Skip to main content

Envisioning is an emerging technology research institute and advisory.

LinkedInInstagramGitHub

2011 — 2026

research
  • Reports
  • Newsletter
  • Methodology
  • Origins
  • Vocab
services
  • Research Sessions
  • Signals Workspace
  • Bespoke Projects
  • Use Cases
  • Signal Scanfree
  • Readinessfree
impact
  • ANBIMAFuture of Brazilian Capital Markets
  • IEEECharting the Energy Transition
  • Horizon 2045Future of Human and Planetary Security
  • WKOTechnology Scanning for Austria
audiences
  • Innovation
  • Strategy
  • Consultants
  • Foresight
  • Associations
  • Governments
resources
  • Pricing
  • Partners
  • How We Work
  • Data Visualization
  • Multi-Model Method
  • FAQ
  • Security & Privacy
about
  • Manifesto
  • Community
  • Events
  • Support
  • Contact
  • Login
ResearchServicesPricingPartnersAbout
ResearchServicesPricingPartnersAbout
  1. Home
  2. Vocab
  3. TTC (Test-Time Compute)

TTC (Test-Time Compute)

Allocating additional computational resources during inference to improve reasoning and output quality

Year: 2024Generality: 689
Back to Vocab

Test-Time Compute (TTC) is the strategy of spending more computational resources at inference time (when a model is answering a question) rather than at training time (when a model learns from data) to improve output quality, reasoning depth, and solution accuracy. This inverts the scaling intuition from the era of large language models, where bigger models trained on more data generally performed better. The key insight is that reasoning can be "thought out" as well as "learned in"—sometimes it's better to allocate 1000 inference steps to a single problem than to train a larger model.

Technically, test-time compute manifests through several mechanisms. Chain-of-thought prompting (asking the model to "think step-by-step") is a lightweight form, expanding reasoning tokens without extra training. Beam search, where the model generates multiple candidate solutions and ranks them, is another approach. More recent models like OpenAI's o1 and o3 employ reinforcement learning during inference, allowing models to search through reasoning pathways, backtrack when stuck, and verify intermediate steps. Some frameworks use ensembles, running multiple inference passes and aggregating results. Others use verification loops, where a language model proposes a solution and a second system checks its correctness. All of these techniques increase the compute spend at test time while keeping the base model fixed or even smaller.

TTC matters because it suggests a new scaling frontier: instead of doubling model parameters every year, systems can allocate compute dynamically based on problem difficulty. A simple query might need minimal reasoning; a complex math problem might benefit from 100x more compute. This enables more efficient systems—smaller, cheaper base models augmented with adaptive reasoning. It also changes how we think about AI capability: a system that reasons longer isn't necessarily smarter, but it's more thorough. However, TTC increases latency and cost per query, creating new tradeoffs between speed, accuracy, and expense.

Related

Related

Test-Time Training (TTT)
Test-Time Training (TTT)

A technique where models update their parameters during inference to improve performance.

Generality: 520
Evaluation-Time Compute
Evaluation-Time Compute

Computational resources consumed when an AI model runs inference on new data.

Generality: 627
TTFT (Test Time Fine-Tuning)
TTFT (Test Time Fine-Tuning)

Adapting a pre-trained model's parameters on new data during inference.

Generality: 520
Inference Scaling
Inference Scaling

Improving model outputs by allocating more compute during inference rather than during training

Generality: 812
Inference-Time Reasoning
Inference-Time Reasoning

A trained model's process of applying learned knowledge to generate outputs on new data.

Generality: 751
Thinking Tokens
Thinking Tokens

Hidden reasoning tokens consumed during inference for internal step-by-step reasoning invisible to users

Generality: 605