Prompting · intermediate
Test-Time Compute (inference-time compute)
Test-time compute is the extra reasoning, sampling, or search a model can do at inference time to improve quality — more thinking tokens, more candidate answers, or verifier-guided search.
Explanation
Pre-2024 LLMs spent one forward pass per token and called it a day. Reasoning models (o1, R1) spend many tokens "thinking" before the user-visible answer. Beyond pure CoT, test-time compute can also mean: best-of-N sampling, tree search over candidate steps, verifier-guided reranking, and self-consistency voting.
The empirical finding driving this: for a fixed model, doubling test-time compute often beats doubling training compute on hard problems — the new third scaling axis alongside parameters and data.
Practically, more test-time compute means longer latency and higher per-call cost, so it is reserved for harder tasks.
Examples
- o1 thinking for 30 seconds before answering a math olympiad problem.
- Best-of-32 sampling on a coding task, then choosing the highest-scored answer.
When to use test-time compute
Whenever quality matters more than latency — math, code, research, structured planning.
Frequently asked
What is Test-Time Compute?
Test-time compute is the extra reasoning, sampling, or search a model can do at inference time to improve quality — more thinking tokens, more candidate answers, or verifier-guided search.
What is an example of test-time compute?
o1 thinking for 30 seconds before answering a math olympiad problem.
How is Test-Time Compute related to Reasoning Model?
Test-Time Compute and Reasoning Model are both prompting concepts. A reasoning model spends extra compute thinking step-by-step before answering. OpenAI o1/o3, DeepSeek R1, and Anthropic's extended thinking are reasoning models.
When should I use test-time compute?
Whenever quality matters more than latency — math, code, research, structured planning.
Is Test-Time Compute considered intermediate?
Test-Time Compute is generally considered intermediate-level material in the AI and LLM space.