Inference · beginner
Greedy Decoding
Greedy decoding always picks the single highest-probability next token. It is deterministic, fast, and often dull.
Explanation
Greedy decoding is the simplest possible sampling strategy. For factual tasks (extraction, classification, code with a known answer) it is usually fine and gives reproducible output. For creative or open-ended generation, it tends to produce repetitive, generic text.
Equivalent to temperature=0 (or top-k=1) in most API frameworks. Use for evals and when you need deterministic output.
Examples
- Asking a model "What is 2+2?" — greedy is fine.
- Setting temperature=0 in a code-generation API call.
Frequently asked
What is Greedy Decoding?
Greedy decoding always picks the single highest-probability next token. It is deterministic, fast, and often dull.
What is an example of greedy decoding?
Asking a model "What is 2+2?" — greedy is fine.
How is Greedy Decoding related to Temperature?
Greedy Decoding and Temperature are both inference concepts. Temperature is a generation parameter that controls randomness. 0 is deterministic (always pick the most likely token); higher values produce more diverse, surprising output.
Is Greedy Decoding considered beginner?
Greedy Decoding is generally considered beginner-level material in the AI and LLM space.