Skip to main content
ModelTerms

Inference · beginner

Greedy Decoding

Greedy decoding always picks the single highest-probability next token. It is deterministic, fast, and often dull.

Explanation

Greedy decoding is the simplest possible sampling strategy. For factual tasks (extraction, classification, code with a known answer) it is usually fine and gives reproducible output. For creative or open-ended generation, it tends to produce repetitive, generic text.

Equivalent to temperature=0 (or top-k=1) in most API frameworks. Use for evals and when you need deterministic output.

Examples

  • Asking a model "What is 2+2?" — greedy is fine.
  • Setting temperature=0 in a code-generation API call.

Frequently asked

What is Greedy Decoding?

Greedy decoding always picks the single highest-probability next token. It is deterministic, fast, and often dull.

What is an example of greedy decoding?

Asking a model "What is 2+2?" — greedy is fine.

How is Greedy Decoding related to Temperature?

Greedy Decoding and Temperature are both inference concepts. Temperature is a generation parameter that controls randomness. 0 is deterministic (always pick the most likely token); higher values produce more diverse, surprising output.

Is Greedy Decoding considered beginner?

Greedy Decoding is generally considered beginner-level material in the AI and LLM space.

Side-by-side comparisons

Sources