Inference · beginner

Greedy Decoding

Greedy decoding always picks the single highest-probability next token. It is deterministic, fast, and often dull.

Published May 29, 2026

Explanation

Greedy decoding is the simplest possible sampling strategy. For factual tasks (extraction, classification, code with a known answer) it is usually fine and gives reproducible output. For creative or open-ended generation, it tends to produce repetitive, generic text.

Equivalent to temperature=0 (or top-k=1) in most API frameworks. Use for evals and when you need deterministic output.

Examples

Asking a model "What is 2+2?" — greedy is fine.
Setting temperature=0 in a code-generation API call.

Frequently asked

What is Greedy Decoding?

Greedy decoding always picks the single highest-probability next token. It is deterministic, fast, and often dull.

What is an example of greedy decoding?

Asking a model "What is 2+2?" — greedy is fine.

How is Greedy Decoding related to Temperature?

Greedy Decoding and Temperature are both inference concepts. Temperature is a generation parameter that controls randomness. 0 is deterministic (always pick the most likely token); higher values produce more diverse, surprising output.

Is Greedy Decoding considered beginner?

Greedy Decoding is generally considered beginner-level material in the AI and LLM space.

TemperatureInference

Temperature is a generation parameter that controls randomness. 0 is deterministic (always pick the most likely token); higher values produce more diverse, surprising output.

SamplingInference

Sampling is the act of choosing the next token from the model's output distribution, typically after applying temperature and a truncation strategy like top-p or top-k.

Beam SearchInference

Beam search explores several candidate continuations in parallel, keeping the top-k partial sequences at each step. Common in translation; rare in modern LLM chat.

Top-kInference

Top-k restricts token sampling to the k highest-probability tokens, then samples from that set. A simpler alternative to top-p.

Side-by-side comparisons

Sources

Hugging Face — Generation strategies