Skip to main content
ModelTerms

Inference · intermediate

Beam Search

Beam search explores several candidate continuations in parallel, keeping the top-k partial sequences at each step. Common in translation; rare in modern LLM chat.

Explanation

At each step beam search expands every current candidate by every possible next token, scores all resulting sequences by cumulative log-probability, and keeps the top beam_width of them. Output is the highest-scoring complete sequence.

Strong for sequence-to-sequence tasks with a clear correct answer (machine translation, summarization eval). Rarely used in open-ended chat — it tends to produce safe, generic completions and is much more expensive than sampling.

Examples

  • Translation systems with beam width 4-10.
  • Decoding for ASR (speech-to-text).

Frequently asked

What is Beam Search?

Beam search explores several candidate continuations in parallel, keeping the top-k partial sequences at each step. Common in translation; rare in modern LLM chat.

What is an example of beam search?

Translation systems with beam width 4-10.

How is Beam Search related to Greedy Decoding?

Beam Search and Greedy Decoding are both inference concepts. Greedy decoding always picks the single highest-probability next token. It is deterministic, fast, and often dull.

Is Beam Search considered intermediate?

Beam Search is generally considered intermediate-level material in the AI and LLM space.

Side-by-side comparisons

Sources