Architecture · intermediate
Decoder (decoder-only)
A decoder is a transformer module that generates a sequence one token at a time, using causal self-attention so each token only sees earlier ones. GPT-style LLMs are decoder-only.
Explanation
In a decoder, position t can attend only to positions less than t. This causality is what makes next-token prediction well-defined and what lets the model be used to generate fluent text autoregressively at inference time.
The vast majority of modern LLMs (GPT-4, Claude, Llama, Mistral, Gemini) are decoder-only. They drop the encoder entirely and just stack decoder layers. The simplicity scales well and the same model can both read and generate.
Examples
- GPT-4 generating a paragraph token by token.
- Claude continuing a conversation in the same decoder loop.
Frequently asked
What is Decoder?
A decoder is a transformer module that generates a sequence one token at a time, using causal self-attention so each token only sees earlier ones. GPT-style LLMs are decoder-only.
What is an example of decoder?
GPT-4 generating a paragraph token by token.
How is Decoder related to Encoder?
Decoder and Encoder are both architecture concepts. An encoder is a transformer module that reads an input sequence and produces a contextualized representation — a vector per token that captures meaning in context.
Is Decoder considered intermediate?
Decoder is generally considered intermediate-level material in the AI and LLM space.