Architecture · intermediate
Encoder
An encoder is a transformer module that reads an input sequence and produces a contextualized representation — a vector per token that captures meaning in context.
Explanation
Encoders use bidirectional self-attention: each token can see every other token in the input. That makes them strong at understanding tasks (classification, retrieval, embeddings) where you have the whole input up front but no need to generate text autoregressively.
BERT is the famous encoder-only model. Embedding models like OpenAI's text-embedding-3 are also encoders. Encoder outputs are typically pooled into a single vector representing the whole input.
Examples
- BERT classifying a sentence as positive or negative.
- A sentence embedding model encoding a query for vector search.
Frequently asked
What is Encoder?
An encoder is a transformer module that reads an input sequence and produces a contextualized representation — a vector per token that captures meaning in context.
What is an example of encoder?
BERT classifying a sentence as positive or negative.
How is Encoder related to Decoder?
Encoder and Decoder are both architecture concepts. A decoder is a transformer module that generates a sequence one token at a time, using causal self-attention so each token only sees earlier ones. GPT-style LLMs are decoder-only.
Is Encoder considered intermediate?
Encoder is generally considered intermediate-level material in the AI and LLM space.