Agents & Tools · advanced
Cross-Encoder
A cross-encoder takes a (query, document) pair as joint input and outputs a single relevance score. Slower than the bi-encoders used for dense retrieval but much more accurate — the standard reranker architecture.
Explanation
Bi-encoder (the typical embedding model): encode the query and each document independently into vectors, score by dot product. Fast — you can pre-encode the whole corpus. Cross-encoder: feed (query + document) into a single transformer pass, output one score. Slow — N forward passes per query, no pre-indexing — but every layer can attend across the boundary.
In practice cross-encoders are used as rerankers on top of bi-encoder retrieval. The bi-encoder filters to top-K, the cross-encoder reranks K. K is typically 25-100.
BERT-base cross-encoders trained on MS MARCO are a strong default. BGE-rerank and Cohere Rerank are modern productionized versions.
Examples
- BGE-rerank-large taking 50 (query, doc) pairs and outputting 50 scores in ~100ms on a GPU.
- A ColBERT model — a "late interaction" variant — that gives some bi-encoder speed with cross-encoder-like quality.
Frequently asked
What is Cross-Encoder?
A cross-encoder takes a (query, document) pair as joint input and outputs a single relevance score. Slower than the bi-encoders used for dense retrieval but much more accurate — the standard reranker architecture.
What is an example of cross-encoder?
BGE-rerank-large taking 50 (query, doc) pairs and outputting 50 scores in ~100ms on a GPU.
How is Cross-Encoder related to Reranker?
Cross-Encoder and Reranker are both agents & tools concepts. A reranker is a second-pass scoring model that takes the top-K retrieved candidates and reorders them by joint relevance to the query. Typically a cross-encoder; dramatically improves retrieval precision at low cost.
Is Cross-Encoder considered advanced?
Cross-Encoder is generally considered advanced-level material in the AI and LLM space.