Agents & Tools · intermediate
Hybrid Search (hybrid retrieval)
Hybrid search combines vector (semantic) and keyword (BM25) retrieval and fuses their results — usually via Reciprocal Rank Fusion — to get the best of both: semantic recall and exact-match precision.
Explanation
Pure vector search finds semantically similar passages but can miss exact keyword matches (acronyms, product names, error codes). Pure keyword search nails exact matches but misses paraphrases ("affordable laptop" ≠ "cheap notebook"). Hybrid runs both and fuses the rankings.
The dominant fusion method is Reciprocal Rank Fusion (RRF): for each document, sum 1/(k + rank) across the two retrievers (k typically 60); rank by the sum. Other approaches include weighted score combinations.
Hybrid retrieval is the default in production RAG. Most managed vector DBs (Weaviate, Qdrant, Pinecone, pgvector + tsvector) expose it directly.
Examples
- A code-search RAG: vector finds "the file that does X" semantically, BM25 finds files containing the exact function name; hybrid catches both.
- A support bot: hybrid handles both "how do I cancel my subscription?" (semantic) and "error ECONNREFUSED 5432" (exact).
When to use hybrid search
Almost any production RAG; the wins are essentially free once your vector DB supports it.
Frequently asked
What is Hybrid Search?
Hybrid search combines vector (semantic) and keyword (BM25) retrieval and fuses their results — usually via Reciprocal Rank Fusion — to get the best of both: semantic recall and exact-match precision.
What is an example of hybrid search?
A code-search RAG: vector finds "the file that does X" semantically, BM25 finds files containing the exact function name; hybrid catches both.
How is Hybrid Search related to Retrieval-Augmented Generation?
Hybrid Search and Retrieval-Augmented Generation are both agents & tools concepts. RAG retrieves relevant documents from a corpus at query time and includes them in the prompt, letting an LLM answer with up-to-date, source-cited, private information without retraining.
When should I use hybrid search?
Almost any production RAG; the wins are essentially free once your vector DB supports it.
Is Hybrid Search considered intermediate?
Hybrid Search is generally considered intermediate-level material in the AI and LLM space.