Skip to main content
ModelTerms

Agents & Tools · advanced

Semantic Chunking

Semantic chunking embeds each sentence and inserts a chunk boundary wherever consecutive embeddings diverge sharply — producing chunks that respect topic boundaries rather than character counts.

Explanation

Algorithm: embed every sentence, compute cosine distance between adjacent sentences, place chunk breaks where distance exceeds a threshold (e.g. the 95th percentile of all distances). Topical shifts produce big distance jumps; tightly related sentences stay together.

Trade-off: more compute up front (one embedding per sentence vs. zero for character splitting) but consistently better retrieval on heterogeneous corpora — long docs, mixed-topic FAQs, transcripts.

Often combined with a chunk-size cap so you don't end up with one 5000-token chunk on a section the algorithm didn't want to split.

Examples

  • A meeting transcript: semantic chunker breaks on topic-change moments rather than arbitrary token windows.
  • A wiki where some pages are 200 words and some are 50000 — semantic chunking handles both without tuning.

When to use semantic chunking

When documents have variable topic density and recursive chunking is producing low-quality retrievals.

Frequently asked

What is Semantic Chunking?

Semantic chunking embeds each sentence and inserts a chunk boundary wherever consecutive embeddings diverge sharply — producing chunks that respect topic boundaries rather than character counts.

What is an example of semantic chunking?

A meeting transcript: semantic chunker breaks on topic-change moments rather than arbitrary token windows.

How is Semantic Chunking related to Chunking?

Semantic Chunking and Chunking are both agents & tools concepts. Chunking is the process of splitting source documents into smaller passages before embedding them for retrieval. Chunk size and boundaries control how relevant retrievals will be.

When should I use semantic chunking?

When documents have variable topic density and recursive chunking is producing low-quality retrievals.

Is Semantic Chunking considered advanced?

Semantic Chunking is generally considered advanced-level material in the AI and LLM space.

ChunkingAgents & Tools

Chunking is the process of splitting source documents into smaller passages before embedding them for retrieval. Chunk size and boundaries control how relevant retrievals will be.

Recursive ChunkingAgents & Tools

Recursive chunking splits text by trying progressively smaller separators — paragraphs, then sentences, then words — until each chunk fits the target size, preserving natural boundaries where possible.

Retrieval-Augmented GenerationAgents & Tools

RAG retrieves relevant documents from a corpus at query time and includes them in the prompt, letting an LLM answer with up-to-date, source-cited, private information without retraining.

EmbeddingArchitecture

An embedding is a list of numbers (a vector) that represents a piece of input — a word, a sentence, an image — in a space where similar things end up close together.

Side-by-side comparisons

Sources