Agents & Tools · intermediate
Recursive Chunking (recursive character splitter)
Recursive chunking splits text by trying progressively smaller separators — paragraphs, then sentences, then words — until each chunk fits the target size, preserving natural boundaries where possible.
Explanation
LangChain's RecursiveCharacterTextSplitter popularized this pattern. The algorithm: try splitting on "\n\n" (paragraph). If any chunk is too big, recurse on it with "\n" (line). If still too big, recurse on ". " (sentence). And so on.
The result: chunks usually align with paragraph and sentence boundaries, with the smaller separators kicking in only on long monoliths. Trivial to configure and a strong default for general-purpose RAG.
Better than fixed-size chunking in nearly all cases; weaker than semantic chunking when documents have variable topic density, but much cheaper.
Examples
- A 5000-character article: recursive splitter at 1000 chars with 100-char overlap → 6 chunks, each ending on a natural sentence boundary.
Frequently asked
What is Recursive Chunking?
Recursive chunking splits text by trying progressively smaller separators — paragraphs, then sentences, then words — until each chunk fits the target size, preserving natural boundaries where possible.
What is an example of recursive chunking?
A 5000-character article: recursive splitter at 1000 chars with 100-char overlap → 6 chunks, each ending on a natural sentence boundary.
How is Recursive Chunking related to Chunking?
Recursive Chunking and Chunking are both agents & tools concepts. Chunking is the process of splitting source documents into smaller passages before embedding them for retrieval. Chunk size and boundaries control how relevant retrievals will be.
Is Recursive Chunking considered intermediate?
Recursive Chunking is generally considered intermediate-level material in the AI and LLM space.