ChatRaj
Architecture & chunking

Chunk stride

Chunk stride is the step size between consecutive chunks.

Bottom line
Chunk stride is the step size between consecutive chunks. When stride is smaller than chunk size, neighboring chunks overlap, so answers that fall on a boundary still appear in full inside at least one retrieved chunk. Most RAG stacks use a stride that yields 10 to 20 percent overlap.
Reviewed by ··5 min read
Jump to section

What chunk stride actually is

Stride is the step size your splitter takes between the start of one chunk and the start of the next. If your chunk size is 500 tokens and your stride is 500, chunks butt up against each other with zero overlap. If your stride is 400, every chunk shares its last 100 tokens with the next chunk's first 100 tokens. That shared tail is the overlap, and the relationship is mechanical: overlap equals chunk size minus stride.

Most RAG tutorials skip this distinction. They tell you to set a chunk size, set an overlap, and move on. The word "stride" rarely appears, even though it is the variable that actually controls the splitter's cursor. Calling it stride matters because it reframes the question: you are not adding overlap on top of chunks, you are choosing how far the window slides each step. A smaller stride means more chunks, more embeddings, more index storage, and more recall safety. A larger stride means fewer chunks, less cost, and a higher risk that a key sentence lands on a seam.

This page is about the forgotten knob. Document chunking decides how to slice; stride decides how much the slices share.

The overlap formula: stride < chunk_size

The arithmetic is simple. With chunk size C and stride S where S is less than or equal to C:

  • Overlap = C - S
  • Overlap ratio = (C - S) / C
  • Chunk count for a document of length L is approximately (L - C) / S + 1

A worked example. You have a 5,000-token article. Pick C = 500 and S = 500 (no overlap) and you get roughly 10 chunks. Pick C = 500 and S = 400 (100-token overlap, 20 percent) and you get roughly 12 chunks. The 20 percent overlap costs you about 20 percent more chunks to embed, store, and search.

The common rule of thumb is to keep overlap between 10 and 20 percent of chunk size. That gives a stride of 0.8 to 0.9 times the chunk size. LangChain users typically set chunk_size around 500 to 1000 and chunk_overlap around 50 to 200, which puts them squarely in that band. The default is not magic. It is a defensible starting point that trades a modest indexing tax for resilience against boundary loss.

Why those numbers and not, say, 50 percent overlap? Because the marginal benefit drops fast. The first 10 percent of overlap catches most sentences that cross a chunk seam. Pushing to 30 or 50 percent mostly duplicates content that was already retrievable, while inflating the index. Cost grows linearly with overlap. Recall does not.

Why stride matters for AI chatbots

Imagine a 600-character paragraph in your help center that explains your refund policy. Your splitter is configured for 500-token chunks with zero overlap. The splitter places the boundary right in the middle of the explanation. Now chunk N contains "If a customer requests a refund within 30 days," and chunk N+1 contains "we issue store credit for digital goods and a full refund for physical goods." A user asks, "What happens if I want a refund after 31 days?" The embedding model scores chunk N highly because it contains "refund within 30 days." It returns chunk N. The chatbot sees half the policy. It hallucinates the rest, or worse, gives the wrong answer with full confidence.

Set stride to 400 instead. Now both chunks contain the full sentence. Either one retrieves cleanly. The answer is grounded. This is the entire case for overlap, and it is invisible until you see it fail.

The same dynamic shows up in technical documentation, where a code example might be split from the prose that describes it, or in narrative case studies, where the conclusion sits two paragraphs after the setup. Retrieval-augmented generation pipelines that ignore stride end up brittle in exactly these cases. ChatRaj's default stride preserves roughly 10 percent overlap on free-form website content, which catches most boundary cases without bloating the index.

When to overlap and when to skip

Overlap is not free. A 20 percent overlap means about 25 percent more chunks compared to a zero-overlap split (the math works out because you are adding a fractional chunk for every original chunk). That translates to 25 percent more embedding API calls during ingestion, 25 percent more vector storage, and 25 percent more candidates for the retriever to score. For a small site this is rounding error. For a 100,000-page documentation corpus it is a real bill.

Use overlap for:

  • Free-form prose with long sentences and multi-clause arguments
  • Technical explanations where a concept spans a few paragraphs
  • Narrative content like case studies or blog posts
  • Help articles where the question and answer might land on different sides of a boundary

Skip or reduce overlap for:

  • FAQ pages where each question and answer pair is self-contained
  • Product catalog rows, where each row is an independent record
  • Table cells, which are already atomic units
  • Structured logs or JSON where each line is a complete record

A 2026 systematic study using SPLADE retrieval on Natural Questions found that overlap added no measurable recall benefit on that corpus, only indexing cost. The lesson is not "skip overlap." The lesson is that stride is a corpus-dependent choice. Run a retrieval eval. If your recall at top-5 is identical with and without overlap, save the money. If it drops two points without overlap, the 10 percent tax is worth it.

For most public website content, the answer is yes, overlap. For tightly structured catalogs, the answer is often no. ChatRaj makes the default sensible and the override available, because there is no universal right answer here, only the right answer for your corpus.

FAQ

Common Chunk stride questions

Ten to twenty percent of the chunk size. For 500-token chunks that is 50 to 100 tokens of overlap, which corresponds to a stride of 400 to 450.

Was this helpful?

Ship your first chatbot in 60 seconds.

Sign in with Google and you'll be answering visitor questions before your coffee gets cold.

60-second setup · One-line install · Works on any site

Works on any website
SShopify
WWebflow
WPWordPress
SqSquarespace
FFramer
</>Plain HTML