Stop Chunking Like It's 2022

SessionEngineering tracktentative

Stop Chunking Like It's 2022

Day: Day 2 — Session Day 1
Time: 2:50pm-3:10pm
Room: Track 3
Track: Search & Retrieval

Accessible with the Engineering pass and above.

About this session

Every RAG system bets everything on a single chunk size. 500 tokens? 800? Pick wrong, and half your queries fail before they start. But here's what nobody tells you: all the picks are wrong; there is no single chunk size that works for all queries. We ran oracle experiments across meeting transcripts, story chapters, and TV scripts. The result? Queries disagree violently on what chunk size works best - sometimes by 40 percentage points. Your "tuned" chunk size isn't a compromise; it's systematic underperformance. In this talk, we'll expose why fixed chunking fails and show you a dead-simple fix: index at multiple chunk sizes, aggregate at retrieval time using Reciprocal Rank Fusion. No retraining. No LLM overhead. Just 1-37% better recall across benchmarks by letting queries vote with their ranks instead of forcing them into one-size-fits-all boxes. Walk away knowing exactly when your chunk size is sabotaging you - and how to stop leaving 20-40% of your retrieval performance on the table.

Topics

Evals & ObservabilitySearch & Retrieval (RAG, Deep Research, Web search)

Speakers

Yuval Belfer

Senior Developer Advocate · AI21 Labs

Niv Granot