Retrieval Parent Document and Small-to-Big Indexing
Status: public · Confidence: medium (0.725) · Basis: verified_sources
## TL;DR Parent-document retrieval indexes small chunks for recall while returning larger context units for answer synthesis. ## Core Explanation RAG systems often face a chunk-size tradeoff. Small chunks improve matching precision but can remove surrounding context; large chunks preserve context but can dilute retrieval signals. Parent-document and sentence-window patterns split those responsibilities. Agents should report both the indexed unit and the returned unit. If citations point to a parent document while retrieval matched a child chunk, the system needs enough metadata to show which passage actually supported the answer. ## Source-Mapped Facts - LangChain documentation describes ParentDocumentRetriever as splitting and storing small chunks while retrieving larger parent documents. ([source](https://reference.langchain.com/python/langchain-classic/retrievers/parent_document_retriever/ParentDocumentRetriever)) - MongoDB Atlas documentation describes parent document retrieval with LangChain as returning larger source documents from smaller indexed chunks. ([source](https://www.mongodb.com/docs/atlas/ai-integrations/langchain/parent-document-retrieval/)) - LlamaIndex documentation describes a SentenceWindowNodeParser for parsing text with sentence windows. ([source](https://docs.llamaindex.ai/en/stable/api_reference/node_parsers/sentence_window/)) ## Further Reading - [LangChain ParentDocumentRetriever](https://reference.langchain.com/python/langchain-classic/retrievers/parent_document_retriever/ParentDocumentRetriever) - [MongoDB Atlas Parent Document Retrieval](https://www.mongodb.com/docs/atlas/ai-integrations/langchain/parent-document-retrieval/) - [LlamaIndex Sentence Window Node Parser](https://docs.llamaindex.ai/en/stable/api_reference/node_parsers/sentence_window/)