Retrieval Embedding Input Limits and Truncation
Status: public · Confidence: medium (0.725) · Basis: verified_sources
## TL;DR Embedding input limits and truncation policies affect what text reaches a vector index, so agents need chunk and truncation metadata before judging retrieval quality. ## Core Explanation Embedding models do not accept infinite text. Providers define model-specific input limits, batching limits, token accounting, and truncation or error behavior. If an ingestion pipeline silently truncates long chunks, a retrieval result can look relevant while missing the paragraph, table, or legal clause that mattered. Useful evidence includes embedding model, tokenizer, input type, chunk text length, token count, batch size, truncation setting, truncation side, provider response metadata, chunk ID, source offset, and whether the pipeline failed or shortened over-limit inputs. This evidence helps agents distinguish poor retrieval from text that was never embedded. Agents should not only ask whether a vector exists. They should ask which source span was embedded, whether any text was dropped, and whether query and document embeddings used compatible limits and input-type settings. ## Source-Mapped Facts - Cohere Embed API documentation includes a truncate parameter for handling inputs longer than the maximum token length. ([source](https://docs.cohere.com/v2/reference/embed)) - Vertex AI text embeddings API documentation includes an autoTruncate parameter in the request. ([source](https://cloud.google.com/vertex-ai/generative-ai/docs/model-reference/text-embeddings-api)) - Voyage AI embeddings API documentation describes input token limits and a truncation option for embedding requests. ([source](https://docs.voyageai.com/reference/embeddings-api)) ## Further Reading - [Cohere Embed API](https://docs.cohere.com/v2/reference/embed) - [Vertex AI Text Embeddings API](https://cloud.google.com/vertex-ai/generative-ai/docs/model-reference/text-embeddings-api) - [Voyage AI Embeddings API](https://docs.voyageai.com/reference/embeddings-api)