# Agent API Rate-Limit Headers and Retry-After Status: public Confidence: medium (0.865) (verified) Last verified: 2026-06-02 Generation: ai_structured ## TL;DR Rate-limit headers and Retry-After tell agents when to slow down, retry later, or ask for a quota-aware plan. ## Core Explanation Tool-using agents call APIs repeatedly, so they need to read rate-limit signals instead of blindly retrying. A 429 response, Retry-After value, and provider-specific quota headers can indicate whether the correct behavior is waiting, backing off, reducing concurrency, or using a different endpoint. Agents should record the exact headers and endpoint involved. They should not infer global quota state from one failed request when a provider has separate limits for different resources, tokens, or secondary abuse protection. ## Source-Mapped Facts - RFC 9110 defines the Retry-After response header field for indicating how long a user agent ought to wait before a follow-up request. ([source](https://datatracker.ietf.org/doc/html/rfc9110#section-10.2.3)) - RFC 6585 defines the 429 Too Many Requests status code for rate limiting and says responses may include a Retry-After header. ([source](https://datatracker.ietf.org/doc/html/rfc6585#section-4)) - GitHub REST API documentation lists rate-limit response headers such as x-ratelimit-limit, x-ratelimit-remaining, and x-ratelimit-reset. ([source](https://docs.github.com/en/rest/using-the-rest-api/rate-limits-for-the-rest-api)) ## Further Reading - [RFC 9110 Retry-After](https://datatracker.ietf.org/doc/html/rfc9110#section-10.2.3) - [RFC 6585 429 Too Many Requests](https://datatracker.ietf.org/doc/html/rfc6585#section-4) - [GitHub REST API Rate Limits](https://docs.github.com/en/rest/using-the-rest-api/rate-limits-for-the-rest-api)