Limits
| Context | Requests/min | Expensive ops/min |
|---|---|---|
| Authenticated | 600 | 30 |
| Unauthenticated | 60 | 5 |
POST /v1/documents (uploads), POST /v1/documents/{id}/structured-output, and POST /v1/documents/{id}/chat/completions. All other endpoints count as standard requests.
Rate limit headers
Every API response includes rate limit information:| Header | Description |
|---|---|
X-RateLimit-Limit | Max requests per window |
X-RateLimit-Remaining | Requests remaining |
X-RateLimit-Reset | Unix timestamp when window resets |
When rate limited
You receive a429 response:
Retry-After header tells you how many seconds to wait.
Best practices
- Check headers before hitting limits — monitor
X-RateLimit-Remaining - Use exponential backoff on 429 responses
- Batch small PDFs — prefer one large job over many small ones
- Cache results — store extraction results locally after fetching