Errors - OkraPDF

Error format

All errors return a consistent JSON shape:

{
  "error": {
    "code": "BAD_REQUEST",
    "message": "Human-readable description of the problem"
  }
}

Error codes

Code	HTTP Status	Description
`BAD_REQUEST`	400	Invalid request body or parameters
`UNAUTHORIZED`	401	Missing or invalid API key
`FORBIDDEN`	403	Valid key but not authorized for this resource
`NOT_FOUND`	404	Document or resource does not exist
`CONFLICT`	409	Resource state conflict
`SCHEMA_VALIDATION_FAILED`	422	Request was well-formed but semantically invalid
`RATE_LIMITED`	429	Too many requests - see rate limits
`INTERNAL_ERROR`	500	Unexpected server error
`TIMEOUT`	504	The operation timed out

Handling errors

Retry strategy

For transient errors (429, 500, 502, 503, 504), use exponential backoff:

import time
import requests

def api_call_with_retry(url, headers, max_retries=3):
    for attempt in range(max_retries):
        resp = requests.get(url, headers=headers)

        if resp.status_code == 429:
            retry_after = int(resp.headers.get("Retry-After", 5))
            time.sleep(retry_after)
            continue

        if resp.status_code >= 500:
            time.sleep(2 ** attempt)
            continue

        return resp

    raise Exception(f"Failed after {max_retries} retries")

Non-retryable errors

Code	Action
`400`	Fix your request body or parameters
`401`	Check your API key
`403`	Verify you own the resource
`404`	Verify the document ID exists
`422`	Check request schema against the API reference

Structured output errors

The /chat/completions endpoint with response_format: { type: "json_schema" } and the MCP extract_data tool return specific error codes:

Code	HTTP Status	Description
`SCHEMA_VALIDATION_FAILED`	422	Extracted data didn’t match your JSON schema. Check field types, required fields, and nesting.
`EXTRACTION_BLOCKED`	422	The model couldn’t extract data — document has no pages, parsing failed, or all queries returned errors.
`TIMEOUT`	504	Extraction exceeded the time limit (default 45s, MCP uses 120s). Simplify the schema or use page ranges.
`DOCUMENT_NOT_FOUND`	404	Document ID doesn’t exist or hasn’t finished processing. Check `get_document_status` first.

Example error response

{
  "error": {
    "message": "Structured output timed out",
    "type": "server_error",
    "details": { "timeoutMs": 120000 }
  }
}

Tips for reliable extraction

Keep schemas flat when possible — fewer nested objects means faster extraction
Use string types for financial values (e.g. "$215,938 million") rather than numbers to avoid parsing ambiguity
Check document status before extracting — phase: "complete" is required

Documentation Index

​Error format

​Error codes

​Handling errors

​Retry strategy

​Non-retryable errors

​Structured output errors

​Example error response

​Tips for reliable extraction