Documentation Index
Fetch the complete documentation index at: https://docs.okrapdf.com/llms.txt
Use this file to discover all available pages before exploring further.
The problem
Talking to a PDF normally means stitching together a pipeline:
- Download the file (handle redirects, auth, rate limits)
- Parse it (OCR, layout detection, table extraction)
- Chunk and embed the content
- Build a system prompt within token limits
- Send to an LLM and manage multi-turn state
That’s five services to stand up before you can ask your first question.
The shortcut
The Resolve endpoint collapses this into a single POST:
curl -X POST "https://api.okrapdf.com/v1/resolve/chat/completions" \
-H "Authorization: Bearer $OKRA_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"source": { "type": "url", "url": "https://arxiv.org/abs/2601.06007" },
"wait_ms": 30000,
"messages": [{ "role": "user", "content": "What is the tl;dr?" }]
}'
That’s it. OkraPDF will:
- Detect the PDF — rewrites
/abs/ to /pdf/ for arXiv automatically
- Download and parse with OCR + layout analysis
- Deduplicate — same URL from the same tenant reuses the existing document
- Run the completion against the parsed content
- Return an OpenAI-compatible response
{
"id": "chatcmpl-m0nfcbyuhu",
"object": "chat.completion",
"choices": [{
"message": {
"role": "assistant",
"content": "This paper evaluates prompt caching strategies for long-running AI agents..."
},
"finish_reason": "stop"
}],
"usage": { "prompt_tokens": 13010, "completion_tokens": 849, "total_tokens": 13859 }
}
How wait_ms works
The wait_ms parameter controls how long the server waits for ingestion before responding:
| Value | Behavior |
|---|
0 | Return immediately — 202 if still processing, 200 if already indexed |
3000 | Wait up to 3 seconds, then 200 or 202 |
30000 | Wait up to 30 seconds (good default for most PDFs) |
| Omitted | Uses server default (30s) |
> 120000 | Clamped to 120s max |
Handling 202 (still processing)
If the document hasn’t finished parsing, you get a 202:
{
"run_id": "run-doc-73fdc4f0...-1774096446741",
"document_id": "doc-73fdc4f0d15a4472864629d69a4ce098",
"status_url": "/document/doc-73fdc4f0.../status",
"retry_after": 2
}
Poll status_url until the phase is complete, then retry your original request:
# Check status
curl "https://api.okrapdf.com/document/doc-73fdc4f0.../status" \
-H "Authorization: Bearer $OKRA_API_KEY"
# Retry when ready
curl -X POST "https://api.okrapdf.com/v1/resolve/chat/completions" \
-H "Authorization: Bearer $OKRA_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"source": { "type": "url", "url": "https://arxiv.org/abs/2601.06007" },
"wait_ms": 30000,
"messages": [{ "role": "user", "content": "What is the tl;dr?" }]
}'
Second call hits the cached document instantly.
Streaming
Swap the endpoint to get a streaming response:
curl -X POST "https://api.okrapdf.com/v1/resolve/ai-stream" \
-H "Authorization: Bearer $OKRA_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"source": { "type": "url", "url": "https://arxiv.org/abs/2601.06007" },
"wait_ms": 30000,
"messages": [{ "role": "user", "content": "Summarize key contributions" }]
}'
Multi-turn follow-ups
Include prior messages for follow-up questions — same source URL reuses the document:
curl -X POST "https://api.okrapdf.com/v1/resolve/chat/completions" \
-H "Authorization: Bearer $OKRA_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"source": { "type": "url", "url": "https://arxiv.org/abs/2601.06007" },
"wait_ms": 30000,
"messages": [
{ "role": "user", "content": "What is the tl;dr?" },
{ "role": "assistant", "content": "This paper evaluates prompt caching for AI agents..." },
{ "role": "user", "content": "What specific cost savings did they measure?" }
]
}'
Source types
The resolve endpoint supports three source types:
URL source
{ "source": { "type": "url", "url": "https://arxiv.org/pdf/1706.03762.pdf" } }
Allowed domains: arxiv.org, hkexnews.hk (more coming).
Filing source
{ "source": { "type": "filing", "exchange": "NASDAQ", "ticker": "AAPL", "slug": "10-K-2023" } }
Looks up a pre-indexed filing by exchange, ticker, and slug.
Public source
{ "source": { "type": "public_source", "id": "your-public-source-id" } }
Points to a shared, pre-indexed document available to all tenants.
Using with the OpenAI SDK
The response shape is OpenAI-compatible, so you can use it as a drop-in with any OpenAI SDK consumer:
const response = await fetch('https://api.okrapdf.com/v1/resolve/chat/completions', {
method: 'POST',
headers: {
'Authorization': `Bearer ${process.env.OKRA_API_KEY}`,
'Content-Type': 'application/json',
},
body: JSON.stringify({
source: { type: 'url', url: 'https://arxiv.org/abs/2601.06007' },
wait_ms: 30000,
messages: [{ role: 'user', content: 'What are the key findings?' }],
}),
});
const data = await response.json();
console.log(data.choices[0].message.content);
Error handling
| Status | Code | Meaning |
|---|
400 | INVALID_SOURCE | Missing or malformed source object |
401 | — | Missing or invalid API key |
404 | FILING_NOT_READY | Filing not found or not indexed |
404 | PUBLIC_SOURCE_NOT_FOUND | Public source missing or disabled |
422 | UNSUPPORTED_SOURCE_URL | URL domain not in allowlist |
502 | URL_INGEST_START_FAILED | Failed to start document ingestion |