Documentation Index
Fetch the complete documentation index at: https://docs.okrapdf.com/llms.txt
Use this file to discover all available pages before exploring further.
Overview
OkraPDF has two upload paths. Both produce the same result: a processed document you can query, extract from, and publish.
| Standard upload | Presign upload |
|---|
| Steps | 1 request (DO handles everything) | 3 requests (presign + PUT + confirm) |
| Latency to “upload done” | ~3-4s | ~0.5s |
| Worker involved | Yes (receives PDF bytes) | No (PDF goes direct to R2) |
| Durable Object created | Immediately | Only at confirm step |
| Best for | SDK, small files, simplicity | CLI pipelines, large files, speed |
Presign upload (fast path)
Three steps: get a signed URL, upload the PDF directly to storage, then confirm to start processing.
Step 1: Get presigned URL
curl -X POST https://api.okrapdf.com/presign \
-H "Authorization: Bearer $OKRA_API_KEY" \
-H "Content-Type: application/json" \
-d '{"fileName":"quarterly-report.pdf"}'
{
"docId": "doc-8db94ee67e004e20bcf36d65f2677a1a",
"uploadUrl": "https://agent-session-events.<account>.r2.cloudflarestorage.com/documents/doc-.../original.pdf?X-Amz-...",
"r2Key": "documents/doc-8db94ee6.../original.pdf",
"fileName": "quarterly-report.pdf",
"expiresIn": 300
}
This is stateless. No Durable Object is created. Typical latency: ~80ms.
Step 2: Upload directly to R2
curl -X PUT "$UPLOAD_URL" \
-H "Content-Type: application/pdf" \
--data-binary @quarterly-report.pdf
The PDF goes straight to Cloudflare R2 storage. The Worker is never involved. A 6MB file typically uploads in ~300ms.
Step 3: Confirm and start processing
curl -X POST https://api.okrapdf.com/document/$DOC_ID/confirm-upload \
-H "Authorization: Bearer $OKRA_API_KEY" \
-H "Content-Type: application/json" \
-d '{"r2Key":"documents/doc-.../original.pdf","fileName":"quarterly-report.pdf"}'
{
"documentId": "doc-8db94ee67e004e20bcf36d65f2677a1a",
"phase": "idle",
"workflowId": "lifecycle-doc-8db94ee6...-1772158807620",
"rootNodeId": "node_doc-8db94ee6..._root"
}
This is the only step that creates a Durable Object. It verifies the R2 upload, computes SHA-256, and starts the lifecycle workflow.
Step 4: Poll status
curl https://api.okrapdf.com/document/$DOC_ID/status \
-H "Authorization: Bearer $OKRA_API_KEY"
{
"documentId": "doc-8db94ee67e004e20bcf36d65f2677a1a",
"phase": "complete",
"fileName": "quarterly-report.pdf",
"pagesCompleted": 42,
"pagesTotal": 42
}
Standard upload (simple path)
One request. The Worker receives the PDF bytes and handles everything.
From a URL
curl -X POST https://api.okrapdf.com/document/$DOC_ID/upload-url \
-H "Authorization: Bearer $OKRA_API_KEY" \
-H "Content-Type: application/json" \
-d '{"url":"https://example.com/report.pdf"}'
The Worker fetches the PDF from the URL, stores it, and starts processing. Simple but slower: the Worker downloads the file, then uploads to R2.
From a local file (binary)
curl -X POST https://api.okrapdf.com/document/$DOC_ID/upload \
-H "Authorization: Bearer $OKRA_API_KEY" \
-H "Content-Type: application/pdf" \
-H "X-File-Name: report.pdf" \
--data-binary @report.pdf
The entire PDF passes through the Worker. For large files this adds latency.
Two-step presign through DO
# Step 1: Get presigned URL (creates DO)
curl -X POST https://api.okrapdf.com/document/$DOC_ID/presign-upload \
-H "Authorization: Bearer $OKRA_API_KEY" \
-H "Content-Type: application/json" \
-d '{"fileName":"report.pdf"}'
# Step 2: Upload to R2
curl -X PUT "$UPLOAD_URL" \
-H "Content-Type: application/pdf" \
--data-binary @report.pdf
# Step 3: Confirm
curl -X POST https://api.okrapdf.com/document/$DOC_ID/confirm-upload \
-H "Authorization: Bearer $OKRA_API_KEY"
This is the older presign flow. It creates a DO at step 1 (to store metadata), while the new /presign endpoint skips the DO entirely until confirm.
When to use which
Use presign (/presign) when:
- You want the lowest possible upload latency
- You’re building a CLI tool or pipeline
- Files are large (>10MB) and you don’t want them passing through a Worker
- You want the docId before the DO exists (e.g. for optimistic UI)
Use standard upload (/upload-url) when:
- You’re uploading from a public URL (the server fetches it)
- You’re using the SDK (
okra.sessions.create(url))
- Simplicity matters more than speed
Full script
See scripts/presign-upload-example.sh for a runnable bash script that does all four steps with timing.
./scripts/presign-upload-example.sh ./quarterly-report.pdf
Output:
1. Requesting presigned URL...
docId: doc-8db94ee67e004e20bcf36d65f2677a1a
time: 0.082s
2. Uploading quarterly-report.pdf (1503780 bytes) to R2...
time: 0.327s
3. Confirming upload (creates DO, starts processing)...
phase: idle
workflow: lifecycle-doc-8db94ee6...-1772158807620
time: 2.058s
4. Polling status...
[3/60] phase=complete progress=42/42
=== TIMING ===
Presign: 0.082s (stateless Worker)
Upload: 0.327s (direct to R2, 1503780 bytes)
Confirm: 2.058s (DO created)
Total: 2.467s (until processing starts)