> ## Documentation Index
> Fetch the complete documentation index at: https://docs.okrapdf.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Presign Upload (Fast Path)

> Upload PDFs via stateless presigned URLs for lowest latency. Compare with standard upload.

## Overview

OkraPDF has two upload paths. Both produce the same result: a processed document you can query, extract from, and publish.

|                          | Standard upload                   | Presign upload                       |
| ------------------------ | --------------------------------- | ------------------------------------ |
| Steps                    | 1 request (DO handles everything) | 3 requests (presign + PUT + confirm) |
| Latency to "upload done" | \~3-4s                            | \~0.5s                               |
| Worker involved          | Yes (receives PDF bytes)          | No (PDF goes direct to R2)           |
| Durable Object created   | Immediately                       | Only at confirm step                 |
| Best for                 | SDK, small files, simplicity      | CLI pipelines, large files, speed    |

## Presign upload (fast path)

Three steps: get a signed URL, upload the PDF directly to storage, then confirm to start processing.

### Step 1: Get presigned URL

```bash theme={null}
curl -X POST https://api.okrapdf.com/presign \
  -H "Authorization: Bearer $OKRA_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"fileName":"quarterly-report.pdf"}'
```

```json theme={null}
{
  "docId": "doc-8db94ee67e004e20bcf36d65f2677a1a",
  "uploadUrl": "https://agent-session-events.<account>.r2.cloudflarestorage.com/documents/doc-.../original.pdf?X-Amz-...",
  "r2Key": "documents/doc-8db94ee6.../original.pdf",
  "fileName": "quarterly-report.pdf",
  "expiresIn": 300
}
```

This is stateless. No Durable Object is created. Typical latency: **\~80ms**.

### Step 2: Upload directly to R2

```bash theme={null}
curl -X PUT "$UPLOAD_URL" \
  -H "Content-Type: application/pdf" \
  --data-binary @quarterly-report.pdf
```

The PDF goes straight to Cloudflare R2 storage. The Worker is never involved. A 6MB file typically uploads in **\~300ms**.

### Step 3: Confirm and start processing

```bash theme={null}
curl -X POST https://api.okrapdf.com/document/$DOC_ID/confirm-upload \
  -H "Authorization: Bearer $OKRA_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"r2Key":"documents/doc-.../original.pdf","fileName":"quarterly-report.pdf"}'
```

```json theme={null}
{
  "documentId": "doc-8db94ee67e004e20bcf36d65f2677a1a",
  "phase": "idle",
  "workflowId": "lifecycle-doc-8db94ee6...-1772158807620",
  "rootNodeId": "node_doc-8db94ee6..._root"
}
```

This is the only step that creates a Durable Object. It verifies the R2 upload, computes SHA-256, and starts the lifecycle workflow.

### Step 4: Poll status

```bash theme={null}
curl https://api.okrapdf.com/document/$DOC_ID/status \
  -H "Authorization: Bearer $OKRA_API_KEY"
```

```json theme={null}
{
  "documentId": "doc-8db94ee67e004e20bcf36d65f2677a1a",
  "phase": "complete",
  "fileName": "quarterly-report.pdf",
  "pagesCompleted": 42,
  "pagesTotal": 42
}
```

## Standard upload (simple path)

One request. The Worker receives the PDF bytes and handles everything.

### From a URL

```bash theme={null}
curl -X POST https://api.okrapdf.com/document/$DOC_ID/upload-url \
  -H "Authorization: Bearer $OKRA_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"url":"https://example.com/report.pdf"}'
```

The Worker fetches the PDF from the URL, stores it, and starts processing. Simple but slower: the Worker downloads the file, then uploads to R2.

### From a local file (binary)

```bash theme={null}
curl -X POST https://api.okrapdf.com/document/$DOC_ID/upload \
  -H "Authorization: Bearer $OKRA_API_KEY" \
  -H "Content-Type: application/pdf" \
  -H "X-File-Name: report.pdf" \
  --data-binary @report.pdf
```

The entire PDF passes through the Worker. For large files this adds latency.

### Two-step presign through DO

```bash theme={null}
# Step 1: Get presigned URL (creates DO)
curl -X POST https://api.okrapdf.com/document/$DOC_ID/presign-upload \
  -H "Authorization: Bearer $OKRA_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"fileName":"report.pdf"}'

# Step 2: Upload to R2
curl -X PUT "$UPLOAD_URL" \
  -H "Content-Type: application/pdf" \
  --data-binary @report.pdf

# Step 3: Confirm
curl -X POST https://api.okrapdf.com/document/$DOC_ID/confirm-upload \
  -H "Authorization: Bearer $OKRA_API_KEY"
```

This is the older presign flow. It creates a DO at step 1 (to store metadata), while the new `/presign` endpoint skips the DO entirely until confirm.

## When to use which

**Use presign (`/presign`)** when:

* You want the lowest possible upload latency
* You're building a CLI tool or pipeline
* Files are large (>10MB) and you don't want them passing through a Worker
* You want the docId before the DO exists (e.g. for optimistic UI)

**Use standard upload (`/upload-url`)** when:

* You're uploading from a public URL (the server fetches it)
* You're using the SDK (`okra.sessions.create(url)`)
* Simplicity matters more than speed

## Full script

See [`scripts/presign-upload-example.sh`](https://github.com/steventsao/agent-session/blob/main/scripts/presign-upload-example.sh) for a runnable bash script that does all four steps with timing.

```bash theme={null}
./scripts/presign-upload-example.sh ./quarterly-report.pdf
```

Output:

```
1. Requesting presigned URL...
   docId:   doc-8db94ee67e004e20bcf36d65f2677a1a
   time:    0.082s

2. Uploading quarterly-report.pdf (1503780 bytes) to R2...
   time:    0.327s

3. Confirming upload (creates DO, starts processing)...
   phase:   idle
   workflow: lifecycle-doc-8db94ee6...-1772158807620
   time:    2.058s

4. Polling status...
   [3/60] phase=complete    progress=42/42

=== TIMING ===
  Presign:  0.082s  (stateless Worker)
  Upload:   0.327s  (direct to R2, 1503780 bytes)
  Confirm:  2.058s  (DO created)
  Total:    2.467s  (until processing starts)
```
