> ## Documentation Index
> Fetch the complete documentation index at: https://docs.okrapdf.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Chat with Any PDF by URL

> One API call to ask questions about any PDF — no upload step, no parsing pipeline, no vector store.

## The problem

Talking to a PDF normally means stitching together a pipeline:

1. Download the file (handle redirects, auth, rate limits)
2. Parse it (OCR, layout detection, table extraction)
3. Chunk and embed the content
4. Build a system prompt within token limits
5. Send to an LLM and manage multi-turn state

That's five services to stand up before you can ask your first question.

## The shortcut

The **Resolve** endpoint collapses this into a single POST:

```bash theme={null}
curl -X POST "https://api.okrapdf.com/v1/resolve/chat/completions" \
  -H "Authorization: Bearer $OKRA_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "source": { "type": "url", "url": "https://arxiv.org/abs/2601.06007" },
    "wait_ms": 30000,
    "messages": [{ "role": "user", "content": "What is the tl;dr?" }]
  }'
```

That's it. OkraPDF will:

1. **Detect the PDF** — rewrites `/abs/` to `/pdf/` for arXiv automatically
2. **Download and parse** with OCR + layout analysis
3. **Deduplicate** — same URL from the same tenant reuses the existing document
4. **Run the completion** against the parsed content
5. **Return an OpenAI-compatible response**

```json theme={null}
{
  "id": "chatcmpl-m0nfcbyuhu",
  "object": "chat.completion",
  "choices": [{
    "message": {
      "role": "assistant",
      "content": "This paper evaluates prompt caching strategies for long-running AI agents..."
    },
    "finish_reason": "stop"
  }],
  "usage": { "prompt_tokens": 13010, "completion_tokens": 849, "total_tokens": 13859 }
}
```

## How `wait_ms` works

The `wait_ms` parameter controls how long the server waits for ingestion before responding:

| Value      | Behavior                                                                 |
| ---------- | ------------------------------------------------------------------------ |
| `0`        | Return immediately — `202` if still processing, `200` if already indexed |
| `3000`     | Wait up to 3 seconds, then `200` or `202`                                |
| `30000`    | Wait up to 30 seconds (good default for most PDFs)                       |
| Omitted    | Uses server default (30s)                                                |
| `> 120000` | Clamped to 120s max                                                      |

### Handling `202` (still processing)

If the document hasn't finished parsing, you get a `202`:

```json theme={null}
{
  "run_id": "run-doc-73fdc4f0...-1774096446741",
  "document_id": "doc-73fdc4f0d15a4472864629d69a4ce098",
  "status_url": "/document/doc-73fdc4f0.../status",
  "retry_after": 2
}
```

Poll `status_url` until the phase is `complete`, then retry your original request:

```bash theme={null}
# Check status
curl "https://api.okrapdf.com/document/doc-73fdc4f0.../status" \
  -H "Authorization: Bearer $OKRA_API_KEY"

# Retry when ready
curl -X POST "https://api.okrapdf.com/v1/resolve/chat/completions" \
  -H "Authorization: Bearer $OKRA_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "source": { "type": "url", "url": "https://arxiv.org/abs/2601.06007" },
    "wait_ms": 30000,
    "messages": [{ "role": "user", "content": "What is the tl;dr?" }]
  }'
```

Second call hits the cached document instantly.

## Streaming

Swap the endpoint to get a streaming response:

```bash theme={null}
curl -X POST "https://api.okrapdf.com/v1/resolve/ai-stream" \
  -H "Authorization: Bearer $OKRA_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "source": { "type": "url", "url": "https://arxiv.org/abs/2601.06007" },
    "wait_ms": 30000,
    "messages": [{ "role": "user", "content": "Summarize key contributions" }]
  }'
```

## Multi-turn follow-ups

Include prior messages for follow-up questions — same source URL reuses the document:

```bash theme={null}
curl -X POST "https://api.okrapdf.com/v1/resolve/chat/completions" \
  -H "Authorization: Bearer $OKRA_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "source": { "type": "url", "url": "https://arxiv.org/abs/2601.06007" },
    "wait_ms": 30000,
    "messages": [
      { "role": "user", "content": "What is the tl;dr?" },
      { "role": "assistant", "content": "This paper evaluates prompt caching for AI agents..." },
      { "role": "user", "content": "What specific cost savings did they measure?" }
    ]
  }'
```

## Source types

The resolve endpoint supports three source types:

### URL source

```json theme={null}
{ "source": { "type": "url", "url": "https://arxiv.org/pdf/1706.03762.pdf" } }
```

Allowed domains: `arxiv.org`, `hkexnews.hk` (more coming).

### Filing source

```json theme={null}
{ "source": { "type": "filing", "exchange": "NASDAQ", "ticker": "AAPL", "slug": "10-K-2023" } }
```

Looks up a pre-indexed filing by exchange, ticker, and slug.

### Public source

```json theme={null}
{ "source": { "type": "public_source", "id": "your-public-source-id" } }
```

Points to a shared, pre-indexed document available to all tenants.

## Using with the OpenAI SDK

The response shape is OpenAI-compatible, so you can use it as a drop-in with any OpenAI SDK consumer:

```typescript theme={null}
const response = await fetch('https://api.okrapdf.com/v1/resolve/chat/completions', {
  method: 'POST',
  headers: {
    'Authorization': `Bearer ${process.env.OKRA_API_KEY}`,
    'Content-Type': 'application/json',
  },
  body: JSON.stringify({
    source: { type: 'url', url: 'https://arxiv.org/abs/2601.06007' },
    wait_ms: 30000,
    messages: [{ role: 'user', content: 'What are the key findings?' }],
  }),
});

const data = await response.json();
console.log(data.choices[0].message.content);
```

## Error handling

| Status | Code                      | Meaning                            |
| ------ | ------------------------- | ---------------------------------- |
| `400`  | `INVALID_SOURCE`          | Missing or malformed source object |
| `401`  | —                         | Missing or invalid API key         |
| `404`  | `FILING_NOT_READY`        | Filing not found or not indexed    |
| `404`  | `PUBLIC_SOURCE_NOT_FOUND` | Public source missing or disabled  |
| `422`  | `UNSUPPORTED_SOURCE_URL`  | URL domain not in allowlist        |
| `502`  | `URL_INGEST_START_FAILED` | Failed to start document ingestion |
