> ## Documentation Index
> Fetch the complete documentation index at: https://docs.okrapdf.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Docling

> Parse PDFs locally with IBM's Docling, deploy to OkraPDF for chat, extraction, and page rendering. Your PDF bytes never leave your machine.

## Why Docling + OkraPDF

[Docling](https://github.com/docling-project/docling) is IBM's open-source document parser. It runs entirely on your machine — no API keys, no cloud calls, no per-page pricing. Its TableFormer model is best-in-class for complex table extraction (merged cells, nested headers, spanning rows).

Once you've parsed a document, you often need to share it — give a colleague a link to the extracted tables, let a client chat with the report, or serve the structured data via API to downstream systems.

OkraPDF handles that layer. Upload your PDF, deploy Docling's extraction, and get:

* Chat completions over the document
* Structured extraction with JSON schemas
* Page images with bounding box overlays
* Deterministic URLs for every page, table, and figure
* Collection queries across multiple documents

Your PDF bytes stay on your machine. Only the structured text and coordinates are sent to OkraPDF.

```
PDF bytes ──► [your machine: Docling] ──► structured JSON
                                              │
                                              ▼
                                    [OkraPDF: store + serve]
                                         │
                              ┌──────────┼──────────┐
                              ▼          ▼          ▼
                           chat     page images   API URLs
```

## Install

```bash theme={null}
# Docling (Python)
pip install docling requests

# OkraPDF API key
export OKRA_API_KEY=okra_...
```

<Note>
  Docling requires Python 3.10+ and \~4 GB RAM for the layout + table models.
  First run downloads models from HuggingFace (\~500 MB).
</Note>

## How it works

The integration is a three-step pipeline:

1. **Upload** the PDF to OkraPDF with `skip_parse=true` — stores the file for page rendering, but skips OCR. No extraction charges.
2. **Parse** the PDF locally with Docling — `DocumentConverter().convert()` returns a `DoclingDocument` with text, tables, figures, and bounding boxes.
3. **Ingest** the Docling output into the OkraPDF document — replaces the extraction layer. The document is now live with chat, search, and API access.

## Full example

```python theme={null}
import os, sys, requests
from docling.document_converter import DocumentConverter

API_URL = "https://api.okrapdf.com"
API_KEY = os.environ["OKRA_API_KEY"]
PDF_PATH = sys.argv[1]  # e.g. "quarterly-report.pdf"

# ── Step 1: Upload PDF (skip_parse — no OCR charge) ─────────────

with open(PDF_PATH, "rb") as f:
    resp = requests.post(
        f"{API_URL}/v1/documents?skip_parse=true",
        files={"file": (os.path.basename(PDF_PATH), f, "application/pdf")},
        headers={"Authorization": f"Bearer {API_KEY}"},
    )
    resp.raise_for_status()
    doc_id = resp.json()["documentId"]

print(f"Uploaded: {doc_id}")

# ── Step 2: Parse locally with Docling ───────────────────────────

result = DocumentConverter().convert(PDF_PATH)
doc_dict = result.document.export_to_dict()

print(f"Parsed: {len(doc_dict.get('pages', {}))} pages, "
      f"{len(doc_dict.get('texts', []))} texts, "
      f"{len(doc_dict.get('tables', []))} tables")

# ── Step 3: Send raw Docling JSON — server handles everything ────

resp = requests.post(
    f"{API_URL}/document/{doc_id}/ingest",
    json={"data": doc_dict, "vendor": "docling", "mode": "replace"},
    headers={"Authorization": f"Bearer {API_KEY}", "Content-Type": "application/json"},
)
resp.raise_for_status()

print(f"\nDocument live:")
print(f"  Chat:       {API_URL}/document/{doc_id}/chat/completions")
print(f"  Markdown:   {API_URL}/v1/documents/{doc_id}/full.md")
print(f"  Page 1:     {API_URL}/v1/documents/{doc_id}/pg_1.png")
print(f"  Page 1 md:  {API_URL}/v1/documents/{doc_id}/pg_1.md")
```

No client-side mapping needed. You send the raw `export_to_dict()` output and OkraPDF's server-side Docling plugin handles bbox conversion (BOTTOMLEFT → 0-1 relative), table cell restructuring (flat grid → row/cell hierarchy), and label passthrough. The raw Docling JSON is also stored verbatim for auditability.

## What Docling extracts

Docling's output includes structured labels and bounding boxes for every element:

| Docling label                 | What it is                      |
| ----------------------------- | ------------------------------- |
| `text`                        | Body paragraph                  |
| `section_header`              | Section heading                 |
| `title`                       | Document title                  |
| `list_item`                   | Bulleted or numbered list entry |
| `table`                       | Structured table with cell grid |
| `picture` / `chart`           | Figure with optional caption    |
| `footnote`                    | Footnote text                   |
| `page_header` / `page_footer` | Running headers and footers     |
| `key_value_region`            | Key-value pair (forms)          |
| `formula`                     | Mathematical formula            |
| `code`                        | Code block                      |

All labels are passed through to OkraPDF as-is. OkraPDF maps them to canonical types at the rendering boundary — you always get the original Docling label in the API response.

## Bounding box conversion

Docling uses **BOTTOMLEFT** origin with absolute pixel coordinates. OkraPDF uses **0-1 relative** coordinates (origin top-left).

The conversion flips the Y axis and normalizes by page dimensions:

```python theme={null}
# Docling: l=72, t=720, r=300, b=700 on a 612x792 page
# OkraPDF: x=0.118, y=0.091, w=0.373, h=0.025
x = l / page_width            # 72/612 = 0.118
y = (page_height - t) / page_height  # (792-720)/792 = 0.091
w = (r - l) / page_width      # (300-72)/612 = 0.373
h = (t - b) / page_height     # (720-700)/792 = 0.025
```

## Table structure

Docling's TableFormer model extracts table cells as a flat array with row/column grid indices:

```json theme={null}
{
  "table_cells": [
    {"text": "Revenue", "start_row_offset_idx": 0, "start_col_offset_idx": 0},
    {"text": "$10M",    "start_row_offset_idx": 0, "start_col_offset_idx": 1},
    {"text": "Profit",  "start_row_offset_idx": 1, "start_col_offset_idx": 0},
    {"text": "$2M",     "start_row_offset_idx": 1, "start_col_offset_idx": 1}
  ]
}
```

The example code groups these into OkraPDF's `table > row > cell` hierarchy:

```json theme={null}
{
  "type": "table",
  "children": [
    {"type": "row", "children": [
      {"type": "cell", "value": "Revenue"},
      {"type": "cell", "value": "$10M"}
    ]},
    {"type": "row", "children": [
      {"type": "cell", "value": "Profit"},
      {"type": "cell", "value": "$2M"}
    ]}
  ]
}
```

## Using with the CLI

If you already have a Docling JSON output file, use the CLI to upload and ingest separately:

```bash theme={null}
# Upload PDF (no parsing)
okra upload report.pdf --skip-parse
# → doc-abc123...

# Ingest Docling output
curl -X POST https://api.okrapdf.com/document/doc-abc123/ingest \
  -H "Authorization: Bearer $OKRA_API_KEY" \
  -H "Content-Type: application/json" \
  -d @docling-output.json
```

## Data sovereignty

This pattern gives you full control over where PDF bytes are processed:

| Step              | Where it runs          | What's sent                   |
| ----------------- | ---------------------- | ----------------------------- |
| PDF parsing       | Your machine (Docling) | Nothing — fully local         |
| Upload            | OkraPDF API            | PDF bytes (for page images)   |
| Ingest            | OkraPDF API            | Structured text + coordinates |
| Chat / extraction | OkraPDF edge           | Queries only                  |

For maximum privacy, you can skip the PDF upload entirely and use `POST /v1/documents/ingest` to create a document from structured data alone — but you won't get page images or PDF download.

## Verify it's lossless

OkraPDF stores the raw Docling JSON server-side and preserves original labels — no mapping, no data loss. You can verify this by comparing the snapshot export against your local Docling output:

```bash theme={null}
# 1. Check the snapshot — raw Docling types preserved as-is
curl -s -H "Authorization: Bearer $OKRA_API_KEY" \
  "https://api.okrapdf.com/exports/$DOC_ID/snapshot" | python3 -c "
import sys, json
d = json.load(sys.stdin)
types = {}
has_bbox = 0
for page in d['pages']:
    for b in page['blocks']:
        types[b['type']] = types.get(b['type'], 0) + 1
        if b.get('bbox'): has_bbox += 1
total = sum(types.values())
print(f'Total blocks: {total}, with bbox: {has_bbox}')
for t, c in sorted(types.items(), key=lambda x: -x[1]):
    print(f'  {t}: {c}')
"
```

Example output for a 2-page resume:

```
Total blocks: 169, with bbox: 169
Types:
  text: 79
  list_item: 48
  section_header: 41
  picture: 1
```

Notice the types are Docling's raw labels (`section_header`, `list_item`) — not mapped to generic types. OkraPDF resolves these to canonical types only at the rendering boundary (markdown export, chat context), so the original fidelity is always available via the API.

```bash theme={null}
# 2. Compare block count: local vs deployed
python3 -c "
from docling.document_converter import DocumentConverter
result = DocumentConverter().convert('report.pdf')
doc = result.document.export_to_dict()
local = len(doc.get('texts', [])) + len(doc.get('tables', [])) + len(doc.get('pictures', []))
print(f'Local Docling blocks: {local}')
"

curl -s -H "Authorization: Bearer $OKRA_API_KEY" \
  "https://api.okrapdf.com/exports/$DOC_ID/snapshot" | \
  python3 -c "
import sys, json
d = json.load(sys.stdin)
deployed = sum(len(p['blocks']) for p in d['pages'])
print(f'Deployed OkraPDF blocks: {deployed}')
"
```

If counts match and types are raw Docling labels, the ingest is lossless.

## Standalone example

A complete standalone script is available at [`examples/docling-ingest.py`](https://github.com/okrapdf/okrapdf/blob/main/examples/docling-ingest.py).

## See also

* [Ingest API](/features/ingest-api) — reference for the ingest endpoint
* [Local Extraction + Redaction](/cookbook/local-redact-docling) — Docling + PII redaction
* [Batch Processing](/cookbook/batch-processing) — process multiple PDFs
