Documentation Index Fetch the complete documentation index at: https://docs.okrapdf.com/llms.txt
Use this file to discover all available pages before exploring further.
Overview
Branch creates a zero-copy fork of a document. Ingest with mode: "replace" supersedes
existing nodes on affected pages. Together they let you correct extraction errors in
isolation and compare results.
The problem
OCR textlayers sometimes get numbers wrong — dropped signs, lost precision. A completion
grounded in bad extraction data gives wrong answers. You need a way to fix the data without
mutating the original document.
Flow
Original doc ──→ branch ──→ ingest(replace) ──→ re-query
(0.6)%, 14.7% 14.8% (correct) ✓ PASS
Full example
import { OkraClient } from 'okrapdf' ;
const client = new OkraClient ({ apiKey: process . env . OKRA_API_KEY });
const docId = 'doc-abc123' ;
// 1. Ask the original — gets wrong answer from textlayer
const baseline = await client . generate ( docId ,
'What was the effective tax rate in FY2022 vs FY2021?'
);
console . log ( baseline . answer ); // "(0.6)% and 14.7%" — wrong
// 2. Branch (zero-copy fork, ~2s)
const branch = await client . request ( '/v1/documents/' + docId + '/branch' , {
method: 'POST' ,
});
const branchId = branch . id ;
// 3. Ingest corrected table data on the branch
await client . request ( '/document/' + branchId + '/ingest' , {
method: 'POST' ,
headers: { 'Content-Type' : 'application/json' },
body: JSON . stringify ({
vendor: 'canonical' ,
mode: 'replace' , // supersedes existing nodes on affected pages
data: {
pages: [{
pageNumber: 77 ,
blocks: [{
type: 'table' ,
label: 'Tax Reconciliation' ,
value: 'Income tax expense/(benefit) | $31 | (0.6)% | ($743) | 14.8%' ,
children: [
{ type: 'row' , children: [
{ type: 'cell' , value: 'Income tax expense/(benefit)' },
{ type: 'cell' , value: '$31' },
{ type: 'cell' , value: '(0.6)%' },
{ type: 'cell' , value: '($743)' },
{ type: 'cell' , value: '14.8%' },
]}
]
}]
}]
}
}),
});
// 4. Wait for processing
await client . wait ( branchId );
// 5. Re-query — gets correct answer
const improved = await client . generate ( branchId ,
'What was the effective tax rate in FY2022 vs FY2021?'
);
console . log ( improved . answer ); // "(0.6)% and 14.8%" — correct
How mode: "replace" works
Mode Behavior append (default)New nodes are added alongside existing ones replaceExisting nodes on affected pages get status = 'superseded', then new nodes are hydrated. Completions only read non-superseded nodes.
Superseded nodes are not deleted — they stay in the graph for audit trail purposes.
This follows the same append-only, never-overwrite pattern as vendor_log.
Branch response
{
"id" : "doc-forked-..." ,
"branched_from" : "doc-abc123" ,
"phase" : "complete" ,
"row_counts" : {
"meta" : 26 ,
"nodes" : 380 ,
"edges" : 379 ,
"page_ledger" : 190 ,
"vendor_log" : 12 ,
"document_log" : 419
}
}
The branch is immediately queryable — same phase, same nodes. Mutations on the branch
don’t affect the original.
When to use this
Eval corrections : fix extraction errors on specific pages to measure impact on downstream completions
A/B testing vendors : branch, re-ingest with a different vendor’s output, compare answers
Human-in-the-loop : reviewer corrects a table, ingests the fix on a branch, promotes if better
Safe experimentation : try schema changes or re-extraction without risking production data
Live demo
See the interactive version at ingest-branch-demo.pages.dev .
Ingest API Push pre-parsed vendor output into OkraPDF.
Structured Extraction Extract typed data from documents with Zod schemas.