> ## Documentation Index
> Fetch the complete documentation index at: https://docs.okrapdf.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Multi-Document Analysis

> Compare, aggregate, and analyze across multiple PDFs with okra chat

## Overview

`okra chat` with `--doc` or `-c` runs a lead agent that queries multiple documents simultaneously. Pass comma-separated job IDs or use a collection.

```bash theme={null}
okra chat "compare the revenue figures" --doc ocr-abc123,ocr-def456
```

## Basic: compare two documents

```bash theme={null}
okra chat "compare the revenue figures in these two reports" --doc ocr-abc123,ocr-def456
```

The agent will:

1. Query each document for revenue data
2. Synthesize a side-by-side comparison
3. Return a unified answer with citations

## Compare 3+ documents from your recent jobs

```bash theme={null}
# Grab 5 most recent completed jobs and compare
JOBS=$(okra jobs list --jq '[.[] | select(.status=="completed")] | .[0:5] | map(.job_id) | join(",")' | tr -d '"')
okra chat "what are the common themes across these documents?" --doc "$JOBS"
```

## Use a collection

Collections group related documents (e.g., all quarterly reports for one company).

```bash theme={null}
# Create a collection
okra collections create "TSMC 2024 Filings"

# Add documents to it
okra collections add <collection-id> ocr-abc123 ocr-def456 ocr-ghi789

# Query the entire collection
okra chat "summarize the key financial metrics across all filings" -c "TSMC 2024 Filings"
```

## Targeted questions

The agent is most effective with specific questions:

```bash theme={null}
# Good: specific data point
okra chat "what is the total revenue for each company in FY2024?" --doc ocr-a,ocr-b,ocr-c

# Good: comparison
okra chat "compare the risk factors sections — what risks appear in one but not the other?" --doc ocr-a,ocr-b

# Good: aggregation
okra chat "calculate the average gross margin across all reports" --doc ocr-a,ocr-b,ocr-c,ocr-d

# Less useful: vague
okra chat "tell me about these" --doc ocr-a,ocr-b
```

## How it works under the hood

When you pass multiple documents, `okra chat` sends a request to the orchestrator endpoint which:

1. **Starts a lead agent** (Kimi K2.5) with a `query_document` tool
2. **Fans out** questions to each DocumentAgent DO's `/completion` endpoint
3. **Collects** results from each document
4. **Synthesizes** findings into a unified answer

Each document query runs against the DO's local SQLite — sub-millisecond page reads, no external database hops.

## Quiet mode for scripting

```bash theme={null}
# Plain-text output, no spinners
okra chat "compare revenue" --doc ocr-a,ocr-b --quiet > result.txt

# Stream clean output directly to another command/file
okra chat "compare revenue" --doc ocr-a,ocr-b --quiet | tee result.txt
```

## Follow-up queries

Rerun against the same document set with a refined prompt:

```bash theme={null}
okra chat "summarize the key risks" --doc ocr-a,ocr-b,ocr-c
okra chat "now compare only liquidity and cash flow risk" --doc ocr-a,ocr-b,ocr-c
```
