Overview
Every OkraPDF document can have an Eval Agent attached. When enabled, it evaluates each chat completion asynchronously — checking for hallucinated facts, compliance violations, or custom policy rules you define. The eval never blocks the response. Results appear in the document event log within seconds.Enable eval on a document
How it works
- You chat normally —
POST /document/:id/chat/completionsreturns instantly, no latency added. - Three hooks fire asynchronously via the document’s internal queue:
turn.before— evaluates the user query before the LLM runstool.execute.after— evaluates tool call resultsturn.after— evaluates the final response against the document
- EvalAgent judges using a fast LLM (Haiku or Kimi-K2.5) with your instructions as the evaluation criteria.
- Results logged to the document event log — viewable via API or the info page.
Check eval results
Guardrail examples
Hallucination detection (financial documents)
Source accuracy (legal/compliance)
PII leakage prevention
Scope enforcement (narrow the agent)
Tone and brand voice
Configuration options
| Field | Type | Default | Description |
|---|---|---|---|
enabled | boolean | false | Turn eval on/off |
scope | "document" | "user" | "document" | Eval context scope — per-document or per-user turn history |
instructions | string | — | Natural language evaluation criteria |
model | object | auto | Override the eval model (see below) |
maxRecentTurns | number | 5 | How many recent turns to include as context |
Custom eval model
By default, EvalAgent uses Claude Haiku (ifANTHROPIC_API_KEY is set) or Kimi-K2.5 via OpenRouter. Override with:
Disable eval
Architecture
EvalAgent is a separate Durable Object that runs independently from the document’s completion handler.- No latency impact — eval events are written to the document’s internal queue during completion, then processed asynchronously in a separate DO wake.
- Durable — queued eval events survive DO hibernation. If the eval LLM is slow, events retry with exponential backoff.
- Scoped — each document (or user, if
scope: "user") gets its own EvalAgent instance with its own turn history. - Fail-open — if the eval LLM errors or times out, the completion is unaffected. Errors are logged, never surfaced to the user.
See also
- Chat — document chat completions
- Output Schema — structured extraction with validation