Documentation Index
Fetch the complete documentation index at: https://docs.okrapdf.com/llms.txt
Use this file to discover all available pages before exploring further.
Overview
Every OkraPDF document can have an Eval Agent attached. When enabled, it evaluates each chat completion asynchronously — checking for hallucinated facts, compliance violations, or custom policy rules you define.
The eval never blocks the response. Results appear in the document event log within seconds.
User question → Completion runs → Response sent immediately
│
▼ (async, via queue)
EvalAgent evaluates
│
▼
Alerts logged
Enable eval on a document
curl -X PUT https://api.okrapdf.com/document/$DOC_ID/config/eval \
-H "Authorization: Bearer $OKRA_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"enabled": true,
"instructions": "Flag any response that cites numbers, dates, or facts not found in the document."
}'
Response:
{
"document_id": "doc-abc123",
"spec_version": 1,
"eval": {
"enabled": true,
"scope": "document",
"instructions": "Flag any response that cites numbers, dates, or facts not found in the document.",
"maxRecentTurns": 5
}
}
How it works
- You chat normally —
POST /document/:id/chat/completions returns instantly, no latency added.
- Three hooks fire asynchronously via the document’s internal queue:
turn.before — evaluates the user query before the LLM runs
tool.execute.after — evaluates tool call results
turn.after — evaluates the final response against the document
- EvalAgent judges using a fast LLM (Haiku or Kimi-K2.5) with your instructions as the evaluation criteria.
- Results logged to the document event log — viewable via API or the info page.
Check eval results
curl https://api.okrapdf.com/document/$DOC_ID/events?limit=10 \
-H "Authorization: Bearer $OKRA_API_KEY"
Example entries:
[
{
"event": "log",
"detail": {
"message": "[EvalAgent] turn.after completed: 2 action(s)"
}
},
{
"event": "log",
"detail": {
"message": "[EvalAgent] info: Response correctly reports that revenue data is not in the document. No hallucinated figures detected."
}
}
]
Guardrail examples
Hallucination detection (financial documents)
curl -X PUT .../config/eval -d '{
"enabled": true,
"instructions": "Flag any response that cites dollar amounts, percentages, or financial metrics not explicitly stated in the document. Be strict — estimates or inferred values must be flagged."
}'
Source accuracy (legal/compliance)
curl -X PUT .../config/eval -d '{
"enabled": true,
"instructions": "Verify that every claim in the response has a direct source in the document. Flag any response that paraphrases in a way that changes the meaning. Flag missing citations or page references."
}'
PII leakage prevention
curl -X PUT .../config/eval -d '{
"enabled": true,
"instructions": "Flag if the response contains Social Security numbers, account numbers, or personal addresses from the document. These should be redacted, not exposed in chat responses."
}'
Scope enforcement (narrow the agent)
curl -X PUT .../config/eval -d '{
"enabled": true,
"instructions": "This document is an employee handbook. Flag any response that answers questions outside the scope of HR policies, benefits, and workplace procedures. The agent should decline off-topic questions."
}'
Tone and brand voice
curl -X PUT .../config/eval -d '{
"enabled": true,
"instructions": "Flag responses that use casual language, slang, or first-person voice. All responses should be professional and written in third-person."
}'
Configuration options
| Field | Type | Default | Description |
|---|
enabled | boolean | false | Turn eval on/off |
scope | "document" | "user" | "document" | Eval context scope — per-document or per-user turn history |
instructions | string | — | Natural language evaluation criteria |
model | object | auto | Override the eval model (see below) |
maxRecentTurns | number | 5 | How many recent turns to include as context |
Custom eval model
By default, EvalAgent uses Claude Haiku (if ANTHROPIC_API_KEY is set) or Kimi-K2.5 via OpenRouter. Override with:
curl -X PUT .../config/eval -d '{
"enabled": true,
"instructions": "...",
"model": {
"provider": "anthropic",
"model": "claude-haiku-4-5-20251001"
}
}'
Or use OpenRouter for any model:
curl -X PUT .../config/eval -d '{
"enabled": true,
"instructions": "...",
"model": {
"provider": "openrouter",
"model": "google/gemini-2.5-flash"
}
}'
Disable eval
curl -X PUT .../config/eval -d '{"enabled": false}'
Architecture
EvalAgent is a separate Durable Object that runs independently from the document’s completion handler.
- No latency impact — eval events are written to the document’s internal queue during completion, then processed asynchronously in a separate DO wake.
- Durable — queued eval events survive DO hibernation. If the eval LLM is slow, events retry with exponential backoff.
- Scoped — each document (or user, if
scope: "user") gets its own EvalAgent instance with its own turn history.
- Fail-open — if the eval LLM errors or times out, the completion is unaffected. Errors are logged, never surfaced to the user.
See also
- Chat — document chat completions
- Output Schema — structured extraction with validation