Dynamic Agent Workflows

Dynamic agent workflows are available for internal use and private beta. For stable production extraction, use structured extraction or built-in invoice workflows.

Overview

Dynamic agent workflows let you express a document workflow as plain JavaScript:

agent() runs one prompted agent call and requires a JSON Schema return contract.
parallel() runs multiple agent calls behind a fan-out barrier.
pipeline() maps work across pages, files, rows, providers, or other input arrays.
phase() marks a named stage boundary for traceability and review.

The JavaScript between those calls is controller code. Use it for branching, loops, scoring, consensus, budget checks, and accept-or-abstain decisions. The agents do not validate the workflow definition themselves; they just run their task and return JSON that matches the schema you gave them.

When to use it

Use dynamic agent workflows when a single extraction prompt is not enough:

Pattern	Example
Provider A/B tests	Run text layer, VLM OCR, and a vendor parser on the same pages, then judge discrepancies.
N-provider fan-out	Compare multiple extraction providers against one typed schema.
Risk-controlled OCR	Probe the same page under multiple views and accept only consensus outputs.
Human review splits	Run detect, emit review artifacts, then apply changes in a second phase after approval.
Policy-controlled runs	Enforce user budget, privacy, provider, and mutation limits around every agent/tool call.

Authoring grammar

Every script is a JavaScript controller. It can include loader-readable metadata, controller variables, loops, branches, and return.

export const meta = {
  name: "Filing A/B extractor",
  description: "Compare text-layer and visual extraction before returning facts.",
  phases: [
    { title: "Extract", detail: "Run independent probes." },
    { title: "Judge", detail: "Merge and flag discrepancies." }
  ]
};

meta must be a pure object literal when present. Use it for human-facing names and phase descriptions; do not put runtime logic in it. Every real unit of work should be an agent() call with a task prompt and a JSON Schema under the schema key.

export const meta = {
  name: "Financial metric review",
  description: "Extract, compare, and judge financial metrics."
};

const SUMMARY = {
  type: "object",
  required: ["metrics", "risks"],
  properties: {
    metrics: {
      type: "array",
      items: {
        type: "object",
        required: ["label", "value", "page"],
        properties: {
          label: { type: "string" },
          value: { type: "string" },
          page: { type: "integer" }
        }
      }
    },
    risks: {
      type: "array",
      items: { type: "string" }
    }
  }
};

phase("Extract");

const precision = agent(
  "Extract only clearly stated financial metrics from args.document_id.",
  { label: "precision-extract", schema: SUMMARY }
);

const completeness = agent({
  label: "completeness-extract",
  prompt: "Extract all likely financial metrics from args.document_id. Mark uncertain values in risks.",
  schema: SUMMARY
});

const [a, b] = await parallel([precision, completeness]);

phase("Judge");

const judged = await agent(
  `Compare these two extractions and return the best merged result:
precision=${JSON.stringify(a)}
completeness=${JSON.stringify(b)}`,
  { label: "judge", schema: SUMMARY }
);

return judged;

The contract key is always schema. Do not use returns, output_schema, or json_schema.

Controller API

API	Shape	Notes
`args`	`const docId = args.document_id`	Run inputs passed from `run_workflow` or `POST /v1/runs`.
`phase(label)`	`phase("Extract")`	Marks the current stage for events and UI traces.
`agent(prompt, options)`	`await agent("Extract rows.", { label, schema })`	Runs one schema-contracted worker agent.
`agent(options)`	`await agent({ prompt, label, schema, phase })`	Object style, useful for generated scripts.
`parallel(items)`	`await parallel([() => agent(...), () => agent(...)])`	Fan-out barrier. Prefer thunks so the parallel group owns when work starts.
`pipeline(items, mapper, options)`	`await pipeline(pages, page => agent(...), { concurrency: 4 })`	Sequential by default. `concurrency` is capped at 16.
`log(...items)`	`log("judge input", summary)`	Adds trace logs without becoming a workflow step.
`return value`	`return judged`	Becomes `controller_output` on the run.

Readiness model

There are two workflow types:

Type	How readiness works
Finite catalog definitions	Okra can recompute readiness from `definition.steps`, so `validateWorkflowDefinitionReadiness(...)` applies.
Agent workflow scripts	Okra treats the script as code. Readiness is `ready` or `parse_error`, based on the loader/analyzer result. There is no `valid` or `invalid` verdict from a finite-DAG validator.

The static planned field is only an estimate from AST analysis. It can count literal phase(), agent(), and parallel() calls, but loops and pipeline() can emit more or fewer agents at runtime. Run failures are execution state, not blueprint validity. If the controller throws before any agent() starts, the run can fail with no completed agent steps. If an agent returns invalid JSON, the run records the agent failure and schema errors.

Run from MCP

Connect your client to the OkraPDF MCP server, then use:

Tool	Purpose
`draft_workflow`	Save the workflow source and return its workflow id, readiness, AST, visualization, and static agent analysis.
`run_workflow`	Start a run for a saved workflow.
`view_workflow_run`	Read run status, controller output, per-agent outputs, events, logs, and errors.
`view_workflow`	Inspect a workflow blueprint before running it. For agent scripts, `validation` is `null` and readiness comes from the script analysis.

Example agent instruction:

Draft a workflow named "filing A/B extractor" with this JavaScript source.
Every agent() call must include schema. Do not run it yet.

Then:

Run workflow wf_... with inputs { "document_id": "doc-..." } and show view_workflow_run output.

Run from the API

Create a workflow resource from a JavaScript source file:

cat > workflow.js <<'JS'
export const meta = {
  name: "Smoke agent workflow",
  description: "Minimal dynamic agent workflow smoke test."
};

phase("Smoke");

const result = await agent("Return { ok: true }.", {
  label: "smoke",
  schema: {
    type: "object",
    required: ["ok"],
    properties: {
      ok: { type: "boolean" }
    }
  }
});

return result;
JS

curl -X POST https://api.okrapdf.com/v1/workflows \
  -H "Authorization: Bearer $OKRA_API_KEY" \
  -H "Content-Type: application/json" \
  -d "$(jq -n --rawfile code workflow.js '{
    name: "Smoke agent workflow",
    code: $code
  }')"

Run it:

curl -X POST https://api.okrapdf.com/v1/runs \
  -H "Authorization: Bearer $OKRA_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "workflow_id": "wf_...",
    "inputs": {
      "document_id": "doc-..."
    }
  }'

Check the run:

curl https://api.okrapdf.com/v1/runs/run_... \
  -H "Authorization: Bearer $OKRA_API_KEY"

Run trace

Runs return both final output and a trace of agentic events. The event stream is the product-level provenance layer for debugging and review.

{
  "object": "run",
  "status": "completed",
  "execution_mode": "agent_workflow_script",
  "controller_output": { "ok": true },
  "agents": [
    {
      "id": "agent_1",
      "label": "smoke",
      "status": "completed",
      "output": { "ok": true }
    }
  ],
  "events": [
    { "type": "run.started" },
    { "type": "phase.entered", "phase": "Smoke" },
    { "type": "agent.started", "agent_id": "agent_1" },
    { "type": "schema.validated", "agent_id": "agent_1" },
    { "type": "agent.completed", "agent_id": "agent_1" },
    { "type": "run.completed" }
  ]
}

Typical event types include run.started, phase.entered, parallel.started, agent.started, schema.validated, agent.completed, agent.failed, log, run.completed, and run.failed.

Parallel patterns

For simple fan-out, pass agent promises:

const [textLayer, vision] = await parallel([
  agent("Extract using the PDF text layer.", { label: "text-layer", schema: PAGE_TEXT }),
  agent("Extract by visually reading the page image.", { label: "vlm", schema: PAGE_TEXT })
]);

For stronger provenance around a parallel group, pass functions so the controller can tag all branch agents with the group:

const [textLayer, vision] = await parallel([
  () => agent("Extract using the PDF text layer.", { label: "text-layer", schema: PAGE_TEXT }),
  () => agent("Extract by visually reading the page image.", { label: "vlm", schema: PAGE_TEXT })
]);

Then compare the branches:

const discrepancies = await agent(
  `Compare text-layer and VLM extraction. Flag missing text, substitutions, numbers, dates, and bbox disagreements.
text_layer=${JSON.stringify(textLayer)}
vlm=${JSON.stringify(vision)}`,
  { label: "discrepancy-review", schema: DISCREPANCIES }
);

Current boundary

Dynamic agent workflows are ready for internal workflows, demos, MCP-driven authoring, and playground A/B tests. They are not yet the public untrusted-user sandbox for arbitrary long-running code. Under the hood, Okra stores the author-facing script, analyzes its AST for visualization, runs host-managed agents with strict JSON contracts, and returns traceable events. The next production-hardening step is lowering each agent() call into durable Cloudflare Workflow step.do() boundaries while keeping the same authoring grammar.

​Overview

​When to use it

​Authoring grammar

​Controller API

​Readiness model

​Run from MCP

​Run from the API

​Run trace

​Parallel patterns

​Current boundary

Overview

When to use it

Authoring grammar

Controller API

Readiness model

Run from MCP

Run from the API

Run trace

Parallel patterns

Current boundary