Skip to main content

What it is

okrapdf/okrapdf-redact-skill is an open-source Codex skill for local PDF redaction. It does three things in one pass:
  • detects visible sensitive values on page images
  • writes normalized bbox JSON
  • burns black-box redactions into a new PDF
The current default model is google/gemini-3.1-flash-lite-preview via OpenRouter. Use Qwen when you want a second opinion or behavior closer to OkraPDF’s Qwen-heavy extraction stack.

Why use it

Use this skill when you need bbox-grounded redaction for real PDFs, not just text masking in markdown or completion context. Outputs:
  • <name>.redacted.vlm.pdf
  • <name>.redactions.json
  • <name>.redaction-preview/
The preview PNGs are the source of truth. Open them before trusting the final PDF.

Install

Clone the repo and install Python deps:
git clone https://github.com/okrapdf/okrapdf-redact-skill
cd okrapdf-redact-skill
python3 -m pip install -r requirements.txt
Set credentials:
export OPENROUTER_API_KEY=...
export OKRA_API_KEY=...
OKRA_API_KEY can also come from ~/.okra/config.json if the okra CLI is already authenticated.

Run

python3 scripts/redact_pdf.py /path/to/file.pdf
Pick a model explicitly:
python3 scripts/redact_pdf.py /path/to/file.pdf \
  --model google/gemini-3.1-flash-lite-preview
Or switch to Qwen:
python3 scripts/redact_pdf.py /path/to/file.pdf \
  --model qwen/qwen3-vl-235b-a22b-instruct

Notes

  • Gemini Lite currently works well for tight value boxes on government forms.
  • The skill keeps one best box per field type per page.
  • The default path drops obvious Gemini receipt-number duplicates in the top barcode/header strip.
For the skill internals and prompt contract, see the repo: