open-source OCR · self-host it · bring your own model
OCR as a building block.
Open source. Not a SaaS.
Parse a document into structured, confidence-scored data and relay it where the work actually happens. Deploy the scan worker to your own Cloudflare account, bring your own model — no subscription, no lock-in, no database.
// give your agent eyes that emit structured data and know where to send it
{
"scan_id": "scn_7f3a91c2",
"status": "ok",
"fields": { "merchant": "Blue Bottle", "total": 18.50 },
"confidence": { "merchant": 0.97, "total": 0.92 },
"needs_review": [],
"field_source": { "merchant": "rescued", "total": "deterministic" },
"raw_text": "Blue Bottle … TOTAL 18.50",
"meta": {
"engine": "ocr+rescue", "model": "claude-haiku-4-5",
"tokens": { "input": 2584, "output": 40, "provider_cost_usd": 0.0028 }, // on your key (BYO)
"scan_credits": 1,
"total_credits": 1 // just plumbing — model is on your key
}
} Open source. Self-host the whole thing.
The scan worker is public and MIT. Deploy it to your own Cloudflare account and you own the whole path — your keys, your infrastructure, your data. No database, no accounts, no billing. Or skip the ops and let us run it.
git clone https://github.com/parserelay/worker
cd worker && pnpm install
npx wrangler secret put ANTHROPIC_API_KEY # or OPENAI_API_KEY
npx wrangler secret put GLM_OCR_API_KEY # or MISTRAL_API_KEY
npx wrangler secret put API_KEYS # keys you accept
npx wrangler deploy # POST /v1/scan is live -
@parserelay/worker the scan core
Self-host POST /v1/scan on your own Cloudflare account — no database, your keys.
github ↗ -
@parserelay/core the contract
The scan request/response types — one envelope behind all three surfaces.
github ↗ -
@parserelay/mcp agents
The scan tool over MCP — stdio + streamable-HTTP, in any MCP host.
github ↗ -
@parserelay/scanner React
<DeadSimpleMicroScanner> — drop-in capture / upload → fields.
github ↗ -
@parserelay/client REST
Thin typed client for POST /v1/scan.
github ↗ -
@parserelay/angular Angular
The drop-in scanner component for Angular (v0).
github ↗
// every box is MIT — on npm and GitHub. bring your own keys; nothing phones home.
One request body. One envelope. Three ways in.
Move between sync, fire-and-forget relay, and an embedded component without changing how you build the request or parse the result.
- 01
MCP tool
Agents call
scannatively and get the envelope back inline.npx -y @parserelay/mcp - 02
REST
For everything else, plus the async relay path returning
202 accepted.POST /v1/scan - 03
Component
Drop-in capture/upload → fields. Angular is v0; React is a port.
<DeadSimpleMicroScanner />
Deterministic‑first. You pay for the hard fields.
Cheap OCR parses what it can for free. The expensive model is a rescue tool,
not the extraction tool: it only touches the fields the confidence gate flags, and it
reads the actual page image to do it. Unresolved fields come back null and
flagged, never hallucinated.
// most fields cost $0.00 · you only pay for the ones that fight back
- signed
- HMAC header your worker can trust.
- retried
- exponential backoff on delivery failure.
- idempotent
- a retried scan never double-posts.
- dead-lettered
- persistent failures land somewhere, not nowhere.
One contract. Five ways to read.
The engine is a parameter, not a product — same request body, same envelope.
Pick how the work gets done, or let auto decide. Cost scales with how hard the
document fights back, not with the mode you name.
| engine | how it reads | confidence | cost |
|---|---|---|---|
| autodefault |
Text layer → ocr+rescue; otherwise vision.
Best for: mixed or unknown inputs — let the service pick. | scored | $ |
| ocr |
OCR backend only — raw text parsed straight to fields, no LLM rescue. The OCR model bills per token, so it's cheap but not free.
Best for: labeled / structured text (forms, exports) where fields resolve straight from the text — or when you just want raw_text. It can't read an unlabeled receipt. | null * | ¢ |
| ocr+rescue | Deterministic parse; the model rescues only the flagged fields, reading the page image. Best for: partly-structured docs where most fields parse cleanly and only a few fight back. (If nothing parses, it rescues everything — reach for vision.) | scored | $ |
| ocr+check | The model structures every field from the OCR text, then self-checks (do the line items sum to the total?). Best for: dense, multi-line docs — invoices, receipts. Full structuring + a math check, and cheaper than vision since it reads text, not pixels. | scored | $$ |
| vision | The model reads the image directly. Best for: photos and scans with no usable text layer — and the most faithful on messy, unlabeled layouts. | scored | $$ |
// * ocr-only returns confidence: null by design — no LLM judges field correctness, so the missing score is itself the honest signal.
Don't want to run it? We'll host it.
Self-hosting is free and always will be. If you'd rather skip the ops, use the hosted API — we run the worker, you pay per scan in prepaid credits, bring your own keys, no subscription. Pricing, the token→credit map, and a cost calculator live on the hosted page.
Hosted pricing & calculator →