ModelPilot Docs Home Sign in Start free trial

Architecture & privacy

ModelPilot is split so the sensitive data physically can't reach us. A local proxy classifies on your box; a hosted brain returns the routing decision from a category and numbers.

Request flow

your app ──▶ local proxy ──▶ Anthropic API ──▶ your app
                 │  (classifies locally; forwards with YOUR key)
                 ▼
            ModelPilot brain   ◀── category + token counts only
            (floors · economics · gate · entitlement · mode)
  1. Your app calls the proxy instead of api.anthropic.com (one line: the base URL).
  2. The proxy classifies the request locally into a task category and extracts numeric features (token estimates, flags) — no prompt text leaves the process.
  3. It asks the brain for a decision, sending only the category + features. The brain applies the per-category floors, switch economics, the confidence gate, your entitlement, and your mode.
  4. The proxy forwards the (possibly cheaper) request to Anthropic with your key and returns the response — adding x-modelpilot-decision / x-modelpilot-routed headers.
  5. If the brain is unreachable, the proxy fails open: the request goes straight to Anthropic, unrouted.

What leaves your box — and what never does

Stays localSent to ModelPilot
Prompt text & system promptsTask category (e.g. classification)
Model outputs / completionsToken-count estimates & boolean flags (has-tools, etc.)
Your Anthropic API keyRequested model + your deployment id / API key
Any customer dataAggregate savings dollars & counts (for billing)
Defense in depth: the brain and metering/telemetry endpoints reject any payload that contains prompt/output/secret-looking keys (HTTP 422) — even though the client already guarantees aggregates only.

Metering & billing

Your local ledger records counts and dollars (never prompt text). The client periodically reports the delta of realized savings to the console as aggregates; your bill each cycle is 20% of that. You can audit every figure in the dashboard, recomputed from the actual tokens each request used.

How routing protects quality

The client is inspectable

The proxy is a thin, publishable client: it contains the commodity classifier and the forwarding logic, but none of the routing IP (floors, price table, economics) — those stay server-side. You can read exactly what it sends.

© 2026 ModelPilot · krethikram@gmail.com