How it works
One control loop, one registry, one rule: no license, no autonomous action. This page walks the whole system — from the proving ground that manufactures evidence to the fingerprint that voids a license the night the model changes.
The sandbox is live telemetry — a microservice flight-simulator you can deliberately break and heal, with parameterised severity. Splunk Cloud is the reasoning context, reached exclusively through the Splunk MCP Server. And Warrant itself is an MCP server, so external agents are gated by the same registry.
Also rendered as Mermaid in the repo: architecture_diagram.md
The ordering is the whole trick: by step 4 the agent has committed, in writing, to what the world will look like if it’s right. There is nothing to argue about at step 7.
| # | step | what happens | touchpoint |
|---|---|---|---|
| 1 | SENSE | Read live metrics: error_rate, db_connections, p95 latency | sandbox |
| 2 | CONTEXT | Pull real operational context from Splunk | MCP · splunk_run_query, saia_* |
| 3 | DIAGNOSE | A brain proposes one bounded action and a stated confidence | heuristic or Gemini |
| 4 | PREDICT | Commit to a falsifiable forecast band — a control limit learned from healthy data — before acting | statistics, not vibes |
| 5 | GATE | Reversible + in-blast-radius only; autonomous only with a valid license, else human approval | certification |
| 6 | ACT | Execute the bounded remediation | sandbox control API |
| 7 | VERIFY | Read the live metric back against the committed band | reality |
| 8 | LEDGER | Record the graded outcome — correct?, confidence, fingerprint — and re-evaluate the license | certification |
An agent can’t earn trust without acting, and shouldn’t be allowed to act without trust. Worse, real incidents are too rare to certify on — five production successes is an anecdote, not a track record. The proving ground breaks both problems at once:
it manufactures incidents in the sandbox — leak, bad deploy, cache stampede — at parameterised severity and noise, so no two exams are identical, and runs the agent through them as accelerated exams. Each exam is the full control loop: the agent diagnoses, commits to a prediction, acts, and is graded by the metric coming back inside the band — or not. Fifteen graded outcomes land in the ledger in seconds.
Pilots earn licenses in simulators, not by crashing planes. Agents should too.
One license per action class (restart_connection_pool, rollback_deploy, clear_cache…) — because competence doesn’t transfer between actions. All three conditions must hold:
| condition | test | what it prevents |
|---|---|---|
| confidence | wilson_lower_bound(hits, n) ≥ threshold | a lucky streak counting as competence — 3/3 raw is 100%, but Wilson says you may only claim ~44% |
| evidence | n ≥ min_samples | licensing on an anecdote |
| calibration | brier(stated confidence, outcomes) ≤ max | trust bought with bravado — confidently wrong fails even with a passing hit-rate |
| probation | evidence bar += 2 × production strikes | a suspended agent retrying exam suites until one gets lucky |
| margin | thin margin → ALLOW_WITH_MONITORING | treating “barely cleared the bar” as a blank cheque |
The ledger behind every license is tamper-evident: each outcome is sha256-chained to the one before it, and every record is labelled by evidence — measured (Warrant read the metric itself) vs self-reported (the agent’s word) — so an auditor can see exactly how much of a license rests on what.
The same math runs natively in Splunk as a saved search — splunk/trust_ledger.spl — so on a HEC-enabled tenant the license registry is a Splunk dashboard.
One violated prediction in production suspends the license on the spot — the agent rolls back its own action, escalates with full context, and returns to supervised mode. It re-earns autonomy the same way it earned it the first time: with evidence.
The fingerprint is the part nobody else has: every license is pinned to model_id + prompt_version. When the brain changes — a vendor model update, a prompt tweak, a swap from heuristic to Gemini — the registry notices before the new brain acts once, not after its first incident. A track record belongs to the brain that earned it.
The trust gate isn’t a framework you adopt — it’s a tool call. Any external agent asks permission before acting and reports the outcome after; the registry does the rest. The repo includes a self-contained proof (python -m warrant.mcp_demo): an independent agent earns, uses, and loses autonomy purely over MCP.
Four acts, ninety seconds — or clone it and run the real thing in two terminals.