WARRANT · Licensing authority for AI agents

Splunk Agentic Ops Hackathon · Observability

Don’t trust your agent.
License it.

Evals tell you how smart an agent is. Warrant tells production how much rope to give it — and takes the rope back. Agents earn revocable, per-action licenses by making falsifiable predictions and being graded by reality, not by a demo that went well.

▶ Watch it earn one — and lose it Why this matters →

Certificate of License

No. WL-0017 · Registry of Autonomous Actions

action classrollback_deploy

holder (fingerprint)heuristic:rules@1.2

wilson lower bound0.31 < 0.50

last outcomeprediction violated

Suspended

Autonomy withdrawn on first violated prediction. Human approval required.

Certificate of License

No. WL-0042 · Registry of Autonomous Actions

action classrestart_connection_pool

holder (fingerprint)heuristic:rules@1.2

wilson lower bound0.57 ≥ 0.50

calibration (brier)0.04 · n=21

Licensed

Bearer may act autonomously within this action class. License is revoked on a violated prediction and void if the holder’s fingerprint changes.

◆ falsifiable predictions ◆ revocable per-action licenses ◆ drift-aware fingerprints ◆ MCP on both sides of the loop

§1 · THE PROBLEM

Everyone is shipping agents that act.
Nobody can say when the human may let go.

The industry’s answer to agent safety is “human-in-the-loop” — an analyst approves every action, forever. That isn’t a safety model; it’s a bottleneck with good intentions. And the moment teams get tired of clicking approve, autonomy gets granted the way it always does: a demo went well, a prompt was tweaked, someone flipped auto‑approve. No number. No evidence. No way to take it back.

A driver’s license doesn’t certify that you’re smart. It certifies you were tested on the road you’ll actually drive — and it can be taken away.

Read the full case — and how Warrant compares to evals, guardrails and human-in-the-loop →

§2 · THE ANSWER

Autonomy as a license: earned, audited, revocable.

Warrant sits between any AI agent and the systems it wants to touch. Before every action it asks one question: does this brain hold a valid license for this action class?

The Proving Ground

Trust starts before production.

Manufactured incidents, accelerated exams. The agent commits to a falsifiable prediction before each fix; reality grades it. Fifteen graded outcomes in seconds — no cold start, no anecdotes.

How exams work →

The License

Three conditions. No exceptions.

A Wilson lower bound that clears threshold, a minimum body of evidence, and Brier-scored calibration — an agent that is confidently wrong fails even with a passing hit-rate.

The licensing math →

The Fingerprint

Licenses die with the brain that earned them.

Every license is pinned to model ID + prompt version. Your vendor updates the model overnight? Every license drops to PROVISIONAL before the new brain acts once.

Drift detection →

And every earned license is issued as a real, printable certificate — bound to the ledger hash →

§3 · THE DEMONSTRATION

Watch an agent earn autonomy — then lose it honestly.

Four acts against a live fault-injection sandbox. The kill-shot is Act II: a decoy incident fools the agent exactly the way it would fool an engineer — and its own falsifiable prediction catches the mistake, rolls it back, and suspends the license. Trustworthy when wrong is the property everything else is missing.

The Proving Ground

3 classes licensed

Fifteen manufactured incidents across three action classes. Wilson bounds climb until the registry shows three LICENSED stamps.

Production — and the decoy

license suspended

Acting alone, the agent resolves real incidents — until a decoy mimics a familiar signature. Its fix misses the band it predicted. It declares itself wrong, rolls back, escalates, and loses the license.

III

The model updates overnight

all licenses → provisional

The fingerprint changes — no failure has happened yet. Warrant doesn’t wait for one: the track record belonged to a brain that no longer exists.

Re-certification

autonomy re-earned

The new brain re-earns each license under its own fingerprint. Autonomy restored — with evidence, not a vibe.

▶ Play the replay (90 seconds) Run it live locally →

§4 · BUILT ON SPLUNK

MCP on both sides of the loop.

Warrant consumes the Splunk MCP Server for every read of Splunk data — and ships its own MCP server, so the trust gate becomes infrastructure any agent in the ecosystem can call: a SOAR playbook, Splunk’s own triage agents, a Claude agent.

# any external agent, over MCP — licenses are pinned to the caller's fingerprint: > warrant_request_action(action_class="restart_connection_pool", agent_fingerprint="gemini-2.5-flash:v3") { "verdict": "ALLOW", "license": { "status": "LICENSED", "confidence": 0.61 } } > warrant_request_action(action_class="rollback_deploy", agent_fingerprint="gemini-2.5-flash:v3") { "verdict": "REQUIRE_APPROVAL", "license": { "status": "SUSPENDED", "strikes": 1 } }

Every Splunk touchpoint, and the full tool reference →

Autonomy is earned.
In writing.

Built solo for the Splunk Agentic Ops Hackathon 2026 · Observability track.

▶ Watch the demo github.com/Uthmannabeel/warrant

Don’t trust your agent.License it.

Everyone is shipping agents that act.Nobody can say when the human may let go.

Autonomy as a license: earned, audited, revocable.

Trust starts before production.

Three conditions. No exceptions.

Licenses die with the brain that earned them.

Watch an agent earn autonomy — then lose it honestly.

The Proving Ground

Production — and the decoy

The model updates overnight

Re-certification

MCP on both sides of the loop.

Autonomy is earned.In writing.

Don’t trust your agent.
License it.

Everyone is shipping agents that act.
Nobody can say when the human may let go.

Autonomy is earned.
In writing.