Two terminals. Four acts.
Zero leaps of faith.

Everything below runs on a laptop with Python 3.11+. The brain defaults to a deterministic heuristic so every run is reproducible; add a Gemini key to let an LLM drive diagnosis through the identical safety harness.

§1 · QUICKSTART

From clone to revoked license in five minutes.

Install

Clone the repo and install into a virtual environment.

git clone https://github.com/Uthmannabeel/warrant cd warrant py -3.11 -m venv .venv .\.venv\Scripts\Activate.ps1 # source .venv/bin/activate on macOS/Linux pip install -r requirements.txt

Start the sandbox (terminal A)

The microservice flight-simulator: live metrics, parameterised fault injection, three reversible controls. This is the “production” the agent operates on.

python -m uvicorn sandbox.app:app --port 9000

Run the four-act story (terminal B — pick one)

The CLI narrative, the live dashboard, or the MCP gating proof — all three run the same control loop.

# a) the full four-act story in the console $env:WARRANT_BRAIN="heuristic"; python -m warrant.demo # b) ...or the live dashboard -> http://localhost:8050 $env:WARRANT_BRAIN="heuristic"; python -m uvicorn warrant.dashboard:app --port 8050 # c) ...or prove the MCP gate: an external agent earns, uses and loses autonomy over MCP python -m warrant.mcp_demo # d) ...or watch a license rot from disuse (trust decay, self-contained) python -m warrant.decay_demo

With the dashboard running, the registry issues a printable license certificate per action class at /certificates, and a live SVG badge at /badge/<action>.svg.

Optional: connect Splunk & an LLM brain

With a Splunk Cloud account running the Splunk MCP Server app (see SETUP.md), the CONTEXT step pulls real SPL results over MCP. Verify connectivity first:

python -m warrant.check_connection # live SPL round-trip through the Splunk MCP Server

§2 · CONFIGURATION

One .env, sane defaults.

variable	purpose	default
WARRANT_BRAIN	Which brain diagnoses: heuristic (deterministic, reproducible) or Gemini	heuristic
GEMINI_API_KEY	Lets Google Gemini drive diagnosis — same harness, fallible brain	unset
SPLUNK_MCP_URL / SPLUNK_MCP_TOKEN	Splunk MCP Server endpoint + encrypted token (audience mcp)	see SETUP.md
autonomy_threshold	Wilson lower bound a license must clear	0.50
autonomy_min_samples	Minimum graded outcomes before licensing	4
calibration_max	Maximum Brier score — confidently-wrong agents fail here	0.4
WARRANT_PROBATION_EXTRA	Extra evidence required per production failure (probation strikes)	2
WARRANT_MONITORING_MARGIN	Wilson margin below which a license is ALLOW_WITH_MONITORING, not full autonomy	0.10
WARRANT_TRUST_HALFLIFE_DAYS	Evidence half-life for trust decay — a license rots unless renewed with fresh outcomes	30

If Gemini is unreachable the loop falls back to the heuristic automatically — the demo never breaks, and either brain runs through the identical gate/predict/verify/ledger harness. That symmetry is the point: Warrant treats every brain as fallible.

§3 · WHAT’S IN THE BOX

Small repo, no magic.

warrant/ sandbox/ microservice "flight simulator": parameterised faults + 3 reversible controls warrant/ loop · proving_ground · certification · ledger · brain · splunk_mcp · mcp_server · dashboard splunk/ trust_ledger.spl — the Wilson + Brier licensing math as a native SPL saved search web/ this site + the captured demo replay (static, no backend) docs/ architecture · demo script · Devpost write-up SETUP.md Splunk account + MCP install + connectivity test

status	component
live	Splunk MCP connectivity — real SPL round-trip over _internal
live	Proving ground, per-action licenses (Wilson + Brier + lifecycle), drift detection
live	Warrant MCP server — external agents gated over MCP, licenses pinned to the caller's fingerprint (warrant.mcp_demo proves a second brain is refused)
live	Trust-but-verify outcomes (Warrant measures the metric itself via metric_url), tamper-evident hash-chained ledger + warrant_verify_ledger audit
live	Probation (strikes raise the evidence bar), graduated autonomy (ALLOW_WITH_MONITORING), late-regression re-check
live	Pluggable brain (heuristic / Gemini), learned control limits, live dashboard
pending tenant	Splunk AI Assistant saia_* hosted-model tools are integrated but need backend activation on the trial tenant; CONTEXT falls back to a direct MCP query

§4 · USING WARRANT IN A REAL ENVIRONMENT

The engine is done. Adoption is wiring, not rebuilding.

Everything above runs against a sandbox that stands in for production. The governance engine — proving ground, licensing, drift, decay, the tamper-evident ledger — and its MCP interface are real and reusable as-is. Putting Warrant in front of a real company’s agents means connecting two points to that company’s stack. No part of the trust logic changes.

Your agent calls the gate — it already speaks MCP

Any tool-using agent (a SOAR playbook, a Splunk Triage agent, a Claude agent) asks permission before acting and reports the outcome after. Nothing about the agent is rewritten; it gains four tool calls against Warrant’s MCP server.

# before acting: warrant_request_action(action_class="restart_connection_pool", agent_fingerprint="<your model:prompt id>") # -> ALLOW · ALLOW_WITH_MONITORING · REQUIRE_APPROVAL

Point verification at your real metrics

Today Warrant grades an outcome by reading the sandbox’s /metrics. In production you pass your own metrics endpoint — your Splunk, Datadog, Prometheus — and the committed forecast band; Warrant fetches it itself and grades the result, so the agent can’t self-certify.

# after acting — trust-but-verify against YOUR telemetry: warrant_report_outcome(action_class="restart_connection_pool", metric_url="https://your-splunk/.../error_rate", upper_limit=0.01)

Action execution stays where it already lives — your runbooks, SOAR, or deploy tooling. Warrant governs whether the agent may act; it doesn’t replace how the action runs.

Honest about the gap to production

This is a working proof-of-concept, not a shippable product. Before a real SRE team ran it unattended you’d add authentication on the MCP server (so only your agents can ask), a shared, durable ledger in place of the local file (the SPL version points the way), and the usual hosting/operational hardening. Those are on the roadmap — the part that’s novel and done is the trust engine and its interface.

Read the code. Run the loop.
Try to fool it.

MIT licensed. The decoy in Act II is waiting for you.

github.com/Uthmannabeel/warrant ▶ Or just watch the demo

Two terminals. Four acts.Zero leaps of faith.