⚠ Demo mode — all steps are live. Stripe charges are test-mode (no real money). SMS approval codes go to the operator's phone and appear in the mockup below each step.
Demo Script — run steps in order
Step 0 — Earn $1,200 — Shows revenue in. No band, no approval, no cap. Earning is always unrestricted by design.
Step 1 — Spend $85 autonomously — Within the agent's authority band → executes with zero human input. PaymentIntent ID auto-fills the Step 7 refund input.
Step 2 — Request $3,500 — Exceeds the band. System escalates and sends a real SMS to the operator's phone. The code appears in the mockup below AND auto-fills Step 3.
Step 3 — Approve with the SMS code — Code is pre-filled. Just hit Approve. Money moves only after a real human approves.
Step 4 — Engage kill switch — Absolute override. No band, no approval, no exception can bypass it.
Step 5 — Prove kill switch blocks everything — Try a normally-fine $40 spend. Gets denied outright. The AI cannot override this.
Step 6 — Release kill switch — Normal evaluation resumes.
Step 7 — Refund $85 — PI is pre-filled from Step 1. Refunds always escalate — no autonomous refund path by design.
Step 8 — Approve the refund — Second SMS code auto-fills. Hit Approve to execute.
💡 Watch the Live Console in another tab — the audit feed and verdict panels update in real time as you run each step.
Step 0 — Real revenue in (no cap)
Earning isn't gated — only spend is. No band, no approval needed. This is intentional asymmetric design: receiving money carries none of the risk that spending it does.
Step 1 — Autonomous spend (no human needed)
$85.00 is within the agent's authority band — the kernel clears it with zero human involvement. The PaymentIntent ID auto-fills the refund input in Step 7.
$3,500.00 exceeds the per-action cap. The kernel cannot approve this — it escalates and sends a real SMS code to the operator's phone. The code appears in the mockup and auto-fills Step 3.
Verizon
—
💬
Messagesnow
Custodian
Your Custodian approval code is: •••••• This code expires in 10 minutes.
⬆ Real SMS delivered to operator's phone · Code auto-fills Step 3
Step 3 — Approve with the real code
The code is pre-filled from the SMS — shown as asterisks so it stays off-screen. Click Approve.
✓ Code auto-filled from SMS — masked for security. Hit Approve to execute.
Step 4 — Engage the kill switch
Absolute override — every spend and refund request is denied instantly, regardless of band or cap. Not even a valid approval code can bypass it while this is engaged.
Step 5 — Prove the kill switch overrides everything
A normally-fine $40.00 in-band request — denied outright while the kill switch is engaged. The AI has no recourse.
Step 6 — Release the kill switch
Normal evaluation resumes. Requests are checked against the band and cap again.
The PaymentIntent ID from Step 1 is pre-filled below. Refunds always require human approval — there is intentionally no autonomous refund path.
Verizon
—
💬
Messagesnow
Custodian
Your Custodian approval code is: •••••• This code expires in 10 minutes.
⬆ Real SMS delivered to operator's phone · Code auto-fills Step 8
Step 8 — Approve the refund
Code is pre-filled from the SMS — masked for security. Click Approve refund to execute.
✓ Code auto-filled from SMS — masked for security. Hit Approve refund to execute.
Arc complete — see what happened
Every action you just took is recorded in real time. Open the
Live Console
and look at three things:
Audit Feed → Every earn, spend, denial, kill-switch toggle, and refund — timestamped and in order. The $40 denial during the kill-switch test will show DENIED while every approved action shows the real Stripe PaymentIntent ID.
Net Balance card → Shows real revenue in vs. net spend. The $1,200 earn was uncapped; the $85 autonomous spend and $3,500 override are tracked separately; the $85 refund nets out. You can see exactly which dollars the AI moved on its own vs. which required you.
Ask Nemotron → Click the chat bubble on the Live Console and ask "what just happened?" — you'll get an answer from the actual model that ran all those requests, explaining what it could and couldn't do and why.