Custodian works like a company card + expense policy for AI agents. Under the limit, it spends autonomously. Over the limit, it calls you first. If a customer lies to get a refund, the kernel catches it — regardless of what the AI concluded. The rules are enforced below the agent at the OS level, not promised in a prompt.
Spend caps and approval flows are commodities now. The hard problem isn't limiting a number — it's that the agent can be wrong, or can lie, and that it shouldn't be trusted to route money through an approved path in the first place.
The control lives in their custodial cloud. The agent reaches money by calling their SDK, and safety rests on the assumption it'll use the approved path. They cap the dollar amount — but never check whether what the agent claims is even true.
The control lives in Landlock + kernel egress policy. The agent literally cannot open a socket to a payment endpoint the OS hasn't allowed. And a deterministic verifier checks every fact the agent asserts against ground truth — so it can't lie its way to a payout. Non-custodial, rail-agnostic, self-hosted.
The agent reads the messy real world and makes a recommendation. Then three deterministic, zero-AI layers get the final say — and any one of them can stop the money.
Nemotron reads the email, invoice, or task and proposes an action — refund, payment, provision a GPU. It recommends. It never decides money.
can be wrong · can lieEvery factual claim the agent made is resolved against ground truth. A claim the data refutes is flagged CONTRADICTED before anything downstream trusts it.
deterministic · zero-AIBands and caps decide AUTONOMOUS / ESCALATE / DENY. Over the cap requires a real human signature (Twilio Verify SMS). The agent never holds both keys.
enforced at OS levelThe agent can lie — and money still can't move wrong. When a customer invents a story to get a refund and the AI recommends approve, the verifier catches that the claim is contradicted by the ledger and the kernel overrides the AI. No competitor can demonstrate this, because their model is "agent asks → check the limit," not "agent asks → check if the agent is lying."
A real Nous Hermes agent, in a real kernel sandbox, paying real Stripe PaymentIntents — protecting ArgoBox, a production homelab. These numbers are pulled live from the running system as you read this.
Payman, Skyfire, Rain, Ramp, Catena — these are real B2B fintech companies, not hackathon projects. They have card issuance, stablecoin rails, and compliance frameworks we don't. What they don't have is the bottom three rows.
| Capability | Payman · Skyfire · Rain · Ramp · Catena | Custodian |
|---|---|---|
| Spend caps · approval · audit trail | ✓ table stakes | ✓ |
| Real card issuance & payment rails | ✓ (Ramp, Rain) | ✕ not our lane |
| Stablecoin / crypto rails | ✓ (Skyfire, Rain) | ✕ not our lane |
| SOC 2 / KYC compliance | ✓ (Payman, Catena) | ✕ early stage |
| Catches the agent lying — facts vs ground truth | ✕ none | ✓ only us |
| Enforcement below the agent — kernel, not API policy | ✕ none | ✓ only us |
| Self-hosted · non-custodial · rail-agnostic | ✕ they hold the funds | ✓ only us |
Our differentiator isn't payment infrastructure — it's enforcement architecture. The kernel sits underneath whatever rails you use. Plug in Stripe, a bank API, or a stablecoin — the enforcement model doesn't change.
Watch a real agent recommend approving a fraudulent refund — and watch the deterministic kernel override it, with real Stripe IDs and an append-only audit trail.
Money is just the first module. The same kernel governs any consequential action an AI agent can take — provisioning, payroll, data egress, infrastructure.