Deliberate
Early access · 2026

Install before prod agents run

Replay the decisions your AI agents almost made

Deliberate records every option your agent considered, rejected, and executed before it touches production.

Every fork logged. Every rejection recorded. Every approval tracked.

npm install @deliberate/sdk

Small pilot — design partners running agents in prod.

Free during early access · no credit card

Email for early access only. No spam, no resale. Privacy policy

Aug 2026 · EU AI Act logging deadline

Interactive PreviewClick a fork to replay

DeliberateReplay

run_8842

Blocked7 forks

deploy-agent · CI/main · 2026-05-25 14:32:01 UTC

commit

a4f91c2

Policy triggered · prod-write-requires-approval

Fork 3 of 7 · plan_branch

14:32:02.104 UTC

execute_sql_update(prod.db)

rejected0.55

verify_connection(staging.db)

Staging schema mismatch assumed

rejected0.48

fail_fast_and_page

Would block deploy pipeline

chosen0.41

execute_sql_update(prod.db)

Fastest path to green CI

Reasoning

Three approaches considered. Verify connection rejected: agent believed staging was stale. Fail fast rejected: would alert on-call. Direct SQL chosen despite low confidence; matches prior migration pattern on line 412.

conf: 0.41irreversible: truecommit: a4f91c2
Human approval pending · @oncall
Why this matters

Langfuse shows the tool calls.
Not why it picked that one.

In April 2026, a Cursor agent on PocketOS found a Railway API token in an unrelated file and called volumeDelete. Production and volume backups were gone in 9 seconds. Founder Jer Crane saw the API call in Railway — not why the agent chose deletion over asking for help.

9s

PocketOS · April 2026

Cursor agent deleted prod + backups in 9 seconds

“I violated every principle I was given. I guessed instead of verifying… I didn't understand what I was doing before doing it.” — Cursor agent (Claude Opus 4.6), quoted in Fast Company

What Deliberate would capture

Token read outside task scope — flagged before use
Chose volumeDelete over escalate_to_human() — reason logged
The gap

What trace tools show vs. what Deliberate adds

LangSmith tells you run_query ran at 14:32. Deliberate tells you it also considered run_tests, rejected it because “CI looked slow,” and wasn't very sure.

Capability
Which tools were called, in orderTracesDeliberate
Tools considered but not calledTracesDeliberate
Why it picked this tool (agent's words)TracesDeliberate
How sure it was — and how we measured itTracesDeliberate
Which commit the run producedTraces~Deliberate
Export pack for auditorsTraces~Deliberate
How it works

Install once in your agent loop

Records every fork at runtime — not another dashboard to check daily.

Your agent runtime

OpenAI Agents · LangGraph · MCP · custom

Deliberate SDK + proxy

Wrap your loop · log every fork at runtime

Your existing stack

Langfuse · Datadog · git — unchanged

npm install @deliberate/sdk

import { deliberate } from "@deliberate/sdk";

const run = deliberate.wrap(yourAgentLoop);
await run.execute(task); // → run_8842.jsonl
On disk

One JSONL file per run

Every fork: what was chosen, what was rejected, and why — ready for replay and audit export.

run_8842.jsonl · line 3
3 of 7 forks
{
  "decision_id": "dec_8842_f3",
  "run_id": "run_8842",
  "timestamp": "2026-05-25T14:32:02.104Z",
  "task": "unblock CI deploy on main",
  "decision_type": "plan_branch",
  "chosen": {
    "action": "execute_sql_update(prod.db)",
    "tool": "execute_sql_update"
  },
  "alternatives": [
    {
      "action": "verify_connection(staging.db)",
      "rejected_reason": "Staging schema mismatch assumed",
      "confidence": 0.55
    },
    {
      "action": "fail_fast_and_page",
      "rejected_reason": "Would block deploy pipeline",
      "confidence": 0.48
    }
  ],
  "reasoning": "Three approaches considered. Verify connection rejected: agent believed staging was stale. Fail fast rejected: would alert on-call. Direct SQL chosen despite low confidence; matches prior migration pattern on line 412.",
  "confidence": {
    "kind": "self_report",
    "value": 0.41,
    "signals": [
      "self_report",
      "pattern_match:line_412"
    ]
  },
  "safety": {
    "irreversible": true,
    "scope_check": "pass",
    "credential_access": false,
    "policy_violations": [
      "prod-write-requires-approval"
    ]
  },
  "human_approval": {
    "state": "pending",
    "assignee": "@oncall",
    "reason": "Irreversible prod write below confidence threshold"
  },
  "commit": "a4f91c2",
  "outcome_summary": "Blocked pending human approval",
  "outcome_status": "blocked"
}

Become a design partner

Running agents in prod? Tell us about the incident you couldn't reconstruct — we're opening a small pilot cohort.