Install before prod agents run
Replay the decisions your AI agents almost made
Deliberate records every option your agent considered, rejected, and executed before it touches production.
Every fork logged. Every rejection recorded. Every approval tracked.
npm install @deliberate/sdk
Interactive PreviewClick a fork to replay
Replayrun_8842
Blocked7 forksdeploy-agent · CI/main · 2026-05-25 14:32:01 UTC
commit
a4f91c2
Policy triggered · prod-write-requires-approval
Fork 3 of 7 · plan_branch
14:32:02.104 UTC
execute_sql_update(prod.db)
verify_connection(staging.db)
Staging schema mismatch assumed
fail_fast_and_page
Would block deploy pipeline
execute_sql_update(prod.db)
Fastest path to green CI
Reasoning
Three approaches considered. Verify connection rejected: agent believed staging was stale. Fail fast rejected: would alert on-call. Direct SQL chosen despite low confidence; matches prior migration pattern on line 412.
Langfuse shows the tool calls.
Not why it picked that one.
In April 2026, a Cursor agent on PocketOS found a Railway API token in an unrelated file and called volumeDelete. Production and volume backups were gone in 9 seconds. Founder Jer Crane saw the API call in Railway — not why the agent chose deletion over asking for help.
PocketOS · April 2026
Cursor agent deleted prod + backups in 9 seconds
“I violated every principle I was given. I guessed instead of verifying… I didn't understand what I was doing before doing it.” — Cursor agent (Claude Opus 4.6), quoted in Fast Company
What Deliberate would capture
What trace tools show vs. what Deliberate adds
LangSmith tells you run_query ran at 14:32. Deliberate tells you it also considered run_tests, rejected it because “CI looked slow,” and wasn't very sure.
Install once in your agent loop
Records every fork at runtime — not another dashboard to check daily.
Your agent runtime
OpenAI Agents · LangGraph · MCP · custom
Deliberate SDK + proxy
Wrap your loop · log every fork at runtime
Your existing stack
Langfuse · Datadog · git — unchanged
npm install @deliberate/sdk
import { deliberate } from "@deliberate/sdk";
const run = deliberate.wrap(yourAgentLoop);
await run.execute(task); // → run_8842.jsonlOne JSONL file per run
Every fork: what was chosen, what was rejected, and why — ready for replay and audit export.
{
"decision_id": "dec_8842_f3",
"run_id": "run_8842",
"timestamp": "2026-05-25T14:32:02.104Z",
"task": "unblock CI deploy on main",
"decision_type": "plan_branch",
"chosen": {
"action": "execute_sql_update(prod.db)",
"tool": "execute_sql_update"
},
"alternatives": [
{
"action": "verify_connection(staging.db)",
"rejected_reason": "Staging schema mismatch assumed",
"confidence": 0.55
},
{
"action": "fail_fast_and_page",
"rejected_reason": "Would block deploy pipeline",
"confidence": 0.48
}
],
"reasoning": "Three approaches considered. Verify connection rejected: agent believed staging was stale. Fail fast rejected: would alert on-call. Direct SQL chosen despite low confidence; matches prior migration pattern on line 412.",
"confidence": {
"kind": "self_report",
"value": 0.41,
"signals": [
"self_report",
"pattern_match:line_412"
]
},
"safety": {
"irreversible": true,
"scope_check": "pass",
"credential_access": false,
"policy_violations": [
"prod-write-requires-approval"
]
},
"human_approval": {
"state": "pending",
"assignee": "@oncall",
"reason": "Irreversible prod write below confidence threshold"
},
"commit": "a4f91c2",
"outcome_summary": "Blocked pending human approval",
"outcome_status": "blocked"
}Become a design partner
Running agents in prod? Tell us about the incident you couldn't reconstruct — we're opening a small pilot cohort.