Agent Rollbacks Need Rehearsals, Not Hope

Diagram showing an AI agent proposing a change, running it in a sandbox, checking traces, rehearsing rollback, and only then merging

The scariest agent failure is not the one where the bot obviously faceplants.

That one is annoying, but at least everybody can see the smoke.

The scarier failure is the clean-looking PR. Green checks. Confident summary. Nice little bullet list. The agent says it updated the config, migrated the thing, adjusted the integration, cleaned up the scripts, and verified the happy path.

Then production coughs blood six hours later and the rollback plan is: “uh, revert the PR?”

Cute.

Agent rollbacks need rehearsals.

Not vibes. Not a paragraph in the runbook. Not a human squinting at a diff at midnight. A rehearsed, logged, tool-enforced undo path before the change is allowed to merge.

Source freshness check: this post was checked on 2026-06-25. The agent/devtool stack is moving right now: Gemini CLI was pushed on 2026-06-25, Claude Code, OpenAI Agents SDK, MCP specification, GitHub MCP Server, LangGraph, and Semantic Kernel were all pushed on 2026-06-24. The current direction is obvious: agents are becoming active execution surfaces around repos, tools, workflows, and external systems. That makes rollback discipline current, not theoretical.

“Revert the PR” is not a rollback strategy

A git revert is a useful tool.

It is not a complete rollback strategy.

It does not automatically undo:

database writes
migrations that already ran
generated files shipped elsewhere
feature flags flipped by a tool call
tickets/comments/statuses posted to external systems
cache invalidations
secrets rotated by accident
cloud resources created during a run
user-visible state changed outside the repo

Agents make this worse because the change boundary gets fuzzy.

A human might say “I edited three files.”

An agent might edit three files, call an MCP server, regenerate snapshots, touch a migration, retry a flaky command, update a config file, and leave a trail of side effects across tools. If the final PR only shows the repo diff, your rollback view is missing half the damn crime scene.

Confused John Travolta reaction GIF representing an engineer looking for the agent rollback plan after a failure

Every agent change should produce an undo receipt

If an agent can produce a diff, it can produce an undo receipt.

That receipt should say, in boring concrete terms:

change_id: agt_change_2026_06_25_1048
agent: dependency-maintenance-agent
intent: update auth middleware and regenerate client types
repo_diff: pull_request_1842
external_side_effects:
  - tool: github.issue.comment
    target: issue_918
    rollback: delete_comment_or_append_correction
  - tool: feature_flags.update
    target: auth.new_session_validator
    rollback: set false
  - tool: db.migration.plan
    target: migration_20260625_auth_sessions
    rollback: migration_20260625_auth_sessions_down
required_rehearsals:
  - clean checkout revert applies
  - migration down succeeds against fixture database
  - feature flag reset works with scoped token
  - generated client can be restored from previous artifact
result: rollback_rehearsed

Not a novel.

A receipt.

Something CI, reviewers, and incident responders can inspect without asking the model to explain itself after the blast radius has already expanded.

The sandbox should test the undo path too

Most agent safety talk stops at “run it in a sandbox.”

Good start. Not enough.

A sandbox that only proves the forward path is half a seatbelt.

The agent should have to prove:

apply the change
run the relevant checks
capture side effects
execute the rollback
run the checks again
prove the system returns to the expected baseline

If step four fails, the change is not ready.

If step six fails, the change is not ready.

If the agent cannot name what it changed outside git, the change is not ready.

This is not bureaucracy. This is how you keep “agent velocity” from turning into “automated incident generation.”

Tool calls need inverse operations

Agent tool design is still too obsessed with what the agent can do.

The grown-up question is: can it undo what it did?

For every write-capable tool, ask for the inverse:

create_branch → delete_branch
post_comment → delete_comment or supersede_comment
set_feature_flag → restore previous flag value
run_migration → validated down migration or explicit irreversible marker
create_cloud_resource → destroy resource plus cleanup verification
rotate_secret → rollback procedure, dependent service refresh, and audit note

Some operations are intentionally irreversible.

Fine. Then the agent should say that before it runs them, and the workflow should require stronger approval.

“Cannot rollback” is valid metadata.

“Oops” is not.

This Is Fine reaction GIF representing a team pretending an unrehearsed agent rollback will work during an incident

Review should block on rollback evidence

Human review should not be reduced to reading whatever confident summary the agent wrote.

The review checklist needs hard gates:

What changed in git?
What changed outside git?
Which tool calls were write-capable?
Which side effects are reversible?
Which side effects are irreversible?
Was rollback rehearsed in a sandbox?
Did the rollback return tests, fixtures, generated artifacts, and traces to baseline?
Where is the receipt?

If the receipt is missing, block the merge.

If the rollback was not rehearsed, block the merge.

If the agent cannot explain external side effects, block the merge.

Yes, that slows things down.

Good.

You know what slows things down more? A fast agent creating a slow incident.

Agents need change classes

Not every agent task needs the same rollback ceremony.

Fixing a typo is not rotating a secret. Updating docs is not migrating customer state. The answer is not “make every agent change painful.” The answer is change classes.

A sane setup looks like this:

class_0_docs_only:
  rollback: git_revert
  rehearsal: optional

class_1_code_only:
  rollback: git_revert_plus_tests
  rehearsal: required_in_ci

class_2_generated_or_config:
  rollback: restore_artifact_or_previous_config
  rehearsal: required_with_diff_receipt

class_3_external_side_effects:
  rollback: inverse_tool_calls_required
  rehearsal: required_in_sandbox
  approval: human

class_4_irreversible_or_prod_data:
  rollback: impossible_or_partial
  rehearsal: dry_run_required
  approval: explicit_human_plus_runbook

That gives agents room to move without giving them a chainsaw in a daycare.

The real feature is confidence you can unwind

The agent market keeps selling speed.

Speed is nice.

But the feature serious teams will pay for is confidence: the confidence that a delegated change can be inspected, constrained, replayed, and unwound.

That means the agent platform needs receipts. The tool layer needs inverse operations. The sandbox needs rollback tests. The PR needs evidence. The reviewer needs a red button that actually works.

Because the first time your agent breaks something important, nobody will care that it saved twelve minutes on the implementation.

They will ask one question:

Can we roll this back?

If the answer is “probably,” you already fucked up.