Agent Artifacts Need Chain of Custody, Not Mystery Meat

Diagram of an AI agent artifact moving through receipts, policy gates, and a signed artifact store

The next agent security mess is not going to start with a robot dramatically deploying production at midnight.

It is going to start with a file.

A patch. A screenshot. A test log. A generated migration. A browser recording. A “summary” pasted into a ticket. A neat little artifact that looks official because the product wrapped it in a nice card and a green check.

Then someone will ask the only question that matters:

Where did this come from?

And the answer will be some version of: “the agent made it.”

That is not provenance. That is vibes wearing a badge.

Agent artifacts need chain of custody. Every diff, report, file, recording, plan, and deploy bundle should carry receipts: which model, which prompt, which tools, which inputs, which approvals, which policy gates, which tests, which human handoff, and which version of reality it was based on.

No receipt? Then it is not an artifact. It is a rumor with a filename.

Source freshness check: this post was checked on 2026-06-29 against active agent/devtool projects and protocols. Recent commits show the stack is still moving right now: LangGraph on 2026-06-28, GitHub MCP Server on 2026-06-27, Gemini CLI on 2026-06-26, Claude Code on 2026-06-26, OpenAI Agents SDK on 2026-06-25, and Model Context Protocol on 2026-06-25. The current pattern is clear: agents are becoming tool-using execution systems, not chat boxes. That makes artifact provenance a live product problem, not compliance theater.

Generated output is not automatically evidence

Teams are about to drown in agent-generated “evidence.”

The agent says tests passed. The agent says the screenshot matches. The agent says it read the docs. The agent says the dependency is safe. The agent says the migration is reversible. The agent says the PR is ready.

Cool story.

Show the receipt.

An artifact should answer basic questions without making a reviewer dig through a cursed transcript:

what input triggered it?
what model and agent version produced it?
what tools were called?
what commands ran?
what files were read and written?
what network resources were accessed?
what secrets or restricted scopes were explicitly denied?
what policy checks passed?
what tests failed, flaked, or were skipped?
what human approval changed the run state?
what later context invalidated the result?

That sounds heavy until you realize build systems, CI, package registries, deployment platforms, and security teams already live in this world.

Agents do not get an exemption because their summaries are confident.

Confused John Travolta reaction GIF representing a reviewer trying to find where an agent artifact actually came from

The artifact should carry its own passport

A serious agent artifact needs metadata attached at creation time, not reconstructed after the fire.

Something like:

{
  "artifact_id": "art_2026_06_29_0017",
  "type": "pull_request_diff",
  "created_at": "2026-06-29T01:00:00Z",
  "agent_run": "run_7f42",
  "model": "agent-model@version",
  "workspace": "repo:payments-service@8c9a12f",
  "prompt_hash": "sha256:...",
  "tool_receipts": [
    { "tool": "git.diff", "status": "ok" },
    { "tool": "npm.test", "status": "failed_then_passed" },
    { "tool": "browser.screenshot", "status": "ok" }
  ],
  "policy": {
    "network": "deny_by_default",
    "secrets": "not_granted",
    "production_write": "not_granted",
    "human_approval": "required_before_merge"
  },
  "verification": {
    "tests": "passed",
    "lint": "passed",
    "review_required": true
  }
}

You do not need that exact schema. You need the product behavior.

When a developer opens an agent-produced PR, the provenance should be visible. When a manager reads an agent-produced incident summary, the source materials should be traceable. When a deploy system receives an agent-generated bundle, the deploy gate should know whether the artifact came from a clean workspace with approved tools or a haunted side quest.

If your artifact store cannot tell the difference, your agent product is smuggling trust through the UI.

Screenshots and summaries are especially dangerous

Code diffs at least leave a trail. Screenshots and summaries are worse.

They feel like evidence, but they are easy to detach from the run that produced them.

A screenshot without context does not prove the right branch was running. A summary without citations does not prove the agent read the right docs. A test log without command receipts does not prove the exact patch was tested. A browser recording without environment metadata does not prove the production path was exercised.

This is how teams accidentally promote generated artifacts into truth.

The UI says “verification complete.”

The artifact says nothing.

The human assumes the missing context exists somewhere.

It does not.

This Is Fine reaction GIF representing a team trusting agent-generated artifacts without provenance

Chain of custody is product UX

Do not bury provenance in a compliance export nobody opens.

Put it where the work happens.

A PR card should show:

source commit and branch
agent run ID
tool call summary
tests run and tests skipped
files touched
permissions granted
approvals requested
known limitations
replay or inspect button

An agent report should show citations, source timestamps, and stale-source warnings.

A deploy artifact should show whether it was produced from a clean checkout, whether generated files were modified by humans afterward, and whether the agent had permission to touch production-facing config.

This is not “enterprise checkbox” stuff. This is how you keep builders from trusting mystery meat.

The best provenance UX is boring in the happy path and loud when something smells wrong:

This artifact was generated from an uncommitted workspace.

This summary cites source material older than 90 days.

This screenshot was captured before the latest patch.

This diff was produced after a denied tool call.

This deploy bundle has no matching test receipt.

That is useful friction. The good kind. The kind that saves your ass.

Receipts make agents easier to trust, not harder to use

Some teams hear “provenance” and think paperwork.

Wrong framing.

Receipts make agents faster because reviewers stop asking the same defensive questions every time.

Without provenance, every agent artifact starts from zero trust. The human has to inspect, rerun, verify, and mentally reconstruct the story.

With provenance, the review starts halfway up the ladder:

here is what the agent changed
here is how it verified the change
here is what it could not verify
here is what it was forbidden to touch
here is what needs human judgment

That is not bureaucracy. That is compression.

A good agent product should make trustworthy artifacts cheap and untrusted artifacts obvious.

The bottom line

Agents are moving from chat outputs to work outputs.

Work outputs become artifacts.

Artifacts get reviewed, copied, merged, deployed, audited, and used as evidence.

If those artifacts do not carry chain of custody, the organization eventually has to choose between blind trust and manual rework.

Both suck.

So build the receipt layer now:

sign artifacts or at least hash and bind them to a run
store tool receipts beside outputs
make source freshness and environment metadata visible
warn when artifacts are stale, partial, or detached
require policy gates before high-impact artifacts can move downstream
make replay/inspection part of the artifact UX

Agent products that get this right will feel calmer. Less magic show, more cockpit recorder.

The ones that skip it will keep shipping mystery meat and calling it automation.