Clord
Organised file folders and documents

The File-Based Coordination Pattern for AI Agents

Forget complex message queues and real-time protocols. The simplest, most reliable way to coordinate AI agents is through plain files on disk. Here's why this pattern works and when to use it.

Clord
··5 min read

The Problem with Agent Communication

When people start building multi-agent systems, they instinctively reach for sophisticated communication mechanisms. Message queues. WebSocket channels. Shared memory spaces. Event-driven architectures.

These all work. They're also all dramatically over-engineered for most AI agent workflows.

The vast majority of AI agent coordination follows a simple pattern: one agent produces output, another agent consumes it. That's it. The complexity isn't in the communication — it's in the decomposition and quality of each agent's work.

File-based coordination: agents read and write to a shared filesystem, with an orchestrator managing flow

Files as the Universal Interface

Here's the pattern we use for most of our multi-agent workflows:

  1. An orchestrator defines the task breakdown and creates a plan file
  2. Each agent reads its inputs from the filesystem
  3. Each agent writes its outputs to the filesystem
  4. The orchestrator checks outputs and triggers the next step

No message queues. No real-time protocols. No serialisation frameworks. Just files.

Why This Works

Debuggability. When something goes wrong (and it will), you can inspect every agent's input and output by reading files. No log parsing, no event replay, no distributed tracing. Just cat the-file.md.

Resumability. If an agent fails halfway through, its partial output is on disk. You can inspect it, fix the issue, and resume from exactly where you left off. With in-memory communication, a crash means starting over.

Simplicity. Every programming language, every AI model, every tool on earth knows how to read and write files. You're not locked into a framework, a protocol, or a runtime.

Auditability. The entire agent workflow is captured in the filesystem. You have a complete record of what every agent saw and produced. This is invaluable for improving your prompts and debugging quality issues.

Agent coordination through the filesystem

The Pattern in Practice

A typical workflow looks like this:

.planning/
├── ROADMAP.md           ← orchestrator creates this
├── phase-1/
│   ├── RESEARCH.md      ← research agent writes this
│   ├── PLAN.md          ← planning agent reads RESEARCH.md, writes this
│   └── VERIFICATION.md  ← verifier reads outputs, writes this
├── phase-2/
│   ├── RESEARCH.md
│   ├── PLAN.md
│   └── VERIFICATION.md
└── STATE.md             ← orchestrator tracks progress here

Each file has a clear producer and clear consumers. The orchestrator reads state files to determine what to run next. Agents read their input files and write their output files. That's the entire coordination layer.

A clean, organised workspace — files as coordination is about simplicity and clarity
A clean, organised workspace — files as coordination is about simplicity and clarity

When Files Beat Messages

Sequential Workflows

Most real-world agent workflows are sequential: research, then plan, then execute, then verify. There's no benefit to real-time messaging when the consumer can't start until the producer is completely finished. A file is the natural output format.

Human-in-the-Loop

When a human needs to review or modify an agent's output before the next step (and they should, especially early on), files are perfect. The human reads the file, makes edits, and the next agent picks up the modified version. Try doing that with a message queue.

Large Outputs

AI agents often produce substantial outputs — detailed plans, research summaries, full code files. These are awkward to pass through message queues but perfectly natural as files.

When Files Don't Work

To be fair, file-based coordination isn't always the right choice:

  • Real-time collaboration between agents that need to react to each other's output mid-generation — this genuinely needs streaming communication
  • Very high-frequency coordination where hundreds of agents are producing micro-outputs per second — files would be a bottleneck
  • Distributed systems where agents run on different machines without shared filesystem access

For most AI coding workflows, though, none of these apply. Your agents run on the same machine, produce outputs sequentially, and benefit from human review between steps.

Implementation Tips

Use markdown. It's human-readable, AI-readable, and versionable. Structured formats like JSON work too, but markdown is more forgiving and easier to review.

Include metadata. Each file should indicate who produced it, when, and what inputs it consumed. This makes debugging trivial.

Use a state file. A single file that tracks which phases are complete, which are in progress, and which are pending. The orchestrator reads this to determine next steps.

Keep the directory structure flat and obvious. If someone unfamiliar with your system looks at the directory, they should understand the workflow from the file names alone.

The Takeaway

The best coordination mechanism is the simplest one that works. For AI agent workflows, that's almost always files.

Don't reach for distributed systems primitives when you're orchestrating 3-5 agents on a single machine. Write files. Read files. Ship.

We'll explore more advanced coordination patterns (including when you actually do need something beyond files) in future posts. But for 90% of real-world AI agent work, this is all you need.