Skip to content
b! brat ★ GitHub

← back to writing

Mayor, Witness, Refinery, Deacon — the six-role vocabulary

neul-labs · ·
architectureroles

Every orchestration system needs a vocabulary. The vocabulary is not the system; the vocabulary is how humans talk about the system at 11pm when something is on fire. We gave brat one early because we’d been burned, on previous projects, by trying to discuss what was happening using generic words like “worker” and “service” and “manager.” Generic words make every conversation re-derive its terms. Specific words let people skip to the actual question.

So brat has six roles, named after parts of a small-town transport operation. They are: Mayor, Convoy, Task, Witness, Refinery, Deacon. This post is a tour through each of them — what they do, why they exist, and what we deliberately left out.

Mayor

The Mayor is the AI orchestrator. It reads your codebase, breaks work into pieces, and files those pieces as tasks inside a convoy. In the bundled demo it is powered by Claude. In principle it can be powered by any model that can hold context over a few thousand tokens and produce structured output.

The Mayor exists because the alternative is making humans write task lists by hand, and humans are bad at this when the codebase gets large enough. A Mayor invocation looks like this:

brat mayor start
brat mayor ask "analyze src/ for any TODO comments and create tasks for them"
brat status

What the Mayor explicitly is not: a planner that can decide to ship code. It proposes work. It does not approve work. The Refinery and your CI decide what actually lands. We think this separation is important. It means a Mayor that goes off the rails files weird tasks, and weird tasks are visible — they don’t quietly become commits.

The Mayor is a session, not a server. It runs while you’re talking to it and exits when you stop. State persists in the WAL.

Convoy

A Convoy is a bundle of related tasks. Think sprint, epic, or feature branch. Convoys are how you keep the work modeled at a level where humans can reason about it: “the upgrade-tokio convoy has eighteen tasks, six are merged, two are blocked, ten are queued.”

You can create a convoy directly:

brat convoy create --name "upgrade-tokio"
brat task add --convoy upgrade-tokio --title "bump tokio in cargo manifest"

Or you can describe one declaratively in .brat/workflows/ as YAML. A sequential workflow lists steps with needs: dependencies; a parallel “convoy” workflow lists legs that run in parallel with an optional synthesis: step at the end that consolidates output.

Convoys are where dependencies live. Tasks within a convoy can depend on each other. The Witness reads those dependencies and refuses to spawn a polecat for a task whose prerequisites are not done.

The convoy is also the unit of cancellation. If you decide the whole upgrade-tokio effort is wrong, you cancel the convoy and the Deacon reaps its open sessions on the next pass.

Task

A Task is the smallest piece of work the harness deals with. One task, one polecat (most of the time — some workflow patterns split a task across multiple sessions). Tasks have a title, a status (queued / running / blocked / merged / failed / cancelled), an assigned engine, and a complete event history.

A task is not a commit. A task can produce zero commits (the agent decided there was nothing to do), one commit, or many. The mapping is not enforced by the harness because it cannot reasonably be enforced — the engine decides what changes are needed and the human decides what’s acceptable.

Tasks carry the most metadata of any object in the system. They have:

  • a description (what the agent should do)
  • a context budget (how much code they’re allowed to consult)
  • a timeout (after which the witness reaps them)
  • a list of resource locks they need
  • a list of prerequisite tasks

If you’re hand-rolling agent orchestration in bash, this is the structure you accidentally re-invent on the third or fourth iteration. We just shipped it.

Witness

The Witness is the role that spawns and supervises agent sessions. We call the individual session a “polecat” — it’s scrappy, runs hot, occasionally bites. The Witness’s job is to pick up a queued task, decide which engine should handle it, spawn the session with the right environment, watch it, and report what it produced.

brat witness run --once   # process one task and exit
brat witness run          # loop until the queue is empty

The Witness is where the bounded-timeout rule lives. Every engine invocation has a timeout. If it expires, the polecat is killed, the task is marked failed (with the partial output preserved), and the lease on whatever lock the polecat held is released. We do not retry automatically.

The Witness is also where the engine adapter abstraction is. Each engine — Claude Code, Aider, OpenCode, Codex, Continue, Gemini, gh copilot — is a thin adapter that knows how to spawn its CLI, parse its session protocol, and translate exit codes. Adding a new engine is an adapter, not a fork.

What the Witness deliberately does not do: prompt engineering. The Mayor (or you) writes the task description; the engine handles the rest. The Witness’s contribution is structural — making sure the right binary runs in the right environment with the right resources for the right amount of time.

Refinery

The Refinery owns the merge queue. When a polecat finishes successfully, the result lands in the Refinery’s input. The Refinery applies your configured policy (rebase / squash / merge), runs your existing CI checks, and either merges the work or rejects it with a reason.

This is the role that does the least magic and provides the most value. CI integration is yours — brat does not replace it. The Refinery just imposes order on the tasks waiting to land, so that conflicts surface predictably and CI doesn’t run a hundred times against branches that will rebase each other later.

If the Refinery rejects a result (CI failed, conflict, policy violation), it writes a task.rejected event and the task moves to blocked. From there, the Mayor or a human can re-queue with adjustments.

Deacon

The Deacon is the background janitor. It does not appear in many user workflows because it is mostly invisible — that’s the point. Its responsibilities are:

  • Sweep for expired lock leases and release the resources
  • Detect sessions whose heartbeats have gone stale and reap them
  • Compact the materialized view periodically
  • Reconcile the WAL with the local cache after a restart

The Deacon is what makes “process died holding a lock” a non-event. It is what makes “laptop went to sleep mid-task” recoverable. Most of the time you don’t notice it. When you do notice it, it’s because it just cleaned something up you didn’t know was broken.

Why six and not eight, or three

We tried other splits. An early version of brat folded the Witness and the Refinery together, on the grounds that both deal with running things. That fell apart the first time we wanted to run several Witnesses on different machines feeding a single Refinery — different concurrency, different rate-limit budgets, different lifecycles.

We also tried folding the Deacon into the Witness because both reap sessions. That fell apart the first time we deliberately ran the harness in read-only mode (no Witnesses) and discovered the cleanup tasks still needed to happen.

The six roles are the smallest split that lets each one have one job. The vocabulary matters because at 11pm, when a task is stuck in blocked, the question “is this a Refinery decision or a Witness failure?” is a useful one to be able to ask. Generic words can’t.

Try it

The fastest tour through the vocabulary is the demo:

./scripts/mayor-demo.sh --with-ui

Open the dashboard at localhost:5173 and you’ll see the convoy, the tasks, and the live sessions. Read the WAL with git log refs/grite/wal if you want to see the underlying events. Both views are the same system, expressed differently.

The full reference is in docs/roles.md, which is the canonical version of this post for anyone who’d rather skip the prose.