about me
All posts
·11 min read

Agent Teams in Claude Code: How They Actually Work


Agent Teams in Claude Code

If you read my previous post on sub agents, you know the basics: spawn workers, keep your context clean, get results back. Sub agents are great for that. But they have a fundamental limitation. They can only talk to you. They can't talk to each other.

Agent teams change that entirely. Instead of isolated workers reporting back to a single thread, you get a group of Claude Code sessions that can message each other, challenge each other's findings, and coordinate on shared tasks. And the part that changed how I work: you can talk to any of them directly while they're working.

What Sub Agents Can't Do

Say you have one agent refactoring an API layer and another updating the frontend that consumes it. With sub agents, the frontend agent has no idea what the API agent changed. It works off stale assumptions. You end up fixing the mismatch yourself.

With agent teams, these two can talk. The API agent finishes changing an endpoint signature and messages the frontend agent directly: "I renamed the response field from userData to user." The frontend agent adjusts without you doing anything.

That's the core difference. Sub agents are workers that report to a manager. Agent teams are coworkers that collaborate.

What's Actually Running Under the Hood

This is where it gets interesting. Each teammate is a fully independent Claude Code session with its own 200k token context window. But they share infrastructure that makes coordination possible:

A shared task list. All teammates can see what needs doing, what's in progress, and what's done. Tasks have real dependency tracking. If task B is blocked by task A, nobody can claim B until A completes. When a teammate finishes a blocking task, dependent tasks unblock automatically. File locking prevents two teammates from grabbing the same task simultaneously.

A messaging system. Teammates send direct messages to each other by name, not by some internal ID. When a teammate sends a message, it gets delivered automatically to the recipient. The lead doesn't poll for updates, they just arrive as new conversation turns. There's also broadcast, but it's expensive: one broadcast sends a separate message to every teammate, so costs scale linearly with team size.

A team config file. Stored at ~/.claude/teams/{team-name}/config.json, it lists every member with their name and role. Teammates read this to discover who else is on the team. The task list lives at ~/.claude/tasks/{team-name}/.

Automatic idle notifications. After every turn, a teammate goes idle and automatically notifies the lead. This tripped me up at first. I'd see "teammate idle" and think something broke. It didn't. Idle just means they finished their current turn and are waiting. Send them a message and they wake right back up. It's the normal rhythm, not an error.

Peer DM visibility. Here's a subtle one. When one teammate DMs another, the lead gets a brief summary of that message in the idle notification. You don't see the full conversation, but you see enough to know what your teammates are coordinating on without you. This gives you oversight without micromanagement.

Two Ways to Steer the Team

This is the thing I want to emphasize most, because it's what makes teams fundamentally different from sub agents.

Talk to a Teammate Directly

In in-process mode, press Shift+Up/Down to select a teammate, then type. You're talking to that specific agent, not through the lead.

This is for tactical corrections. You see a teammate going down a wrong path, you fix it immediately:

  • "Stop. The auth module uses JWTs, not sessions. Check src/auth/jwt.ts"
  • "Don't refactor that utility function, it's shared with the billing module"
  • "The test is flaky because of the database seed order, not a race condition"

No round-trip through the orchestrator. No waiting for the lead to relay your message. Direct.

You can also press Enter to view a teammate's full session, see exactly what they're doing, and press Escape to interrupt their current turn if they're heading somewhere bad.

Ask the Lead to Handle It

Better when you need coordination across multiple teammates:

"Tell the backend teammate to expose a new endpoint for user preferences, and have the frontend teammate add a settings page that calls it."

The lead breaks that down, messages the right agents, creates tasks, and tracks progress. You gave one instruction, the lead turned it into a coordinated multi-agent workflow.

Another example: "The API teammate just changed the response schema. Make sure the test teammate knows about it." The lead sends the right message to the right agent with the relevant context.

The first approach is surgical. The second is strategic. I bounce between both constantly depending on what the situation needs.

How to Set This Up

It's still experimental, so you need to opt in:

// settings.json
{
  "env": {
    "CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS": "1"
  }
}

Then describe what you want. No special syntax:

Create an agent team to build the notification system. One teammate for the
backend API and WebSocket layer, one for the React components and hooks,
one for writing integration tests.

Claude creates the team, spawns teammates, sets up the shared task list, and starts assigning work. Press Ctrl+T to toggle the task list and see what everyone's doing.

You can also specify models: "Use Sonnet for each teammate" if you want to balance cost and speed.

Delegate Mode

One thing that annoyed me early on: the lead would sometimes start implementing tasks itself instead of delegating. It sees a small task, thinks "I can just do this," and suddenly your orchestrator is heads-down writing code instead of coordinating.

Press Shift+Tab to enable delegate mode. The lead loses access to code-editing tools. It can only spawn teammates, send messages, manage tasks, and synthesize results. Pure coordination.

This matters more than it sounds. Without delegate mode, you end up with a lead that's half-coordinator, half-worker, and bad at both. With it, the lead focuses entirely on breaking down work, assigning it, relaying information between teammates, and synthesizing results. That's what you actually want from an orchestrator.

Plan Approval: The Safety Net Sub Agents Never Had

With sub agents, when one goes off the rails, you don't find out until it's done. With agent teams, you can require teammates to plan before they touch code:

Spawn a teammate to refactor the payment processing module.
Require plan approval before they make any changes.

The teammate works in read-only mode. It reads the code, analyzes dependencies, and sends a plan to the lead. The lead reviews and either approves or rejects with feedback. The teammate stays in plan mode until approved, revises if rejected, and only then starts writing code.

You can shape the lead's judgment: "only approve plans that include test coverage" or "reject any plan that modifies the database schema directly." The lead enforces these criteria on its own, so you don't have to review every plan yourself.

Where I've Actually Used This

Cross-Layer Feature Development

A user preferences feature that spans the full stack. Backend needs an API and a database migration. Frontend needs a settings page. Tests need to cover the integration.

With sub agents, the test agent would be writing tests based on the original spec. By the time the backend agent makes changes mid-implementation, the tests are already wrong.

With an agent team, the test teammate watches what the other two produce. When the backend teammate changes a response shape, it messages the test teammate. The tests track the real implementation, not a stale plan.

I directed this by telling the lead: "Have the test teammate wait for the backend teammate to finish the API before writing integration tests, but start on unit tests for the frontend components immediately." The lead set up task dependencies, integration tests blocked by API completion, unit tests unblocked, and the team self-coordinated from there.

Debugging Where Nobody Agrees on the Cause

Intermittent 502 errors. Could be anything. A single agent picks the first plausible theory and chases it. Confirmation bias.

I spawned four teammates and told them to actively try to disprove each other:

Investigate the 502 errors. Four teammates, four hypotheses.
Have them message each other to challenge findings.
The theory that survives is probably right.

One blamed the load balancer. Another found connection pool evidence. A third checked DNS. They messaged each other, poking holes in each other's analysis, and converged on the actual cause. The adversarial dynamic produces better answers than sequential investigation, because theories get stress-tested instead of just explored.

I could check in on the progress through the lead's idle notifications, which include summaries of the peer messages being exchanged. When one teammate was chasing a dead end, I messaged them directly: "The other teammate already ruled out DNS. Focus on the deployment pipeline instead." Saved probably fifteen minutes of wasted investigation.

Large Codebase Refactoring

Migrating a component library to a new design system. Thirty components. Each one is independent, but they all need to follow the same patterns.

I had one teammate read the design system docs and write a migration guide, then spawned five more teammates that each owned a batch of components. The first teammate broadcast the migration guide (one of the rare cases where broadcast makes sense, everyone genuinely needed the same information). The five workers each handled their batch, messaging the guide-writer when they hit ambiguous cases.

Without the inter-agent communication, each worker would have interpreted the design system slightly differently. With it, the guide-writer became a living reference that maintained consistency across the whole team.

Patterns That Save Time

Give teammates enough context upfront. They load your CLAUDE.md and project config automatically, but they don't inherit the lead's conversation history. Everything task-specific needs to be in the spawn prompt. Be explicit: file paths, constraints, conventions.

Break work along file boundaries, not logical boundaries. Two teammates editing the same file leads to overwrites. If two tasks touch the same file, one teammate owns both. Period.

Monitor using direct messages, not by watching output scroll. Shift+Up/Down to check on a teammate is faster than trying to parse their terminal output. Ask them "what's your status?" and you get a concise answer.

Use the lead for strategic steering, direct messages for tactical corrections. "Tell the team to prioritize the API layer first" goes to the lead. "Stop, that function signature is wrong" goes directly to the specific teammate.

Don't panic when teammates go idle. Idle is the normal state between turns. It doesn't mean they're stuck or crashed. Send them a message and they wake up. Only worry if a teammate has been idle for an unusually long time with unfinished tasks. That might mean they hit an error.

When Teams Are Overkill

Not everything needs a team. Use a single session or sub agents when:

  • Tasks are sequential, each step depending on the last
  • The whole thing takes less than a few minutes
  • You need tight back-and-forth debugging loops
  • Token budget matters, every teammate is a full Claude instance

The honest tradeoff: teams use significantly more tokens. Each teammate has its own context window. For research, debugging, and multi-layer features, the token cost is worth the speed and coordination. For anything you could finish in a single session without your context getting stale, it's not.

The Shift in How You Work

With a single session, Claude is your assistant. With sub agents, it's your parallel workforce. With agent teams, it's a team you're leading.

The ability to steer teammates mid-task, directly or through the lead, is what makes this more than just "sub agents with extra steps." You're not dispatching work and hoping for the best. You're directing it in real time, course-correcting based on what you see, and letting agents coordinate with each other so you don't have to relay every piece of information yourself.

The first few times, it feels like overkill. Then you hit a task where three agents are working in parallel, one finds something that changes the approach, messages the others, and they all adjust without you doing anything. That's when it clicks.