Agentic Coding: From Single Agents to Agent Teams


There’s a progression happening in how developers work with AI.

First, chat: you drive, AI assists. Back-and-forth, line by line. You’re in the loop on every decision.

Then, agents: you delegate, AI drives. You define the goal, the agent figures out the path. You review at the end.

Now, teams: you orchestrate. Multiple agents work in parallel, communicate with each other, and you synthesize their output.

Each step requires letting go of more control—and getting more leverage in return.

Engineers are control freaks. Affectionately. We got good at our jobs by understanding every layer, tracing every bug, owning every decision.

Delegation requires trust and letting go. Both are unnatural.

The failure modes are predictable:

Over-steering: You give instructions so detailed the agent can’t adapt when something unexpected happens. You’ve essentially written pseudocode and asked the AI to translate it.

Under-specifying: “Make it better” is not a task. Neither is “fix the bugs.” What does “better” mean? Which bugs?

Micromanaging: Checking in every 30 seconds, redirecting constantly, never letting the agent build momentum.

The sweet spot: clear goal, sufficient context, room to maneuver. The same brief you’d give a junior engineer you trusted.

For teams, the stakes compound. Over-steering one agent is inefficient. Over-steering five is chaos.

Agents are amnesiac. Every session starts from zero. The agent doesn’t know your codebase, your conventions, or your preferences.

The fix: a persistent context file that tells the agent who it is and what world it’s operating in.

# CLAUDE.md

## Project Overview
Multi-tenant SaaS platform for invoice processing.
Go backend, React frontend, PostgreSQL database.

## Architecture
- /cmd: Entry points
- /internal/api: HTTP handlers
- /internal/domain: Business logic (no external dependencies)
- /internal/infra: Database, external services

## Conventions
- Errors wrap with context: fmt.Errorf("doing X: %w", err)
- Tests live next to code: foo.go → foo_test.go
- No globals. Dependency injection everywhere.

## Gotchas
- /internal/legacy is untouchable. Don't modify.
- Auth uses custom middleware in /internal/auth. Read before touching.
- Billing service is flaky. Always add retries.

This isn’t a substitute for good task definition—it’s the backdrop. The agent reads this before starting any task.

For agent teams, CLAUDE.md becomes even more critical. Each teammate starts with fresh context. They all need the same grounding.

Before jumping to teams, master single-agent delegation. Four patterns cover most use cases:

Use case: New codebase. You need a mental map.

"Explore this codebase and explain the architecture.
Focus on the payment flow. Create a summary doc."

Agent strength: Tireless reading. Can ingest thousands of files without fatigue.

Watch out for: Hallucinated connections. The agent may infer relationships that don’t exist.

Checkpoint: “Show me which files you found that conclusion in.”

Use case: Bug fix. Refactor. Migration. You know what, you want the agent to do how.

"Fix the race condition in handler.go where the cache read
and database write aren't atomic. Don't change the API
signature. Add a test that would have caught this."

Agent strength: Patience for tedious changes. Consistency across many files.

Watch out for: Scope creep. The agent “helpfully” refactors adjacent code you didn’t ask about.

Checkpoint: Always review the diff before committing.

Use case: Scaffolding a new feature, service, or project.

"Create a new API endpoint for password reset.
Follow the patterns in /internal/api/auth.
Include validation, error handling, tests,
and update the OpenAPI spec."

Agent strength: Speed. Can scaffold in minutes what takes hours.

Watch out for: Plausible but wrong. Generated code compiles and runs but has subtle bugs.

Checkpoint: Read the generated code like a PR from someone you don’t fully trust yet.

Use case: Code review. Security audit. Performance analysis. You want insight, not action.

"Review this PR for security issues.
Don't suggest fixes, just identify problems
and explain severity."

Agent strength: Unbounded attention. Checks things humans skim past.

Watch out for: False positives. Over-flagging stylistic issues as problems.

Checkpoint: Calibrate by verifying a few findings yourself.

Sometimes a task is too complex for one agent but doesn’t need a full team.

Subagents run within a single Claude Code session. The main agent spawns helpers for research or verification. They report back to the main agent only—no inter-agent communication.

"Research the best pagination library for our Go API.
Spawn a subagent to investigate options while you
continue implementing the endpoint structure."

Good for:

  • Breaking down a complex task without losing your main context
  • Parallel research while implementation continues
  • Verification steps that shouldn’t interrupt the main flow

Limitation: Subagents can only report back to the main agent. They can’t talk to each other. For true parallelism with coordination, you need teams.

Agent teams are multiple Claude Code instances working together. One session acts as team lead, coordinating work and synthesizing results. Teammates work independently, each in its own context window, and can communicate directly with each other.

Unlike subagents, you can interact with individual teammates directly without going through the lead.

Teams are experimental and disabled by default:

# In environment or settings.json
CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1

Then tell Claude to create a team:

"Create an agent team to explore this CLI tool design
from different angles: one teammate on UX, one on
technical architecture, one playing devil's advocate."

Claude spawns the team, creates a shared task list, and coordinates work based on your prompt.

  • Tasks with clear boundaries that parallelize: Three reviewers looking at different aspects of a PR
  • Research from multiple angles: Exploring a design decision from UX, architecture, and skeptic perspectives
  • Investigating bugs with parallel hypotheses: One agent checks logs, one traces code, one reproduces locally
  • Exploration without code changes: Reviewing, researching, analyzing
  • Code changes to the same files: Two teammates editing the same file leads to overwrites. Partition by file ownership.
  • Work that can’t be clearly partitioned: If the tasks are deeply interdependent, a single agent with subagents may be better.
  • Unattended for too long: Teams need check-ins. Letting them run unsupervised increases the risk of wasted effort.

Different team structures for different problems:

Leader creates team → spawns workers → workers report to leader → leader synthesizes

Most common pattern. One orchestrator, multiple specialists. Good for research, exploration, multi-perspective review.

Leader creates tasks → workers self-assign → leader monitors progress

For embarrassingly parallel work where workers are interchangeable. Each worker grabs the next available task.

Agent A → Agent B → Agent C
(each waits for predecessor)

Sequential processing with handoffs. When work must flow through stages.

Multiple agents get same task → each proposes solution → leader picks best

For decisions where you want diverse perspectives. Let three agents design the API, then pick the best approach.

Worker does task → watcher monitors → watcher can trigger rollback

For critical operations needing safety checks. The watcher doesn’t do the work—it validates the worker’s output.

"Create an agent team to review this PR from three angles:

- Teammate 1: Security vulnerabilities
  Look for injection risks, auth bypasses, data exposure.

- Teammate 2: Performance issues
  Look for N+1 queries, memory leaks, slow algorithms.

- Teammate 3: Code simplicity
  Look for over-engineering, premature abstraction, YAGNI violations.

Each should send findings to me. Don't suggest fixes,
just identify and explain."

The three teammates explore in parallel, each with their own context. You synthesize their findings into a coherent review.

This works well because:

  • Clear boundaries (each teammate owns a perspective)
  • No code changes (no file conflicts)
  • Parallelizable (they don’t depend on each other)

From the official docs and hard-won experience:

Start without code. If you’re new to teams, begin with tasks that don’t require writing code: reviewing a PR, researching a library, investigating a bug. These show the value of parallel exploration without the coordination challenges of parallel implementation.

Partition by file ownership. Two teammates editing the same file leads to overwrites. Break the work so each teammate owns a different set of files. If that’s not possible, use a single agent instead.

Check in regularly. Monitor teammates’ progress, redirect approaches that aren’t working, synthesize findings as they come in. Letting a team run unattended for too long increases wasted effort.

Synthesize as you go. Don’t just collect findings at the end. Integrate insights incrementally. Early synthesis often reveals that you need to redirect one of the teammates.

You writing code yourself
         ↓
Chat: AI assists your coding
         ↓
Single agent: AI codes, you review
         ↓
Subagents: AI spawns helpers within session
         ↓
Agent team: Multiple AIs in parallel, you orchestrate

Each rung trades control for leverage. The skill is knowing which rung fits the task:

  • Quick question or code snippet: Chat
  • Well-defined task you could do but shouldn’t spend time on: Single agent
  • Complex task with research subtasks: Agent with subagents
  • Parallelizable exploration or review: Agent team

Don’t use a team when an agent will do. Don’t use an agent when chat will do. Match the tool to the task.

Over-steering: Instructions so detailed the agent can’t adapt. Describe the goal and constraints, not the procedure.

Under-specifying: Vague success criteria. “Make it better” fails. “Reduce p99 latency below 100ms” succeeds.

Ignoring clarification requests: The agent asks for clarification, you say “just figure it out.” Those questions are signal. Answer them or realize your task was underspecified.

No checkpoints: For complex tasks, build in checkpoints. “After step 1, show me the plan before proceeding.” Especially for teams running in parallel.

Not reviewing proportionally: The code works, you ship it, bugs emerge. Treat agent output like junior engineer output. Review proportional to risk.

We’re early. Agent teams are still experimental. The patterns will evolve. But the direction is clear: more leverage, more parallelism, more orchestration.

The progression from chat to agent to team mirrors how you’d delegate to humans. First you pair. Then you assign tasks. Then you manage a team.

The skill isn’t prompting. It’s knowing what to delegate, how to partition, and when to intervene.

The developers who learn to orchestrate will ship what used to take a team—alone.