Everyone wants to build "the next Claude Code" โ€” a multi-agent system where a lead orchestrates specialists, tasks flow through a kanban, reviewers gate quality, and humans kick back. On a whiteboard it looks elegant. In production, it bleeds tokens like a broken pipe. The uncomfortable math: in a realistic multi-agent setup with a lead, three teammates, and ten tasks running end-to-end, roughly 92% of the tokens in a session are not doing useful work. They're tool schemas and protocol text, repeated turn after turn, agent after agent. That's not a bug in any one prompt โ€” it's structural. The numbers tell the story. Total session cost: ~920,000 tokens. Actual work โ€” planning, execution, back-and-forth โ€” roughly 70K. Tool definitions alone: ~850,000. That's an order of magnitude larger than all the actual work combined. Every teammate gets the full toolbox on every turn โ€” kanban tools, cross-team coordination tools, process tools โ€” even though most agents only ever touch five or six of them. A developer who needs task_get and task_complete is paying for twenty other definitions they'll never call. Seven traps make this worse. Protocol text gets copy-pasted three or four times (the 90-line task protocol shows up in spawn prompts, member briefings, assignment messages, and every reconnect). There's no planning phase so you pay twice for rework โ€” build wrong, review, throw out, rebuild. Teammates carry stale context across tasks. Assignment messages teach things the briefing already taught. Post-compact recovery re-injects everything verbatim. Relay messages repeat their own instructions every single time. The fix isn't prompt-level trimming โ€” that's rearranging chairs. The real wins are architectural: scope tools by role (a reviewer doesn't need kanban management), add a planning phase before any developer touches code, and make teammates ephemeral per task instead of persistent across the epic. Spawn a fresh developer per task, kill it on completion, start clean next time. The spawn cost is a few thousand tokens. The carrying cost of stale context is unbounded. Protocol injection needs deduplication too โ€” inject once at spawn, reference after that. Slender assignment messages with a one-line pointer instead of inline tool examples. Make cross-team tools conditional โ€” inject them only when they become relevant.

Key Takeaways

Tool definitions are the silent killer: 850K of 920K tokens in a modest run 92% overhead is structural, not prompt-level โ€” fix the architecture Ephemeral teammates beat persistent ones: spawn cost < carrying cost Protocol deduplication, role-scoped tools, planning gates โ€” build these in from day one

The Bottom Line

If your agent bill is surprising you, the answer isn't 'Claude is expensive.' It's that you're paying to re-teach the same agent the same things on every turn and shipping it tools it never touches. Multi-agent systems aren't Claude Code with extra agents bolted on โ€” they're a different design problem, and the cost model is dominated by things that don't show up in any single prompt review. Build for it from the start or watch 92% of your tokens evaporate into schema repetition.