One Agent Is Not Enough
Claude Code is remarkable at working through a task one step at a time. But some jobs are simply too big for one context window: a color migration across 45 files, a security audit of an entire codebase, rewriting every industry page on a site. One agent grinds through those serially. Dozens of agents finish them in minutes.
That is why we built Agent Army - an open-source Claude skill that deploys a two-layer hierarchy of parallel agents, with a top-tier commander model orchestrating the whole operation. It works anywhere Claude skills run - Claude Code for engineering work, Cowork for business and document work. It has become one of our favorite tools internally, and it is free on GitHub.
And we can prove we use it: everything on this page came out of one deployment. A three-wave army wrote this post, produced the video below with Agent Army driving HyperFrames, and then audited its own work before we hit publish.
Swarms vs. Armies: The Distinction That Matters
Most "multi-agent" setups you see are swarms: one agent splits its single context window across a handful of sub-agents. That is one brain, divided. Useful, but the ceiling is still one context window.
An army is different: every Layer 1 agent gets its own full 1M-token context window, and each one commands its own swarm of sub-agents underneath. Many independent brains, not one brain divided.
The result is a structure that looks like a real org chart. A commander at the top doing the thinking. Team leads in the middle, each owning a domain with a full context window. Workers at the bottom executing precise briefs against specific files.
How Agent Army Works
The commander thinks, the army executes
The commander is the strongest model in the session - Claude Fable 5 by default, or Opus 4.8 if that is what you are running. It never touches files itself. It runs reconnaissance, weighs every file by size, groups the work into domains, writes complete self-contained briefs for every agent, and verifies the results. Commander quality is the ceiling on army quality, so the skill refuses to run the commander on a smaller model.
Power levels: you choose the model mix
The newest version of the skill asks one question before deploying: how much power do you want? The Fable 5 commander stays in charge either way - your answer sets which Claude models fill the two layers underneath it.
- Max Power - Fable 5 commanding Opus 4.8 team leads and Opus 4.8 workers. For gnarly refactors and correctness-critical migrations.
- Heavy (our default) - Fable 5 commanding Opus 4.8 team leads that orchestrate Sonnet 4.6 workers. Smart management, fast execution.
- Balanced - Sonnet 4.6 throughout the swarms. For mechanical migrations with clear patterns.
- Economy - Sonnet 4.6 team leads, Haiku 4.5 workers. For high-volume, dead-simple find-and-replace across hundreds of files.
Every agent call passes its model explicitly, so an Economy run never silently bills at Opus rates. And any single sub-agent assigned a monster file can be escalated a tier up - the plan calls it out before anything deploys.
Safety rails that assume agents lie
Parallel agents fail in predictable ways, so the skill hardens against each one. A git checkpoint branch is created before any file is touched. A deployment gate verifies that every file is owned by exactly one agent - no overlaps, no gaps. And after every wave, the commander runs a phantom-completion check: it cross-references git diff against every agent's report, because an agent claiming "COMPLETE" with no matching diff is lying. Trust the diff, not the report.
Waves, not one-shot runs
The army works in up to four adaptive waves: Execute makes the changes, Audit sends fresh Opus agents to review the first wave's work, Propagate updates tests and docs that reference what changed, and Notify drafts the PR description and changelog. Each wave's report becomes the next wave's reconnaissance.
Why It Is One of Our Favorites
We use Agent Army on real client work constantly. Our largest internal run put 60+ concurrent agents on a single session. We used a 24-agent army to audit our own website end to end - conversion paths, performance, analytics wiring - and it surfaced issues a single-agent review had walked right past. Full-site brand migrations that used to eat an afternoon now finish inside a coffee break, with a rollback branch waiting if anything goes sideways.
This very post is the latest example. Wave one researched and drafted the article while a parallel team scripted and rendered the video. Wave two sent fresh Opus agents to audit both - fact-checking the stats against our verified numbers and tightening the copy. Wave three propagated everything into the site: data entries, structured metadata, imports. And it is not just a coding trick: the same skill runs armies in Cowork for proposals, research decks, and document-heavy business work.
The pattern generalizes: any task with six or more independent units of work is a candidate for an army. Below that, the orchestration overhead is not worth it - and the skill will tell you so instead of deploying anyway.
Part of a Library People Actually Use
Agent Army ships inside our open-source claude-skills library - 162 MIT-licensed skills spanning business operations, sales, engineering, and AI agent architecture. The library has earned 170+ GitHub stars, is listed on 13+ independent skill directories, and claudemarketplaces.com alone reports 29,000+ installs across our skills. Everything is free, no strings attached.
Try It Yourself
Getting started takes under a minute:
- Clone the library:
git clone https://github.com/OneWave-AI/claude-skills.git ~/.claude/skills - Open Claude Code (or Cowork) on Fable or Opus and type
/agent-armywith your task - Pick an army size and a power level when asked, review the plan, and deploy
Start with something mechanical - a rename, a color migration, an audit - and watch the wave reports come in. Once you have seen 20 agents finish an afternoon of work before your coffee cools, single-agent workflows feel slow.
And if you want this kind of leverage wired into your business - not just your codebase - that is literally what we do.



