Anchored, a new open-source project from developer @chafoo, tackles one of the most persistent frustrations in AI-assisted development: the widening gap between code generation speed and verification rigor. The tool introduces structured evidence gates into autonomous AI coding workflows, forcing agents to prove their work before marking tasks complete. Currently available as a pre-1.0 plugin for Claude Code, Anchored runs entirely through CLI commands bundled inside the plugin—no npm installs or MCP configuration required.
The Verification Gap Problem
AI coding assistants can generate entire features in minutes, but verification still happens at human speed. Tests pass. PRs merge. And somehow you're left wondering whether the test actually proves what it claims to prove—or why a particular architectural decision was made three hours into an autonomous run. Anchored's core thesis: prompts are requests, not guarantees. An agent can skip criteria, lose track over long executions, or point at evidence that doesn't actually validate anything. The tool flips this by making verification structural rather than disciplinary—a boundary the AI cannot cross without proof.
How the Four-Step Lifecycle Works
Every piece of work moves through plan → refine → build → wrap, and no criterion reaches "done" without attached evidence accepted by an independent checker. During plan, tasks are broken into testable acceptance criteria—if something can't be verified, it isn't a valid criterion. The refine step checks the plan against your actual codebase and project rules, catching gaps, bad assumptions, and soft criteria before any code is written. Build happens phase-by-phase; a phase cannot complete just because code compiles or tests turn green—each criterion needs validated evidence, and rule violations cause rejection. Finally, wrap summarizes the run since verification already occurred during execution.
Separation of Concerns: The Key Innovation
The architectural insight driving Anchored is separation between the agent that writes code and the one that evaluates whether it's proven. An independent instance assesses each criterion—what was required, what changed, which check proves it, whether evidence is sufficient, and whether rules were violated. Evidence isn't prose; it's structured state stored in a versioned anchored.yml at your repo root. This makes the entire process reviewable by the whole team rather than trapped in someone's chat history or personal knowledge.
Tier Model: Epic → Task → Phase
Anchored operates across three scales—epics, tasks, and phases—with consistent lifecycle stages for each. Epics represent large bodies of work that get decomposed into tasks; tasks break down into phases; phases contain the actual implementation steps with evidence gates. The plugin namespace is "a," so commands appear as /a:plan, /a:refine, /a:build, and /a:wrap. Right now Anchored isn't listed on the official Claude Code marketplace—installers add the GitHub repo (chafoo/anchored) as a marketplace source to pull it in.
Technical Architecture
The engine lives in core/ as a standalone Node-compatible package (not yet published separately), while the plugin bundles everything users need. Tests run via bun with spec-coverage, unit, e2e, and integration suites before any build passes. TypeScript compiles to dist/, and the whole project carries MIT licensing. The v3 architecture has been dogfood-validated internally, though @chafoo notes APIs may still shift before 1.0.
Key Takeaways
- Anchored enforces verification as a hard boundary, not a suggestion—the agent cannot proceed without accepted evidence
- Code-writing agents and proof-evaluating agents are intentionally separate instances to prevent conflicts of interest
- The versioned anchored.yml at repo root makes the entire process team-reviewable, not hidden in chat histories
- Currently pre-1.0 with GitHub-based installation for Claude Code; official marketplace listing pending
The Bottom Line
This is exactly what autonomous AI coding has been missing—a forcing function that closes the verification gap by design rather than hoping human reviewers catch everything. If you're running long agentic coding sessions and wondering why you don't fully trust the output, Anchored might be the infrastructure gap you've been ignoring.