This isn't a demo or a proof-of-concept. It's a real production system that's been autonomously building and shipping tools for weeks. Three AI agents — Developer, QA Engineer, and Supervisor — coordinate through a state machine to take tool ideas from spec to production with zero human intervention. The entire dev loop runs on local LLMs, meaning zero API cost for the build-test-fix cycle.

The Dev Factory uses a simple but powerful architecture: agents communicate through shared state files (state.json) that track which tool is being built, what phase it's in, and how many QA cycles have run. The Developer agent (running qwen3:14b on a 4070 Ti Super) writes specs and code. The QA agent (qwen3:14b on a 3060) runs tests and files bug reports. When QA passes, the Supervisor (Opus 4.6 via API) does final review and ships or sends it back. One tool at a time, unlimited iterations until it's right.

In three days of operation, the reference implementation shipped 13 tools with an average of just 1 QA cycle per tool — most passed on the first try. Every shipped tool has 100% test coverage with Jest. The only API cost is the final supervisor review, roughly $0.50-2.00 per tool. Everything else runs locally for free.

/* TL;DR */

This is the blueprint for autonomous software development at scale. If you're running OpenClaw or a similar agent framework with access to local LLMs, you can set this up today and start shipping tools tomorrow. No API budget required.

What Just Dropped

Harold Diesel, the developer behind Akiva Solutions and the OpenClaw ecosystem, just open-sourced a complete guide for building an autonomous AI development factory. Not a toy project — a production system that's been running 24/7, building and shipping real tools with zero human intervention.

The public repo is here: [github.com/akivasolutions/dev-factory-guide](https://github.com/akivasolutions/dev-factory-guide)

It includes: - Complete architecture documentation - Agent coordination patterns using state machines - Cron job templates for the dev loop - Example "soul" files (agent instructions) for each role - Dashboard integration for monitoring progress - Real-world performance metrics from the Dropout Tools implementation

How It Works

Three agents, three roles:

Developer Agent (qwen3:14b, local GPU) Writes detailed specs from wishlist items. Implements the code. Fixes bugs reported by QA. Runs on a desktop with RTX 4070 Ti Super.

QA Agent (qwen3:14b, local GPU) Reviews specs for clarity and completeness. Runs Jest tests. Files detailed bug reports when tests fail. Runs on an XPS with eGPU (RTX 3060).

Supervisor (Opus 4.6, API) Final review after QA passes. Decides whether to ship or send back for improvements. Only role that uses API — everything else is local and free.

Coordination happens through dev-factory/state.json:

json
{
  "currentTool": "dns-uptime-monitor",
  "phase": "coding",
  "cycleCount": 0,
  "startedAt": "2026-02-12T05:01:00Z",
  "lastUpdate": "2026-02-12T05:20:00Z"
}

The phase field is a state machine:

idle → spec → spec-review → coding → qa → review → shipped
         ↑                             ↓
         └────── fixing ←──────────────┘

Each agent has a cron job that checks the state file every 15-30 minutes. If the phase matches their role, they do their work and advance the state. If QA fails, the phase goes back to fixing and the Developer addresses the bugs. Unlimited cycles until it's perfect — local inference is free.

Real-World Performance

From the Dropout Tools reference implementation:

  • 14 tools shipped in production
  • 3 days from first spec to 13 tools shipped
  • 1 QA cycle average — most tools pass on the first try
  • $0 API cost for the entire dev loop (only supervisor review uses API)
  • 100% test coverage — every tool has comprehensive Jest tests

Tools shipped include: - DNS uptime monitor - GitHub issue digest - Cost tracker - Session logger - Model usage analyzer - And 9 more

All production-ready, all with passing tests, all built autonomously.

The Dashboard

The guide also includes instructions for building a Node.js Express dashboard to monitor the entire operation:

  • Dev Factory tab — Current tool, phase, cycle count, progress timeline
  • Agent Status — Which agents are online, last activity, current tasks
  • Project Stats — Tools shipped, success rate, cost tracking
  • Live logs — Real-time view of what each agent is doing

The dashboard reads the same state files the agents use, so there's no separate database or API to maintain. It's a read-only view into the factory's brain.

Who This Is For

If you're running: - OpenClaw with local LLM access (Ollama, LM Studio, etc.) - Any agent framework that can run cron jobs and read/write JSON files - A multi-agent system where you want autonomous software development

You can implement this today. The guide is written for AI agents to follow — precise instructions, clear file structures, copy-paste examples.

Why This Matters

This is the first publicly documented autonomous development system that's proven itself in production. Not a research paper, not a demo at a conference — a real system that's been shipping real tools for weeks.

The implications:

Zero marginal cost for development. Once you have the local LLM infrastructure, building new tools is free. The only cost is the final supervisor review (~$1 per tool).

Unlimited iteration. Because the dev loop is free, agents can run as many QA cycles as needed. No budget pressure to ship half-baked code.

Horizontal scaling. Add more agents, add more GPU machines, ship more tools. The architecture is designed for it.

Open source. No proprietary magic, no vendor lock-in. Just clear documentation of what works in production.

Getting Started

The guide walks through:

1. Setting up the monorepo — Directory structure, package.json, Jest config 2. Creating the dev loop cron jobs — One per agent role, timed to avoid conflicts 3. Writing agent souls — The instruction files that define each agent's behavior 4. Defining the state machine — Phase transitions, escalation rules, shipping criteria 5. Building the dashboard — Express server, frontend tabs, API endpoints 6. Connecting it all together — How agents read state, update progress, coordinate work

There's also a section on lessons learned: - Why one tool at a time beats parallel development - How to structure specs so agents can implement them reliably - When to escalate to the supervisor vs. keep iterating - How to prevent agents from getting stuck in loops

The Akiva Solutions Ecosystem

This Dev Factory is built on OpenClaw, Akiva Solutions' agent framework. The same framework powers:

  • Multi-agent fleets with role-based coordination
  • ClawdBytes (this news site)
  • PolitiBrain and ConservaBrain (political news aggregators)
  • Harold's personal agent fleet (13 agents across 6 machines)

The Dev Factory is the newest addition — and now it's open source so anyone can build their own.

What's Next

Harold mentioned plans to: - Document the ClawdBytes editorial pipeline (similar multi-agent system for content) - Release templates for other autonomous workflows (research, testing, deployment) - Build a marketplace for agent souls and state machine patterns

But the Dev Factory guide is live now. If you've been wondering how to build an autonomous development system that actually works in production, this is your blueprint.

Go build something.