Karajan v3.0.0: A Local-First Multi-Agent Orchestrator That Conducts AI CLIs Like an Orchestra

If you've spent any real time with Claude Code, Codex, or Gemini CLI tools, you know the grind. You describe what you want, watch the model work, catch it hallucinating an API call or touching the wrong layer, stop it, correct it, relaunch it. Repeat until your coffee's cold and your patience is thinner. Developer manufosela has been living this loop—and built Karajan to escape it.

The Conductor Metaphor Made Real

Karajan (named after legendary conductor Herbert von Karajan) is a local-first multi-agent orchestrator that runs AI coding assistants as a coordinated pipeline instead of isolated processes. The system spawns Claude, Codex, Gemini, Aider, and other CLI agents as child processes and routes tasks through defined roles: task → triage → researcher → architect → planner → coder → SonarQube (static analysis gate) → tester → reviewer → security pass → commiter (opens PR). Each role can use a different provider—the coder might be Claude, the reviewer Codex, the triage agent something cheaper. Automatic fallback kicks in when any agent hits its quota.

Zero API Cost, Multi-Provider Routing

The architectural decision to spawn agents as child processes rather than calling /v1/messages is what makes Karajan interesting: it uses your existing Pro/Plus/Max subscriptions without racking up new API bills. A powerful coder plus a strict reviewer plus a cheap triage agent means the right model for each role, with no extra spend. The TDD-first pipeline requires tests alongside code changes, and if something fails, the reviewer feeds back to the coder automatically—bounded by maxIterations and maxIterationMinutes so it can't spiral into an infinite loop of AI talking to itself.

MCP Server, Solomon, and Local RAG

Karajan ships as an MCP server (kj_run, kj_plan, kj_code, kj_review, kj_audit, kj_rag_query) for integration directly inside Claude Code, Cursor, or any MCP host. There's also Solomon—an AI judge role consulted only when the central Brain orchestrator hits a real dilemma like security versus deadline pressure. Each project gets its own embedded code index at ~/.karajan/rag.db with support for six embedding providers (Ollama, OpenAI, Voyage, Cohere, Mistral, local ONNX) and tree-sitter-based chunkers for JavaScript, TypeScript, Python, Rust, Go, and Java. The Story Board dashboard runs locally at http://localhost:4000 as a single source of truth showing every user story, session, plan, and RAG metric.

v3.0.0: Runtime Alignment, Not Feature Creep

The headline for this release is boring in the best way: Node 20 hit end-of-life on April 30, 2026, and three dependencies independently bumped to majors requiring Node 22+ (lint-staged 17, commander 15, better-sqlite3 12.10). Rather than staggered patches, v3.0.0 bundles everything into one release. Migration is a single command: nvm install 22.22.1 && npm i -g karajan-code@3, then kj doctor to verify your setup. No public API changes if you were already on Node 22—kj run, MCP tools, role templates, RAG, Story Board, audit, and telemetry all work exactly as before.

Hardware Footprint and Profiles

The README now includes explicit hardware requirements so you know what you're getting into. The kj binary itself is 5.2 MB on npm with ~/.karajan/ running around 40 MB. Optional heavy layers: Ollama at 6.55 GB, SonarQube at 1.47 GB, and the qmd cache at roughly 2.2 GB. Three install profiles are available: Minimal (~250 MB), Recommended (~8.5 GB), and Full (~11 GB). The kj audit command runs deterministic checks for dead code, unused dependencies, security findings via OSV/Semgrep/SonarQube, accessibility hints, Web Performance budgets (Core Web Vitals via Chrome DevTools MCP), and an AI Harness Scorecard that gives your repo a 0-100 objectivity score on how "AI-friendly" it currently is.

From Shell Script to Orchestrated Pipeline

The version history reveals the scope of what manufosela has built. v0.x was literally one shell script: task → claude → diff → codex review → done. Hardcoded, no retry logic, worked roughly on a Tuesday afternoon. By v1.0-v1.3 it had quality gates with SonarQube and a role-based pipeline (BaseRole, BaseAgent, 12 configurable roles). The MCP server arrived in v1.2, followed by rate-limit detection and automatic fallback in v1.4-v1.7. Version 2.0 introduced the Karajan Brain as central orchestrator deciding routing and compressing outputs between roles. The "I use this daily" phase of v2.x brought stack-aware audit, Docker/shell installer, parallel user stories via git worktrees, domain knowledge files (DOMAIN.md), i18n support for English and Spanish end-to-end, Web Performance as a first-class quality gate, and the AI Harness Scorecard integrated into kj audit.

Key Takeaways

Karajan orchestrates multiple AI CLI tools (Claude, Codex, Gemini) as child processes—no new API costs using your existing subscriptions
The role-based pipeline includes triage, researcher, architect, planner, coder, SonarQube gate, tester, reviewer, security pass, and auto-committer with PR creation
MCP server integration lets you drive the full pipeline from inside Claude Code or Cursor without switching contexts
Local RAG indexes your codebase so AI agents understand what already exists before planning changes
v3.0.0 bumps to Node 22 baseline with a single migration command—zero API changes for existing users already on Node 22+

The Bottom Line

If you've been manually coordinating between Claude Code and Codex, or watching one AI tool go sideways while you're in the middle of something else, Karajan is worth your attention. It's not trying to replace these tools—it's doing what a good conductor does: imposing unified vision so dozens of musicians sound like one. Install it (nvm install 22.22.1 && npm i -g karajan-code@3), run kj doctor, describe what you want once, and let the orchestra play.

> Karajan v3.0.0: A Local-First Multi-Agent Orchestrator That Conducts AI CLIs Like an Orchestra