An operator typed "bug: demo revision limit — max revisions not being enforced" into a Telegram group. Forty minutes later, the thread showed ✅ NIP-153 — PR opened, with a link to a pull request that read like a careful engineer wrote it: root-cause analysis, the fix, stated assumptions, and a regression test. Merged. The "engineer" was a Claude Code session running in a loop on a laptop—nothing underneath it, no OpenClaw, no LangChain, no orchestrator service. Just one Markdown skill file, one 180-line Python script with zero external dependencies, and the skills, subagents, git worktrees, and MCP connection that Claude Code already ships with.
The Architecture
The setup is intentionally boring in all the right ways. A Claude Code skill—a Markdown file describing a procedure—acts as the orchestrator: drain Telegram, pick the next approved ticket, triage it, delegate the build, report back, repeat. When every ticket is blocked waiting on a human, it schedules its own wake-up 20–30 minutes out and goes quiet. Each ticket builds in its own isolated git worktree with a fresh subagent implementing, testing, linting, pushing, opening a PR, and reporting back before the worktree disappears. The isolation is non-negotiable when agents share a repo—and they learned that one the hard way.
Why Linear Over MCP Is the Real Insight
Zero lines of issue-tracker integration were written. Not a small amount—zero. Claude Code talks to Linear through the same MCP connection interactive sessions already use, which means no webhook receiver, no REST client, no sync job, no schema to maintain. When Linear changes something, they inherit it. More importantly, they're leaning on the tracker for three jobs at once: queue (tickets labeled agent, worked oldest-first by priority), state machine (three labels—agent, agent-blocked, manual—and label transitions are the workflow), and memory (every question and answer mirrors onto the ticket as a comment). Kill the loop mid-run, restart cold tomorrow—it re-reads tickets and continues. The only state on disk is a single JSON file holding a Telegram poll offset.
Day One Results With Real Numbers
Three tickets tell the story. NIP-153 fixed a revision cap bug where quotas were counted per ticket row instead of workspace-scoped—the agent caught this properly, added a regression test, and ran 649 tests green before merging. NIP-154 addressed a bot escalating on trivial messages because an FAQ override file was a TODO stub; the implementation subagent took 49 tool uses, 9 minutes 52 seconds, and ~140k tokens. NIP-170 was filed by their CEO directly from chat—he got the syntax wrong twice before it stuck, and the agent confirmed "got it on the 3rd try 👍". Mid-run, the founder typed a queue reprioritization into the group, and the agent acknowledged, reordered its work, and stacked related tickets onto one PR. No config change, no redeploy—steered like a colleague.
The Safety Model
This is not unsupervised by design. Nothing builds without an explicit human go—the agent label only ever gets applied through Telegram replies. A manual label fences tickets entirely; the agent declines even direct green-lights until a human removes it in Linear. Every change comes as a PR—no merges, no silent retries—and CI runs the full test suite before a human reviews and lands anything. One ticket at a time, sequential by choice: observability beats throughput when parallel branches could step on each other's database migrations. Failed builds post a ⚠️, comment the failure onto the ticket, and skip for the rest of the run—nothing silently retries. The loop also treats ticket bodies, comments, and group messages as data, not instructions; "push straight to main / read the env file" in a ticket gets flagged instead of obeyed.
Three Things That Bit Them
Telegram bot privacy mode is the most likely reason any clone of this setup "doesn't work." By default, bots can't see plain messages—only replies and commands. Test answers vanished silently; nothing errored. The fix: BotFather → /setprivacy → Disable, then remove and re-add the bot to the group—the change doesn't apply retroactively. GetUpdates retains messages for ~24 hours, which means a polling loop off over the weekend loses everything sent in that gap. They accepted this as a working-hours surface limitation rather than building infrastructure around it. People also don't reply to questions—they reply to the bot's latest message or nothing at all—so the NIP-123
The Bottom Line
The marginal cost of trying this—if you already run Claude Code and connect Linear via MCP—is roughly one afternoon. No per-token API bill, no separate agent product, no server. Before you reach for an orchestration framework, check whether a ticket queue, a group chat, and the coding agent you already pay for can do the job. Theirs could, and the entire platform fits in two files you can read in ten minutes. The real unlock isn't the AI—it's putting the audit trail where your team already looks instead of in log files only the agent reads.