We're generating more code than ever before, but that doesn't mean we're building better software. In a talk at CommitConf 2026 (titled "AI Doesn't Write Good Software: The Environment Does"), developer Adrian Ferrera makes the case that generative AI has slashed the cost of producing code while leaving untouched the harder problems: understanding what you've built, validating it actually works, maintaining it over time, and knowing whether it fits your system. The uncomfortable truth, Ferrera argues, is that if the bottleneck used to be writing code, now it's everything else.

The Industry Was Already Broken

AI didn't introduce new problems into software development — it supercharged existing ones. For years, teams have been rushing before understanding, building before validating, leaving tests for later, and treating technical debt as inevitable rather than a solvable problem. CISQ estimated that in 2022 alone, poor software quality cost the United States approximately $2.41 trillion, with accumulated technical debt sitting around $1.52 trillion. Before AI, we already had a serious problem. Now we can make those same mistakes at staggering speed. As Ferrera puts it: "AI doesn't create new technical debt. It accelerates the technical debt we already knew how to create."

Verification Debt Is the Real Risk

Ferrera introduces a concept he calls "verification debt" — the idea that AI can generate code that looks correct on the surface, but someone still has to verify it works in context. We've dramatically reduced the cost of producing code, but not the cost of understanding, validating, deploying, operating, and maintaining it. The new bottleneck won't always be writing code; it'll be knowing whether that code is actually correct. Ferrera suggests the real risk isn't AI replacing developers — it's that it allows teams to automate their lack of judgment entirely.

A Workflow Built for Judgment

The thesis is straightforward: AI writes good software when it works within an environment that provides clear boundaries, reliable feedback, and spaces for human intervention. Quality doesn't depend on the model — it depends on the workflow where you integrate it. Ferrera proposes a five-phase framework he calls "Discovery → Plan → Review → Implement → Verify." It's not another Agile methodology for AI agents; it's a minimal control structure designed to clarify when you want AI to reason, when to propose, when to execute, and when humans need to validate.

Breaking Down the Phases

In Discovery, teams should use AI to understand problems, explore alternatives, identify risks, and surface ambiguities — but not generate code yet. Humans contribute what AI lacks: business context, real constraints, historical decisions, and risk sensitivity. The Plan phase turns reasoning into concrete steps, specifying what will change, which modules might be affected, and what tests are needed. Review happens before implementation, focusing on intent and approach rather than syntax — catching poorly oriented solutions before the emotional cost of code makes them hard to discard. Implementation has AI executing within that decision framework rather than improvising out-of-scope changes. Finally, Verify gathers signals: automated tests, typing, linters, static analysis, CI pipelines, security reviews, and observability.

Human in the Loop Means Deciding at the Right Moment

"Human in the Loop isn't reviewing at the end," Ferrera says. "It's deciding at the right moment." This distinction matters: humans aren't emotional compilers checking every line after the fact. They're there to contribute intention, context, judgment, experience, responsibility, and sensitivity to future impact — before building, before accepting solutions, before integrating changes, before taking on technical risk.

Orchestrators, Harnesses, and Humans

Ferrera describes three pillars for AI-augmented development: orchestrators provide structure by separating responsibilities (when to reason, decide, execute, verify, or request intervention); harnesses provide feedback through tests, types, linters, architecture rules, CI pipelines, security reviews, and staging environments; humans provide judgment. "An agent without harnesses is speed without direction," Ferrera notes. All three elements are necessary — none replaces the others.

Key Takeaways

  • AI accelerates existing problems, not new ones — poor practices just run faster now
  • Verification debt may exceed code debt as teams accept unvalidated AI output
  • The five-phase workflow (Discovery → Plan → Review → Implement → Verify) introduces judgment before acceleration
  • Human oversight matters most at decision points, not after the fact
  • Build your harnesses: tests, types, linters, CI, security reviews, and architecture rules

The Bottom Line

Ferrera's argument lands because it's not anti-AI — it's pro-discipline. The teams that thrive won't be those using AI most aggressively; they'll be those who design better systems for responsible use. Define your harnesses, protect your standards, keep judgment at the center. Because speed without direction isn't progress — it's just faster failure.