Anthropic published research in February 2026 that should make every engineering leader pause. A randomized controlled trial involving 52 mostly-junior engineers found that those using AI coding assistants while learning a new library scored 17% lower on comprehension tests compared to developers who relied solely on search and documentation. The study, authored by two Anthropic researchers, measured how well developers understood the code they had just written—not whether they finished faster.
The Numbers Behind the Finding
The productivity result was underwhelming: no significant difference in completion time between groups. But the mastery data told a different story. The AI-assisted group scored 17% lower on a comprehension test covering code reading, debugging, and conceptual questions about what they'd built. Cohen's d = 0.738 with p = 0.010 means this isn't noise—it's a statistically significant effect size roughly equivalent to dropping two letter grades. The paper itself states the tradeoff plainly: 'AI helps you finish. It can hurt your understanding of what you finished.'
How You Use It Matters More Than Whether You Use It
The most striking finding came from analyzing how developers within the AI group used the tool differently. Conceptual-inquiry users—those who asked questions like "what does this do?" and "why this pattern?"—scored 65% or higher on comprehension tests. Code-delegation users who prompted "write this function" scored below 40%. Same tool, same task, same time allotment, dramatically different outcomes.
Supporting Evidence From Independent Researchers
The Anthropic finding isn't isolated. METR's July 2025 study found that 16 senior open-source developers thought AI sped them up by 20% but were actually 19% slower on real repos. A Cursor diff-in-diff analysis from November 2025 showed open-source projects adopting the tool had a +281% lines-added spike in month one, but static analysis warnings increased 29.7% and code complexity rose 40.7%. The study notes that 'a 100% increase in code complexity is associated with a 64.5% decrease in development velocity over time.'
Where the Evidence Is More Nuanced
To be fair, peer-reviewed counter-evidence exists. A 2025 study by Cui, Demirer, and Jaffe published in Management Science ran three field experiments with 4,867 developers at Microsoft, Accenture, and a Fortune 100 company, finding +26.08% task completion rates with Copilot access. However, the gains concentrated heavily among less-experienced developers while senior devs showed smaller effects—and METR's work on legacy code suggested negative effects for experienced engineers.
What Builders Should Actually Do
If Anthropic's skill-formation result is directionally correct—and three supporting studies point the same way—the practical change isn't about avoiding AI. It's about how you interact with it. Ask AI questions rather than delegating code generation: 'What's the difference between X and Y here?' or 'Why does this pattern break in case Z?' Then write the code yourself, or write a first version and ask AI to critique it. Reserve delegation for throwaway scripts, spikes, and one-off tasks you'll never touch again.
Key Takeaways
- AI coding tools deliver modest completion-time gains: 0–30%, not 5x or 10x
- Comprehension drops 17% when developers delegate code generation versus asking questions
- Inquiry-based usage yields 65%+ comprehension; delegation yields sub-40% scores
- Junior developers on tractable tasks see real wins; seniors on legacy code may see losses