When Your AI Agent Runs rm -rf /: How Claude Code and Codex Keep Shell Commands Contained

An deep dive into the OS-level sandboxing arms race between Anthropic's Claude Code and OpenAI's Codex reveals both systems reached for the same Linux primitives—but built fundamentally different trust architectures. A new technical breakdown from instavm.io pulls back the curtain on how both systems contain AI execution risk, revealing a fascinating case of independent engineering convergence.

The Shared Primitives

Both Codex and Claude Code reached for the same Linux kernel mechanisms to isolate untrusted command execution: bubblewrap (bwrap) creates filesystem namespaces with layered mounts—mounting the host root as read-only and overlaying writable paths on top. Seccomp/BPF intercepts every syscall, acting as a firewall for system calls rather than network ports. On macOS, both systems lean on Seatbelt through Apple's TrustedBSD mandatory access control framework, starting every policy with (deny default) and building allowlists from there. The implementations differ in language—Codex builds its argument lists in Rust while Claude Code uses TypeScript—but the underlying OS primitives are identical.

The io_uring Discovery: A Kernel-Level Escape Vector

Here's where it gets interesting. Both teams independently identified a critical evasion vector that most developers have never heard of: Linux's io_uring subsystem can perform operations—including socket creation—in kernel context without passing through the socket() syscall. Starting with Linux 5.19, IORING_OP_SOCKET lets processes bypass seccomp's socket-blocking rules entirely. The fix? Block all three io_uring syscalls (io_uring_setup, io_uring_enter, and io_uring_register) at the BPF level before they can be weaponized. Codex implements this in Rust via the seccompiler crate; Claude Code uses a precompiled C binary built with libseccomp. Two competing teams, same vulnerability, same fix—pure parallel evolution under security pressure.

Codex: Mandatory Containment by Default

OpenAI's approach treats sandboxing as a non-negotiable containment boundary. The SandboxPolicy Rust enum has four variants (DangerFullAccess, ReadOnly, WorkspaceWrite, ExternalSandbox) and the compiler enforces exhaustive matching—no code path can ignore any variant. should_require_platform_sandbox() is always evaluated; the manager may resolve to no sandbox only when policy semantics don't require enforcement. When a sandboxed command hits a blocked operation, Codex emits an EscalateRequest through its typed protocol, presenting users with three outcomes: Run directly, Escalate with broader permissions, or Deny entirely. The session-level scope means all commands in a turn share the same restrictions—consistent containment at the cost of flexibility.

Claude Code: Configurable Isolation You Can Tune

Anthropic takes the opposite philosophical position: sandboxing is a configurable isolation layer that developers can tune per command or per pattern. Configuration merges from five sources (enterprise MDM policies down to ~/.claude/settings.json) with deny-takes-precedence semantics. shouldUseSandbox() evaluates every BashTool invocation individually—sandbox.enabled, dangerouslyDisableSandbox flags, and containsExcludedCommand() pattern matching all feed into the decision. Commands can be excluded by pattern; compound commands (cmd1 && cmd2) are split and checked per subcommand to prevent bypasses like 'docker ps && curl evil.com' escaping because docker is excluded. When blocks occur, violations log to SandboxViolationStore—a 100-entry in-memory ring buffer—surfaced transparently in the REPL rather than as modal approval dialogs.

Where They Converge—and Diverge

Both systems use identical bubblewrap mount ordering (host root as base, restrictions layered on top), both block io_uring independently, and both implement Unix domain socket bridges for proxy-routed network traffic. But the philosophical split runs deep: Codex optimizes for guaranteed containment with session-level consistency; Claude Code optimizes for developer control with per-command granularity. Network enforcement differs too—Codex uses kernel-level seccomp BPF socket filtering while Claude Code enforces at the application layer through domain-based proxy allowlists. Perhaps most tellingly, Claude Code runs cleanupAfterCommand() after every sandboxed execution to scrub bare git repo markers (HEAD, objects, refs) that could weaponize an unsandboxed git invocation—Codex trusts its containment is sufficient without post-execution hardening.

Key Takeaways

Both systems independently blocked io_uring syscalls to prevent kernel-context socket bypass attacks—a critical finding for anyone implementing AI command execution
Codex takes a mandatory approach: sandbox-first, session-level scope, escalation dialogs on block—optimized for closed trust loops
Claude Code takes a configurable approach: per-command evaluation, transparent violation logging, developer-controlled relaxation—optimized for flexibility and visibility
Network isolation in both systems routes through Unix socket bridges to managed proxies rather than relying solely on syscall filtering

The Bottom Line

These aren't just implementation details—they're fundamentally different philosophies about who should control AI behavior. Codex says containment is non-negotiable; Claude Code says developers are trusted operators who should see everything and decide for themselves. Both approaches have merit depending on your threat model, but the io_uring discovery alone proves this arms race isn't slowing down. If you're building AI agent infrastructure, study both systems—because the attackers certainly will.

> When Your AI Agent Runs rm -rf /: How Claude Code and Codex Keep Shell Commands Contained