Inside Tengu: Reverse Engineering Reveals What Claude Code Actually Does Under the Hood

The repository wtfwhs/tengu-decoded dropped on GitHub with a simple premise: document exactly what Anthropic's Claude Code binary does, version by version. What makes this different from typical speculation is the methodology—researcher wtfwhs extracted and beautified 733,000 lines of plaintext JavaScript directly from the Bun-compiled executable (v2.1.197), making every claim reproducible via committed raw data and carved bundles.

The Binary Architecture Shift

From v2.1.32 to v2.1.169, Claude Code underwent a fundamental packaging change: it moved from Node.js SEA to a Bun standalone executable. This wasn't just an implementation swap—it meant the entire JavaScript application became recoverable as cleartext. No decompilation or bytecode analysis required. The bundle sits inside the binary like embedded assets in a compiled Go program, extractable with basic tools like dd and grep. For anyone curious about what their AI coding assistant actually phones home with, this architectural shift opened the floodgates.

Telemetry: What's Actually Leaving Your Machine

The telemetry findings are where things get interesting. Segment has been completely removed from the stack—Anthropic now relies on first-party event logging at /api/event_logging/v2/batch as the primary pipeline, with Datadog US5 configured as an allow-listed mirror (off by default). The latest version tracks 1,163 distinct telemetry events across 243 feature flags. GrowthBook drives feature flag experimentation, while OpenTelemetry and Perfetto traces are optional extras for users who opt in. Notably, the same Datadog token appears to have persisted through multiple release cycles—a potential oversight or deliberate design choice worth monitoring.

Device Fingerprinting: Not What You'd Expect

Perhaps the most surprising finding concerns device identification. The transmitted device_id isn't hardware-derived—it's a random 256-bit token generated via crypto.randomBytes(32) and stored in ~/.claude.json. While Claude Code does read the OS machine UUID, it strips everything down to just host.arch before transmission. In short: your install gets a unique random identifier rather than a fingerprint tied to your actual hardware. This is more privacy-respecting than many alternatives, though the full telemetry payload likely contains other environment signals worth auditing.

Cloud Backend and Managed Agents

The cloud infrastructure reveals ambitious expansion plans. New API endpoints for managed agents include /v1/sessions, /v1/agents, /v1/environments, and /v1/files—plus a Remote-Control bridge operating over wss://bridge.claudeusercontent.com. The background-agent daemon (/background, /tasks, /fork) combined with kairos loop scheduling and cron-style triggers suggests Claude Code is evolving beyond a CLI wrapper into a persistent autonomous agent platform. Agent Teams functionality has also matured: the internal gate tengu_amber_flint flipped from false to true in recent builds, enabling coordinator mode and shared team memory by default—though still requiring opt-in via --agent-teams.

Security Model Evolution

The most significant security change between analyzed versions involves command injection protection. The original 14-category regex pipeline has been replaced entirely: an LLM prefix-classifier handles prompt injection attempts, supplemented by a destructive-command regex and a two-stage auto-mode classifier. Sandbox isolation uses bubblewrap on Linux, seatbelt on macOS, and WFP on Windows. Whether this layered approach is more robust than the previous regex-based system remains debatable—LLM classifiers introduce their own failure modes—but it represents a fundamentally different architecture for handling potentially malicious inputs.

Key Takeaways

Bun-compiled packaging exposes full JavaScript source as plaintext—extractable with standard Unix tools
Device ID uses random 256-bit tokens, not hardware fingerprints—a privacy-positive design choice
Telemetry stack shifted from Segment to first-party logging plus optional Datadog mirroring
Background agent daemon and kairos scheduling suggest Claude Code is becoming a persistent autonomous platform
Agent Teams features are now enabled by default internally while waiting for user opt-in

The Bottom Line

Tengu Decoded is essential reading for anyone who wants to understand what their AI tools actually do—and for Anthropic, it's an uncomfortable transparency. The research proves that with Bun's packaging model, proprietary AI tooling offers essentially no obfuscation against determined analysis. Whether you view this as a privacy win or a security wake-up call depends on whether you're the user or the vendor—but either way, the era of opaque AI binaries is effectively over.

> Inside Tengu: Reverse Engineering Reveals What Claude Code Actually Does Under the Hood