When developer rentierdigital installed graphify on two production projects—a Next.js storefront with a Convex backend and a content pipeline—he expected to see meaningful token savings from the knowledge graph's cross-document analysis. After 14 days of auditing every JSONL transcript in ~/.claude/projects/, the numbers told a different story: across 60 sessions, Claude Code fired approximately 1,500 hook reminders about the graph and invoked it exactly zero times.

The Audit That Exposed the Gap

The setup was thorough. A post-commit git hook digested commits without failure, the MCP server booted cleanly each session with seven tools visible in the panel, and CLAUDE.md contained rules injected verbatim by graphify's installer. By every mechanical measure, the infrastructure worked perfectly. But rentierdigital discovered that tool installation and tool adoption are fundamentally different problems—one you can automate, the other you cannot. The single documented productive use came during a 404 bug hunt on May 2nd, where the graph helped narrow the search zone by pointing at the right module cluster. Even there, the initial diagnosis was wrong; the agent accused humanize.ts when mesh.ts:tryInsertLink was the actual culprit. One saved grep and roughly 500 tokens of file reads—marginal compared to the published 49-71x token savings benchmarks from Mustafa Genc on GoPenAI (April 2026). Those numbers measure controlled scenarios where an agent must use the graph. This audit measured what happens when agents have a choice.

Why Grep Always Wins

The core issue isn't that graphify is poorly designed—it's that AI agents run real-time arbitrage at every step, weighing the cheapest path to completing a task. To answer questions like "where is X defined" or "who calls Y," grep returns exact lines in 1-2 seconds. Reading GRAPH_REPORT.md delivers 500 tokens of thematic summary before the agent still has to open those files. One step versus two. As rentierdigital puts it, text rules enter that arbitrage as signals, not laws. This pattern appears across GitHub issues #18660, #22022, #19635, and notably #57200 filed May 8th—where a reporter burned 85K tokens re-discovering a decision already documented in memory because their rules.md instructions didn't survive context compression. Yajin Zhou articulated it cleanly in March 2026: for AI agents, rules aren't laws, they're suggestions. Christoph Schweres documented the architectural reality a month earlier: after context compression, CLAUDE.md changes status from rule to information the agent may or may not weight.

Where Knowledge Graphs Actually Help

Rentierdigital identifies two setup mistakes that shaped his experience. First, installing a discovery tool on familiar terrain—both projects were heavily modified by hand over months, meaning Claude's working memory was already warm and cross-document surprises don't surprise anyone who wrote the original code. "I installed a compass inside an apartment I've lived in for five years." Second, deploying it cold when the projects had already crossed into warm maintenance territory. The use cases where graphify genuinely earns its keep are narrower than marketing suggests: onboarding onto inherited repos where grep is blind at start, cross-module security audits hunting hidden dependencies across a legacy codebase, and mixed reading lists (papers plus notes plus tweets) where community detection surfaces connections that grep cannot produce. Jo Van Eyck argued this frame in his "AI coding agents are useless on large codebases" video—on that specific point about unknown surface areas, he was right.

Three Levers That Actually Work

The article's most valuable contribution is distinguishing between advisory tooling and mechanical intervention. Mustafa Morbel framed the contrast: CLAUDE.md is advisory by nature, hooks are deterministic. Text gets debated; mechanics do not. "Block instead of remind." A PreToolUse hook that refuses the alternative action forces the path. If your hook only reminds, you've built a polite voice the agent learns to tune out. If it blocks the call, you've built a wall. Make the tool the short path, not a detour—any tool that adds a step loses the arbitrage regardless of how well you wrote its documentation in CLAUDE.md. Introduce tools at the right moment: context tools win on cold start, not warm maintenance when the agent has already built pattern recognition around existing workflows. As rentierdigital puts it: "I installed a compass inside an apartment I've lived in for five years."

Key Takeaways

  • Tool adoption by AI agents is 10x harder than tool installation—mechanics matter more than documentation
  • Claude Code runs real-time arbitrage at every step; shortest path always wins regardless of CLAUDE.md rules
  • Knowledge graphs work for cold onboarding and heterogeneous corpora, not familiar maintained projects
  • Blocking hooks (walls) outperform reminder hooks (polite voices) every time
  • Audit your own transcripts before trusting that your rules are actually being followed

The Bottom Line

A knowledge graph the agent doesn't read isn't a tool—it's documentation. If you're relying on written rules to change Claude Code's behavior, you're betting on a property the model doesn't have: it has weights, not laws. Before you trust your next set of CLAUDE.md instructions, run "grep 'tool_use' ~/.claude/projects//*.jsonl | grep " and count what actually happened versus what you wrote.