Throughline, a hook plugin for Claude Code that aggressively offloads tool I/O to SQLite, hit npm this week—and if you're running long Claude Code sessions, this is the kind of thing that makes you wonder how you ever lived without it. The package comes from developer quolu and targets developers who routinely blow past token limits because every file read, grep result, and Bash output lingers in context until session end, even after it's done its job.

How It Works: A Three-Layer Memory Architecture

Throughline splits conversation management into three distinct layers. L2 holds the raw conversation body—user input plus AI response—for the last 20 turns injected as-is. L1 takes everything older than 20 turns and compresses it to roughly one-fifth its original size while preserving key decision points, using Haiku calls for summarization under a Claude Max plan (no separate API key needed). L3 handles tool I/O—file contents, grep results, Bash output, system messages, and thinking—all of which get shunted entirely out of context and into SQLite, retrieved only when Claude actually needs them. The result: read files don't sit around consuming tokens after the AI has already moved on.

Real Numbers From a 50-Turn Session

The author tested Throughline against a 50-turn session on their own machine. A conversation that normally consumed approximately 125,000 tokens was reduced to within roughly 13,000 tokens—a reduction of about 90 percent. That's not an estimate based on character counts divided by four; the plugin reads actual API values from message.usage in the transcript JSONL files, so you're getting accurate numbers rather than fuzzy approximations. It also auto-detects 1M context windows for larger deployments.

Installation and Memory Persistence

Getting started is straightforward: run npm install -g throughline followed by throughline install, which registers the hook in ~/.claude/settings.json and runs automatically across all Claude Code projects on your machine—no per-project configuration required. Throughline also includes a multi-session capable token monitor triggered with throughline monitor. Notably, conversations persist in SQLite even after running /clear within Claude Code. To carry memory forward to a new session, you type /tl—and only then; the design intentionally prevents accidental firing across parallel windows or VSCode restarts. When you do carry over, Throughline passes along the "next step memo" written by the previous Claude instance and the internal reasoning from the final turn, launching the next Claude in "continue from interruption" mode rather than naive log-reading mode.

Zero Dependencies and Platform Support

The package ships with zero npm dependencies. The tarball published to npm contains only .mjs files—no build process, no native bindings. Requirements are Node.js 22.5+ (leveraging the built-in node:sqlite module), Claude Code with hooks support, a Claude Max plan for L1 summarization via Haiku, and Windows, macOS, or Linux. The project is MIT licensed, with source available on GitHub where bug reports and pull requests are welcome.

Key Takeaways

  • Tool I/O (file reads, grep results, Bash output) gets completely removed from context and stored in SQLite instead of lingering until session end
  • A 50-turn test showed token usage drop from ~125K to ~13K—a roughly 90% reduction using real API measurement values
  • Memory persists across /clear commands; explicit /tl command carries state forward between sessions in "continue from interruption" mode
  • Zero npm dependencies—just .mjs files—requiring only Node.js 22.5+ and the built-in node:sqlite module

The Bottom Line

This is exactly the kind of utility that gets invented when someone actually lives inside a tool day-to-day and hits the pain point directly. Tool I/O bloat has been the elephant in the room for every long Claude Code session, and Throughline doesn't patch it—it surgically removes it. If you're running Claude Code seriously, this belongs on your machine.