Show HN: FERNme Builds Agent Memory with Zero LLM Calls and Flat Token Costs Forever

Most AI agents treat memory like a luxury—expensive to write, prone to hallucination, and locked behind vendor walls. FERNme, posted to Hacker News on June 20th by developer mirkofr, takes the opposite approach: it's built for agents that act at scale, across any domain, without burning tokens on every interaction.

The Core Innovation

FERNme runs a fuzzy-edged Hebbian graph where memory updates are pure arithmetic—no LLM calls required. When an agent interacts with a user, the system strengthens connections based on co-occurrence patterns and applies decay to older associations. The result is a per-site preference graph that stays lean: about 25 tokens whether it's tracking someone's first visit or their five-year behavior pattern.

Zero-LLM Writes And Flat Costs

Traditional extraction-based memory systems call an LLM roughly twice per interaction—one for writing, one for retrieval. FERNme eliminates both by using deterministic graph operations instead of generative models to update beliefs. At 120 interactions, this architecture is 77× more token-efficient than full-history approaches. The write path stays LLM-free in every mode; gated/offline enrichment only triggers as an opt-in fallback when the system encounters novel free-text it can't map deterministically.

Performance Benchmarks

On synthetic data generated from 92 third-person profiles, FERNme achieved 75% preference coverage against hidden answer keys, detected preference drift at 94%, and ignored 100% of injection attempts. In cost/quality Pareto analysis, pure mode hit ~52% quality at $0.008 per 1,000 interactions—122× cheaper than Mem0's estimated $0.95 at comparable quality. The system ties frequency counters on static recall but dominates when tastes shift: FERNme scores 0.72 while a frequency baseline collapses to 0.13.

User Ownership And Privacy

Every preference is visible and editable through a glass-box interface. Users can export their data, delete everything, or invoke forget_everywhere() to wipe the profile and unlearn themselves from the population prior—a provable right-to-be-forgotten backed by cryptographic guarantees. The system logs every action in a tamper-evident HMAC chain so users can detect any unauthorized changes.

Injection Resistance

Because writes are arithmetic operations on graph edges—not LLM extraction—page or user text can't be "talked into" becoming a belief. The source claims 100% of tested injection attempts were ignored, which matters enormously for agents deployed in adversarial environments like customer support or e-commerce.

Architecture And Deployment

FERNme ships with three memory modes: pure (default, no LLM), gated (one small call only on novel free-text), and offline (batched consolidation off the hot path). The REST API includes endpoints for /observe, /card, /recall, /edit, /export, /delete, and triggers. An MCP server is available for Claude integration. Storage defaults to SQLite for zero-setup but production deployments can use Postgres 16 with the same interface. The project includes a glass-box memory editor at /ui and a cross-surface graph visualization at /graph.

Honest Caveats

The benchmarks are on synthetic or LLM-authored data, not real users. The Mem0 head-to-head hasn't been run yet—the harness exists but needs an API key to execute. Gated and offline quality numbers are modeled until validated against a live model. These aren't weaknesses; they're the transparent status of a research preview that knows its limits.

Key Takeaways

FERNme uses Hebbian graph learning for zero-LLM memory writes, eliminating extraction overhead entirely
Token costs stay flat (~25 tokens) regardless of profile age—77× smaller than full history at 120 interactions
The system is injection-resistant by construction: deterministic writes can't be corrupted by prompt injection
User-owned glass-box design with cryptographic audit trails and provable right-to-be-forgotten
Open Apache-2.0 license, 88 tests passing, REST + MCP interfaces, SQLite or Postgres storage

The Bottom Line

FERNme isn't trying to win on nuanced preference extraction—that's LLM extraction's domain. It's proving that deterministic, interpretable memory can be cheap enough for production agents at scale while keeping users in control of what gets remembered about them. If the real-human pilot validates these numbers, this is the architecture most agent developers should be building on.

> Show HN: FERNme Builds Agent Memory with Zero LLM Calls and Flat Token Costs Forever