If you've ever watched an AI coding assistant re-propose approaches you already rejected or ask you to re-explain decisions made three sessions ago, you're intimately familiar with the stateless agent problem. A new Hacker News project called Centri aims to fix that fundamentally—not by expanding context windows but by flipping the architecture entirely.

The Memory-First Approach

Centri treats memory as a derived index over an append-only event spine rather than storing information directly in the context window. Every tool call, file edit, decision, and result gets durably logged with optional secret redaction. When you start a new turn, the system assembles fresh context from this ledger using a deterministic curation function that attaches a source_event_id to every line of output—meaning the same inputs always produce byte-identical briefs without running any LLM at read time. The project is structured around three interconnected components sharing one durable memory store. The Centri core (Python) handles the append-only event spine, typed memory graph with bi-temporal supersession, deterministic curation, optional LLM consolidation, and exposes both REST and WebSocket surfaces. An OpenCode fork provides the TypeScript/Bun web app shell, patched so every turn recalls a brief from the core and runtime events flow back into the spine. Finally, a Hermes plugin lets you plug this memory system into other AI workflows by translating memory calls into Centri's HTTP API.

Key Technical Decisions

The bi-temporal supersession model is particularly clever: new truth invalidates old truth in the presented context, but history remains fully auditable. Stale facts never resurface in a brief while the complete timeline stays accessible for debugging or review. A SQLite FTS5 index enables verbatim recall of exact prior tokens—file names, error strings, identifiers—so nothing gets lost in translation between sessions. LLM consolidation happens offline through a background worker that folds the raw spine into an ambient layer containing your identity, active projects, top open loops, recent narrative, and a user profile capturing preferences and conventions you've demonstrated repeatedly. The system also supports importing existing histories from OpenCode, Claude Code, and Cursor so fresh installs start warm instead of cold.

Deployment and Benchmarks

Centri runs as two systemd services sharing one SQLite database (~/.centri/state.db): centri-core on port 8760 and the OpenCode fork web UI on port 4096. Configuration is entirely environment-driven with BYOK model support—point LITELLM_BASE_URL at any OpenAI-compatible provider. The core ships with a 385-test suite covering memory graph, curation logic, ACP coding loop, tool contract, and history ingestion. The project includes centri-bench, a falsifiable head-to-head benchmark against Letta (v0.16.8 with pgvector). Both deterministic rubric and LLM judge agreed Centri wins overall (1.00 composite vs 0.93), with the entire gap attributable to stale-fact supersession handling where Centri scored perfect 1.00 versus Letta's 0.67.

Key Takeaways

  • Append-only event spine treats context windows as caches, not storage—memory can be thrown away and re-derived at any time
  • Bi-temporal supersession ensures fresh context while preserving full audit history
  • Deterministic curation with receipts means no LLM runs needed at read time for consistent briefs
  • Benchmarks show Centri significantly outperforms Letta on stale-fact handling (1.00 vs 0.67)

The Bottom Line

This is the kind of fundamental architecture rethink that the AI agent space desperately needs. Instead of chasing larger context windows, Centri proves you can have persistent memory without sacrificing determinism or auditability. Worth watching closely—or better yet, forking and stress-testing.