A new open-source tool called h5i (pronounced "high-five") is giving developers a way to audit exactly what AI coding agents did inside their repos—and it's just crossed 440 GitHub stars on Hacker News. The project targets teams using Claude, Codex, Cursor, and similar agents for code generation, offering sandboxed worktrees with built-in provenance tracking that keeps evidence in Git rather than some vendor's cloud.

The Problem With Black-Box Agents

Traditional code review catches bugs after the fact, but AI-generated patches present a different challenge. When an agent edits files, runs shell commands, reads logs, and retries failed tests, Git only shows you the final diff—it has no record of what prompts produced the change, which attempts failed along the way, or what context the agent saw when making decisions. This opacity makes it hard to answer basic questions like "why should I trust this result?" h5i was built specifically to close that gap by treating AI-generated code not as a simple diff but as a full execution trace with provenance.

How It Works

The tool gives each agent its own sandboxed Git worktree, then records prompts, commands, logs, policies, and review trails alongside the actual changes. Evidence lives in the repo itself via Git refs, meaning teams don't need to sign up for any SaaS or trust a third-party audit service. The approach supports both single-agent workflows as a safer workspace and multi-agent scenarios where different agents can try competing approaches in isolated sandboxes before merging one auditable output.

Key Takeaways

  • Prompt versioning with full replay capability
  • Persistent repo-local context and memory across sessions
  • Supervised sandboxed execution environments
  • Command and log capture with token reduction up to 95% for noisy logs
  • Conflict-free multi-agent workflows with separate worktrees
  • Automated audit trails tied directly to AI-generated code

The Bottom Line

h5i is still early-stage, but it addresses a real gap in how teams adopt AI coding assistants. If you're running agents on production repos and can't answer "what actually happened?" you have an accountability problem whether you admit it or not. This project deserves attention from anyone serious about trustworthy AI-assisted development.