Autonomy Project Aims To Build a Controllable, Auditable AI Agent Framework From Scratch

What Is Autonomy?

Autonomy is a new AI agent framework that surfaced on Hacker News this week, pitched as a self-directed agent core built for real engineering environments. The project centers on something called AgentLoop — an autonomous planning and execution engine designed to run in both interactive and batch modes. According to the landing page, the system flows through a predictable cycle: select skills, propose actions, rank candidates using Beam Search with width=3, execute via a governed gateway, then evaluate outcomes and learn from them. The creator is Bill Liu, who describes himself as an AI agent systems engineer focused on building reliable, fully auditable AI.

How the Scoring Works

The most interesting technical detail in Autonomy's architecture is its 5-dimensional candidate ranking system. Actions are scored using a formula that weights evidence_strength at +0.30 and purpose alignment at +0.10, while penalizing risk (HIGH risk gets hit with −0.35), side effects (−0.20), and repeated or unavailable tool calls (−1.0 penalty). This means the agent is explicitly designed to prefer actions it can justify, that have clear intent, while actively avoiding risky moves that produce side effects or re-execute already-run tools. An ApprovalPolicy layer intercepts high-risk commands before they hit ActionGateway, adding a human-readable governance checkpoint — something most hobbyist agents skip entirely.

Built-In Skills and Extensible Toolsets

The framework ships with 13 built-in ProcedureSkills covering tasks like api-debugging, browser-navigation, code-editing, systematic-debugging, test-driven-development, and writing-plans. These are dynamically loaded per context using a model.select_procedure_skills() call on each turn, filtering based on available tool names and the target platform. On the toolsets side, there are four core catalogs — file operations (read/write/patch/diff/outline/symbol_search), terminal execution (shell.execute/process.start/poll/wait/stop), search (text + filesystem), and project utilities (git status/diff/log plus JSON/YAML parsing). Optional add-ons include a Playwright-based browser toolkit, persistent workspace memory with context injection, and planned integrations for delegate/cronjob/computer_use — the kind of tooling that turns this from a demo into something engineers could actually deploy.

AI Provider Support

Autonomy is deliberately provider-agnostic. The core supports nine different providers: Ollama for local deployments at localhost:11434/v1, OpenAI via API key, NVIDIA's integrate.api.nvidia.com endpoint (with kimi-k2.6), OpenRouter, xAI, Kimi, and Alibaba's DashScope API among others. This multi-provider approach means developers aren't locked into a single model backend — critical for teams running cost-sensitive workloads or needing on-premise inference for compliance reasons.

The Learning Loop and Event Sourcing

After each successful run that meets an outcome evaluation threshold, LearningLoop auto-drafts a new skill and generates a LearningProposal flagged as CANDIDATE status. A background CuratorDaemon thread then automatically merges duplicate skills to prevent the procedure library from ballooning into chaos. Every execution is recorded via full event sourcing — runs, events, recipes, skills, proposals, and curator logs are all persisted in AutonomyStore. This means you get complete replayability out of the box, which is exactly what you'd want when debugging why an agent took a weird action at 3 AM.

Key Takeaways

AgentLoop uses Beam Search (width=3) with a 5-dim scoring formula that penalizes risk and repeated actions heavily
ApprovalPolicy governs high-risk tool calls before execution — not just another LLM wrapper
13 built-in ProcedureSkills are dynamically loaded per turn based on context, not hardcoded
Multi-provider support spans Ollama plus eight OpenAI-compatible endpoints for flexibility
Full event sourcing with complete replayability ships out of the box — no tracing library needed

The Bottom Line

This is exactly the kind of project that gets ignored by mainstream tech press but ends up being foundational. Autonomy isn't trying to wow anyone with a flashy demo — it's tackling the hard engineering problem of making AI agents controllable, auditable, and safe to run in production environments. If you're building anything serious with autonomous agents today and your current stack feels like duct-tape-and-prayer, Bill Liu's work is worth forking.

> Autonomy Project Aims To Build a Controllable, Auditable AI Agent Framework From Scratch