Distill Agent Forces AI to Prove Work Before Calling It Done

Every developer who's handed a task to an AI agent knows the feeling: you ask it to set up a database, deploy your app, or write a test suite, and it comes back with confident platitudes before you've seen a single line of output. Distill Agent flips that script entirely. Before this framework ever says "done," it must produce physical evidence—a file written to disk, a service listening on its port, a test passing in the terminal. No proof, no finish line.

Evidence-Gated Execution: The Core Innovation

Distill's task contract system is straightforward in concept but brutal in practice: before the agent starts working, you declare what counts as completion. A file must exist at /app/config.yaml. The PostgreSQL container must be responding on port 5432. The migration script must exit with code zero. Only when these conditions are met does the ReAct loop consider the task finished. This isn't just best-effort reasoning—it's cryptographic certainty that your agent actually did what you asked. The architecture runs through a FastAPI gateway with WebSocket streaming, allowing real-time visibility into every tool execution. The agent's thought process flows across the wire to your terminal or control panel as it happens, so you're never staring at a blank prompt wondering what's happening inside the black box. Behind the scenes, a FIFO queue per user session ensures message ordering stays strict even under high concurrency.

Skill Distillation: When Good Executions Become Reusable Tools

Here's where things get interesting. After successfully completing complex multi-step tasks, Distill doesn't just log success—it distills what worked into parameterized Python tools using its built-in skill synthesizer. These aren't static scripts you write once; they're living artifacts that evolve with your codebase. Each synthesized skill gets versioned automatically, and here's the kicker: if a newer version of a skill starts regressing (fewer tasks succeeding, more errors), Distill rolls it back to the previous known-good state without human intervention. The evaluator runs trajectory analysis on every skill invocation, tracking success rates over time. When regression hits threshold, rollback triggers automatically. This is version control for AI-generated tooling—something the ecosystem desperately needs as we offload more grunt work to autonomous agents.

Communication Channels: Your Agent, Your Platform

Distill doesn't lock you into a web UI or proprietary dashboard. It reaches you wherever you already live: Telegram, Discord, Slack, email, or the terminal. Each adapter streams real-time typing indicators and tool execution logs as JSON events, so you're always current on what your agent is doing—no polling, no refresh buttons. The control panel itself is a React + Tailwind chat interface with live token streaming, giving you that modern LLM chatbot feel while maintaining full visibility into the underlying execution. For terminal purists, there's a dedicated TUI mode that strips away the GUI entirely. All adapters are opt-in via environment tokens and degrade gracefully when not configured.

Deployment Flexibility: From Laptop to Cloud

Getting Distill running is refreshingly painless for a self-hosted agent framework. The bootstrap scripts handle prerequisites automatically—Node, Python, Docker if needed—so spinning up on a fresh machine requires zero manual setup. For cloud deployment, one-click options exist for Render, Railway, and Fly.io through their respective configuration files. On the execution side, Distill supports three sandbox modes: local shell (default), Docker containers with full isolation, or serverless endpoints via an HTTP exec shim that can front providers like Daytona, E2B, or Modal. This means you can prototype locally on your laptop, then scale execution to cloud sandboxes without changing a line of configuration—the shim abstracts provider differences behind one consistent contract.

Key Takeaways

Evidence gating forces accountability—no more 'I'll do it now' promises from AI agents
Skill distillation creates reusable tools from successful trajectories with auto-rollback on regression
Hybrid memory combines SQLite, ChromaDB semantic search, and Neo4j graph relationships for cross-session context
Universal sandboxing supports local, Docker, or serverless execution through a single HTTP interface
MIT licensed open source at github.com/Aspct3434/Distill-Agent with stable core and experimental features clearly marked

The Bottom Line

Distill Agent addresses one of the most annoying failure modes in AI-assisted development: agents that claim completion while leaving your project in a half-baked state. By making evidence mandatory rather than aspirational, it brings accountability to autonomous execution in a way that feels long overdue. If you're running AI agents in production—or even just experimenting—this is the kind of rigor the ecosystem needs.

> Distill Agent Forces AI to Prove Work Before Calling It Done