LiteLLM Agent Platform Brings Claude Code and Codex Behind Your Firewall

BerriAI just dropped LiteLLM Agent Platform, an open-source stack for running AI coding agents like Claude Code and Codex in fully isolated, self-hosted environments. The pitch is simple: give your agents the power to ship code without ever letting them peek at your actual API keys or credentials. It's MIT licensed and live on GitHub now.

The Vault Architecture That Makes This Work

The core innovation here is the credential vault system. When you spin up a sandbox pod in Kubernetes, its environment contains only stub credentials—something like GITHUB_TOKEN=stub_github_a8f1. Every time an agent tries to make an outbound TLS connection, the vault intercepts it and swaps those stubs for your real keys on-the-fly. The agent never sees what it's actually using. You get bypass-permissions-level agent capabilities without the 'oops I committed my production AWS keys' nightmare.

Getting Started: Lap CLI and Local Dev

The platform ships with a terminal-first interface called lap ( LiteLLM Agent Platform, naturally). Install it via npm from the repo's cli directory, symlink the binary to your path, point it at your running instance, and you're off. Running lap run claude-code or lap run codex spins up a fresh Kubernetes pod, attaches your local terminal to its TTY over WebSocket, and drops you straight into an interactive agent session. Detach with Ctrl-D and the sandbox stays alive for 24 hours.

Self-Hosting Options: Kind for Dev, EKS for Prod

For local development, bin/kind-up.sh provisions a kind cluster named agent-sbx, installs the agent-sandbox controller from kubernetes-sigs, and loads the harness image. Docker Compose bootstraps Postgres, runs schema migrations, and spins up the web UI on port 3000 plus a worker process. The recommended production path leans on AWS EKS for the sandbox cluster and Render for the web/worker tier—BerriAI provides an eks-up.sh script and a one-click Render Blueprint in the deploy directory.

Developer API for Programmatic Control

Power users can skip both CLI and web UI entirely. The platform exposes REST endpoints for creating agents, opening sessions, sending messages, and reading replies. Check docs/spawn-task-agent.md and src/server/DEVELOPER.md for the full API surface. This makes it trivial to integrate sandboxed coding agents into existing CI/CD pipelines or internal tooling without human-in-the-loop workflows.

Key Takeaways

Credential vault architecture means zero credential exposure to agents, even with bypass-level permissions
Supports Claude Code, Codex, and any agent that runs in a terminal environment
Kubernetes-native via the official kubernetes-sigs/agent-sandbox CRD
Sessions persist 24 hours post-detach; pods are ephemeral by default otherwise
MIT licensed with no commercial restrictions—self-host anywhere

The Bottom Line

This is exactly the kind of infrastructure the AI coding agent ecosystem has been missing. Giving agents real capabilities without credential exposure has always required bespoke solutions or dangerous workarounds. LiteLLM Agent Platform standardizes that pattern, and the MIT license means anyone's free to fork it, audit it, or extend it for enterprise use cases. If you're running AI coding agents in any security-conscious environment, this is worth evaluating seriously.

> LiteLLM Agent Platform Brings Claude Code and Codex Behind Your Firewall