Weave dropped a new tool on Hacker News Monday that could fundamentally change how developers manage their AI toolchains. The Weave Router is a drop-in proxy for Anthropic, OpenAI, and Gemini that uses an on-box embedder to automatically select the optimal model for each request—instead of relying on manual configuration or vibes-based prompting.
How It Works
The router analyzes incoming requests using a cluster scorer derived from Avengers-Pro research and routes them to whichever enabled provider best matches the task. Weave claims it scored 76.09 on RouterArena's Acc-Cost Arena, landing at #1 on the leaderboard. The system supports Anthropic Messages, OpenAI Chat Completions, and Gemini native APIs with full streaming, tools, and vision capabilities.
Supported Tools
Getting started takes about 30 seconds using a simple installer command that detects Claude Code, Codex, or opencode automatically. For self-hosted setups, a single make full-setup spins up Postgres plus the router on port 8080 with a dashboard at localhost:8080/ui/. Cursor support exists but is marked early beta—performance may not be optimal. The proxy also handles DeepSeek, Kimi, GLM, Qwen, Llama, and Mistral through OpenRouter or any OpenAI-compatible endpoint. Provider keys stay local and encrypted at rest under a BYOK model. Observability comes via OTLP traces to the Weave dashboard, Honeycomb, Datadog, or Grafana.
Technical Details
The router exposes standard endpoints matching each provider's format: POST /v1/messages for Anthropic, POST /v1/chat/completions for OpenAI, and POST /v1beta/models/:action for Gemini. A diagnostic endpoint at POST /v1/route lets you inspect routing decisions without proxying a call. The architecture uses two distinct key types—sk-or-... provider keys live in .env.local while rk_... router keys get sent as Bearer tokens from clients.
Roadmap
Planned features include token-aware rate limiting via Redis sliding windows, tenant hierarchies with sub-installations, and speculative dispatch plus hedging strategies for reducing tail latency. The team is actively working on the Cursor integration to improve its performance characteristics before full release.
Key Takeaways
- Weave Router sits between your AI tools and providers, routing each request intelligently based on task requirements rather than hardcoded model selections
- Installation requires just one command pointing Claude Code, Codex, or opencode at localhost:8080—no manual configuration files needed
- OpenRouter support opens the door to dozens of OSS models while maintaining a single API endpoint for your entire stack
- The system ranked #1 on RouterArena's Acc-Cost metric, suggesting real-world cost/quality tradeoffs are being handled effectively
The Bottom Line
This is exactly the kind of infrastructure that makes AI toolchains actually maintainable at scale. Rather than duct-taping together provider-specific configurations or paying for premium models when cheaper ones would suffice, a smart proxy handles all that cognitive overhead automatically. If you're running Claude Code, Codex, or opencode in production—or planning to—Weave Router is worth the 30-second experiment.