Weave Router Brings Smart Model Routing to Claude Code, Codex, and Cursor

Weave dropped a new tool on Hacker News Monday that could fundamentally change how developers manage their AI toolchains. The Weave Router is a drop-in proxy for Anthropic, OpenAI, and Gemini that uses an on-box embedder to automatically select the optimal model for each request—instead of relying on manual configuration or vibes-based prompting.

How It Works

The router analyzes incoming requests using a cluster scorer derived from Avengers-Pro research and routes them to whichever enabled provider best matches the task. Weave claims it scored 76.09 on RouterArena's Acc-Cost Arena, landing at #1 on the leaderboard. The system supports Anthropic Messages, OpenAI Chat Completions, and Gemini native APIs with full streaming, tools, and vision capabilities.

Supported Tools

Getting started takes about 30 seconds using a simple installer command that detects Claude Code, Codex, or opencode automatically. For self-hosted setups, a single make full-setup spins up Postgres plus the router on port 8080 with a dashboard at localhost:8080/ui/. Cursor support exists but is marked early beta—performance may not be optimal. The proxy also handles DeepSeek, Kimi, GLM, Qwen, Llama, and Mistral through OpenRouter or any OpenAI-compatible endpoint. Provider keys stay local and encrypted at rest under a BYOK model. Observability comes via OTLP traces to the Weave dashboard, Honeycomb, Datadog, or Grafana.

Technical Details

The router exposes standard endpoints matching each provider's format: POST /v1/messages for Anthropic, POST /v1/chat/completions for OpenAI, and POST /v1beta/models/:action for Gemini. A diagnostic endpoint at POST /v1/route lets you inspect routing decisions without proxying a call. The architecture uses two distinct key types—sk-or-... provider keys live in .env.local while rk_... router keys get sent as Bearer tokens from clients.

Roadmap

Planned features include token-aware rate limiting via Redis sliding windows, tenant hierarchies with sub-installations, and speculative dispatch plus hedging strategies for reducing tail latency. The team is actively working on the Cursor integration to improve its performance characteristics before full release.

Key Takeaways

Weave Router sits between your AI tools and providers, routing each request intelligently based on task requirements rather than hardcoded model selections
Installation requires just one command pointing Claude Code, Codex, or opencode at localhost:8080—no manual configuration files needed
OpenRouter support opens the door to dozens of OSS models while maintaining a single API endpoint for your entire stack
The system ranked #1 on RouterArena's Acc-Cost metric, suggesting real-world cost/quality tradeoffs are being handled effectively

The Bottom Line

This is exactly the kind of infrastructure that makes AI toolchains actually maintainable at scale. Rather than duct-taping together provider-specific configurations or paying for premium models when cheaper ones would suffice, a smart proxy handles all that cognitive overhead automatically. If you're running Claude Code, Codex, or opencode in production—or planning to—Weave Router is worth the 30-second experiment.

> Weave Router Brings Smart Model Routing to Claude Code, Codex, and Cursor