June 26, 2026ToolInfrastructureCoding

Workweave Router Picks Your Model for You, on Every Single Request

Here's a problem everyone running coding agents has and pretends they don't: you send every request to the most expensive model because choosing is annoying. Workweave just open-sourced Router to kill that habit. It's a drop-in proxy that picks the best model per request using a tiny on-box embedder, not a vibes-based system prompt, and routes to Anthropic, OpenAI or Gemini in under 50 milliseconds.

The claim is a 40 to 70 percent cost cut, and the method is the interesting part. Instead of a big router model deciding, it uses a cluster-scoring approach from research called Avengers-Pro: score the incoming request, match it to the cluster of models that handles that kind of work well, send it there. Cheap requests go to cheap models, hard ones go to the heavy hitters, and you stop paying flagship prices to reformat JSON. It speaks Anthropic Messages, OpenAI Chat Completions and Gemini native, with streaming, tool use and vision intact, so Claude Code, Codex and Cursor just work.

The detail that matters for adoption: your provider keys stay on your box, encrypted at rest, with OpenTelemetry traces built in. This isn't a SaaS middleman skimming your tokens, it's a local gateway. Written in Go, ELv2 licensed, built by the Workweave team whose engineering-intelligence platform already runs at Robinhood and PostHog.

The bigger picture is that model choice is becoming an optimization problem, not brand loyalty. A year ago you picked Claude or GPT and stuck with it. Now the smart money runs a portfolio and lets a 50-millisecond classifier arbitrage the price-performance gap on every call. The model isn't the product anymore. The routing layer is.

Link: https://github.com/workweave/router
← Previous
GPT-5.6 Sol Ships, But the White House Decides Who Gets It
Next β†’
Runlayer Raises $30M to Be the Control Room for Your Agent Workforce
← Back to all articles

Comments

Loading...
>_