Multi-ModelRouting

StatusTesting
StackOpenRouter, Claude API, GPT-4o

Intelligent routing layer that picks the right model for each task — optimising cost and latency per request.

A middleware layer that sits between the application and the AI providers. When a request comes in, the router classifies the task type (reasoning, vision, simple extraction, creative writing) and routes to the optimal model. Claude Sonnet for complex reasoning, Haiku for simple classification, GPT-4o for image analysis. The router itself uses Haiku to classify — adding only 50ms overhead.

OpenRouterClaude APIGPT-4o

Testing across 500 real requests from the BRVO site audit tool. Current results: routing reduces average cost per request by 41% compared to sending everything to Sonnet, with only a 3% drop in output quality (measured by human evaluation). The biggest win is on simple tasks — extracting meta tags from HTML costs £0.001 with Haiku vs £0.008 with Sonnet, same accuracy.

Being integrated into all BRVO AI features to reduce client operating costs. When a chatbot answers a simple FAQ, it uses Haiku. When it needs to reason about a complex customer problem, it routes to Sonnet. Clients get better AI at lower monthly cost.

High-traffic chatbot

A chatbot handling 10,000 messages per day. Simple greetings and FAQs go to Haiku (£0.001/msg). Complex product comparisons go to Sonnet (£0.008/msg). Monthly cost drops from £2,400 to £1,400 with no quality loss on complex queries.

Document processing pipeline

An invoicing system processing 500 PDFs daily. Simple field extraction (date, amount, vendor) uses Haiku. Anomaly detection and fraud flagging routes to Sonnet. 60% cost reduction.

Content moderation

A platform moderating user-generated content. Obviously safe content passes through Haiku instantly. Edge cases route to Sonnet for nuanced judgement. Faster moderation, lower cost, same safety.

Want this for your business?

Start a sprint
Back toLab