Model System Overview
How ArgentOS routes tasks to the right model tier for optimal cost and performance.
Overview
ArgentOS uses a model router to automatically select the best model for each task. Simple queries go to fast, cheap models. Complex reasoning goes to powerful, expensive ones. This saves money without sacrificing quality.
The Tier System
| Tier | Score Range | Default Model | Cost | Use Case |
|---|---|---|---|---|
| LOCAL | < 0.3 | Qwen3 30B-A3B via Ollama | Free | Simple lookups, quick replies |
| FAST | 0.3 - 0.5 | Claude Haiku | Low | Straightforward questions, memory recall |
| BALANCED | 0.5 - 0.8 | Claude Sonnet | Medium | Most conversations, tool use |
| POWERFUL | > 0.8 | Claude Opus | High | Complex reasoning, multi-step planning |
The model router scores each incoming message on a 0-1 complexity scale and routes it to the corresponding tier.
Key Features
- Automatic routing: No manual model selection needed
- Cost optimization: Simple tasks use cheap models, saving budget for complex work
- Provider diversity: Mix local models, Anthropic, MiniMax, Z.AI, and OpenRouter
- Failover: Automatic fallback when a provider is unavailable
- Auth profiles: Multiple API keys with rotation and cooldown management
Dashboard Integration
The dashboard shows a model badge on each message indicating which tier was used:
- Green = LOCAL (free)
- Yellow = FAST (cheap)
- Blue = BALANCED (standard)
- Purple = POWERFUL (expensive)
Deep Dives
- Model Router -- How complexity scoring and routing work
- Providers -- Supported model providers and configuration
- Auth Profiles -- Managing multiple API keys and subscriptions
- Failover -- How automatic failover and cooldown work