Overview
The Model Router scores the complexity of every agent request and routes it to the cheapest capable model tier. This keeps costs low for simple interactions (greetings, status checks) while ensuring complex tasks (code architecture, multi-step reasoning) get powerful models.
Tier Definitions
| Tier | Score Range | Default Model | Cost | Use Case |
|---|
| LOCAL | < 0.3 | Qwen3 30B-A3B via Ollama | Free | Simple queries, greetings, status. Cannot use tools or leverage injected memory context. |
| FAST | 0.3 - 0.5 | Claude Sonnet | $ | Tool calls, memory queries, standard interaction |
| BALANCED | 0.5 - 0.8 | Claude Opus | $$ | Code generation, analysis, multi-step reasoning |
| POWERFUL | > 0.8 | Claude Opus | $$$ | Complex architecture, deep contemplation, critical decisions |
Tier model assignments are configurable per profile and can be overridden in ~/.argentos/argent.json:
{
"agents": {
"defaults": {
"modelRouter": {
"tiers": {
"local": { "provider": "ollama", "model": "qwen3:30b-a3b" },
"fast": { "provider": "anthropic", "model": "claude-sonnet-4-20250514" },
"balanced": { "provider": "anthropic", "model": "claude-opus-4-20250514" },
"powerful": { "provider": "anthropic", "model": "claude-opus-4-20250514" }
}
}
}
}
}
Complexity Scoring Algorithm
The scoreComplexity() function evaluates 7 factors and produces a score between 0 and 1:
Factor 1: Prompt Length
| Length | Score Contribution | Reason |
|---|
| < 80 chars | +0.05 | Short prompt (greeting, yes/no) |
| 80-300 chars | +0.15 | Medium prompt (simple question) |
| 300-1000 chars | +0.30 | Long prompt (detailed request) |
| > 1000 chars | +0.45 | Very long prompt (complex specification) |
Factor 2: Thinking Level
The user’s thinking level preference adds a boost:
| Level | Score Contribution |
|---|
xhigh / high | +0.15 |
medium | +0.10 |
low / minimal | +0.05 |
Thinking level is treated as a signal of desired reasoning depth, not as an automatic tier escalation. A user who always sets xhigh but sends “hey” still routes to the appropriate tier.
If the request includes images (requires vision-capable models):
| Condition | Score Contribution |
|---|
hasImages: true | +0.30 |
Factor 4: Session Type
The session type applies floors, caps, and boosts:
| Session Type | Behavior |
|---|
| Heartbeat | Floor at 0.3 (FAST minimum). Heartbeats run nudges and tool calls that need reliable tool execution. |
| Contemplation | Boosted to minimum 0.85 (POWERFUL). Self-directed thinking requires depth — cheap models take the easy rest exit. |
| Main (user-facing) | Floor at 0.3 (FAST minimum). User sessions have tools and injected context that local models cannot handle. |
| Subagent | +0.10 boost. Subagent tasks tend to be more focused. |
For user-facing sessions, the router detects prompts likely to trigger tool use:
const toolTriggerPatterns = [
/\b(save|store|record|log|write)\b.*\b(memory|that|this|it)\b/i,
/\b(remember|don't forget|note that|keep in mind)\b/i,
/\b(check|show|list|view)\b.*\b(task|tasks|todo|schedule)\b/i,
/\b(send|message|dm|notify|ping)\b.*\b(discord|telegram|slack|email)\b/i,
/\b(search|look up|find|fetch)\b.*\b(web|online|google|news)\b/i,
/\b(add|create|start|complete|finish|block)\b.*\b(task|tasks)\b/i,
/\b(generate|create|make)\b.*\b(image|audio|video|speech)\b/i,
/\b(open|push|update)\b.*\b(doc|document|panel|canvas)\b/i,
];
When a tool-likely pattern matches and the score is below 0.5 (BALANCED floor), the score is boosted. This prevents Haiku from simulating tool calls as text instead of making real tool_use blocks.
Factor 6: Code/Technical Content
| Pattern Matches | Score Contribution | Reason |
|---|
| >= 3 patterns | +0.20 | Heavy code/technical content |
| >= 1 pattern | +0.10 | Some code/technical content |
Detection patterns include code blocks, programming keywords (function, class, import), SQL statements, and infrastructure terms (docker, kubernetes).
Factor 7: Reasoning/Analysis Patterns
| Pattern Matches | Score Contribution | Reason |
|---|
| >= 2 patterns | +0.15 | Reasoning/analysis required |
| >= 1 pattern | +0.05 | Some reasoning needed |
Detection patterns include analysis verbs (analyze, compare, evaluate), trade-off language, step-by-step requests, and system design requests.
Score Reduction: Simple Queries
Simple greetings and acknowledgments (hi, hello, hey, thanks, ok, yes, no) reduce the score, keeping trivial interactions on the cheapest tier.
Background Model Lanes
Five background subsystems have dedicated routing:
| Lane | Target Tier | Rationale |
|---|
| Contemplation | POWERFUL (0.85+) | Self-directed thinking needs depth |
| SIS | BALANCED | Lesson extraction needs adequate reasoning |
| Heartbeat | FAST (0.3 floor) | Tool calls + nudges need reliable execution |
| Execution Worker | Configurable | Worker has its own model override option |
| Embeddings | LOCAL or configured | Embedding generation is a separate concern |
Named Routing Profiles
Profiles define complete tier mappings and fallback chains:
interface ModelProfile {
description?: string;
tiers: Partial<Record<ModelTier, TierModelMapping>>;
fallbackProfile?: string; // Chain to another profile
}
Built-in Profiles
ArgentOS ships with several built-in profiles including default, budget, minimax-mix, and others defined in src/models/builtin-profiles.ts. User-defined profiles in config take precedence.
Profile Chaining
Profiles can chain to fallback profiles:
{
"profiles": {
"budget": {
"tiers": { "fast": { "provider": "minimax", "model": "MiniMax-M2.1" } },
"fallbackProfile": "default"
}
}
}
Cross-Provider Fallback
When a provider fails (rate limit, error, unavailability), the router walks the fallback chain:
- Primary: The tier’s configured model/provider
- Profile fallback: Next model in the profile’s fallback chain
- Cross-provider: Different provider for the same tier
Cycle Detection
The fallback walker tracks visited profiles to prevent infinite loops:
const visited = new Set<string>([startProfileName ?? ""]);
let nextName: string | undefined = startProfile.fallbackProfile;
while (nextName && !visited.has(nextName)) {
visited.add(nextName);
// ... resolve and add fallback
}
Provider Normalization
The router normalizes provider names to handle common variations:
function normalizeProvider(provider: string): string {
if (normalized === "z.ai" || normalized === "z-ai") return "zai";
if (normalized === "opencode-zen") return "opencode";
return normalized;
}
It also infers providers from model names (claude-* -> anthropic, gpt-* -> openai, glm-* -> zai, etc.) to handle misconfigured profiles.
Dashboard Model Badge
The dashboard displays a color-coded badge on each message showing which tier was used:
| Color | Tier | Meaning |
|---|
| Green | LOCAL | Free local model |
| Yellow | FAST | Fast/cheap cloud model |
| Blue | BALANCED | Mid-tier cloud model |
| Purple | POWERFUL | Premium cloud model |
Session Model Override
Individual sessions can override the router with a specific model:
type SessionModelOverride = {
provider: string;
model: string;
reason?: string;
};
This is used when the user explicitly selects a model in the dashboard, or when a subsystem (like the execution worker) has a configured model preference.
Memory Query Boost
Memory recall queries receive a +0.25 score boost to ensure they route to at least the FAST tier. Local models lack the context window and reasoning needed to effectively process retrieved memories and synthesize them into useful responses.
Configuration
Full router configuration in ~/.argentos/argent.json:
{
"agents": {
"defaults": {
"modelRouter": {
"enabled": true,
"defaultProfile": "default",
"thresholds": {
"local": 0.3,
"fast": 0.5,
"balanced": 0.8
},
"tiers": {
"local": { "provider": "ollama", "model": "qwen3:30b-a3b" },
"fast": { "provider": "anthropic", "model": "claude-sonnet-4-20250514" },
"balanced": { "provider": "anthropic", "model": "claude-opus-4-20250514" },
"powerful": { "provider": "anthropic", "model": "claude-opus-4-20250514" }
},
"profiles": {}
}
}
}
}
Disabling the Router
Set enabled: false to bypass routing and always use the default model:
{
"agents": {
"defaults": {
"modelRouter": {
"enabled": false
}
}
}
}
Key Files
| File | LOC | Description |
|---|
src/models/router.ts | ~586 | Complexity scoring and tier routing |
src/models/types.ts | ~104 | ModelTier, RoutingDecision, ModelProfile types |
src/models/builtin-profiles.ts | — | Built-in profile definitions |
src/agents/provider-registry.ts | ~126 | Dynamic provider registry |
src/agents/provider-registry-seed.ts | ~648 | 15+ provider definitions |