Skip to main content

Overview

The Model Router scores the complexity of every agent request and routes it to the cheapest capable model tier. This keeps costs low for simple interactions (greetings, status checks) while ensuring complex tasks (code architecture, multi-step reasoning) get powerful models.

Tier Definitions

TierScore RangeDefault ModelCostUse Case
LOCAL< 0.3Qwen3 30B-A3B via OllamaFreeSimple queries, greetings, status. Cannot use tools or leverage injected memory context.
FAST0.3 - 0.5Claude Sonnet$Tool calls, memory queries, standard interaction
BALANCED0.5 - 0.8Claude Opus$$Code generation, analysis, multi-step reasoning
POWERFUL> 0.8Claude Opus$$$Complex architecture, deep contemplation, critical decisions
Tier model assignments are configurable per profile and can be overridden in ~/.argentos/argent.json:
{
  "agents": {
    "defaults": {
      "modelRouter": {
        "tiers": {
          "local":    { "provider": "ollama",    "model": "qwen3:30b-a3b" },
          "fast":     { "provider": "anthropic", "model": "claude-sonnet-4-20250514" },
          "balanced": { "provider": "anthropic", "model": "claude-opus-4-20250514" },
          "powerful": { "provider": "anthropic", "model": "claude-opus-4-20250514" }
        }
      }
    }
  }
}

Complexity Scoring Algorithm

The scoreComplexity() function evaluates 7 factors and produces a score between 0 and 1:

Factor 1: Prompt Length

LengthScore ContributionReason
< 80 chars+0.05Short prompt (greeting, yes/no)
80-300 chars+0.15Medium prompt (simple question)
300-1000 chars+0.30Long prompt (detailed request)
> 1000 chars+0.45Very long prompt (complex specification)

Factor 2: Thinking Level

The user’s thinking level preference adds a boost:
LevelScore Contribution
xhigh / high+0.15
medium+0.10
low / minimal+0.05
Thinking level is treated as a signal of desired reasoning depth, not as an automatic tier escalation. A user who always sets xhigh but sends “hey” still routes to the appropriate tier.

Factor 3: Image Input

If the request includes images (requires vision-capable models):
ConditionScore Contribution
hasImages: true+0.30

Factor 4: Session Type

The session type applies floors, caps, and boosts:
Session TypeBehavior
HeartbeatFloor at 0.3 (FAST minimum). Heartbeats run nudges and tool calls that need reliable tool execution.
ContemplationBoosted to minimum 0.85 (POWERFUL). Self-directed thinking requires depth — cheap models take the easy rest exit.
Main (user-facing)Floor at 0.3 (FAST minimum). User sessions have tools and injected context that local models cannot handle.
Subagent+0.10 boost. Subagent tasks tend to be more focused.

Factor 5: Tool-Likely Prompt Detection

For user-facing sessions, the router detects prompts likely to trigger tool use:
const toolTriggerPatterns = [
  /\b(save|store|record|log|write)\b.*\b(memory|that|this|it)\b/i,
  /\b(remember|don't forget|note that|keep in mind)\b/i,
  /\b(check|show|list|view)\b.*\b(task|tasks|todo|schedule)\b/i,
  /\b(send|message|dm|notify|ping)\b.*\b(discord|telegram|slack|email)\b/i,
  /\b(search|look up|find|fetch)\b.*\b(web|online|google|news)\b/i,
  /\b(add|create|start|complete|finish|block)\b.*\b(task|tasks)\b/i,
  /\b(generate|create|make)\b.*\b(image|audio|video|speech)\b/i,
  /\b(open|push|update)\b.*\b(doc|document|panel|canvas)\b/i,
];
When a tool-likely pattern matches and the score is below 0.5 (BALANCED floor), the score is boosted. This prevents Haiku from simulating tool calls as text instead of making real tool_use blocks.

Factor 6: Code/Technical Content

Pattern MatchesScore ContributionReason
>= 3 patterns+0.20Heavy code/technical content
>= 1 pattern+0.10Some code/technical content
Detection patterns include code blocks, programming keywords (function, class, import), SQL statements, and infrastructure terms (docker, kubernetes).

Factor 7: Reasoning/Analysis Patterns

Pattern MatchesScore ContributionReason
>= 2 patterns+0.15Reasoning/analysis required
>= 1 pattern+0.05Some reasoning needed
Detection patterns include analysis verbs (analyze, compare, evaluate), trade-off language, step-by-step requests, and system design requests.

Score Reduction: Simple Queries

Simple greetings and acknowledgments (hi, hello, hey, thanks, ok, yes, no) reduce the score, keeping trivial interactions on the cheapest tier.

Background Model Lanes

Five background subsystems have dedicated routing:
LaneTarget TierRationale
ContemplationPOWERFUL (0.85+)Self-directed thinking needs depth
SISBALANCEDLesson extraction needs adequate reasoning
HeartbeatFAST (0.3 floor)Tool calls + nudges need reliable execution
Execution WorkerConfigurableWorker has its own model override option
EmbeddingsLOCAL or configuredEmbedding generation is a separate concern

Named Routing Profiles

Profiles define complete tier mappings and fallback chains:
interface ModelProfile {
  description?: string;
  tiers: Partial<Record<ModelTier, TierModelMapping>>;
  fallbackProfile?: string;  // Chain to another profile
}

Built-in Profiles

ArgentOS ships with several built-in profiles including default, budget, minimax-mix, and others defined in src/models/builtin-profiles.ts. User-defined profiles in config take precedence.

Profile Chaining

Profiles can chain to fallback profiles:
{
  "profiles": {
    "budget": {
      "tiers": { "fast": { "provider": "minimax", "model": "MiniMax-M2.1" } },
      "fallbackProfile": "default"
    }
  }
}

Cross-Provider Fallback

When a provider fails (rate limit, error, unavailability), the router walks the fallback chain:
  1. Primary: The tier’s configured model/provider
  2. Profile fallback: Next model in the profile’s fallback chain
  3. Cross-provider: Different provider for the same tier

Cycle Detection

The fallback walker tracks visited profiles to prevent infinite loops:
const visited = new Set<string>([startProfileName ?? ""]);
let nextName: string | undefined = startProfile.fallbackProfile;

while (nextName && !visited.has(nextName)) {
  visited.add(nextName);
  // ... resolve and add fallback
}

Provider Normalization

The router normalizes provider names to handle common variations:
function normalizeProvider(provider: string): string {
  if (normalized === "z.ai" || normalized === "z-ai") return "zai";
  if (normalized === "opencode-zen") return "opencode";
  return normalized;
}
It also infers providers from model names (claude-* -> anthropic, gpt-* -> openai, glm-* -> zai, etc.) to handle misconfigured profiles.

Dashboard Model Badge

The dashboard displays a color-coded badge on each message showing which tier was used:
ColorTierMeaning
GreenLOCALFree local model
YellowFASTFast/cheap cloud model
BlueBALANCEDMid-tier cloud model
PurplePOWERFULPremium cloud model

Session Model Override

Individual sessions can override the router with a specific model:
type SessionModelOverride = {
  provider: string;
  model: string;
  reason?: string;
};
This is used when the user explicitly selects a model in the dashboard, or when a subsystem (like the execution worker) has a configured model preference.

Memory Query Boost

Memory recall queries receive a +0.25 score boost to ensure they route to at least the FAST tier. Local models lack the context window and reasoning needed to effectively process retrieved memories and synthesize them into useful responses.

Configuration

Full router configuration in ~/.argentos/argent.json:
{
  "agents": {
    "defaults": {
      "modelRouter": {
        "enabled": true,
        "defaultProfile": "default",
        "thresholds": {
          "local": 0.3,
          "fast": 0.5,
          "balanced": 0.8
        },
        "tiers": {
          "local":    { "provider": "ollama",    "model": "qwen3:30b-a3b" },
          "fast":     { "provider": "anthropic", "model": "claude-sonnet-4-20250514" },
          "balanced": { "provider": "anthropic", "model": "claude-opus-4-20250514" },
          "powerful": { "provider": "anthropic", "model": "claude-opus-4-20250514" }
        },
        "profiles": {}
      }
    }
  }
}

Disabling the Router

Set enabled: false to bypass routing and always use the default model:
{
  "agents": {
    "defaults": {
      "modelRouter": {
        "enabled": false
      }
    }
  }
}

Key Files

FileLOCDescription
src/models/router.ts~586Complexity scoring and tier routing
src/models/types.ts~104ModelTier, RoutingDecision, ModelProfile types
src/models/builtin-profiles.tsBuilt-in profile definitions
src/agents/provider-registry.ts~126Dynamic provider registry
src/agents/provider-registry-seed.ts~64815+ provider definitions