> ## Documentation Index
> Fetch the complete documentation index at: https://docs.argentos.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Router Architecture

> Deep dive into the model router — complexity scoring, tier routing, background lanes, and cross-provider fallback.

## Overview

The Model Router scores the complexity of every agent request and routes it to the cheapest capable model tier. This keeps costs low for simple interactions (greetings, status checks) while ensuring complex tasks (code architecture, multi-step reasoning) get powerful models.

```mermaid theme={null}
flowchart LR
  R["Request\n(prompt, session, signals)"] --> CS["Complexity Scoring\n(7 factors)"]
  CS --> |"score ∈ [0,1]"| TS["Tier Selection"]
  TS --> LOCAL["LOCAL < 0.3"]
  TS --> FAST["FAST < 0.5"]
  TS --> BAL["BALANCED < 0.8"]
  TS --> POW["POWERFUL > 0.8"]
  LOCAL --> P["Provider + Fallback Chain"]
  FAST --> P
  BAL --> P
  POW --> P
```

## Tier Definitions

| Tier         | Score Range | Default Model            | Cost   | Use Case                                                                                     |
| ------------ | ----------- | ------------------------ | ------ | -------------------------------------------------------------------------------------------- |
| **LOCAL**    | \< 0.3      | Qwen3 30B-A3B via Ollama | Free   | Simple queries, greetings, status. **Cannot use tools or leverage injected memory context.** |
| **FAST**     | 0.3 - 0.5   | Claude Sonnet            | \$     | Tool calls, memory queries, standard interaction                                             |
| **BALANCED** | 0.5 - 0.8   | Claude Opus              | \$\$   | Code generation, analysis, multi-step reasoning                                              |
| **POWERFUL** | > 0.8       | Claude Opus              | \$\$\$ | Complex architecture, deep contemplation, critical decisions                                 |

Tier model assignments are configurable per profile and can be overridden in `~/.argentos/argent.json`:

```json theme={null}
{
  "agents": {
    "defaults": {
      "modelRouter": {
        "tiers": {
          "local":    { "provider": "ollama",    "model": "qwen3:30b-a3b" },
          "fast":     { "provider": "anthropic", "model": "claude-sonnet-4-20250514" },
          "balanced": { "provider": "anthropic", "model": "claude-opus-4-20250514" },
          "powerful": { "provider": "anthropic", "model": "claude-opus-4-20250514" }
        }
      }
    }
  }
}
```

## Complexity Scoring Algorithm

The `scoreComplexity()` function evaluates 7 factors and produces a score between 0 and 1:

### Factor 1: Prompt Length

| Length         | Score Contribution | Reason                                   |
| -------------- | ------------------ | ---------------------------------------- |
| \< 80 chars    | +0.05              | Short prompt (greeting, yes/no)          |
| 80-300 chars   | +0.15              | Medium prompt (simple question)          |
| 300-1000 chars | +0.30              | Long prompt (detailed request)           |
| > 1000 chars   | +0.45              | Very long prompt (complex specification) |

### Factor 2: Thinking Level

The user's thinking level preference adds a boost:

| Level             | Score Contribution |
| ----------------- | ------------------ |
| `xhigh` / `high`  | +0.15              |
| `medium`          | +0.10              |
| `low` / `minimal` | +0.05              |

<Info>
  Thinking level is treated as a signal of desired reasoning depth, not as an automatic tier escalation. A user who always sets `xhigh` but sends "hey" still routes to the appropriate tier.
</Info>

### Factor 3: Image Input

If the request includes images (requires vision-capable models):

| Condition         | Score Contribution |
| ----------------- | ------------------ |
| `hasImages: true` | +0.30              |

### Factor 4: Session Type

The session type applies floors, caps, and boosts:

| Session Type           | Behavior                                                                                                          |
| ---------------------- | ----------------------------------------------------------------------------------------------------------------- |
| **Heartbeat**          | Floor at 0.3 (FAST minimum). Heartbeats run nudges and tool calls that need reliable tool execution.              |
| **Contemplation**      | Boosted to minimum 0.85 (POWERFUL). Self-directed thinking requires depth — cheap models take the easy rest exit. |
| **Main** (user-facing) | Floor at 0.3 (FAST minimum). User sessions have tools and injected context that local models cannot handle.       |
| **Subagent**           | +0.10 boost. Subagent tasks tend to be more focused.                                                              |

### Factor 5: Tool-Likely Prompt Detection

For user-facing sessions, the router detects prompts likely to trigger tool use:

```typescript theme={null}
const toolTriggerPatterns = [
  /\b(save|store|record|log|write)\b.*\b(memory|that|this|it)\b/i,
  /\b(remember|don't forget|note that|keep in mind)\b/i,
  /\b(check|show|list|view)\b.*\b(task|tasks|todo|schedule)\b/i,
  /\b(send|message|dm|notify|ping)\b.*\b(discord|telegram|slack|email)\b/i,
  /\b(search|look up|find|fetch)\b.*\b(web|online|google|news)\b/i,
  /\b(add|create|start|complete|finish|block)\b.*\b(task|tasks)\b/i,
  /\b(generate|create|make)\b.*\b(image|audio|video|speech)\b/i,
  /\b(open|push|update)\b.*\b(doc|document|panel|canvas)\b/i,
];
```

<Tip>
  When a tool-likely pattern matches and the score is below 0.5 (BALANCED floor), the score is boosted. This prevents Haiku from simulating tool calls as text instead of making real `tool_use` blocks.
</Tip>

### Factor 6: Code/Technical Content

| Pattern Matches | Score Contribution | Reason                       |
| --------------- | ------------------ | ---------------------------- |
| >= 3 patterns   | +0.20              | Heavy code/technical content |
| >= 1 pattern    | +0.10              | Some code/technical content  |

Detection patterns include code blocks, programming keywords (`function`, `class`, `import`), SQL statements, and infrastructure terms (`docker`, `kubernetes`).

### Factor 7: Reasoning/Analysis Patterns

| Pattern Matches | Score Contribution | Reason                      |
| --------------- | ------------------ | --------------------------- |
| >= 2 patterns   | +0.15              | Reasoning/analysis required |
| >= 1 pattern    | +0.05              | Some reasoning needed       |

Detection patterns include analysis verbs (`analyze`, `compare`, `evaluate`), trade-off language, step-by-step requests, and system design requests.

### Score Reduction: Simple Queries

Simple greetings and acknowledgments (`hi`, `hello`, `hey`, `thanks`, `ok`, `yes`, `no`) reduce the score, keeping trivial interactions on the cheapest tier.

## Background Model Lanes

Five background subsystems have dedicated routing:

| Lane                 | Target Tier         | Rationale                                   |
| -------------------- | ------------------- | ------------------------------------------- |
| **Contemplation**    | POWERFUL (0.85+)    | Self-directed thinking needs depth          |
| **SIS**              | BALANCED            | Lesson extraction needs adequate reasoning  |
| **Heartbeat**        | FAST (0.3 floor)    | Tool calls + nudges need reliable execution |
| **Execution Worker** | Configurable        | Worker has its own model override option    |
| **Embeddings**       | LOCAL or configured | Embedding generation is a separate concern  |

## Named Routing Profiles

Profiles define complete tier mappings and fallback chains:

```typescript theme={null}
interface ModelProfile {
  description?: string;
  tiers: Partial<Record<ModelTier, TierModelMapping>>;
  fallbackProfile?: string;  // Chain to another profile
}
```

### Built-in Profiles

ArgentOS ships with several built-in profiles including `default`, `budget`, `minimax-mix`, and others defined in `src/models/builtin-profiles.ts`. User-defined profiles in config take precedence.

### Profile Chaining

Profiles can chain to fallback profiles:

```json theme={null}
{
  "profiles": {
    "budget": {
      "tiers": { "fast": { "provider": "minimax", "model": "MiniMax-M2.1" } },
      "fallbackProfile": "default"
    }
  }
}
```

## Cross-Provider Fallback

When a provider fails (rate limit, error, unavailability), the router walks the fallback chain:

1. **Primary**: The tier's configured model/provider
2. **Profile fallback**: Next model in the profile's fallback chain
3. **Cross-provider**: Different provider for the same tier

### Cycle Detection

The fallback walker tracks visited profiles to prevent infinite loops:

```typescript theme={null}
const visited = new Set<string>([startProfileName ?? ""]);
let nextName: string | undefined = startProfile.fallbackProfile;

while (nextName && !visited.has(nextName)) {
  visited.add(nextName);
  // ... resolve and add fallback
}
```

### Provider Normalization

The router normalizes provider names to handle common variations:

```typescript theme={null}
function normalizeProvider(provider: string): string {
  if (normalized === "z.ai" || normalized === "z-ai") return "zai";
  if (normalized === "opencode-zen") return "opencode";
  return normalized;
}
```

<Info>
  It also infers providers from model names (`claude-*` -> `anthropic`, `gpt-*` -> `openai`, `glm-*` -> `zai`, etc.) to handle misconfigured profiles.
</Info>

## Dashboard Model Badge

The dashboard displays a color-coded badge on each message showing which tier was used:

| Color  | Tier     | Meaning                |
| ------ | -------- | ---------------------- |
| Green  | LOCAL    | Free local model       |
| Yellow | FAST     | Fast/cheap cloud model |
| Blue   | BALANCED | Mid-tier cloud model   |
| Purple | POWERFUL | Premium cloud model    |

## Session Model Override

Individual sessions can override the router with a specific model:

```typescript theme={null}
type SessionModelOverride = {
  provider: string;
  model: string;
  reason?: string;
};
```

This is used when the user explicitly selects a model in the dashboard, or when a subsystem (like the execution worker) has a configured model preference.

## Memory Query Boost

<Info>
  Memory recall queries receive a +0.25 score boost to ensure they route to at least the FAST tier. Local models lack the context window and reasoning needed to effectively process retrieved memories and synthesize them into useful responses.
</Info>

## Configuration

Full router configuration in `~/.argentos/argent.json`:

```json theme={null}
{
  "agents": {
    "defaults": {
      "modelRouter": {
        "enabled": true,
        "defaultProfile": "default",
        "thresholds": {
          "local": 0.3,
          "fast": 0.5,
          "balanced": 0.8
        },
        "tiers": {
          "local":    { "provider": "ollama",    "model": "qwen3:30b-a3b" },
          "fast":     { "provider": "anthropic", "model": "claude-sonnet-4-20250514" },
          "balanced": { "provider": "anthropic", "model": "claude-opus-4-20250514" },
          "powerful": { "provider": "anthropic", "model": "claude-opus-4-20250514" }
        },
        "profiles": {}
      }
    }
  }
}
```

## Disabling the Router

Set `enabled: false` to bypass routing and always use the default model:

```json theme={null}
{
  "agents": {
    "defaults": {
      "modelRouter": {
        "enabled": false
      }
    }
  }
}
```

## Key Files

| File                                   | LOC   | Description                                    |
| -------------------------------------- | ----- | ---------------------------------------------- |
| `src/models/router.ts`                 | \~586 | Complexity scoring and tier routing            |
| `src/models/types.ts`                  | \~104 | ModelTier, RoutingDecision, ModelProfile types |
| `src/models/builtin-profiles.ts`       | —     | Built-in profile definitions                   |
| `src/agents/provider-registry.ts`      | \~126 | Dynamic provider registry                      |
| `src/agents/provider-registry-seed.ts` | \~648 | 15+ provider definitions                       |
