ArgentOSDocs

Model Failover

Automatic failover with exponential backoff, cooldowns, and profile rotation.

Overview

ArgentOS is designed to stay online even when individual API providers or accounts hit limits. The failover system automatically rotates between auth profiles and providers, using exponential backoff to avoid hammering rate-limited services.

Failover Chain

When an API call fails, ArgentOS follows this chain:

Primary Profile → Next Profile (same provider) → Alternative Provider → Error

Example

  1. Call to anthropic:titanium fails (429 rate limit)
  2. anthropic:titanium enters cooldown
  3. Retry with anthropic:webdevtoday
  4. If that also fails, try anthropic:semfreak
  5. If all Anthropic profiles are in cooldown, try MiniMax or Z.AI (if configured for the current tier)
  6. If no providers are available, return an error to the user

Exponential Backoff

When a profile hits a rate limit, it enters an exponential backoff cooldown:

Failure CountCooldown Duration
1st30 seconds
2nd1 minute
3rd2 minutes
4th4 minutes
5th+8 minutes (max)

The cooldown resets after a successful call from the profile.

Cooldown Management

Viewing Cooldown Status

argent status

The status command shows which profiles are active, which are in cooldown, and when they will be available again.

Manual Reset

argent model reset-cooldowns

Clears all cooldown timers, making all profiles immediately available.

Error Types

Different errors trigger different behaviors:

ErrorAction
429 Rate LimitCooldown + rotate to next profile
529 OverloadedCooldown + rotate to next profile
401 UnauthorizedDisable profile (requires manual fix)
Network errorShort cooldown + retry
Context overflowSkip auth profile rotation (not a provider issue)
500 Server ErrorShort cooldown + retry once

Context overflow errors specifically do not trigger profile rotation since the issue is with the request, not the provider.

Provider-Specific Notes

Anthropic Max Subscriptions

Max subscriptions have weekly quotas that reset on a rolling basis. When one account's quota is exhausted, failover to another subscription keeps the agent running.

MiniMax Limitations

MiniMax cannot accept tool call history with Anthropic-format IDs. Failover to MiniMax only works for:

  • Fresh sessions with no tool history
  • Non-tool conversations

The model router accounts for this and avoids routing mid-session tool conversations to MiniMax.

Local Models (Ollama)

Local models via Ollama have no rate limits or costs but are limited in capability. The LOCAL tier is always available as an ultimate fallback for simple queries.