Overview
ArgentOS is designed to stay online even when individual API providers or accounts hit limits. The failover system automatically rotates between auth profiles and providers, using exponential backoff to avoid hammering rate-limited services.Failover Chain
When an API call fails, ArgentOS follows this chain:Example
Cross-provider fallback
If all Anthropic profiles are in cooldown, try MiniMax or Z.AI (if configured for the current tier).
Exponential Backoff
When a profile hits a rate limit, it enters an exponential backoff cooldown:| Failure Count | Cooldown Duration |
|---|---|
| 1st | 30 seconds |
| 2nd | 1 minute |
| 3rd | 2 minutes |
| 4th | 4 minutes |
| 5th+ | 8 minutes (max) |
The cooldown resets after a successful call from the profile.
Cooldown Management
Viewing Cooldown Status
Manual Reset
Error Types
Different errors trigger different behaviors:| Error | Action |
|---|---|
| 429 Rate Limit | Cooldown + rotate to next profile |
| 529 Overloaded | Cooldown + rotate to next profile |
| 401 Unauthorized | Disable profile (requires manual fix) |
| Network error | Short cooldown + retry |
| Context overflow | Skip auth profile rotation (not a provider issue) |
| 500 Server Error | Short cooldown + retry once |
Provider-Specific Notes
Anthropic Max Subscriptions
Anthropic Max Subscriptions
Max subscriptions have weekly quotas that reset on a rolling basis. When one account’s quota is exhausted, failover to another subscription keeps the agent running.
MiniMax Limitations
MiniMax Limitations
MiniMax cannot accept tool call history with Anthropic-format IDs. Failover to MiniMax only works for:
- Fresh sessions with no tool history
- Non-tool conversations
Local Models (Ollama)
Local Models (Ollama)
Local models via Ollama have no rate limits or costs but are limited in capability. The LOCAL tier is always available as an ultimate fallback for simple queries.
