The Agent Loop
How ArgentOS processes each message — receive, think, act, respond.
Overview
The agent loop is the core execution cycle that runs for every incoming message. It follows a receive -> think -> act -> respond pattern, with multiple tool-use iterations possible before a final response.
The Loop
┌─────────────────────────────────────────┐
│ 1. Receive message │
│ 2. Inject context (time, memory, etc.) │
│ 3. Send to LLM │
│ 4. LLM returns response │
│ ├── Text only? → Send response, done│
│ └── Tool use? → Execute tool(s) │
│ └── Send result back to LLM │
│ └── Go to step 4 │
└─────────────────────────────────────────┘Step by Step
1. Receive Message
The gateway delivers a normalized message to the agent runtime. The message includes the text content, sender identity, channel source, and any attachments.
2. Context Injection
Before sending to the LLM, ArgentOS injects contextual information:
- Timestamp envelope:
[Wed 2026-02-12 10:30 America/Chicago]with elapsed time since last message - Memory bootstrap: Relevant memories recalled from MemU based on the message content
- Plugin injections: Any active plugins can inject additional context
3. Send to LLM
The assembled prompt (system prompt + conversation history + new message) is sent to the LLM via the model router. The router selects the appropriate model tier based on complexity scoring.
4. Process Response
The LLM response can contain:
- Text blocks: Natural language responses streamed to the user
- Tool use blocks: Requests to execute tools (bash, memory, tasks, browser, etc.)
5. Tool Execution
When the model requests a tool, the agent runtime:
- Validates the tool call against the tool schema
- Checks tool policies (sandbox mode, approval requirements)
- Executes the tool handler
- Returns the tool result to the model
- The model processes the result and may request more tools or generate a final response
This loop can iterate multiple times. A single user message might trigger several tool calls before the agent produces a final answer.
Streaming
Responses are streamed token-by-token back to the user through the gateway. This means the user sees the response being generated in real-time, similar to the ChatGPT or Claude.ai experience.
Tool execution happens between stream chunks. The user may see a pause while tools are running.
Error Handling
If a tool execution fails, the error is returned to the model as a tool result with an error flag. The model can then decide to retry, try a different approach, or inform the user about the failure.
If the LLM API call fails (rate limit, network error), the agent runtime uses the failover system to retry with a different provider or profile.