The Agent Loop

Overview

The agent loop is the core execution cycle that runs for every incoming message. It follows a receive -> think -> act -> respond pattern, with multiple tool-use iterations possible before a final response.

The Loop

┌─────────────────────────────────────────┐
│  1. Receive message                     │
│  2. Inject context (time, memory, etc.) │
│  3. Send to LLM                         │
│  4. LLM returns response                │
│     ├── Text only? → Send response, done│
│     └── Tool use?  → Execute tool(s)    │
│         └── Send result back to LLM     │
│             └── Go to step 4            │
└─────────────────────────────────────────┘

Step by Step

1. Receive Message

The gateway delivers a normalized message to the agent runtime. The message includes the text content, sender identity, channel source, and any attachments.

2. Context Injection

Before sending to the LLM, ArgentOS injects contextual information:

Timestamp envelope: [Wed 2026-02-12 10:30 America/Chicago] with elapsed time since last message
Memory bootstrap: Relevant memories recalled from MemU based on the message content
Plugin injections: Any active plugins can inject additional context

3. Send to LLM

The assembled prompt (system prompt + conversation history + new message) is sent to the LLM via the model router. The router selects the appropriate model tier based on complexity scoring.

4. Process Response

The LLM response can contain:

Text blocks: Natural language responses streamed to the user
Tool use blocks: Requests to execute tools (bash, memory, tasks, browser, etc.)

5. Tool Execution

When the model requests a tool, the agent runtime:

Validates the tool call against the tool schema
Checks tool policies (sandbox mode, approval requirements)
Executes the tool handler
Returns the tool result to the model
The model processes the result and may request more tools or generate a final response

This loop can iterate multiple times. A single user message might trigger several tool calls before the agent produces a final answer.

Streaming

Responses are streamed token-by-token back to the user through the gateway. This means the user sees the response being generated in real-time, similar to the ChatGPT or Claude.ai experience.

Tool execution happens between stream chunks. The user may see a pause while tools are running.

Error Handling

If a tool execution fails, the error is returned to the model as a tool result with an error flag. The model can then decide to retry, try a different approach, or inform the user about the failure.

If the LLM API call fails (rate limit, network error), the agent runtime uses the failover system to retry with a different provider or profile.

On this page