> ## Documentation Index
> Fetch the complete documentation index at: https://docs.argentos.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# RALF + ANGEL

> Response Accountability Llama Framework and the ANGEL verification loop.

## What Is RALF?

**RALF** (Response Accountability Llama Framework) is the heartbeat verification pipeline that ensures the agent actually does what it claims to do. Every heartbeat cycle, the agent receives a task contract, executes it, and produces a response. RALF then audits that response using a secondary model -- preferring a local Llama model via Ollama (free, fast) with a cloud fallback to Claude Haiku.

<Note>
  The agent does not control the verification. The harness owns it.
</Note>

## What Is ANGEL?

**ANGEL** ("The Angel on the Shoulder") is the verification sidecar within RALF. It takes the task contract + agent response and produces per-task verdicts: `verified`, `not_verified`, or `unclear`. The name comes from having an independent observer watching over the agent's shoulder, checking every claim.

## Architecture

```mermaid theme={null}
flowchart TD
    A[HEARTBEAT.md] --> B[Parse Tasks + Track Progress]
    C[Heartbeat Prompt] --> D[Inject Contract + Score + Retry Feedback]
    B --> D
    D --> E[Agent Executes]
    E --> F[Ground Truth Collection]
    E --> G[ANGEL Verifier]
    F --> H[Verdict Application]
    G --> H
    H --> I[Progress Tracker]
    H --> J[Score Engine]
    I --> K[Retry failed tasks]
    J --> L[Points +/- and target check]
```

## The Heartbeat Loop

The heartbeat runs on a configurable interval (default: 15 minutes). Each cycle:

<Steps>
  <Step title="Load contract">Parse `HEARTBEAT.md` into structured tasks</Step>
  <Step title="Initialize progress">Carry over retry state, reset verified/skipped tasks</Step>
  <Step title="Check score state">Load accountability score, determine penalty/reward levels</Step>
  <Step title="Build prompt">Inject contract tasks, retry feedback, and score section</Step>
  <Step title="Agent executes">Main agent processes the heartbeat, calls tools, produces response</Step>
  <Step title="Collect ground truth">Query real APIs (email, social, etc.) for actual system state</Step>
  <Step title="Run ANGEL">Send (contract + response + ground truth) to verification sidecar</Step>
  <Step title="Apply verdicts">Update progress tracker, record score, detect contradictions</Step>
  <Step title="Persist">Save progress and score state to disk</Step>
  <Step title="Schedule next">Score may override the interval (shorter for poor performance, longer for outstanding)</Step>
</Steps>

### Interval Overrides

| Score Level                       | Interval                     |
| --------------------------------- | ---------------------------- |
| Lockdown (critically low)         | 8 minutes                    |
| Escalated (negative score)        | 10 minutes                   |
| Tightened (low score)             | 12 minutes                   |
| Normal                            | Config default               |
| Outstanding (high score + streak) | 20 minutes (earned autonomy) |

## ANGEL: The Verification Sidecar

### Model Selection

ANGEL uses a two-tier model strategy:

1. **Local Ollama** (primary) -- Uses `qwen3:1.7b` by default (tiny, fast, free). Verification is binary classification so a small model works well.
2. **Claude Haiku** (fallback) -- If Ollama is unavailable, falls back to Haiku via the Anthropic API.

<Tip>
  The key insight: verification doesn't need a powerful model. It's checking "did the agent do what it said?" -- a classification task, not a creative one.
</Tip>

### Verdicts

| Verdict        | Meaning                                 | Score Impact                    |
| -------------- | --------------------------------------- | ------------------------------- |
| `verified`     | Clear evidence the task was completed   | +10 (required) or +5 (optional) |
| `not_verified` | No evidence, or agent only mentioned it | -15                             |
| `unclear`      | Ambiguous, partial evidence             | -2                              |

### Verification Standards

The verifier is instructed to be **strict**:

* "I'll check X" is NOT the same as having checked X
* Evidence means: tool calls, specific data retrieved, actions taken, content created
* Ground truth overrides the agent's self-reporting

## Ground Truth System

Before ANGEL runs, the heartbeat runner collects actual state from real APIs. This data is injected into the verification prompt so the verifier can catch fabrication.

**Example:** The agent says "Checked inbox, 0 new messages." Ground truth shows 3 unread emails. The verifier sees both claims and marks the task as `not_verified` with a ground truth contradiction.

<Danger>
  **Ground truth contradiction** carries a severe -30 point penalty (stacking with the -15 not\_verified penalty, for -45 total). This is the system's strongest anti-fabrication mechanism.
</Danger>

### API Key Resolution

Ground truth checks need API keys, resolved via:

1. `~/.argentos/service-keys.json` (dashboard-managed, primary)
2. `process.env` (gateway plist environment)
3. `argent.json env.vars` (config fallback)

<Info>
  If no key is available for a ground truth source, that check is silently skipped. The system degrades gracefully.
</Info>

## Configuration

```json theme={null}
{
  "agents": {
    "defaults": {
      "heartbeat": {
        "enabled": true,
        "every": "15m",
        "activeHours": {
          "start": "07:00",
          "end": "23:00",
          "timezone": "America/Chicago"
        },
        "verifier": {
          "enabled": true,
          "model": "qwen3:1.7b"
        }
      }
    }
  }
}
```

| Key                 | Default        | Description                         |
| ------------------- | -------------- | ----------------------------------- |
| `enabled`           | `true`         | Enable/disable heartbeat            |
| `every`             | `"15m"`        | Base interval between heartbeats    |
| `verifier.enabled`  | `true`         | Enable/disable ANGEL verification   |
| `verifier.model`    | `"qwen3:1.7b"` | Ollama model for local verification |
| `activeHours.start` | --             | When heartbeats start (HH:MM)       |
| `activeHours.end`   | --             | When heartbeats stop (HH:MM)        |

## Design Philosophy

1. **Trust but verify**: The agent has full autonomy to work, but every claim is independently checked
2. **Free first**: Local Llama handles verification at zero cost. Cloud is only a fallback.
3. **Ground truth over self-reporting**: Real API data always overrides what the agent says
4. **Consequences, not just monitoring**: Score impacts the agent's autonomy (interval, required tasks)
5. **Anti-gaming by design**: Moving target ratchet, ground truth checks, and strict verification make gaming futile
6. **Graceful degradation**: If Ollama is down, fall back to Haiku. If Haiku is down, skip verification. Nothing crashes.