> ## Documentation Index > Fetch the complete documentation index at: https://docs.argentos.ai/llms.txt > Use this file to discover all available pages before exploring further. # Accountability System > Heartbeat contract format, task verification, scoring, penalty/reward tiers, and interval adjustments. **ArgentOS Business** -- This feature is part of ArgentOS Business. The architecture is documented here for all users, but full functionality requires a Business license. [Learn more about Business](/business) ## Overview The Accountability System is ArgentOS's mechanism for ensuring agent reliability. It combines a structured heartbeat contract (HEARTBEAT.md), automated task verification, a running accountability score with penalty and reward tiers, and dynamic interval adjustments. The system motivates the agent to complete its commitments honestly while providing the operator with quantitative trust metrics. ```mermaid theme={null} flowchart TD A[HEARTBEAT.md] --> B[Parse Contract] B --> C[Execute Tasks] C --> D[Verify Completion] D -->|Verified| E[Score Update +] D -->|Not Verified| F[Score Update -] E --> G[Interval Adjustment] F --> G G --> H[Next Cycle] ``` ## HEARTBEAT.md Contract Format The agent writes and maintains a `HEARTBEAT.md` file in its workspace that defines what it should check during each heartbeat cycle. The file has two sections: ### Structured Tasks Tasks use a specific markdown format inside the `## Tasks` section: ```markdown theme={null} ## Tasks - [ ] check_email | Check for new important emails | required | verify: email_count - [ ] review_tasks | Review and update task priorities | required | verify: task_list_updated - [x] weather_brief | Prepare morning weather brief | optional | verify: weather_sent - [ ] memory_cleanup | Run memory deduplication | optional | verify: dedup_count | max_attempts: 5 ``` ### Task Line Format ``` - [x] task_id | Description | required/optional | verify: hint | max_attempts: N ``` | Field | Description | | ----------------------- | ------------------------------------------------- | | Checkbox `[ ]` or `[x]` | Whether the agent marked it as already done | | `task_id` | Unique slug identifier (auto-slugified from text) | | Description | Human-readable action description | | `required` / `optional` | Whether completion is mandatory | | `verify: hint` | Hint for the verification sidecar on how to check | | `max_attempts: N` | Maximum retry attempts (default: 3) | ## Task Verification After the agent processes each heartbeat task, a verification sidecar evaluates the outcome: | Verdict | Description | | -------------- | ---------------------------------------------------------- | | `verified` | Task was completed and verification confirms it | | `not_verified` | Task was claimed complete but verification found otherwise | | `unclear` | Verification was inconclusive | A special flag `groundTruthContradiction` is set when the agent explicitly claimed a result that verification proves false. This carries the harshest penalty (-30 points). ## Scoring System ### Point Values | Event | Points | Description | | --------------------------- | ------- | ---------------------------------------------------- | | Verified required task | **+10** | Core obligation met | | Verified optional task | **+5** | Extra credit | | Not verified (lied/skipped) | **-15** | Failed obligation | | Unclear verdict | **-2** | Inconclusive (slight penalty) | | Ground truth contradiction | **-30** | Stacks with not\_verified -- agent demonstrably lied | | Human thumbs up | **+3** | Operator positive feedback | | Human thumbs down | **-10** | Operator negative feedback | ### Moving Target (Ratchet) The daily target is not fixed -- it rises as the agent performs well and **can never drop**: ``` dailyTarget = max( 7-day rolling average of positive days, lifetime ratchet floor, BASE_MINIMUM_TARGET (50) ) ``` | Constraint | Value | Description | | --------------------- | ------ | -------------------------------------------- | | `BASE_MINIMUM_TARGET` | 50 | Absolute floor -- target can never go below | | `MAX_DAILY_TARGET` | 500 | Ceiling -- prevents runaway target inflation | | Rolling window | 7 days | Only recent performance matters | | Positive days only | -- | Bad days don't artificially lower the target | **Key principle:** The agent's reward for doing well today is a higher bar tomorrow. Coasting is structurally impossible. ### Example Progression | Day | Score | History Avg | Ratchet Floor | Target | | --- | ----- | ----------- | ------------- | ------ | | 1 | -- | -- | 50 | 50 | | 2 | 75 | 75 | 75 | 75 | | 3 | 90 | 82 | 82 | 82 | | 4 | 60 | 75 | 82 | 82 | | 5 | 110 | 84 | 84 | 84 | | 6 | 120 | 91 | 91 | 91 | | 7 | 30 | 81 | 91 | 91 | Day 7: The agent had a bad day (30 points). The rolling average dropped to 81, but the ratchet floor stays at 91. ## Penalty Tiers When the daily score drops, escalating penalties apply: | Level | Condition | Effects | | ------------- | ----------------------- | ---------------------------------------------------------------- | | **None** | Score >= 25% of target | Normal operation | | **Warning** | Score \< 25% of target | Warning message injected into prompt | | **Tightened** | Score \< 15% of target | 12-min heartbeat interval | | **Escalated** | Score \< 0 | 10-min interval, all tasks forced required | | **Lockdown** | Score \< -20% of target | 8-min interval, all tasks forced required, operator notification | ## Reward Tiers High scores earn autonomy rewards: | Level | Condition | Effects | | --------------- | ---------------------------------------------------- | ------------------------------------------- | | **None** | Score \< 50% of target | No reward | | **Good** | Score >= 50% of target | On-track message | | **Excellent** | Score >= 70% of target | Positive reinforcement message | | **Outstanding** | Score >= 90% of target, or >= 70% with 3+ day streak | 20-min heartbeat interval (earned autonomy) | ## Interval Adjustments The heartbeat interval dynamically adjusts based on the accountability score: | Score Range | Interval | Rationale | | --------------------- | ------------ | ---------------------------------------- | | Very low (lockdown) | \~8 minutes | Frequent check-ins, tight oversight | | Low (tightened) | \~10 minutes | Increased monitoring | | Normal | \~15 minutes | Standard operation | | Good | \~17 minutes | Slight autonomy reward | | Excellent/Outstanding | \~20 minutes | Maximum autonomy, agent has earned trust | The interval override is applied per-cycle -- it does not permanently change the configured interval. When the score returns to normal, the interval returns to the configured default. ## Dashboard Display The accountability score is shown in the dashboard StatusBar as a pill with: * **Shield icon** -- color changes based on score state * **Green score** -- current points (positive = emerald, negative = red) * **Red failures** -- count of failed verifications today ### Score API Endpoints | Endpoint | Method | Description | | --------------------- | ------ | --------------------------------------------------------------------- | | `/api/score` | GET | Current score, dynamic target, verified/failed counts, lifetime stats | | `/api/score/history` | GET | Today + 7-day history for leaderboard display | | `/api/score/feedback` | POST | Record thumbs up/down, returns points delta and new score |