Accountability System

ArgentOS Business — This feature is part of ArgentOS Business. The architecture is documented here for all users, but full functionality requires a Business license. Learn more about Business

Overview

The Accountability System is ArgentOS’s mechanism for ensuring agent reliability. It combines a structured heartbeat contract (HEARTBEAT.md), automated task verification, a running accountability score with penalty and reward tiers, and dynamic interval adjustments. The system motivates the agent to complete its commitments honestly while providing the operator with quantitative trust metrics.

HEARTBEAT.md Contract Format

The agent writes and maintains a HEARTBEAT.md file in its workspace that defines what it should check during each heartbeat cycle. The file has two sections:

Structured Tasks

Tasks use a specific markdown format inside the ## Tasks section:

## Tasks

- [ ] check_email | Check for new important emails | required | verify: email_count
- [ ] review_tasks | Review and update task priorities | required | verify: task_list_updated
- [x] weather_brief | Prepare morning weather brief | optional | verify: weather_sent
- [ ] memory_cleanup | Run memory deduplication | optional | verify: dedup_count | max_attempts: 5

Task Line Format

- [x] task_id | Description | required/optional | verify: hint | max_attempts: N

Field	Description
Checkbox `[ ]` or `[x]`	Whether the agent marked it as already done
`task_id`	Unique slug identifier (auto-slugified from text)
Description	Human-readable action description
`required` / `optional`	Whether completion is mandatory
`verify: hint`	Hint for the verification sidecar on how to check
`max_attempts: N`	Maximum retry attempts (default: 3)

Task Verification

After the agent processes each heartbeat task, a verification sidecar evaluates the outcome:

Verdict	Description
`verified`	Task was completed and verification confirms it
`not_verified`	Task was claimed complete but verification found otherwise
`unclear`	Verification was inconclusive

A special flag groundTruthContradiction is set when the agent explicitly claimed a result that verification proves false. This carries the harshest penalty (-30 points).

Scoring System

Point Values

Event	Points	Description
Verified required task	+10	Core obligation met
Verified optional task	+5	Extra credit
Not verified (lied/skipped)	-15	Failed obligation
Unclear verdict	-2	Inconclusive (slight penalty)
Ground truth contradiction	-30	Stacks with not_verified — agent demonstrably lied
Human thumbs up	+3	Operator positive feedback
Human thumbs down	-10	Operator negative feedback

Moving Target (Ratchet)

The daily target is not fixed — it rises as the agent performs well and can never drop:

dailyTarget = max(
  7-day rolling average of positive days,
  lifetime ratchet floor,
  BASE_MINIMUM_TARGET (50)
)

Constraint	Value	Description
`BASE_MINIMUM_TARGET`	50	Absolute floor — target can never go below
`MAX_DAILY_TARGET`	500	Ceiling — prevents runaway target inflation
Rolling window	7 days	Only recent performance matters
Positive days only	—	Bad days don’t artificially lower the target

Key principle: The agent’s reward for doing well today is a higher bar tomorrow. Coasting is structurally impossible.

Example Progression

Day	Score	History Avg	Ratchet Floor	Target
1	—	—	50	50
2	75	75	75	75
3	90	82	82	82
4	60	75	82	82
5	110	84	84	84
6	120	91	91	91
7	30	81	91	91

Day 7: The agent had a bad day (30 points). The rolling average dropped to 81, but the ratchet floor stays at 91.

Penalty Tiers

When the daily score drops, escalating penalties apply:

Level	Condition	Effects
None	Score >= 25% of target	Normal operation
Warning	Score < 25% of target	Warning message injected into prompt
Tightened	Score < 15% of target	12-min heartbeat interval
Escalated	Score < 0	10-min interval, all tasks forced required
Lockdown	Score < -20% of target	8-min interval, all tasks forced required, operator notification

Reward Tiers

High scores earn autonomy rewards:

Level	Condition	Effects
None	Score < 50% of target	No reward
Good	Score >= 50% of target	On-track message
Excellent	Score >= 70% of target	Positive reinforcement message
Outstanding	Score >= 90% of target, or >= 70% with 3+ day streak	20-min heartbeat interval (earned autonomy)

Interval Adjustments

The heartbeat interval dynamically adjusts based on the accountability score:

Score Range	Interval	Rationale
Very low (lockdown)	~8 minutes	Frequent check-ins, tight oversight
Low (tightened)	~10 minutes	Increased monitoring
Normal	~15 minutes	Standard operation
Good	~17 minutes	Slight autonomy reward
Excellent/Outstanding	~20 minutes	Maximum autonomy, agent has earned trust

The interval override is applied per-cycle — it does not permanently change the configured interval. When the score returns to normal, the interval returns to the configured default.

Dashboard Display

The accountability score is shown in the dashboard StatusBar as a pill with:

Shield icon — color changes based on score state
Green score — current points (positive = emerald, negative = red)
Red failures — count of failed verifications today

Score API Endpoints

Endpoint	Method	Description
`/api/score`	GET	Current score, dynamic target, verified/failed counts, lifetime stats
`/api/score/history`	GET	Today + 7-day history for leaderboard display
`/api/score/feedback`	POST	Record thumbs up/down, returns points delta and new score

Getting Started

Installation

Channels

Agents

Connectors

Tools

Concepts

Memory

Model Router

Overview

HEARTBEAT.md Contract Format

Structured Tasks

Task Line Format

Task Verification

Scoring System

Point Values

Moving Target (Ratchet)

Example Progression

Penalty Tiers

Reward Tiers

Interval Adjustments

Dashboard Display

Score API Endpoints

Day	Score	History Avg	Ratchet Floor	Target
1	—	—	50	50
2	75	75	75	75
3	90	82	82	82
4	60	75	82	82
5	110	84	84	84
6	120	91	91	91
7	30	81	91	91

Day	Score	History Avg	Ratchet Floor	Target
1	—	—	50	50
2	75	75	75	75
3	90	82	82	82
4	60	75	82	82
5	110	84	84	84
6	120	91	91	91
7	30	81	91	91

Getting Started

Installation

Channels

Agents

Connectors

Tools

Concepts

Memory

Model Router

Documentation Index

​Overview

​HEARTBEAT.md Contract Format

​Structured Tasks

​Task Line Format

​Task Verification

​Scoring System

​Point Values

​Moving Target (Ratchet)

​Example Progression

​Penalty Tiers

​Reward Tiers

​Interval Adjustments

​Dashboard Display

​Score API Endpoints

Overview

HEARTBEAT.md Contract Format

Structured Tasks

Task Line Format

Task Verification

Scoring System

Point Values

Moving Target (Ratchet)

Example Progression

Penalty Tiers

Reward Tiers

Interval Adjustments

Dashboard Display

Score API Endpoints

Day	Score	History Avg	Ratchet Floor	Target
1	—	—	50	50
2	75	75	75	75
3	90	82	82	82
4	60	75	82	82
5	110	84	84	84
6	120	91	91	91
7	30	81	91	91