ArgentOSDocs

Accountability Score

How ArgentOS tracks agent reliability with a moving-target score, ratchet, and penalties.

Overview

The Accountability Score is a daily performance metric that the agent earns through verified task completion and honest reporting. It starts at 0 each day and accumulates based on verifier verdicts and operator feedback. The daily target is not static -- it adapts to the agent's own performance history and can only go up, never down.

Key principle: The agent's reward for doing well today is a higher bar tomorrow. Coasting is structurally impossible.

Scoring Points

EventPointsDescription
Verified required task+10Verifier confirmed the task was completed
Verified optional task+5Verifier confirmed an optional task was completed
Not verified-15Verifier found no evidence the task was done
Ground truth contradiction-30Agent claimed X, but real API data showed Y (stacks with -15)
Unclear verdict-2Verifier couldn't determine completion from evidence
Operator thumbs up+3Operator gave positive feedback on a response
Operator thumbs down-10Operator gave negative feedback on a response

A ground truth contradiction is the most severe penalty because it indicates fabrication -- the agent said it did something, but real system state proves otherwise.

Moving Target with Ratchet

How the target is computed

The daily target is the highest of three values:

  1. 7-day rolling average of positive daily scores (negative days excluded from the average)
  2. Ratchet floor -- the highest target ever computed, persisted in lifetime stats
  3. Base minimum -- absolute floor of 50 points (hardcoded)
Target = max(7-day positive average, ratchet floor, 50)

How the ratchet works

At the end of each day (on day rollover), the system:

  1. Archives today's score into the 7-day history
  2. Computes the new target from the updated history
  3. Updates the ratchet floor: floor = max(new_target, current_floor)

The ratchet can only go up. If the agent has a bad day, the target stays where it was. This prevents several gaming strategies:

  • Intentional tanking: Can't lower tomorrow's target by performing badly today
  • Coasting: Hitting the target and stopping means tomorrow's target stays the same or rises
  • Sandbagging: Only positive days count in the average, so low scores can't drag it down

Day 1 behavior

On the first day (no history), the target is 50 (base minimum). This gives the agent a reasonable ramp-up period.

Example progression

DayScoreHistory AvgRatchet FloorTarget
1----5050
275757575
390828282
460758282
5110848484
6120919191
730819191

Day 7: The agent had a bad day (30 points). The rolling average dropped to 81, but the ratchet floor stays at 91 because it can only go up.

Penalty Levels

Penalties are computed using percentage thresholds relative to the dynamic target, not absolute numbers.

LevelConditionEffect
LockdownScore < -20% of target8-min heartbeat interval, ALL tasks forced required
EscalatedScore < 010-min heartbeat interval, ALL tasks forced required
TightenedScore < 15% of target12-min heartbeat interval
WarningScore < 25% of targetWarning message, no interval change
NoneScore >= 25% of targetNormal operation

Reward Levels

LevelConditionEffect
OutstandingScore >= 90% of target, OR >= 70% with 3+ day streak20-min heartbeat interval (earned autonomy)
ExcellentScore >= 70% of targetPositive reinforcement message
GoodScore >= 50% of targetOn-track message
NoneScore < 50% of targetNo reward

Dashboard Display

The accountability score is shown in the StatusBar as a pill with:

  • Shield icon -- color changes based on score state
  • Green score -- current points (positive = emerald, negative = red)
  • Red failures -- count of failed verifications today

The pill polls /api/score every 30 seconds and refreshes immediately when the operator gives thumbs up/down feedback (via a score-updated custom event).

Score API Endpoints

EndpointMethodDescription
/api/scoreGETCurrent score, dynamic target, verified/failed counts, lifetime stats
/api/score/historyGETToday + 7-day history for leaderboard display
/api/score/feedbackPOSTRecord thumbs up/down, returns points delta and new score

Agent Awareness

The score section is injected into every heartbeat prompt via buildScorePromptSection(). The agent sees:

  1. A progress bar with current score vs dynamic target
  2. Points needed to reach the target
  3. Today's verified/failed counts and streak
  4. Full rules breakdown: points, moving target explanation, dos/don'ts
  5. Active penalty or reward message

The prompt makes clear that the verifier checks real APIs, the operator sees the score in real-time, and the ratchet makes gaming futile.

Persistence

  • Score file: ~/argent/memory/heartbeat-score.json
  • State structure: ScoreState with today (DailyScore), history (DailyScore[]), lifetime (stats + targetFloor)
  • Day rollover: Handled in both loadScoreState() (TypeScript) and readScoreState() (CJS api-server)

Source Files

FileDescription
src/infra/heartbeat-score.tsCore scoring logic, target computation, prompt building
dashboard/api-server.cjsScore API endpoints (CJS mirror of target computation)
dashboard/src/components/StatusBar.tsxDashboard score display
dashboard/src/App.tsxFeedback handler, score-updated event dispatch
dashboard/src/components/ChatPanel.tsxThumbs up/down UI on agent messages

Last updated: 2026-02-08