Self-Improving System (SIS)

Overview

The SIS layer enables Argent to:

Observe — Track outcomes of actions
Evaluate — Assess what worked vs. what didn’t
Learn — Extract lessons and patterns
Apply — Use lessons to improve future decisions

Memory Bank Types

Lessons Learned

Mistakes, successes, workarounds, and discoveries from experience.

Patterns Detected

Temporal, sequential, failure, and success patterns.

Tool Knowledge

Rate limits, edge cases, best practices, common errors.

Model Feedback

Which models work best for which task types.

User Preferences

Likes, dislikes, work style, communication style.

Context Awareness

Project-specific and user-specific knowledge.

Lessons Learned

Things discovered through experience that should be remembered:

interface Lesson {
  id: string;
  type: 'mistake' | 'success' | 'workaround' | 'discovery';
  context: string;       // "Trying to send WhatsApp message to group"
  action: string;        // "Used sendMessage with group JID"
  outcome: string;       // "Failed - group JID format was wrong"
  lesson: string;        // "WhatsApp group JIDs must end with @g.us"
  correction?: string;   // "Always append @g.us to group IDs"
  confidence: number;    // 0-1, increases with repeated validation
  occurrences: number;   // How many times this came up
  tags: string[];        // ['whatsapp', 'groups', 'jid', 'format']
  relatedTools: string[]; // ['whatsapp_send', 'message']
}

Examples:

MISTAKE: "When user says 'remind me tomorrow', I should ask what time,
          not assume 9am. User prefers afternoon reminders."

SUCCESS: "Using bullet points instead of paragraphs for task summaries
          gets better user engagement."

WORKAROUND: "ElevenLabs API sometimes returns 429. Wait 2 seconds and
             retry up to 3 times before falling back to system TTS."

DISCOVERY: "User's calendar has recurring 'Focus Time' blocks - don't
            schedule interruptions during these."

Patterns Detected

Recurring behaviors and correlations:

TEMPORAL: "User typically asks for weather between 7-8am.
           Proactively check at 6:55am."

SEQUENTIAL: "After 'check email' task, user often asks 'respond to X'.
             Prepare draft responses proactively."

FAILURE: "API calls to silver-prices.com fail on weekends.
          Use backup source (metals-api.com) on Sat/Sun."

Feedback Loop Implementation

1. Observation Collection

After every action, the system records the outcome including the action type, tool used, result, latency, model used, and any user feedback (explicit or implicit).

2. Evaluation Engine

The evaluation engine checks for:

Explicit failures — action didn’t succeed
User corrections — user had to correct the agent
Repeated patterns — 3+ similar outcomes trigger pattern detection

3. Lesson Extraction

A fast local model extracts lessons from outcomes. This runs on Ollama (free, fast) and focuses on what went wrong/right and how to handle it better next time.

4. Memory Bank Storage

Lessons are stored in SQLite with FTS5 for full-text search:

CREATE TABLE IF NOT EXISTS lessons (
  id TEXT PRIMARY KEY,
  type TEXT NOT NULL,
  context TEXT NOT NULL,
  action TEXT NOT NULL,
  outcome TEXT NOT NULL,
  lesson TEXT NOT NULL,
  correction TEXT,
  confidence REAL DEFAULT 0.5,
  occurrences INTEGER DEFAULT 1,
  last_seen INTEGER NOT NULL,
  tags TEXT,
  related_tools TEXT,
  created_at INTEGER NOT NULL
);

5. Retrieval and Application

Before taking an action, the system checks for relevant lessons by tool name and semantic similarity. Lessons are injected into the prompt:

## Lessons from Past Experience

- **MISTAKE**: WhatsApp group JIDs must end with @g.us
  -> Correction: Always append @g.us to group IDs
- **WORKAROUND**: ElevenLabs API sometimes returns 429. Wait 2 seconds and retry.

Apply these lessons to avoid repeating mistakes.

Heartbeat Learning Review

During low-activity periods, Argent reviews and consolidates lessons:

Pattern detection — Review recent outcomes for recurring patterns
Lesson consolidation — Merge similar lessons
Decay — Reduce confidence on old, unvalidated lessons (30+ days)
Promotion — Boost frequently-validated lessons (5+ occurrences)

User Feedback Integration

Negative feedback triggers immediate lesson extraction with high severity
Positive feedback boosts confidence in lessons used during that action
Implicit feedback (user continued, abandoned, or corrected) is tracked automatically

Configuration

{
  "sis": {
    "enabled": true,
    "lessonExtraction": {
      "model": "local",
      "minConfidenceToStore": 0.3,
      "maxLessonsPerDay": 50
    },
    "retrieval": {
      "maxLessonsInPrompt": 5,
      "minConfidenceToUse": 0.5,
      "recencyBias": 0.2
    },
    "maintenance": {
      "consolidationInterval": "6h",
      "decayAfterDays": 30,
      "decayRate": 0.1
    },
    "feedback": {
      "askForConfirmation": false,
      "trackImplicitFeedback": true
    }
  }
}

Summary

The SIS layer makes Argent a learning system that:

Remembers mistakes and doesn’t repeat them
Recognizes patterns in user behavior and external systems
Accumulates tool knowledge from experience
Improves model routing based on actual performance
Self-maintains through consolidation and decay

This transforms Argent from a stateless assistant into a growing intelligence that becomes more effective over time.

​Overview

​Memory Bank Types

Lessons Learned

Patterns Detected

Tool Knowledge

Model Feedback

User Preferences

Context Awareness

​Lessons Learned

​Patterns Detected

​Feedback Loop Implementation

​1. Observation Collection

​2. Evaluation Engine

​3. Lesson Extraction

​4. Memory Bank Storage

​5. Retrieval and Application

​Heartbeat Learning Review

​User Feedback Integration

​Configuration

​Summary

Overview

Memory Bank Types

Lessons Learned

Patterns Detected

Feedback Loop Implementation

1. Observation Collection

2. Evaluation Engine

3. Lesson Extraction

4. Memory Bank Storage

5. Retrieval and Application

Heartbeat Learning Review

User Feedback Integration

Configuration

Summary