Overview
Profile Collapse is a memory hygiene system that automatically detects and merges duplicate operational profile snapshots in MemU. As the agent operates, heartbeat cycles, status checks, and system monitoring generate manyprofile-type memory items that contain ephemeral operational data — metrics, counts, timestamps, uptime reports. These accumulate rapidly and pollute memory recall with near-identical entries.
Profile Collapse identifies these duplicates through a three-phase grouping strategy (exact match, token match, fuzzy match) and merges them into canonical entries, transferring reinforcement counts to preserve signal strength.
What Gets Collapsed
Not all profile items are candidates for collapse. The system targets operational profile snapshots — items whose summaries contain operational keywords combined with numeric or datetime tokens.Operational Detection
A profile item is classified as operational if its summary matches both:- Operational keywords: status, snapshot, health, metric, count, queue, uptime, latency, ticket, alert, cron, heartbeat, service, gateway, dashboard, api, provider, model
- Ephemeral tokens: numeric values (
42,99.5%) or datetime stamps (2026-03-15)
- Operational (will collapse)
- Not operational (preserved)
- “Gateway health: 3 agents connected, latency 45ms, uptime 99.2%”
- “Heartbeat status 2026-03-15: 12 tasks verified, 2 failed, score 85”
- “Discord channel metrics: 47 messages today, 3 alerts pending”
Three-Phase Grouping
Phase 1: Exact Signature Match
The system normalizes operational summaries into signatures by replacing ephemeral tokens with placeholders:| Token Type | Replacement | Example |
|---|---|---|
| Dates/timestamps | <datetime> | 2026-03-15T14:30:00Z becomes <datetime> |
| UUIDs and hex strings | <id> | a8f2b3c4-... becomes <id> |
| Request/run IDs | <id> | run-abc123 becomes <id> |
| Numbers | <num> | 42, 99.5% become <num> |
Phase 2: Token Key Match
Items not caught by exact matching are further grouped using semantic token keys. The system:- Generates the normalized signature
- Strips stopwords (
the,is,are,was,this,that, etc.) - Strips placeholder tokens (
<id>,<num>,<datetime>) - Applies basic stemming (plural stripping:
servicesbecomesservice) - Sorts remaining tokens alphabetically
- Joins as a token key
Items with identical token keys (and at least 3 semantic tokens) are token duplicates.
Phase 3: Fuzzy Match (Optional)
When enabled, items not caught by exact or token matching are clustered using Jaccard similarity on their semantic token sets:- Similarity threshold: 0.78 (78% token overlap)
- Minimum intersection: 4 shared tokens
- Clustering: Greedy nearest-cluster assignment, sorted by creation time
Canonical Selection
When a group of duplicates is found, one item is selected as the canonical (keeper) entry. Selection priority:- Highest significance —
core>important>noteworthy>routine - Highest reinforcement count — more reinforced = more important
- Oldest creation date — preserve the first observation
- Lexicographic ID — deterministic tiebreaker
Reinforcement Transfer
When duplicates are removed, their reinforcement counts are transferred to the canonical item. Each deleted duplicate contributes at least 1 reinforcement (or its actual count if higher). This preserves the signal that the observation was seen multiple times.Running Profile Collapse
- Dry Run (Default)
- Live Run
By default, collapse runs in dry-run mode — it reports what would happen without deleting anything:
Options
| Option | Type | Default | Description |
|---|---|---|---|
dryRun | boolean | true | Report only, no deletions |
enableFuzzy | boolean | false | Enable Phase 3 fuzzy matching |
maxSampleGroups | number | 20 | Max sample groups in report |
batchSize | number | 500 | Pagination size for profile listing |
Collapse Report
The function returns a detailed report:Foreign Key Safety
When deleting duplicate items, the system handles foreign key constraints gracefully. If a delete fails due to FK constraints (junction tables like
item_categories, category_items, item_entities), it cleans up the junction rows first and retries the deletion.Integration Points
Profile collapse can be triggered:- Manually via the memory hygiene tool
- Periodically as part of scheduled memory maintenance
- On demand when memory item counts exceed thresholds
Key Files
| File | LOC | Description |
|---|---|---|
src/memory/hygiene/profile-collapse.ts | 469 | Core collapse implementation |
src/memory/memu-store.ts | — | MemuStore (provides item listing, deletion, reinforcement) |
src/memory/memu-types.ts | — | MemoryItem and Significance types |
