> ## Documentation Index
> Fetch the complete documentation index at: https://docs.argentos.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Semantic Search

> How MemU searches memories using FTS5 queries, embedding similarity, and reranking.

## Overview

When the agent calls `memory_recall`, MemU performs a multi-stage search that combines full-text matching with semantic similarity to find the most relevant memories.

## Search Pipeline

```mermaid theme={null}
flowchart LR
  A["Query"] --> B["FTS5 Search"]
  B --> C["Embedding Similarity"]
  C --> D["Score Fusion"]
  D --> E["Rerank"]
  E --> F["Results"]
```

### Stage 1: FTS5 Full-Text Search

The query is run against the FTS5 index. This finds memories that contain matching terms:

```sql theme={null}
SELECT * FROM memory_fts WHERE memory_fts MATCH 'infrastructure specs'
ORDER BY rank
LIMIT 50
```

FTS5 provides a relevance rank based on term frequency, position, and document length.

### Stage 2: Embedding Similarity

The query is embedded into a vector and compared against stored memory embeddings using cosine similarity:

```
similarity = dot(query_embedding, memory_embedding) / (|query_embedding| * |memory_embedding|)
```

<Tip>
  This catches semantically related memories that may not share exact words with the query. For example, "computer setup" would match a memory about "infrastructure specs."
</Tip>

### Stage 3: Score Fusion

FTS5 scores and embedding similarity scores are combined:

```
final_score = (fts5_score * fts5_weight) + (similarity² * embedding_weight) + significance_boost
```

Key details:

* Similarity is squared (`sim²`) to amplify strong matches and suppress weak ones
* Significance adds a boost based on the memory's importance score
* Memory recall queries get a +0.25 score boost for model routing (ensures at least Haiku tier)

### Stage 4: Reranking

Results are reranked with additional signals:

* **Recency**: More recent memories get a slight boost
* **Reinforcement**: Memories that have been recalled frequently (log2 reinforcement) rank higher
* **Entity matching**: Memories mentioning entities in the query get a boost

<Note>
  If the reranker encounters errors, it falls back safely to the fusion score ordering.
</Note>

## Query Optimization

### Best Practices

```
Good: "Jason infrastructure"          # Specific keywords
Good: "telegram OR discord setup"     # OR for alternatives
Good: "meeting notes February"        # Scoped by time context

Bad:  "what did we talk about"        # Too vague
Bad:  "the thing from yesterday"      # No searchable terms
```

### FTS5 Syntax

| Syntax           | Example               | Meaning                |
| ---------------- | --------------------- | ---------------------- |
| `word`           | `infrastructure`      | Match this term        |
| `word1 word2`    | `Jason preferences`   | Match both terms (AND) |
| `word1 OR word2` | `telegram OR discord` | Match either term      |
| `"phrase"`       | `"model router"`      | Match exact phrase     |
| `word*`          | `config*`             | Prefix match           |

## Result Format

Search results are returned to the agent as a structured list:

```json theme={null}
[
  {
    "id": 42,
    "content": "Jason runs 2x NVIDIA DGX Spark with 256GB unified memory",
    "significance": 8,
    "score": 0.92,
    "created_at": "2026-01-15T10:30:00Z"
  }
]
```
