docs: expand README with search algorithm internals and secrets architecture

Add detailed Search Algorithm section covering FTS5/BM25 (SQLite) and
tsvector/ts_rank (PostgreSQL) backends, query construction, semantic
expansion at store time, and a comparison table. Also add Sensitive
Memory & Secrets section, Auto-Learn details, and Background Sync Engine
documentation.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
Viktor Barzin 2026-03-16 13:23:15 +00:00
parent 673647710e
commit a18b94d310

202
README.md
View file

@ -89,6 +89,149 @@ Claude Code Session
└──────────────────────┘ └──────────────────────┘
``` ```
## Search Algorithm
Memory recall uses two different full-text search backends depending on the operating mode. Both follow the same query building pattern: the `context` and `expanded_query` fields from the user's recall request are concatenated into a single search string, then processed into a backend-specific query.
### Query Construction
When `memory_recall` is called, the tool receives:
- **`context`** — the topic or question being asked about
- **`expanded_query`** — REQUIRED, minimum 5 space-separated semantically related terms generated by Claude
These are concatenated: `all_terms = f"{context} {expanded_query}"`, then split into individual words for the search backend.
### SQLite: FTS5 with BM25
The local SQLite database uses an [FTS5 virtual table](https://www.sqlite.org/fts5.html) for full-text search with the [BM25](https://en.wikipedia.org/wiki/Okapi_BM25) ranking algorithm.
**Schema:**
```sql
CREATE VIRTUAL TABLE memories_fts USING fts5(
content, category, tags, expanded_keywords,
content='memories', content_rowid='id'
);
```
This is a [content table](https://www.sqlite.org/fts5.html#contentless_and_content_external_tables) backed by the `memories` table — FTS5 reads column data from `memories` on demand rather than duplicating it. Triggers on INSERT, UPDATE, and DELETE keep the FTS index synchronized:
```sql
-- Insert trigger
CREATE TRIGGER memories_ai AFTER INSERT ON memories BEGIN
INSERT INTO memories_fts(rowid, content, category, tags, expanded_keywords)
VALUES (new.id, new.content, new.category, new.tags, new.expanded_keywords);
END;
-- Delete trigger (FTS5 special 'delete' command)
CREATE TRIGGER memories_ad AFTER DELETE ON memories BEGIN
INSERT INTO memories_fts(memories_fts, rowid, content, category, tags, expanded_keywords)
VALUES ('delete', old.id, old.content, old.category, old.tags, old.expanded_keywords);
END;
-- Update trigger (delete old + insert new)
CREATE TRIGGER memories_au AFTER UPDATE ON memories BEGIN
INSERT INTO memories_fts(memories_fts, rowid, ...)
VALUES ('delete', old.id, old.content, old.category, old.tags, old.expanded_keywords);
INSERT INTO memories_fts(rowid, content, category, tags, expanded_keywords)
VALUES (new.id, new.content, new.category, new.tags, new.expanded_keywords);
END;
```
**Query building:** Each word is quoted and joined with OR:
```python
words = all_terms.split()
fts_query = " OR ".join(f'"{w}"' for w in words if w)
# Example: "kubernetes" OR "deployment" OR "pod" OR "scaling" OR "replicas"
```
**Ranking:** Two sort modes:
- `sort_by="relevance"``ORDER BY bm25(memories_fts), importance DESC` — FTS5's built-in BM25 relevance score, with importance as tiebreaker
- `sort_by="importance"` (default) — `ORDER BY importance DESC, created_at DESC` — user-assigned importance score, with recency as tiebreaker
**Fallback:** If the FTS5 MATCH query fails (e.g. syntax error from special characters), the search falls back to a `LIKE %query%` scan against `content` and `tags` columns.
### PostgreSQL: tsvector with Weighted Ranking
The API server uses PostgreSQL's native [full-text search](https://www.postgresql.org/docs/current/textsearch.html) with a generated `tsvector` column and weighted ranking.
**Schema** (from migration `001`):
```sql
CREATE TABLE memories (
id SERIAL PRIMARY KEY,
content TEXT NOT NULL,
category VARCHAR(50) DEFAULT 'facts',
tags TEXT DEFAULT '',
expanded_keywords TEXT DEFAULT '',
importance REAL DEFAULT 0.5,
-- ...
search_vector tsvector GENERATED ALWAYS AS (
setweight(to_tsvector('english', coalesce(content, '')), 'A') ||
setweight(to_tsvector('english', coalesce(expanded_keywords, '')), 'B') ||
setweight(to_tsvector('english', coalesce(tags, '')), 'C') ||
setweight(to_tsvector('english', coalesce(category, '')), 'D')
) STORED
);
CREATE INDEX idx_memories_search ON memories USING GIN(search_vector);
```
The `search_vector` is a [generated column](https://www.postgresql.org/docs/current/ddl-generated-columns.html) that PostgreSQL maintains automatically on every INSERT/UPDATE — no triggers needed. It combines four fields with different weights:
| Weight | Field | Purpose |
|--------|-------|---------|
| **A** (highest) | `content` | The actual memory text — most important for matching |
| **B** | `expanded_keywords` | Semantic search terms generated by Claude at store time |
| **C** | `tags` | User-provided comma-separated tags |
| **D** (lowest) | `category` | Category name (facts, preferences, etc.) |
A [GIN index](https://www.postgresql.org/docs/current/gin.html) on `search_vector` enables fast lookup without sequential scans.
**Query execution:**
```sql
SELECT id, content, category, tags, importance, is_sensitive,
ts_rank(search_vector, query) AS rank,
created_at, updated_at
FROM memories, plainto_tsquery('english', $2) query
WHERE user_id = $1
AND deleted_at IS NULL
AND (search_vector @@ query OR $2 = '')
ORDER BY importance DESC, ts_rank(search_vector, query) DESC
LIMIT $3
```
[`plainto_tsquery`](https://www.postgresql.org/docs/current/textsearch-controls.html#TEXTSEARCH-PARSING-QUERIES) processes the concatenated query text through PostgreSQL's English text search dictionary — it handles stemming (e.g. "running" → "run"), stop word removal, and normalization. The `@@` operator matches the resulting `tsquery` against the stored `tsvector`.
[`ts_rank`](https://www.postgresql.org/docs/current/textsearch-controls.html#TEXTSEARCH-RANKING) scores each match considering the weight of the matched field — a match in `content` (weight A) scores higher than a match in `tags` (weight C).
**Sort modes:**
- `sort_by="importance"` (default) — `ORDER BY importance DESC, ts_rank(search_vector, query) DESC`
- `sort_by="relevance"``ORDER BY ts_rank(search_vector, query) DESC`
- `sort_by="recency"``ORDER BY created_at DESC`
### Semantic Expansion at Store Time
When Claude stores a memory via `memory_store`, it generates an `expanded_keywords` field — at least 5 space-separated semantically related terms. For example, storing "User prefers Svelte for frontends" might produce:
```
expanded_keywords: "svelte frontend framework ui web sveltekit javascript component reactive"
```
These keywords are indexed alongside the memory content. When a future `memory_recall` searches for "what framework should I use for the dashboard?", the expanded keywords create additional matching surface that pure content matching would miss. This is a form of query expansion performed at write time by the LLM.
### SQLite vs PostgreSQL Comparison
| Feature | SQLite (FTS5) | PostgreSQL (tsvector) |
|---------|---------------|----------------------|
| Tokenization | ICU/unicode61 tokenizer | English text search dictionary |
| Stemming | No (exact word matching) | Yes (Porter stemmer via `english` config) |
| Stop words | No | Yes (removes "the", "is", "at", etc.) |
| Ranking | BM25 (term frequency + inverse document frequency) | `ts_rank` (weighted field matching) |
| Field weighting | No (all FTS5 columns equal) | Yes (A/B/C/D weights per column) |
| Index type | B-tree on FTS content | GIN (Generalized Inverted Index) |
| Query syntax | Quoted OR terms (`"word1" OR "word2"`) | `plainto_tsquery` with implicit AND |
| Fallback | LIKE scan on failure | None needed (robust parser) |
| Maintenance | Triggers sync content ↔ FTS table | Generated column, auto-maintained |
## Operating Modes ## Operating Modes
| Mode | When | What happens | | Mode | When | What happens |
@ -207,6 +350,31 @@ Each user gets isolated memory storage. Users cannot see each other's memories.
3. Restart the API server 3. Restart the API server
4. Share the key with the user 4. Share the key with the user
## Sensitive Memory & Secrets
Sensitive data is handled through a multi-layer system:
### Automatic Detection
`credential_detector.py` scans all incoming content for secrets using regex patterns with confidence scores (0.50.95):
| Confidence | Pattern Types |
|------------|--------------|
| 0.95 | Private keys (RSA/EC/DSA/OPENSSH), AWS access keys (`AKIA`/`ASIA`), GitHub tokens (`ghp_`/`gho_`/etc.) |
| 0.850.90 | Connection strings (postgres/mysql/mongodb/redis/amqp), Basic auth headers, Bearer tokens |
| 0.750.80 | `api_key=`, `password=`, `token=` assignments |
| 0.600.65 | Generic `secret=`/`credential=` patterns, hex keys |
Detected content is **redacted** in the main DB (e.g., `[REDACTED:private_key]`) and the original stored in Vault or encrypted storage.
### Encryption at Rest
Three tiers, checked in order:
1. **HashiCorp Vault** (KV v2) — if `VAULT_ADDR` + `VAULT_TOKEN` set. Secrets stored at `claude-memory/{user_id}/mem-{id}`. Supports Kubernetes Service Account auto-authentication.
2. **AES-256-GCM** — if `MEMORY_ENCRYPTION_KEY` set. Key can be hex-encoded 32 bytes or a passphrase (derived via SHA-256). Stored as `nonce (12 bytes) + ciphertext + GCM tag (16 bytes)` in the `encrypted_content` column.
3. **Plaintext fallback** — content stored as-is (still redacted in main column if detected as sensitive).
Retrieve the original via `secret_get(id=<memory_id>)`.
## Plugin Hooks ## Plugin Hooks
| Hook | Event | What it does | | Hook | Event | What it does |
@ -217,6 +385,17 @@ Each user gets isolated memory storage. Users cannot see each other's memories.
| `auto-learn.py` | Stop (async) | Runs haiku-as-judge on the last exchange to extract durable facts worth storing | | `auto-learn.py` | Stop (async) | Runs haiku-as-judge on the last exchange to extract durable facts worth storing |
| `auto-allow-memory-tools.py` | PermissionRequest | Auto-approves `memory_store`, `memory_recall`, `memory_list`, `memory_delete`, `secret_get` | | `auto-allow-memory-tools.py` | PermissionRequest | Auto-approves `memory_store`, `memory_recall`, `memory_list`, `memory_delete`, `secret_get` |
### Auto-Learn Details
The auto-learn hook runs asynchronously after each Claude response. It operates in two modes:
- **Single-turn** (every turn) — analyzes just the last user/assistant exchange. Cheap and fast. Extracts corrections, preferences, decisions, and facts.
- **Deep multi-turn** (every 5 turns) — analyzes the last 5 exchanges as a window. Also extracts debugging insights, workarounds, and operational knowledge.
**Judge fallback chain:** Claude CLI (haiku model) → local Ollama (qwen2.5:3b/llama3.2:3b/gemma2:2b/phi3:mini) → heuristic pattern matching (keyword-based extraction from user messages).
Extracted events are deduplicated via SHA-256 content hashing. Each event is stored to both the MCP memory database and a `~/.claude/projects/<project>/memory/auto-learned.md` markdown file.
### Debug ### Debug
```bash ```bash
@ -227,6 +406,18 @@ export DEBUG_CLAUDE_MEMORY_HOOKS=1
export DISABLE_CLAUDE_MEMORY_AUTO_APPROVE=1 export DISABLE_CLAUDE_MEMORY_AUTO_APPROVE=1
``` ```
## Background Sync Engine
In hybrid mode, a `SyncEngine` runs in a daemon thread with its own SQLite connection (thread-safe via `threading.Lock`).
**Push cycle:** Reads the `pending_ops` queue (SQLite table) and sends each operation to the API. On success, removes the operation from the queue and updates the local `server_id` mapping. On failure, the operation stays queued for the next cycle.
**Pull cycle:** Calls `GET /api/memories/sync?since=<last_sync_ts>` for incremental changes. Server wins on conflicts — remote updates overwrite local rows (matched by `server_id`). Soft-deleted rows on the server are hard-deleted locally.
**Write path:** Every `memory_store` first writes to local SQLite (instant), then attempts an immediate sync to the API. If the immediate sync fails, the write is enqueued in `pending_ops` for the next background cycle.
**Sync interval:** Configurable via `MEMORY_SYNC_INTERVAL` (default: 60 seconds).
## API Reference ## API Reference
| Endpoint | Method | Description | | Endpoint | Method | Description |
@ -267,6 +458,7 @@ Aliases `CLAUDE_MEMORY_API_URL` and `CLAUDE_MEMORY_API_KEY` are also supported.
| `API_KEYS` | Multi-user JSON map `{"user": "key"}` | None | | `API_KEYS` | Multi-user JSON map `{"user": "key"}` | None |
| `VAULT_ADDR` | Vault server address | None | | `VAULT_ADDR` | Vault server address | None |
| `VAULT_TOKEN` | Vault authentication token | None | | `VAULT_TOKEN` | Vault authentication token | None |
| `VAULT_ROLE` | Kubernetes auth role name | `claude-memory` |
| `MEMORY_ENCRYPTION_KEY` | AES-256 key for non-Vault encryption | None | | `MEMORY_ENCRYPTION_KEY` | AES-256 key for non-Vault encryption | None |
## Database Migrations ## Database Migrations
@ -278,6 +470,13 @@ export DATABASE_URL="postgresql://user:pass@localhost:5432/claude_memory"
alembic upgrade head alembic upgrade head
``` ```
Three migrations:
1. `001` — Creates `memories` table with generated `tsvector` column and GIN index
2. `002` — Adds `user_id` (multi-user), `is_sensitive`, `vault_path`, `encrypted_content`
3. `003` — Adds `deleted_at` (soft delete) and `updated_at` index for incremental sync
All migrations are idempotent (check column/table existence before altering).
## Development ## Development
```bash ```bash
@ -288,8 +487,11 @@ source .venv/bin/activate
pip install -e ".[api,dev]" pip install -e ".[api,dev]"
pytest tests/ -v pytest tests/ -v
ruff check src/ tests/ ruff check src/ tests/
mypy src/claude_memory/ --strict
``` ```
The MCP server itself (`mcp_server.py` and `sync.py`) uses **stdlib only** — no pip install needed on the client side. The `[api]` extra adds FastAPI, asyncpg, uvicorn, etc. for the server.
## License ## License
Apache License 2.0. See [LICENSE](LICENSE) for details. Apache License 2.0. See [LICENSE](LICENSE) for details.