docs: expand README with search algorithm internals and secrets architecture
Add detailed Search Algorithm section covering FTS5/BM25 (SQLite) and tsvector/ts_rank (PostgreSQL) backends, query construction, semantic expansion at store time, and a comparison table. Also add Sensitive Memory & Secrets section, Auto-Learn details, and Background Sync Engine documentation. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
parent
673647710e
commit
a18b94d310
1 changed files with 202 additions and 0 deletions
202
README.md
202
README.md
|
|
@ -89,6 +89,149 @@ Claude Code Session
|
|||
└──────────────────────┘
|
||||
```
|
||||
|
||||
## Search Algorithm
|
||||
|
||||
Memory recall uses two different full-text search backends depending on the operating mode. Both follow the same query building pattern: the `context` and `expanded_query` fields from the user's recall request are concatenated into a single search string, then processed into a backend-specific query.
|
||||
|
||||
### Query Construction
|
||||
|
||||
When `memory_recall` is called, the tool receives:
|
||||
- **`context`** — the topic or question being asked about
|
||||
- **`expanded_query`** — REQUIRED, minimum 5 space-separated semantically related terms generated by Claude
|
||||
|
||||
These are concatenated: `all_terms = f"{context} {expanded_query}"`, then split into individual words for the search backend.
|
||||
|
||||
### SQLite: FTS5 with BM25
|
||||
|
||||
The local SQLite database uses an [FTS5 virtual table](https://www.sqlite.org/fts5.html) for full-text search with the [BM25](https://en.wikipedia.org/wiki/Okapi_BM25) ranking algorithm.
|
||||
|
||||
**Schema:**
|
||||
```sql
|
||||
CREATE VIRTUAL TABLE memories_fts USING fts5(
|
||||
content, category, tags, expanded_keywords,
|
||||
content='memories', content_rowid='id'
|
||||
);
|
||||
```
|
||||
|
||||
This is a [content table](https://www.sqlite.org/fts5.html#contentless_and_content_external_tables) backed by the `memories` table — FTS5 reads column data from `memories` on demand rather than duplicating it. Triggers on INSERT, UPDATE, and DELETE keep the FTS index synchronized:
|
||||
|
||||
```sql
|
||||
-- Insert trigger
|
||||
CREATE TRIGGER memories_ai AFTER INSERT ON memories BEGIN
|
||||
INSERT INTO memories_fts(rowid, content, category, tags, expanded_keywords)
|
||||
VALUES (new.id, new.content, new.category, new.tags, new.expanded_keywords);
|
||||
END;
|
||||
|
||||
-- Delete trigger (FTS5 special 'delete' command)
|
||||
CREATE TRIGGER memories_ad AFTER DELETE ON memories BEGIN
|
||||
INSERT INTO memories_fts(memories_fts, rowid, content, category, tags, expanded_keywords)
|
||||
VALUES ('delete', old.id, old.content, old.category, old.tags, old.expanded_keywords);
|
||||
END;
|
||||
|
||||
-- Update trigger (delete old + insert new)
|
||||
CREATE TRIGGER memories_au AFTER UPDATE ON memories BEGIN
|
||||
INSERT INTO memories_fts(memories_fts, rowid, ...)
|
||||
VALUES ('delete', old.id, old.content, old.category, old.tags, old.expanded_keywords);
|
||||
INSERT INTO memories_fts(rowid, content, category, tags, expanded_keywords)
|
||||
VALUES (new.id, new.content, new.category, new.tags, new.expanded_keywords);
|
||||
END;
|
||||
```
|
||||
|
||||
**Query building:** Each word is quoted and joined with OR:
|
||||
```python
|
||||
words = all_terms.split()
|
||||
fts_query = " OR ".join(f'"{w}"' for w in words if w)
|
||||
# Example: "kubernetes" OR "deployment" OR "pod" OR "scaling" OR "replicas"
|
||||
```
|
||||
|
||||
**Ranking:** Two sort modes:
|
||||
- `sort_by="relevance"` — `ORDER BY bm25(memories_fts), importance DESC` — FTS5's built-in BM25 relevance score, with importance as tiebreaker
|
||||
- `sort_by="importance"` (default) — `ORDER BY importance DESC, created_at DESC` — user-assigned importance score, with recency as tiebreaker
|
||||
|
||||
**Fallback:** If the FTS5 MATCH query fails (e.g. syntax error from special characters), the search falls back to a `LIKE %query%` scan against `content` and `tags` columns.
|
||||
|
||||
### PostgreSQL: tsvector with Weighted Ranking
|
||||
|
||||
The API server uses PostgreSQL's native [full-text search](https://www.postgresql.org/docs/current/textsearch.html) with a generated `tsvector` column and weighted ranking.
|
||||
|
||||
**Schema** (from migration `001`):
|
||||
```sql
|
||||
CREATE TABLE memories (
|
||||
id SERIAL PRIMARY KEY,
|
||||
content TEXT NOT NULL,
|
||||
category VARCHAR(50) DEFAULT 'facts',
|
||||
tags TEXT DEFAULT '',
|
||||
expanded_keywords TEXT DEFAULT '',
|
||||
importance REAL DEFAULT 0.5,
|
||||
-- ...
|
||||
search_vector tsvector GENERATED ALWAYS AS (
|
||||
setweight(to_tsvector('english', coalesce(content, '')), 'A') ||
|
||||
setweight(to_tsvector('english', coalesce(expanded_keywords, '')), 'B') ||
|
||||
setweight(to_tsvector('english', coalesce(tags, '')), 'C') ||
|
||||
setweight(to_tsvector('english', coalesce(category, '')), 'D')
|
||||
) STORED
|
||||
);
|
||||
|
||||
CREATE INDEX idx_memories_search ON memories USING GIN(search_vector);
|
||||
```
|
||||
|
||||
The `search_vector` is a [generated column](https://www.postgresql.org/docs/current/ddl-generated-columns.html) that PostgreSQL maintains automatically on every INSERT/UPDATE — no triggers needed. It combines four fields with different weights:
|
||||
|
||||
| Weight | Field | Purpose |
|
||||
|--------|-------|---------|
|
||||
| **A** (highest) | `content` | The actual memory text — most important for matching |
|
||||
| **B** | `expanded_keywords` | Semantic search terms generated by Claude at store time |
|
||||
| **C** | `tags` | User-provided comma-separated tags |
|
||||
| **D** (lowest) | `category` | Category name (facts, preferences, etc.) |
|
||||
|
||||
A [GIN index](https://www.postgresql.org/docs/current/gin.html) on `search_vector` enables fast lookup without sequential scans.
|
||||
|
||||
**Query execution:**
|
||||
```sql
|
||||
SELECT id, content, category, tags, importance, is_sensitive,
|
||||
ts_rank(search_vector, query) AS rank,
|
||||
created_at, updated_at
|
||||
FROM memories, plainto_tsquery('english', $2) query
|
||||
WHERE user_id = $1
|
||||
AND deleted_at IS NULL
|
||||
AND (search_vector @@ query OR $2 = '')
|
||||
ORDER BY importance DESC, ts_rank(search_vector, query) DESC
|
||||
LIMIT $3
|
||||
```
|
||||
|
||||
[`plainto_tsquery`](https://www.postgresql.org/docs/current/textsearch-controls.html#TEXTSEARCH-PARSING-QUERIES) processes the concatenated query text through PostgreSQL's English text search dictionary — it handles stemming (e.g. "running" → "run"), stop word removal, and normalization. The `@@` operator matches the resulting `tsquery` against the stored `tsvector`.
|
||||
|
||||
[`ts_rank`](https://www.postgresql.org/docs/current/textsearch-controls.html#TEXTSEARCH-RANKING) scores each match considering the weight of the matched field — a match in `content` (weight A) scores higher than a match in `tags` (weight C).
|
||||
|
||||
**Sort modes:**
|
||||
- `sort_by="importance"` (default) — `ORDER BY importance DESC, ts_rank(search_vector, query) DESC`
|
||||
- `sort_by="relevance"` — `ORDER BY ts_rank(search_vector, query) DESC`
|
||||
- `sort_by="recency"` — `ORDER BY created_at DESC`
|
||||
|
||||
### Semantic Expansion at Store Time
|
||||
|
||||
When Claude stores a memory via `memory_store`, it generates an `expanded_keywords` field — at least 5 space-separated semantically related terms. For example, storing "User prefers Svelte for frontends" might produce:
|
||||
|
||||
```
|
||||
expanded_keywords: "svelte frontend framework ui web sveltekit javascript component reactive"
|
||||
```
|
||||
|
||||
These keywords are indexed alongside the memory content. When a future `memory_recall` searches for "what framework should I use for the dashboard?", the expanded keywords create additional matching surface that pure content matching would miss. This is a form of query expansion performed at write time by the LLM.
|
||||
|
||||
### SQLite vs PostgreSQL Comparison
|
||||
|
||||
| Feature | SQLite (FTS5) | PostgreSQL (tsvector) |
|
||||
|---------|---------------|----------------------|
|
||||
| Tokenization | ICU/unicode61 tokenizer | English text search dictionary |
|
||||
| Stemming | No (exact word matching) | Yes (Porter stemmer via `english` config) |
|
||||
| Stop words | No | Yes (removes "the", "is", "at", etc.) |
|
||||
| Ranking | BM25 (term frequency + inverse document frequency) | `ts_rank` (weighted field matching) |
|
||||
| Field weighting | No (all FTS5 columns equal) | Yes (A/B/C/D weights per column) |
|
||||
| Index type | B-tree on FTS content | GIN (Generalized Inverted Index) |
|
||||
| Query syntax | Quoted OR terms (`"word1" OR "word2"`) | `plainto_tsquery` with implicit AND |
|
||||
| Fallback | LIKE scan on failure | None needed (robust parser) |
|
||||
| Maintenance | Triggers sync content ↔ FTS table | Generated column, auto-maintained |
|
||||
|
||||
## Operating Modes
|
||||
|
||||
| Mode | When | What happens |
|
||||
|
|
@ -207,6 +350,31 @@ Each user gets isolated memory storage. Users cannot see each other's memories.
|
|||
3. Restart the API server
|
||||
4. Share the key with the user
|
||||
|
||||
## Sensitive Memory & Secrets
|
||||
|
||||
Sensitive data is handled through a multi-layer system:
|
||||
|
||||
### Automatic Detection
|
||||
`credential_detector.py` scans all incoming content for secrets using regex patterns with confidence scores (0.5–0.95):
|
||||
|
||||
| Confidence | Pattern Types |
|
||||
|------------|--------------|
|
||||
| 0.95 | Private keys (RSA/EC/DSA/OPENSSH), AWS access keys (`AKIA`/`ASIA`), GitHub tokens (`ghp_`/`gho_`/etc.) |
|
||||
| 0.85–0.90 | Connection strings (postgres/mysql/mongodb/redis/amqp), Basic auth headers, Bearer tokens |
|
||||
| 0.75–0.80 | `api_key=`, `password=`, `token=` assignments |
|
||||
| 0.60–0.65 | Generic `secret=`/`credential=` patterns, hex keys |
|
||||
|
||||
Detected content is **redacted** in the main DB (e.g., `[REDACTED:private_key]`) and the original stored in Vault or encrypted storage.
|
||||
|
||||
### Encryption at Rest
|
||||
|
||||
Three tiers, checked in order:
|
||||
1. **HashiCorp Vault** (KV v2) — if `VAULT_ADDR` + `VAULT_TOKEN` set. Secrets stored at `claude-memory/{user_id}/mem-{id}`. Supports Kubernetes Service Account auto-authentication.
|
||||
2. **AES-256-GCM** — if `MEMORY_ENCRYPTION_KEY` set. Key can be hex-encoded 32 bytes or a passphrase (derived via SHA-256). Stored as `nonce (12 bytes) + ciphertext + GCM tag (16 bytes)` in the `encrypted_content` column.
|
||||
3. **Plaintext fallback** — content stored as-is (still redacted in main column if detected as sensitive).
|
||||
|
||||
Retrieve the original via `secret_get(id=<memory_id>)`.
|
||||
|
||||
## Plugin Hooks
|
||||
|
||||
| Hook | Event | What it does |
|
||||
|
|
@ -217,6 +385,17 @@ Each user gets isolated memory storage. Users cannot see each other's memories.
|
|||
| `auto-learn.py` | Stop (async) | Runs haiku-as-judge on the last exchange to extract durable facts worth storing |
|
||||
| `auto-allow-memory-tools.py` | PermissionRequest | Auto-approves `memory_store`, `memory_recall`, `memory_list`, `memory_delete`, `secret_get` |
|
||||
|
||||
### Auto-Learn Details
|
||||
|
||||
The auto-learn hook runs asynchronously after each Claude response. It operates in two modes:
|
||||
|
||||
- **Single-turn** (every turn) — analyzes just the last user/assistant exchange. Cheap and fast. Extracts corrections, preferences, decisions, and facts.
|
||||
- **Deep multi-turn** (every 5 turns) — analyzes the last 5 exchanges as a window. Also extracts debugging insights, workarounds, and operational knowledge.
|
||||
|
||||
**Judge fallback chain:** Claude CLI (haiku model) → local Ollama (qwen2.5:3b/llama3.2:3b/gemma2:2b/phi3:mini) → heuristic pattern matching (keyword-based extraction from user messages).
|
||||
|
||||
Extracted events are deduplicated via SHA-256 content hashing. Each event is stored to both the MCP memory database and a `~/.claude/projects/<project>/memory/auto-learned.md` markdown file.
|
||||
|
||||
### Debug
|
||||
|
||||
```bash
|
||||
|
|
@ -227,6 +406,18 @@ export DEBUG_CLAUDE_MEMORY_HOOKS=1
|
|||
export DISABLE_CLAUDE_MEMORY_AUTO_APPROVE=1
|
||||
```
|
||||
|
||||
## Background Sync Engine
|
||||
|
||||
In hybrid mode, a `SyncEngine` runs in a daemon thread with its own SQLite connection (thread-safe via `threading.Lock`).
|
||||
|
||||
**Push cycle:** Reads the `pending_ops` queue (SQLite table) and sends each operation to the API. On success, removes the operation from the queue and updates the local `server_id` mapping. On failure, the operation stays queued for the next cycle.
|
||||
|
||||
**Pull cycle:** Calls `GET /api/memories/sync?since=<last_sync_ts>` for incremental changes. Server wins on conflicts — remote updates overwrite local rows (matched by `server_id`). Soft-deleted rows on the server are hard-deleted locally.
|
||||
|
||||
**Write path:** Every `memory_store` first writes to local SQLite (instant), then attempts an immediate sync to the API. If the immediate sync fails, the write is enqueued in `pending_ops` for the next background cycle.
|
||||
|
||||
**Sync interval:** Configurable via `MEMORY_SYNC_INTERVAL` (default: 60 seconds).
|
||||
|
||||
## API Reference
|
||||
|
||||
| Endpoint | Method | Description |
|
||||
|
|
@ -267,6 +458,7 @@ Aliases `CLAUDE_MEMORY_API_URL` and `CLAUDE_MEMORY_API_KEY` are also supported.
|
|||
| `API_KEYS` | Multi-user JSON map `{"user": "key"}` | None |
|
||||
| `VAULT_ADDR` | Vault server address | None |
|
||||
| `VAULT_TOKEN` | Vault authentication token | None |
|
||||
| `VAULT_ROLE` | Kubernetes auth role name | `claude-memory` |
|
||||
| `MEMORY_ENCRYPTION_KEY` | AES-256 key for non-Vault encryption | None |
|
||||
|
||||
## Database Migrations
|
||||
|
|
@ -278,6 +470,13 @@ export DATABASE_URL="postgresql://user:pass@localhost:5432/claude_memory"
|
|||
alembic upgrade head
|
||||
```
|
||||
|
||||
Three migrations:
|
||||
1. `001` — Creates `memories` table with generated `tsvector` column and GIN index
|
||||
2. `002` — Adds `user_id` (multi-user), `is_sensitive`, `vault_path`, `encrypted_content`
|
||||
3. `003` — Adds `deleted_at` (soft delete) and `updated_at` index for incremental sync
|
||||
|
||||
All migrations are idempotent (check column/table existence before altering).
|
||||
|
||||
## Development
|
||||
|
||||
```bash
|
||||
|
|
@ -288,8 +487,11 @@ source .venv/bin/activate
|
|||
pip install -e ".[api,dev]"
|
||||
pytest tests/ -v
|
||||
ruff check src/ tests/
|
||||
mypy src/claude_memory/ --strict
|
||||
```
|
||||
|
||||
The MCP server itself (`mcp_server.py` and `sync.py`) uses **stdlib only** — no pip install needed on the client side. The `[api]` extra adds FastAPI, asyncpg, uvicorn, etc. for the server.
|
||||
|
||||
## License
|
||||
|
||||
Apache License 2.0. See [LICENSE](LICENSE) for details.
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue