No description
Find a file
Viktor Barzin 4d7988b6ac fix: replace BaseHTTPMiddleware with pure ASGI to fix SSE streaming
BaseHTTPMiddleware buffers response bodies, which prevents SSE events
from streaming to the client in real time. This caused MCP initialization
to never complete, resulting in -32602 errors on all tool calls.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 10:45:20 +00:00
.claude add project CLAUDE.md [ci skip] 2026-03-22 23:44:38 +02:00
.claude-plugin Add Claude Code plugin scaffold for single-repo install 2026-03-14 14:49:18 +00:00
.github/workflows fix: use Woodpecker repo ID in API URL, quote YAML commands 2026-03-16 00:06:26 +00:00
.woodpecker fix: remove slack notify from deploy pipeline to unblock manual triggers 2026-03-16 22:18:48 +00:00
commands Add Claude Code plugin scaffold for single-repo install 2026-03-14 14:49:18 +00:00
deploy feat: standalone claude-memory-mcp with multi-user support and Vault integration 2026-03-14 09:42:05 +00:00
docker feat: add Alembic for database migrations 2026-03-14 10:34:45 +00:00
examples feat: standalone claude-memory-mcp with multi-user support and Vault integration 2026-03-14 09:42:05 +00:00
hooks resilient memory sync: decouple push/pull, startup full resync, auth failure handling 2026-03-16 18:37:59 +00:00
mcp Add Claude Code plugin scaffold for single-repo install 2026-03-14 14:49:18 +00:00
migrations add multi-user memory sharing with r/w permissions 2026-03-22 15:34:01 +02:00
openclaw-plugin Add OpenClaw memory-api plugin 2026-03-14 13:54:03 +00:00
scripts feat: add background knowledge extraction script for Stop hook 2026-03-14 11:55:51 +00:00
skills/memory-management Add Claude Code plugin scaffold for single-repo install 2026-03-14 14:49:18 +00:00
src/claude_memory fix: replace BaseHTTPMiddleware with pure ASGI to fix SSE streaming 2026-04-08 10:45:20 +00:00
tests feat: sharing tests, property tests, tag-share UI, inline errors, route fix 2026-03-22 23:36:13 +02:00
.gitignore feat: sharing tests, property tests, tag-share UI, inline errors, route fix 2026-03-22 23:36:13 +02:00
alembic.ini feat: add Alembic for database migrations 2026-03-14 10:34:45 +00:00
LICENSE feat: standalone claude-memory-mcp with multi-user support and Vault integration 2026-03-14 09:42:05 +00:00
pyproject.toml feat: sharing tests, property tests, tag-share UI, inline errors, route fix 2026-03-22 23:36:13 +02:00
README.md docs: add sharing feature to README 2026-03-22 19:58:30 +02:00

Claude Memory MCP

Give Claude persistent memory that survives across sessions, context compactions, and machines.

claude plugins install github:ViktorBarzin/claude-memory-mcp

That's it. Claude now remembers things.

What It Does

You:    "remember I prefer Svelte for all new frontend apps"
Claude: Stored. ✓

  ── 3 weeks later, new session, different machine ──

You:    "build me a dashboard for this API"
Claude: I'll set up a SvelteKit project since that's your preference...

No configuration needed for single-machine use. Memories are stored in a local SQLite database with full-text search. For multi-machine sync, point it at an API server (details below).

Features

Slash Commands

  • /remember <fact> — store a fact, preference, decision, or person detail
  • /recall <query> — search memories by topic

Automatic Behaviors (via hooks)

  • Auto-recall — before responding, Claude checks stored memories for relevant context (preferences, past corrections, decisions). Completely invisible to the user.
  • Auto-learn — after each response, a background process analyzes the conversation for corrections, preferences, and decisions worth remembering. Uses haiku-as-judge for conservative extraction.
  • Compaction survival — when Claude's context window compacts, key memories are saved to a marker file and re-injected on the next prompt. No knowledge is lost.
  • Auto-approve — memory tool calls are approved automatically, no permission prompts.

MCP Tools

Claude has direct access to these tools during conversation:

Tool Description
memory_store Store a fact with category, tags, importance, and semantic search keywords
memory_recall Search memories using full-text search with expanded query terms
memory_list List recent memories, optionally filtered by category
memory_delete Delete a memory by ID
secret_get Retrieve the decrypted content of a sensitive memory
memory_count Get memory counts by category and sync status diagnostics
memory_share Share a memory with another user (read or write access)
memory_unshare Revoke sharing of a memory from a user
memory_share_tag Share all memories with a tag (live rule — future memories included)
memory_unshare_tag Revoke tag-based sharing
memory_update Update an existing memory's content, tags, or importance
memory_shared_with_me List memories others have shared with you
memory_my_shares List all sharing rules you've created

Sharing tools require an API connection (not available in SQLite-only mode).

Memory Categories

Memories are organized into: facts, preferences, projects, people, decisions

Sensitive Memory Support

Mark memories as sensitive with force_sensitive: true. When Vault is configured, sensitive content is encrypted at rest and only decryptable via secret_get.

Architecture

Claude Code Session
┌──────────────────────────────────────────────────────┐
│  Hooks                        MCP Server             │
│  ┌──────────────────┐        ┌────────────────────┐  │
│  │ auto-recall      │───────▶│ memory_store       │  │
│  │ auto-learn       │        │ memory_recall      │  │
│  │ compaction       │        │ memory_list        │  │
│  │ auto-approve     │        │ memory_delete      │  │
│  └──────────────────┘        │ secret_get         │  │
│                              │ memory_count       │  │
│                              └─────────┬──────────┘  │
│                                        │             │
│              ┌─────────────────────────┼──────────┐  │
│              │ Local SQLite            │          │  │
│              │ (cache + FTS5)    SyncEngine       │  │
│              │ ◄──── always ────(background, 60s) │  │
│              │       used       push queued writes │  │
│              │                  pull server changes│  │
│              │  pending_ops ──────────┘            │  │
│              │  (offline write queue)              │  │
│              └────────────────────────────────────┘  │
└──────────────────────────────────────────────────────┘
                                │
              optional, for multi-machine sync
                                │
                     ┌──────────▼───────────┐
                     │  API Server (FastAPI) │
                     │  Docker / Kubernetes  │
                     └──────────┬───────────┘
                                │
                     ┌──────────▼───────────┐
                     │    PostgreSQL         │
                     │  (authoritative)      │
                     └──────────────────────┘

Sync Resilience

The SyncEngine is designed to handle failures gracefully:

  • Decoupled push/pull — push failures never block pull. Remote changes from other agents always flow in.
  • Auth failure detection — on 401/403, the engine sets an auth-failed flag, logs a clear warning, and degrades to SQLite-only mode. A periodic health check detects when auth is restored.
  • Startup full resync — on MCP server start, a full cache replacement runs to catch drift from other agents, deleted records, and schema changes.
  • Periodic full resync — every ~10 minutes, a full resync replaces incremental sync to catch any drift.
  • Retry cap — individual pending ops are retried up to 5 times, then permanently skipped to prevent queue buildup.
  • Orphan reconciliation — local records that never synced are deduplicated against server content before push.

Search Algorithm

Memory recall uses two different full-text search backends depending on the operating mode. Both follow the same query building pattern: the context and expanded_query fields from the user's recall request are concatenated into a single search string, then processed into a backend-specific query.

Query Construction

When memory_recall is called, the tool receives:

  • context — the topic or question being asked about
  • expanded_query — REQUIRED, minimum 5 space-separated semantically related terms generated by Claude

These are concatenated: all_terms = f"{context} {expanded_query}", then split into individual words for the search backend.

SQLite: FTS5 with BM25

The local SQLite database uses an FTS5 virtual table for full-text search with the BM25 ranking algorithm.

Schema:

CREATE VIRTUAL TABLE memories_fts USING fts5(
    content, category, tags, expanded_keywords,
    content='memories', content_rowid='id'
);

This is a content table backed by the memories table — FTS5 reads column data from memories on demand rather than duplicating it. Triggers on INSERT, UPDATE, and DELETE keep the FTS index synchronized:

-- Insert trigger
CREATE TRIGGER memories_ai AFTER INSERT ON memories BEGIN
    INSERT INTO memories_fts(rowid, content, category, tags, expanded_keywords)
    VALUES (new.id, new.content, new.category, new.tags, new.expanded_keywords);
END;

-- Delete trigger (FTS5 special 'delete' command)
CREATE TRIGGER memories_ad AFTER DELETE ON memories BEGIN
    INSERT INTO memories_fts(memories_fts, rowid, content, category, tags, expanded_keywords)
    VALUES ('delete', old.id, old.content, old.category, old.tags, old.expanded_keywords);
END;

-- Update trigger (delete old + insert new)
CREATE TRIGGER memories_au AFTER UPDATE ON memories BEGIN
    INSERT INTO memories_fts(memories_fts, rowid, ...)
    VALUES ('delete', old.id, old.content, old.category, old.tags, old.expanded_keywords);
    INSERT INTO memories_fts(rowid, content, category, tags, expanded_keywords)
    VALUES (new.id, new.content, new.category, new.tags, new.expanded_keywords);
END;

Query building: Each word is quoted and joined with OR:

words = all_terms.split()
fts_query = " OR ".join(f'"{w}"' for w in words if w)
# Example: "kubernetes" OR "deployment" OR "pod" OR "scaling" OR "replicas"

Ranking: Two sort modes:

  • sort_by="relevance"ORDER BY bm25(memories_fts), importance DESC — FTS5's built-in BM25 relevance score, with importance as tiebreaker
  • sort_by="importance" (default) — ORDER BY importance DESC, created_at DESC — user-assigned importance score, with recency as tiebreaker

Fallback: If the FTS5 MATCH query fails (e.g. syntax error from special characters), the search falls back to a LIKE %query% scan against content and tags columns.

PostgreSQL: tsvector with Weighted Ranking

The API server uses PostgreSQL's native full-text search with a generated tsvector column and weighted ranking.

Schema (from migration 001):

CREATE TABLE memories (
    id SERIAL PRIMARY KEY,
    content TEXT NOT NULL,
    category VARCHAR(50) DEFAULT 'facts',
    tags TEXT DEFAULT '',
    expanded_keywords TEXT DEFAULT '',
    importance REAL DEFAULT 0.5,
    -- ...
    search_vector tsvector GENERATED ALWAYS AS (
        setweight(to_tsvector('english', coalesce(content, '')), 'A') ||
        setweight(to_tsvector('english', coalesce(expanded_keywords, '')), 'B') ||
        setweight(to_tsvector('english', coalesce(tags, '')), 'C') ||
        setweight(to_tsvector('english', coalesce(category, '')), 'D')
    ) STORED
);

CREATE INDEX idx_memories_search ON memories USING GIN(search_vector);

The search_vector is a generated column that PostgreSQL maintains automatically on every INSERT/UPDATE — no triggers needed. It combines four fields with different weights:

Weight Field Purpose
A (highest) content The actual memory text — most important for matching
B expanded_keywords Semantic search terms generated by Claude at store time
C tags User-provided comma-separated tags
D (lowest) category Category name (facts, preferences, etc.)

A GIN index on search_vector enables fast lookup without sequential scans.

Query execution:

SELECT id, content, category, tags, importance, is_sensitive,
       ts_rank(search_vector, query) AS rank,
       created_at, updated_at
FROM memories, plainto_tsquery('english', $2) query
WHERE user_id = $1
  AND deleted_at IS NULL
  AND (search_vector @@ query OR $2 = '')
ORDER BY importance DESC, ts_rank(search_vector, query) DESC
LIMIT $3

plainto_tsquery processes the concatenated query text through PostgreSQL's English text search dictionary — it handles stemming (e.g. "running" → "run"), stop word removal, and normalization. The @@ operator matches the resulting tsquery against the stored tsvector.

ts_rank scores each match considering the weight of the matched field — a match in content (weight A) scores higher than a match in tags (weight C).

Sort modes:

  • sort_by="importance" (default) — ORDER BY importance DESC, ts_rank(search_vector, query) DESC
  • sort_by="relevance"ORDER BY ts_rank(search_vector, query) DESC
  • sort_by="recency"ORDER BY created_at DESC

Semantic Expansion at Store Time

When Claude stores a memory via memory_store, it generates an expanded_keywords field — at least 5 space-separated semantically related terms. For example, storing "User prefers Svelte for frontends" might produce:

expanded_keywords: "svelte frontend framework ui web sveltekit javascript component reactive"

These keywords are indexed alongside the memory content. When a future memory_recall searches for "what framework should I use for the dashboard?", the expanded keywords create additional matching surface that pure content matching would miss. This is a form of query expansion performed at write time by the LLM.

SQLite vs PostgreSQL Comparison

Feature SQLite (FTS5) PostgreSQL (tsvector)
Tokenization ICU/unicode61 tokenizer English text search dictionary
Stemming No (exact word matching) Yes (Porter stemmer via english config)
Stop words No Yes (removes "the", "is", "at", etc.)
Ranking BM25 (term frequency + inverse document frequency) ts_rank (weighted field matching)
Field weighting No (all FTS5 columns equal) Yes (A/B/C/D weights per column)
Index type B-tree on FTS content GIN (Generalized Inverted Index)
Query syntax Quoted OR terms ("word1" OR "word2") plainto_tsquery with implicit AND
Fallback LIKE scan on failure None needed (robust parser)
Maintenance Triggers sync content ↔ FTS table Generated column, auto-maintained

Operating Modes

Mode When What happens
SQLite-only No env vars set Everything local. Zero config. Works offline.
Hybrid MEMORY_API_KEY set Local SQLite for reads + background sync to API. Writes queue if API is down.
HTTP-only MEMORY_API_KEY + MEMORY_SYNC_DISABLE=1 Direct API calls, no local cache. Legacy mode.
Full Any mode + Vault configured Above + Vault for encrypting sensitive memories at rest.

Setup

claude plugins install github:ViktorBarzin/claude-memory-mcp

Works immediately with SQLite-only mode. To enable multi-machine sync:

export MEMORY_API_URL="https://your-server.example.com"
export MEMORY_API_KEY="your-api-key"

Option 2: Manual MCP Config

Add to ~/.claude.json under mcpServers:

{
  "mcpServers": {
    "claude_memory": {
      "type": "stdio",
      "command": "python3",
      "args": ["/path/to/claude-memory-mcp/src/claude_memory/mcp_server.py"],
      "env": {
        "PYTHONPATH": "/path/to/claude-memory-mcp/src",
        "MEMORY_API_URL": "https://your-server.example.com",
        "MEMORY_API_KEY": "your-api-key"
      }
    }
  }
}

Omit MEMORY_API_URL and MEMORY_API_KEY for SQLite-only mode.

Verify

You: /remember "I prefer dark mode in all applications"
You: /recall "UI preferences"

Running the API Server

Only needed for multi-machine sync. Skip this if you're using SQLite-only mode.

Docker Compose

cd docker
docker compose up -d

Starts the API server + PostgreSQL. Optionally add Vault:

docker compose --profile vault up -d

Manual

pip install claude-memory-mcp[api]
export DATABASE_URL="postgresql://user:pass@localhost:5432/claude_memory"
export API_KEY="your-secret-key"
alembic upgrade head
uvicorn claude_memory.api.app:app --host 0.0.0.0 --port 8000

Kubernetes

Designed for high availability: 2 replicas with pod anti-affinity, PodDisruptionBudget, and startup probes.

helm install claude-memory deploy/helm/claude-memory \
  --set env.DATABASE_URL="postgresql://user:pass@host:5432/db" \
  --set env.API_KEY="your-key" \
  --set ingress.host="claude-memory.yourdomain.com"

Or raw manifests:

kubectl apply -f deploy/kubernetes/namespace.yaml
kubectl create secret generic claude-memory-secrets \
  -n claude-memory \
  --from-literal=database-url="postgresql://user:pass@host:5432/db" \
  --from-literal=api-key="your-key"
kubectl apply -f deploy/kubernetes/

Multi-User Setup

For team deployments, use API_KEYS with a JSON mapping:

export API_KEYS='{"alice": "key-alice-xxx", "bob": "key-bob-yyy"}'

Each user gets isolated memory storage by default. Users can selectively share memories with each other.

Memory Sharing

Users can share individual memories or entire tags with other users:

You: "share memory #42 with bob as read-only"
You: "share all memories tagged 'infra' with alice with write access"

Individual shares grant access to a specific memory. Tag shares are live rules — any future memory with that tag is automatically visible to the target user.

Permission levels:

  • read — can see the memory in recall results
  • write — can also update the memory's content, tags, and importance

Only the owner can delete a memory, even if others have write access. Shared memories appear in memory_recall results with shared_by and share_permission fields.

Adding a user:

  1. Generate a key: openssl rand -base64 32
  2. Add to API_KEYS JSON
  3. Restart the API server
  4. Share the key with the user

Sensitive Memory & Secrets

Sensitive data is handled through a multi-layer system:

Automatic Detection

credential_detector.py scans all incoming content for secrets using regex patterns with confidence scores (0.50.95):

Confidence Pattern Types
0.95 Private keys (RSA/EC/DSA/OPENSSH), AWS access keys (AKIA/ASIA), GitHub tokens (ghp_/gho_/etc.)
0.850.90 Connection strings (postgres/mysql/mongodb/redis/amqp), Basic auth headers, Bearer tokens
0.750.80 api_key=, password=, token= assignments
0.600.65 Generic secret=/credential= patterns, hex keys

Detected content is redacted in the main DB (e.g., [REDACTED:private_key]) and the original stored in Vault or encrypted storage.

Encryption at Rest

Three tiers, checked in order:

  1. HashiCorp Vault (KV v2) — if VAULT_ADDR + VAULT_TOKEN set. Secrets stored at claude-memory/{user_id}/mem-{id}. Supports Kubernetes Service Account auto-authentication.
  2. AES-256-GCM — if MEMORY_ENCRYPTION_KEY set. Key can be hex-encoded 32 bytes or a passphrase (derived via SHA-256). Stored as nonce (12 bytes) + ciphertext + GCM tag (16 bytes) in the encrypted_content column.
  3. Plaintext fallback — content stored as-is (still redacted in main column if detected as sensitive).

Retrieve the original via secret_get(id=<memory_id>).

Plugin Hooks

Hook Event What it does
pre-compact-backup.sh PreCompact Saves top 20 memories to a marker file before context compaction
post-compact-recovery.sh UserPromptSubmit Re-injects saved memories after compaction (one-time, then deletes marker)
user-prompt-recall.py UserPromptSubmit Tells Claude to check memory_recall before responding to each message
auto-learn.py Stop (async) Runs haiku-as-judge on the last exchange to extract durable facts worth storing
auto-allow-memory-tools.py PermissionRequest Auto-approves memory_store, memory_recall, memory_list, memory_delete, secret_get

Auto-Learn Details

The auto-learn hook runs asynchronously after each Claude response. It operates in two modes:

  • Single-turn (every turn) — analyzes just the last user/assistant exchange. Cheap and fast. Extracts corrections, preferences, decisions, and facts.
  • Deep multi-turn (every 5 turns) — analyzes the last 5 exchanges as a window. Also extracts debugging insights, workarounds, and operational knowledge.

Judge fallback chain: Claude CLI (haiku model) → local Ollama (qwen2.5:3b/llama3.2:3b/gemma2:2b/phi3:mini) → heuristic pattern matching (keyword-based extraction from user messages).

Extracted events are deduplicated via SHA-256 content hashing. Each event is stored to the MCP memory database.

Debug

# See what hooks are doing
export DEBUG_CLAUDE_MEMORY_HOOKS=1

# Disable auto-approve to see permission prompts
export DISABLE_CLAUDE_MEMORY_AUTO_APPROVE=1

Background Sync Engine

In hybrid mode, a SyncEngine runs in a daemon thread with its own SQLite connection (thread-safe via threading.Lock).

Push cycle: Reads the pending_ops queue (SQLite table) and sends each operation to the API. On success, removes the operation from the queue and updates the local server_id mapping. On failure, the operation stays queued for the next cycle.

Pull cycle: Calls GET /api/memories/sync?since=<last_sync_ts> for incremental changes. Server wins on conflicts — remote updates overwrite local rows (matched by server_id). Soft-deleted rows on the server are hard-deleted locally.

Write path: Every memory_store first writes to local SQLite (instant), then attempts an immediate sync to the API. If the immediate sync fails, the write is enqueued in pending_ops for the next background cycle.

Sync interval: Configurable via MEMORY_SYNC_INTERVAL (default: 60 seconds).

API Reference

Endpoint Method Description
/health GET Health check (unauthenticated)
/api/auth-check GET Validate API key without side effects
/api/memories POST Store a memory
/api/memories GET List memories (?category=facts&limit=20)
/api/memories/recall POST Search memories by context and expanded query
/api/memories/{id} PUT Update a memory (owner or write-shared user)
/api/memories/{id} DELETE Soft-delete a memory (owner only)
/api/memories/{id}/share POST Share a memory with a user
/api/memories/{id}/share/{user_id} DELETE Revoke sharing from a user
/api/memories/share-tag POST Share all memories with a tag
/api/memories/share-tag DELETE Revoke tag-based sharing
/api/memories/shared-with-me GET List memories shared with current user
/api/memories/my-shares GET List sharing rules created by current user
/api/memories/sync GET Incremental sync (?since=ISO-timestamp)
/api/memories/{id}/secret POST Get decrypted sensitive memory content
/api/memories/migrate-secrets POST Re-encrypt existing secrets with current Vault config
/api/memories/import POST Bulk import memories (JSON array)

All endpoints except /health require Authorization: Bearer <api-key>.

Environment Variables

MCP Server (client-side)

Variable Description Default
MEMORY_API_URL API server URL http://localhost:8080
MEMORY_API_KEY API key (enables hybrid mode) None
MEMORY_HOME Local storage directory ~/.claude/claude-memory
MEMORY_DB SQLite database path override $MEMORY_HOME/memory/memory.db
MEMORY_SYNC_INTERVAL Background sync interval in seconds 60
MEMORY_SYNC_DISABLE Set to 1 for HTTP-only mode None

Aliases CLAUDE_MEMORY_API_URL and CLAUDE_MEMORY_API_KEY are also supported.

API Server

Variable Description Default
DATABASE_URL PostgreSQL connection string Required
API_KEY Single-user API key None
API_KEYS Multi-user JSON map {"user": "key"} None
VAULT_ADDR Vault server address None
VAULT_TOKEN Vault authentication token None
VAULT_ROLE Kubernetes auth role name claude-memory
MEMORY_ENCRYPTION_KEY AES-256 key for non-Vault encryption None

Database Migrations

PostgreSQL (API Server)

Migrations run automatically on API server startup. To run manually:

export DATABASE_URL="postgresql://user:pass@localhost:5432/claude_memory"
alembic upgrade head

Four migrations:

  1. 001 — Creates memories table with generated tsvector column and GIN index
  2. 002 — Adds user_id (multi-user), is_sensitive, vault_path, encrypted_content
  3. 003 — Adds deleted_at (soft delete) and updated_at index for incremental sync
  4. 004 — Adds memory_shares and tag_shares tables for multi-user sharing

All migrations are idempotent (check column/table existence before altering).

SQLite (MCP Client)

SQLite schema is versioned via PRAGMA user_version and migrated automatically on startup. Current version: 2.

Version Migration
1 Add server_id column to memories table
2 Add retry_count column to pending_ops table

Development

Prerequisites

  • Python 3.11+
  • For API server tests: httpx, pytest-asyncio

Quick Start

git clone https://github.com/ViktorBarzin/claude-memory-mcp.git
cd claude-memory-mcp
python -m venv .venv
source .venv/bin/activate
pip install -e ".[api,dev]"

Running Tests

# All tests
pytest tests/ -v

# Individual test suites
pytest tests/test_sync.py -v          # SyncEngine (client-side sync resilience)
pytest tests/test_mcp_server.py -v    # MCP server (SQLite, tools, protocol)
pytest tests/test_api.py -v           # API server (FastAPI endpoints)

# Linting
ruff check src/ tests/
mypy src/claude_memory/ --strict

The MCP server itself (mcp_server.py and sync.py) uses stdlib only — no pip install needed on the client side. The [api] extra adds FastAPI, asyncpg, uvicorn, etc. for the server.

Test Structure

File Tests What it covers
test_sync.py SyncEngine unit tests Push/pull, auth failure handling, retry caps, full resync, orphan reconciliation, decoupled push/pull, diagnostics
test_mcp_server.py MCP server unit tests SQLite CRUD, FTS search, tool dispatch, MCP protocol, memory_count, schema migration
test_api.py API server integration tests All REST endpoints, auth, user isolation, soft delete, sync, secrets, import
test_auth.py Auth module tests Single/multi-user auth, key mapping
test_credential_detector.py Credential detection Pattern matching for secrets
test_crypto.py Encryption tests AES-256 encrypt/decrypt
test_vault_client.py Vault integration Secret storage/retrieval

Project Structure

claude-memory-mcp/
├── src/claude_memory/
│   ├── mcp_server.py          # MCP server entry point (stdio NDJSON)
│   ├── sync.py                # Background sync engine (SQLite ↔ API)
│   ├── credential_detector.py # Sensitive content detection
│   └── api/
│       ├── app.py             # FastAPI application
│       ├── auth.py            # API key authentication
│       ├── database.py        # asyncpg connection pool
│       ├── models.py          # Pydantic models
│       ├── permissions.py     # Sharing permission checks
│       └── vault_service.py   # HashiCorp Vault integration
├── tests/                     # pytest test suite
├── hooks/                     # Claude Code hooks (auto-recall, auto-learn, etc.)
├── docker/                    # Docker Compose for API server + PostgreSQL
├── deploy/                    # Kubernetes manifests and Helm chart
└── pyproject.toml             # Package config (hatchling)

Key Design Decisions

  • stdlib-only MCP server: The MCP server (mcp_server.py, sync.py) uses only Python stdlib — no pip install required for the client side. This ensures it works in any Claude Code environment without dependency management.
  • NDJSON transport: Claude Code uses NDJSON (one JSON per line) for stdio MCP, not Content-Length framing.
  • Non-blocking startup: MCP server startup must complete in ~15s or Claude Code times out. All network calls are deferred to background threads.
  • Suppress stderr: Any stderr output during MCP startup causes Claude Code to reject the server.

License

Apache License 2.0. See LICENSE for details.