claude-memory-mcp

Author	SHA1	Message	Date
Viktor Barzin	5151bbe0d5	fix(mcp): serialize local SQLite writes to end "database is locked" under concurrent stores Some checks failed Build and Push / lint-and-test (push) Has been cancelled Details Build and Push / build (push) Has been cancelled Details Build and Push / deploy (push) Has been cancelled Details Build and Push / notify-failure (push) Has been cancelled Details Under heavy concurrent memory_store (many subagents/sessions writing close together) the local SQLite layer raced on the single SQLite writer and surfaced sqlite3.OperationalError: database is locked, which made memory tools slow and eventually dropped whole sessions. Two root causes: - The MCP server (mcp_server.py) and the background SyncEngine (sync.py) each opened a SEPARATE connection to the same SQLite file. WAL allows one writer; when the sync writer held the lock across a resync, a concurrent store blew past busy_timeout and raised "database is locked". - mcp_server's connection was opened WITHOUT check_same_thread=False, so the moment two requests were handled on different threads every local store/recall raised ProgrammingError "SQLite objects created in a thread...". Fix: a single process-wide serialized writer. - New LocalStore (local_store.py) owns ONE connection (check_same_thread=False) guarded by ONE re-entrant lock, keeps WAL, and wraps writes in transaction()/write() with bounded exponential-backoff retry on the rare residual lock (e.g. another OS process) instead of failing the call. - MemoryServer builds the LocalStore and SHARES it with the SyncEngine, so the sync writer no longer opens a second connection — the two-connections race is eliminated. All server reads/writes go through the shared lock; stores stay snappy (enqueue-local + async sync). - Bound the one genuinely slow path (remote semantic memory_recall) with an explicit RECALL_TIMEOUT and, on timeout/unreachable backend, return a clear "working / retry" signal instead of hanging silently or crashing. When a client supplies _meta.progressToken, emit one notifications/progress so the call shows life. Ships to users via the plugin (mcp/memory-mcp.json runs src/.../mcp_server.py); no server-side/API change needed. TDD: added concurrency tests (many threads + sync writer on one file) and recall progress/bounding tests; full gate green (ruff + mypy strict + 185 pytest). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-19 06:06:09 +00:00
Viktor Barzin	e47efee6b6	resilient memory sync: decouple push/pull, startup full resync, auth failure handling - Decouple push and pull in _sync_once() so pull always runs even if push fails - Add startup full resync to catch drift from other agents and schema changes - Add periodic full resync every ~10 minutes for continuous drift correction - Add auth failure detection (401/403) with graceful SQLite-only degradation - Add /api/auth-check endpoint for lightweight key validation - Add retry cap (5 attempts) on pending ops to prevent infinite queue buildup - Add orphan reconciliation: push local-only records with content dedup - Add memory_count MCP tool for sync diagnostics - Add version-based SQLite schema migration (PRAGMA user_version) - Fix API key in ~/.claude.json to match server - Update README with sync resilience docs, test structure, project layout - Add 30 new tests covering all new behaviors (155 total, all passing)	2026-03-16 18:37:59 +00:00
Viktor Barzin	d370855abf	fix mypy across all source files, remove \|\| true from CI - Add type annotations to all FastAPI endpoints in api/app.py - Fix bare list/dict generics in sync.py and app.py - Fix no-any-return in vault_client.py and sync.py - Remove mypy \|\| true from GitHub Actions CI — mypy is now clean	2026-03-15 23:36:34 +00:00
Viktor Barzin	e22a8f743a	fix: 404 retry loop on delete and URL-encode since param Three fixes for high 4xx alert rate: 1. Catch urllib.error.HTTPError (not just RuntimeError) for 404 on DELETE in pending_ops — was causing infinite retry loop for already-deleted memories 2. try_sync_delete: treat 404 as success instead of enqueuing a retry that will also 404 forever 3. URL-encode the `since` query param to prevent `+` in timezone offset being decoded to a space (the asyncpg-string-timestamp pattern applied to the sync client)	2026-03-15 15:28:19 +00:00
Viktor Barzin	a52afe050d	Fix MCP server startup for Claude Code compatibility - Suppress stderr output (Claude Code rejects servers with stderr) - Make SyncEngine.start() non-blocking (blocking sync caused 15s+ timeout) - Skip Content-Length header lines gracefully (NDJSON transport) - Silently ignore malformed JSON lines instead of sending error responses	2026-03-14 16:05:35 +00:00
Viktor Barzin	0a4aed9e1b	fix: resolve ruff lint errors (unused imports and variables)	2026-03-14 13:04:40 +00:00
Viktor Barzin	cd80a67dfa	feat: add local SQLite cache with background sync and HA deployment - Add SyncEngine for background sync between local SQLite cache and remote API with pending_ops queue for offline resilience - Refactor MCP server to support three modes: SQLite-only, hybrid (local cache + sync, new default), and HTTP-only (legacy) - Add GET /api/memories/sync endpoint for incremental sync - Change DELETE to soft delete (set deleted_at) for sync support - Add deleted_at IS NULL filters to all read queries - Scale API deployment to 2 replicas with pod anti-affinity, PDB, and startup probe for high availability - Add migration 003 for deleted_at column and updated_at index - Add comprehensive tests for sync engine and API sync endpoint	2026-03-14 12:42:39 +00:00

7 commits