Commit graph

4 commits

Author SHA1 Message Date
Viktor Barzin
5151bbe0d5 fix(mcp): serialize local SQLite writes to end "database is locked" under concurrent stores
Some checks failed
Build and Push / lint-and-test (push) Has been cancelled
Build and Push / build (push) Has been cancelled
Build and Push / deploy (push) Has been cancelled
Build and Push / notify-failure (push) Has been cancelled
Under heavy concurrent memory_store (many subagents/sessions writing close
together) the local SQLite layer raced on the single SQLite writer and surfaced
sqlite3.OperationalError: database is locked, which made memory tools slow and
eventually dropped whole sessions. Two root causes:

  - The MCP server (mcp_server.py) and the background SyncEngine (sync.py) each
    opened a SEPARATE connection to the same SQLite file. WAL allows one writer;
    when the sync writer held the lock across a resync, a concurrent store blew
    past busy_timeout and raised "database is locked".
  - mcp_server's connection was opened WITHOUT check_same_thread=False, so the
    moment two requests were handled on different threads every local store/recall
    raised ProgrammingError "SQLite objects created in a thread...".

Fix: a single process-wide serialized writer.

  - New LocalStore (local_store.py) owns ONE connection (check_same_thread=False)
    guarded by ONE re-entrant lock, keeps WAL, and wraps writes in
    transaction()/write() with bounded exponential-backoff retry on the rare
    residual lock (e.g. another OS process) instead of failing the call.
  - MemoryServer builds the LocalStore and SHARES it with the SyncEngine, so the
    sync writer no longer opens a second connection — the two-connections race is
    eliminated. All server reads/writes go through the shared lock; stores stay
    snappy (enqueue-local + async sync).
  - Bound the one genuinely slow path (remote semantic memory_recall) with an
    explicit RECALL_TIMEOUT and, on timeout/unreachable backend, return a clear
    "working / retry" signal instead of hanging silently or crashing. When a
    client supplies _meta.progressToken, emit one notifications/progress so the
    call shows life.

Ships to users via the plugin (mcp/memory-mcp.json runs src/.../mcp_server.py);
no server-side/API change needed. TDD: added concurrency tests (many threads +
sync writer on one file) and recall progress/bounding tests; full gate green
(ruff + mypy strict + 185 pytest).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-19 06:06:09 +00:00
Viktor Barzin
e47efee6b6
resilient memory sync: decouple push/pull, startup full resync, auth failure handling
- Decouple push and pull in _sync_once() so pull always runs even if push fails
- Add startup full resync to catch drift from other agents and schema changes
- Add periodic full resync every ~10 minutes for continuous drift correction
- Add auth failure detection (401/403) with graceful SQLite-only degradation
- Add /api/auth-check endpoint for lightweight key validation
- Add retry cap (5 attempts) on pending ops to prevent infinite queue buildup
- Add orphan reconciliation: push local-only records with content dedup
- Add memory_count MCP tool for sync diagnostics
- Add version-based SQLite schema migration (PRAGMA user_version)
- Fix API key in ~/.claude.json to match server
- Update README with sync resilience docs, test structure, project layout
- Add 30 new tests covering all new behaviors (155 total, all passing)
2026-03-16 18:37:59 +00:00
Viktor Barzin
eba5cf6a82
fix: resolve ruff lint errors (unused imports and variables) 2026-03-14 10:01:41 +00:00
Viktor Barzin
0ed5e1e016
feat: standalone claude-memory-mcp with multi-user support and Vault integration
Extracted from private infra repo into standalone open-source project.

Three operating modes:
- Local: SQLite + FTS5 (zero dependencies)
- Server: PostgreSQL via HTTP API with multi-user auth
- Full: PostgreSQL + HashiCorp Vault for secret management

Features:
- MCP stdio server with 5 tools (store/recall/list/delete/secret_get)
- FastAPI HTTP API with multi-user Bearer token auth (API_KEYS JSON map)
- Regex-based credential detection with auto-redaction
- AES-256-GCM encryption fallback for non-Vault deployments
- Vault KV v2 client (stdlib urllib, K8s SA auto-auth)
- Per-user data isolation (all queries scoped by user_id)
- Secret migration endpoint for existing plain-text credentials
- Backward-compatible env var aliases (CLAUDE_MEMORY_API_URL)

Infrastructure:
- Docker + docker-compose (API + PostgreSQL + optional Vault)
- Woodpecker CI (test → build → push → kubectl deploy)
- GitHub Actions CI (Python 3.11/3.12/3.13) + Release (GHCR + PyPI)
- Helm chart + raw Kubernetes manifests

96 tests passing across 6 test files.
2026-03-14 09:42:05 +00:00