• v0.1.1 5151bbe0d5

    fix(mcp): serialize local SQLite writes to end "database is locked" under concurrent stores
    Some checks failed
    Build and Push / lint-and-test (push) Has been cancelled
    Build and Push / build (push) Has been cancelled
    Build and Push / deploy (push) Has been cancelled
    Build and Push / notify-failure (push) Has been cancelled

    Ghost released this 2026-06-19 06:06:09 +00:00 | 0 commits to master since this release

    Under heavy concurrent memory_store (many subagents/sessions writing close
    together) the local SQLite layer raced on the single SQLite writer and surfaced
    sqlite3.OperationalError: database is locked, which made memory tools slow and
    eventually dropped whole sessions. Two root causes:

    • The MCP server (mcp_server.py) and the background SyncEngine (sync.py) each
      opened a SEPARATE connection to the same SQLite file. WAL allows one writer;
      when the sync writer held the lock across a resync, a concurrent store blew
      past busy_timeout and raised "database is locked".
    • mcp_server's connection was opened WITHOUT check_same_thread=False, so the
      moment two requests were handled on different threads every local store/recall
      raised ProgrammingError "SQLite objects created in a thread...".

    Fix: a single process-wide serialized writer.

    • New LocalStore (local_store.py) owns ONE connection (check_same_thread=False)
      guarded by ONE re-entrant lock, keeps WAL, and wraps writes in
      transaction()/write() with bounded exponential-backoff retry on the rare
      residual lock (e.g. another OS process) instead of failing the call.
    • MemoryServer builds the LocalStore and SHARES it with the SyncEngine, so the
      sync writer no longer opens a second connection — the two-connections race is
      eliminated. All server reads/writes go through the shared lock; stores stay
      snappy (enqueue-local + async sync).
    • Bound the one genuinely slow path (remote semantic memory_recall) with an
      explicit RECALL_TIMEOUT and, on timeout/unreachable backend, return a clear
      "working / retry" signal instead of hanging silently or crashing. When a
      client supplies _meta.progressToken, emit one notifications/progress so the
      call shows life.

    Ships to users via the plugin (mcp/memory-mcp.json runs src/.../mcp_server.py);
    no server-side/API change needed. TDD: added concurrency tests (many threads +
    sync writer on one file) and recall progress/bounding tests; full gate green
    (ruff + mypy strict + 185 pytest).

    Co-Authored-By: Claude Opus 4.8 noreply@anthropic.com