memory_recall was returning almost the entire store instead of a small
set of relevant matches. Two compounding causes, both fixed here:
1. Default limit was 10000 (commit d03a77a "effectively unlimited").
recall_memories/MemoryRecall and the memory_recall MCP tool now
default to 30 (ceiling stays 10000 for callers that opt in).
2. The OR-broadening fallback (fires when the precise AND-match is
sparse) ordered by the importance hybrid and padded up to `limit`,
so with limit=10000 it flooded results with high-importance but
irrelevant memories. It now orders OR-matches by ts_rank(relevance)
DESC and applies a minimum-rank floor (OR_BROADEN_MIN_RANK=0.01) to
drop rows that merely contain a query word incidentally.
Tests: add test_recall_default_limit_is_capped (asserts 30 passed to
fetch) and test_recall_or_broadening_is_relevance_bounded (asserts the
OR query is relevance-ordered + rank-floored). Full suite 176 green,
ruff + mypy clean.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
When content exceeds 500 chars, it's automatically split into multiple
memories on paragraph boundaries. Each chunk gets the same category,
tags (with part-N-of-M suffix), keywords, and importance. Removes the
old 800 char hard limit from the Pydantic model.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
With 375 memories and 1M context window, low limits just hide results.
Agents can still pass a smaller limit when they want fewer results.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- New migration 004: memory_shares and tag_shares tables with indexes
- Share individual memories or entire tags with other users (read/write)
- Tag shares are live rules: future memories with shared tags auto-visible
- Recall query merges own + shared memories via UNION, returns shared_by field
- Owner-only delete enforcement (403 for non-owners, even with write access)
- PUT /api/memories/{id} update endpoint with permission checks
- 5 new MCP SSE tools: memory_share, memory_unshare, memory_share_tag,
memory_unshare_tag, memory_update
- Permission helper checks ownership, individual shares, and tag shares
- Add MAX_MEMORY_CHARS=800 Pydantic validation on MemoryStore.content
- Update auto-learn judge prompts: "ONE topic per event", 100-500 chars,
include the WHY not just the WHAT
- Split 9 mega-memories (800-2400ch) into 70 focused memories (100-500ch)
via migration script
Before: median 331ch, 11 memories >800ch, recall wastes 84% of returned tokens
After: median 213ch, 2 memories >800ch (dense single-topic refs), recall returns
only the relevant knowledge
Trade-off research: PostgreSQL ts_rank gives the same score regardless of
document size, so a 2400-char memory with 12 topics gets recalled for any
of its 12 topics but wastes context with the other 11. Focused memories
(100-500ch) give higher signal-to-noise per recall.
- Add SyncEngine for background sync between local SQLite cache and
remote API with pending_ops queue for offline resilience
- Refactor MCP server to support three modes: SQLite-only, hybrid
(local cache + sync, new default), and HTTP-only (legacy)
- Add GET /api/memories/sync endpoint for incremental sync
- Change DELETE to soft delete (set deleted_at) for sync support
- Add deleted_at IS NULL filters to all read queries
- Scale API deployment to 2 replicas with pod anti-affinity, PDB,
and startup probe for high availability
- Add migration 003 for deleted_at column and updated_at index
- Add comprehensive tests for sync engine and API sync endpoint