redis: revert 3-node Sentinel HA to single standalone instance [ci skip]

The redis-v2 Sentinel cluster split-brained: redis-v2-0 booted during a network partition, hit the init script's deterministic "pod-0 = bootstrap master" fallback, and became a SECOND master alongside the sentinel-elected redis-v2-2. HAProxy's `expect rstring role:master` matched both and round-robined client connections across the two diverging masters, so Immich enqueued BullMQ jobs on one while its workers blocked-popped on the other -> every queue wedged and new-upload thumbnails 404'd cluster-wide. Third Sentinel-class incident in ~6 weeks (after the 2026-04-19 PM quorum drift and 2026-04-22 flap cascade). Revert to a single standalone instance: replicas=1; drop Sentinel + HAProxy + init bootstrap configmap + both PDBs; redis container only (+ exporter). maxmemory-policy allkeys-lru -> volatile-lru so one shared instance serves both workload classes correctly: evict only TTL'd cache keys, never TTL-less Immich BullMQ / Celery job keys. redis-master service name/DNS unchanged -> no consumer edits; collapsed onto redis-v2-0's existing dataset (queued jobs preserved). Applied via tg (Tier 1 / PG-authoritative state); this commit syncs source + docs only, hence [ci skip]. Monitoring: drop RedisReplicationLagHigh + RedisReplicasMissing (no replicas now; the latter would false-fire), RedisMemoryPressure 85%->80% volatile-lru backstop. Docs: rewrite databases.md Redis section (single-instance design + incident history); add post-mortem 2026-05-30-redis-split-brain.md. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-30 17:49:43 +00:00 · 2026-05-30 17:49:43 +00:00 · e1ab23193d
commit e1ab23193d
parent 5bcb4525a4
4 changed files with 196 additions and 515 deletions
--- a/stacks/monitoring/modules/monitoring/prometheus_chart_values.tpl
+++ b/stacks/monitoring/modules/monitoring/prometheus_chart_values.tpl
@ -1676,30 +1676,28 @@ serverFiles:
            labels:
              severity: critical
            annotations:
-              summary: "Redis has no ready replicas"
+              summary: "Redis is down — statefulset redis-v2 has no ready pod"
          - alert: RedisMemoryPressure
-            expr: redis_memory_used_bytes{namespace="redis"} / redis_memory_max_bytes{namespace="redis"} > 0.85
+            # Single instance, volatile-lru (2026-05-30): at maxmemory, TTL'd
+            # (cache) keys are evicted but TTL-less keys (Immich BullMQ + Celery
+            # jobs) are NOT — so once cache headroom is gone, queue writes start
+            # erroring. 80% is the backstop to intervene (bump maxmemory) first.
+            expr: redis_memory_used_bytes{namespace="redis"} / redis_memory_max_bytes{namespace="redis"} > 0.80
            for: 5m
            labels:
              severity: warning
            annotations:
-              summary: "Redis pod {{ $labels.pod }} using {{ $value | humanizePercentage }} of maxmemory — eviction imminent"
+              summary: "Redis pod {{ $labels.pod }} using {{ $value | humanizePercentage }} of maxmemory — volatile-lru evicting cache keys; queue writes at risk"
          - alert: RedisEvictions
-            # allkeys-lru is configured so evictions under cache pressure are
-            # expected, but sustained evictions mean we're thrashing — raise it.
+            # volatile-lru evicts only TTL'd (cache) keys under pressure — an
+            # occasional eviction is by design, but a sustained rate means we're
+            # near maxmemory and should raise it before queue writes error.
            expr: rate(redis_evicted_keys_total{namespace="redis"}[5m]) > 0
            for: 5m
            labels:
              severity: warning
            annotations:
-              summary: "Redis pod {{ $labels.pod }} evicting keys ({{ $value }} keys/s)"
-          - alert: RedisReplicationLagHigh
-            expr: redis_connected_slave_lag_seconds{namespace="redis"} > 30
-            for: 3m
-            labels:
-              severity: warning
-            annotations:
-              summary: "Redis replica {{ $labels.slave_ip }} lagging {{ $value }}s behind master"
+              summary: "Redis pod {{ $labels.pod }} evicting keys ({{ $value }} keys/s) — near maxmemory"
          - alert: RedisForkLatencyHigh
            # latest_fork_usec > 500ms means BGSAVE fork is stalling the main
            # thread long enough to drop client requests. COW pressure or
@ -1717,16 +1715,6 @@ serverFiles:
              severity: warning
            annotations:
              summary: "Redis pod {{ $labels.pod }} AOF rewrite running >10m — COW memory risk, investigate"
-          - alert: RedisReplicasMissing
-            # redis-v2 StatefulSet should always have 3 replicas connected to
-            # the master (2 replicas + itself). <2 connected_slaves means one
-            # replica is unreachable or still syncing.
-            expr: redis_connected_slaves{namespace="redis", pod=~"redis-v2-.*"} < 2 and redis_instance_info{namespace="redis", pod=~"redis-v2-.*", role="master"} == 1
-            for: 10m
-            labels:
-              severity: warning
-            annotations:
-              summary: "Redis master {{ $labels.pod }} has only {{ $value }} connected replicas (expected 2)"
          - alert: HeadscaleReplicasMismatch
            expr: (kube_deployment_status_replicas_available{namespace="headscale"} or on() vector(0)) < 1
            for: 5m