infra/docs/architecture
Viktor Barzin e55c549c9a [redis] Phase 7 step 2: remove Bitnami helm_release + orphan PVCs
Bringing the 2026-04-19 rework to its end-state. Cutover soaked for ~1h
with 0 alerts firing and 127 ops/sec on the v2 master — skipped the
nominal 24h rollback window per user direction.

 - Removed `helm_release.redis` (Bitnami chart v25.3.2) from TF. Helm
   destroy cleaned up the StatefulSet redis-node (already scaled to 0),
   ConfigMaps, ServiceAccount, RBAC, and the deprecated `redis` + `redis-headless`
   ClusterIP services that the chart owned.
 - Removed `null_resource.patch_redis_service` — the kubectl-patch hack
   that worked around the Bitnami chart's broken service selector. No
   Helm chart, no patch needed.
 - Removed the dead `depends_on = [helm_release.redis]` from the HAProxy
   deployment.
 - `kubectl delete pvc -n redis redis-data-redis-node-{0,1}` for the two
   orphan PVCs the StatefulSet template left behind (K8s doesn't cascade-delete).
 - Simplified the top-of-file comment and the redis-v2 architecture
   comment — they talked about the parallel-cluster migration state that
   no longer exists. Folded in the sentinel hostname gotcha, the redis
   8.x image requirement, and the BGSAVE+AOF-rewrite memory reasoning
   so the rationale survives in the code rather than only in beads.
 - `RedisDown` alert no longer matches `redis-node|redis-v2` — just
   `redis-v2` since that's the only StatefulSet now. Kept the `or on()
   vector(0)` so the alert fires when kube_state_metrics has no sample
   (e.g. after accidental delete).
 - `docs/architecture/databases.md` trimmed: no more "pending TF removal"
   or "cold rollback for 24h" language.

Verification after apply:
 - kubectl get all -n redis: redis-v2-{0,1,2} (3/3 Running) + redis-haproxy-*
   (3 pods, PDB minAvailable=2). Services: redis-master + redis-v2-headless only.
 - PVCs: data-redis-v2-{0,1,2} only (redis-data-redis-node-* deleted).
 - Sentinel: all 3 agree mymaster = redis-v2-0 hostname.
 - HAProxy: PING PONG, DBSIZE 92, 127 ops/sec on master.
 - Prometheus: 0 firing redis alerts.

Closes: code-v2b
Closes: code-2mw

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 16:32:14 +00:00
..
agent-task-tracking.md Add agent task tracking documentation 2026-04-15 17:11:26 +00:00
authentication.md fix: cluster healthcheck fixes + Authentik upgrade to 2026.2.2 2026-04-15 06:41:56 +00:00
automated-upgrades.md [docs] automated-upgrades: document long-lived OAuth + expiry monitoring 2026-04-18 13:00:07 +00:00
backup-dr.md [infra] Fix rewrite-body plugin + cleanup TrueNAS + version bumps 2026-04-17 05:51:52 +00:00
ci-cd.md docs: comprehensive audit and update of all architecture docs and runbooks [ci skip] 2026-04-06 13:21:05 +03:00
compute.md docs: comprehensive audit and update of all architecture docs and runbooks [ci skip] 2026-04-06 13:21:05 +03:00
databases.md [redis] Phase 7 step 2: remove Bitnami helm_release + orphan PVCs 2026-04-19 16:32:14 +00:00
dns.md [dns] Kea: multi-IP DHCP option 6 (10.0.10, 10.0.20) + TSIG-signed DDNS (WS E) 2026-04-19 16:12:23 +00:00
homepage.md add homepage auto-discovery documentation [ci skip] 2026-03-25 13:06:43 +02:00
incident-response.md [claude-agent-service] Migrate all pipelines from DevVM SSH to K8s HTTP 2026-04-18 10:12:02 +00:00
mailserver.md [docs] Mailserver architecture — richer diagrams + steady-state accuracy [ci skip] 2026-04-19 12:40:53 +00:00
monitoring.md [docs] Document external-monitor opt-out mechanism in monitoring.md 2026-04-19 15:19:06 +00:00
multi-tenancy.md add architecture documentation for all infrastructure subsystems [ci skip] 2026-03-24 00:55:25 +02:00
networking.md [dns] Kea: multi-IP DHCP option 6 (10.0.10, 10.0.20) + TSIG-signed DDNS (WS E) 2026-04-19 16:12:23 +00:00
overview.md deprecate TrueNAS: migrate Immich NFS to Proxmox, remove all 10.0.10.15 references [ci skip] 2026-04-13 14:42:07 +00:00
secrets.md docs: comprehensive audit and update of all architecture docs and runbooks [ci skip] 2026-04-06 13:21:05 +03:00
security.md [docs] Update anti-AI and rybbit docs after rewrite-body removal 2026-04-17 21:43:13 +00:00
storage.md docs(storage): add encrypted LVM documentation 2026-04-15 21:00:37 +00:00
vpn.md docs: update Technitium DNS docs after cache optimization 2026-04-12 18:29:25 +01:00