- Add local-path PVC for MySQL data (10Gi, actual usage ~27GB)
- Init container seeds data from NFS on first run (cp -a)
- NFS volume kept as read-only seed source in init container
- MySQL 9.2.0 running on local disk with proper fsync
- All dependent services verified running:
hackmd, speedtest, onlyoffice, paperless-ngx
- mysqldump backup taken before migration
- Existing daily mysqldump CronJob unchanged (writes to NFS)
- Switch from redis/redis-stack:latest to redis:7-alpine
(modules were completely unused — zero module commands in stats)
- Move data from NFS (/mnt/main/redis) to local-path PVC
(RDB saves: 39s on NFS → <1s on local disk)
- Start fresh (old RDB had redis-stack module data incompatible with plain redis;
all Redis data is transient — queues and caches rebuild automatically)
- Add hourly redis-backup CronJob: redis-cli --rdb to NFS for backup pipeline
- Remove RedisInsight UI ingress (port 8001, only in redis-stack)
- Add redis-backup to NFS exports
- 110 clients reconnected immediately after switchover
- Memory savings: ~100MB from dropping unused modules
New variant documents ghost Running pods with frozen processes after kured
rolling reboots. Key diagnostic: Running 1/1 but zero listening sockets
from ss -tlnp. Fix: force-delete pods to get fresh NFS mounts.
The BuildKit builder cannot push to the insecure HTTP registry at
registry.viktorbarzin.lan:5050 because buildkit_config is not being
applied by the plugin. Simplified to DockerHub-only push for now.
Private registry caching and push can be re-added once buildkit_config
issue is resolved.
Major milestone - shared PostgreSQL moved from NFS to CloudNativePG:
- CNPG cluster (pg-cluster) running in dbaas namespace on local-path storage
- PostGIS image (ghcr.io/cloudnative-pg/postgis:16) for dawarich compatibility
- All 20 databases and 19 roles restored from pg_dumpall backup
- postgresql.dbaas Service patched to point at CNPG primary
- Old PG deployment scaled to 0 (NFS data intact for rollback)
- All 12+ dependent services verified running:
authentik, n8n, dawarich, tandoor, linkwarden, netbox, woodpecker,
rybbit, affine, health, resume, trading-bot, atuin
- Authentik PgBouncer working through the switched endpoint
TODO: codify CNPG cluster in Terraform, add 2nd replica, update backup CronJob
The plugin-docker-buildx (Codeberg version) changed CacheFrom from
string to StringSlice, which causes urfave/cli to split on commas.
The cache_images setting properly handles registry refs by generating
both --cache-from and --cache-to flags automatically.
- cache_from/cache_to must be plain strings, not YAML lists — the
plugin-docker-buildx treats them as single string values and the
Woodpecker settings layer was splitting comma-separated list items
into separate --cache-from flags (type=registry and ref=... separately)
- caretta.tf: replace deprecated set{} blocks with values=[yamlencode()]
to fix Terraform plan error with newer Helm provider
Key changes from v1:
- Drop 3-instance replication → 2-instance CNPG, single Redis/MySQL
- Remove Headscale from PG migration (project discourages it)
- Remove MeshCentral from PG migration (NeDB, not SQLite)
- Replace Redis Sentinel with single redis:7 on local disk (modules unused)
- Add RAM overcommit warning and mitigation
- Add explicit single-host limitation acknowledgment
- Add per-component rollback plans
- Fix backup strategy (CNPG can't archive WAL to NFS natively)
- Reorder migration: low-risk services first, authentik last
- Add research gate before each service migration