docs(storage): record harden-half shipped (orphan cleanup + ghost-reconcile)
2a orphan cleanup (67 Released PVs + 475 LVs removed, VG pve 997->~410) + 2b csi-ghost-reconcile CronJob done — ghost-disk doom loop closed by construction, beads code-dfjn retireable. Cap kept at 28 (lowering would reverse the 2026-05-25 eviction-cascade post-mortem fix). Phase-1: insta2spotify migrated (noted its 3.26GB image re-pull blip on node reschedule). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
parent
1b9d4f1233
commit
d808694af4
1 changed files with 6 additions and 1 deletions
|
|
@ -25,7 +25,12 @@ Both `StorageClass: nfs-truenas` and `StorageClass: nfs-proxmox` point to the Pr
|
|||
|
||||
**History (2026-04-13)**: TrueNAS (VM 9000, 10.0.10.15) fully decommissioned. NFS storage migrated to the Proxmox host (192.168.1.127). ZFS datasets under `/mnt/main/` and `/mnt/ssd/` moved to ext4 LVs at `/srv/nfs/` and `/srv/nfs-ssd/`. Legacy PVs referencing `/mnt/main/` paths still work (bind-mounted or symlinked on the Proxmox host); new PVs use `/srv/nfs/` and `/srv/nfs-ssd/`. TrueNAS VM still exists in stopped state on PVE pending user decision on deletion.
|
||||
|
||||
**History (2026-06-05) — Wave 2 NFS migration + strategy decision**: Decided to **keep proxmox-csi and harden it** (option ① — keeps PVC mobility, £0, no new hardware) rather than re-architect to TopoLVM (pins PVCs to a node) or Longhorn (2× write-amplification on the single shared sdc HDD). See `docs/plans/2026-06-05-block-storage-harden-nfs-design.md`. Migrated 5 non-DB, embedded-DB-free workloads off block to NFS to relieve the per-VM LUN cap: **tandoor** (media, PG-backed), **speedtest** (config, MySQL), **hackmd** (image uploads, MySQL — dropped LUKS for low-sensitivity images), **changedetection** (JSON datastore), **send** (upload blobs, Redis). Freed 5 SCSI-LUN slots (4 on the then-hot node6, 21→16). Each followed the scale-0 → busybox mover (`cp -a`) → swap `claim_name` → delete block PVC pattern. **Remaining structural work (the "harden" half) is open in beads `code-dfjn`**: ghost-disk-loop *prevention* (cap disks/node, raise QMP timeout, auto-reconcile) + orphan Released-PV/LV cleanup.
|
||||
**History (2026-06-05) — Wave 2 NFS migration + strategy decision**: Decided to **keep proxmox-csi and harden it** (option ① — keeps PVC mobility, £0, no new hardware) rather than re-architect to TopoLVM (pins PVCs to a node) or Longhorn (2× write-amplification on the single shared sdc HDD). See `docs/plans/2026-06-05-block-storage-harden-nfs-design.md`. Migrated 5 non-DB, embedded-DB-free workloads off block to NFS to relieve the per-VM LUN cap: **tandoor** (media, PG-backed), **speedtest** (config, MySQL), **hackmd** (image uploads, MySQL — dropped LUKS for low-sensitivity images), **changedetection** (JSON datastore), **send** (upload blobs, Redis). Freed 5 SCSI-LUN slots (4 on the then-hot node6, 21→16). Each followed the scale-0 → busybox mover (`cp -a`) → swap `claim_name` → delete block PVC pattern. (Phase-1 follow-on 2026-06-05: insta2spotify also migrated — note its reschedule re-pulled a 3.26 GB image, a ~6 min blip; large-image services incur a pull-delay when a migration moves the pod to a fresh node.)
|
||||
|
||||
**The "harden" half is now SHIPPED (2026-06-05):**
|
||||
- **Orphan cleanup** — removed 67 `Released` proxmox PVs + 475 orphan LVs/snapshots (VG `pve` 997 → ~410 LVs; thin pool freed). 1 LV left (`f127a41c`, stuck-open stale qemu fd — harmless, clears on node reboot; do not force `dmsetup remove`).
|
||||
- **Ghost-loop prevention** — `csi-ghost-reconcile` CronJob (`stacks/proxmox-csi/ghost-reconcile.tf`, every 15 min) compares each worker VM's real scsi disks (Proxmox API, scoped CSI token) against k8s VolumeAttachments and safely detaches ghosts (`PUT .../config delete=scsiN`); detection mirrors check #47, with a 60 s re-confirm + per-run cap-5. Verified live (66 VAs, 0 ghosts). This closes the doom loop by construction — **beads `code-dfjn` can be retired.**
|
||||
- **Cap deliberately kept at 28** (NOT lowered to 24): the labeler value (`stacks/proxmox-csi/.../main.tf` `node_labels`) was raised 24→28 per the 2026-05-25 eviction-cascade post-mortem; lowering it would reverse that fix. With auto-reconcile keeping drift at 0, the 28 cap is safe.
|
||||
|
||||
## Architecture Diagram
|
||||
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue