infra/.claude/skills
Viktor Barzin b64d8d6168 cluster-health: add #47 ghost-disk drift check; fix immich_search set -e crash
Check #47 "Proxmox CSI — Ghost-Disk Drift": per node, compares the real
virtio-scsi CSI disk count in `qm config <vmid>` (SSH PVE) against the
attached proxmox-CSI VolumeAttachments k8s tracks. Catches orphaned "ghost"
disks left by failed detaches (query-pci QMP timeouts) that the scheduler's
28-LUN guard can't see — exactly the drift that wedged the MAM grabber on
node3 (13 tracked vs 23 real). PASS reconciled; WARN drift>0 or real 20-24;
FAIL real ≥25 (near the LUN cap). Already flagging node6 at 21 disks.
Single `qm list` + one `qm config` per VM keeps it ~3s (the naive
once-per-VM version timed out the parallel runner).

Also fixes a PRE-EXISTING set -e crash in #46 immich_search (introduced by
138894cd): `pct=$(kubectl exec … | tr -d ' ')` and the dur_ms probe were
unguarded, so with `set -o pipefail` a non-zero psql/exec propagated and
tripped `set -e`, killing the check before json_add. It silently dropped
from every parallel report and broke --serial entirely (whole run aborted).
Guarded both substitutions with `|| true`; the existing `=~` numeric checks
already handle the empty case. immich_search now reports PASS/WARN instead
of vanishing.
2026-06-05 09:19:10 +00:00
..
add-user docs(add-user): update skill with actual working flow (no auto TF apply) 2026-03-18 00:28:46 +00:00
archived [claude-agent-service] Remove orphaned DevVM SSH key wiring 2026-04-18 13:31:15 +00:00
cluster-health cluster-health: add #47 ghost-disk drift check; fix immich_search set -e crash 2026-06-05 09:19:10 +00:00
disk-wear [skill] Add /disk-wear skill for periodic disk write analysis 2026-04-17 11:15:26 +00:00
extend-vm-storage [ci skip] Import Claude skills into OpenClaw moltbot 2026-02-17 21:09:12 +00:00
home-assistant docs: add ha-sofia Version Control add-on to HA skill [ci skip] 2026-04-12 11:37:02 +01:00
k8s-ndots-search-domain-nxdomain-flood [ci skip] Add pfsense-dnsmasq-interface-binding skill, update ndots skill to v1.1.0 2026-02-16 22:30:57 +00:00
pfsense [ci skip] Add pfSense firewall management skill 2026-02-14 12:42:10 +00:00
post-mortem feat: add incident management system with user reporting 2026-04-14 20:00:31 +00:00
setup-project feat(setup-project): auto-PR working Dockerfiles back to upstream 2026-04-17 18:12:13 +00:00
upgrade-state k8s-version-upgrade: switch detection cron from weekly to daily 2026-05-18 18:29:08 +00:00
uptime-kuma [uptime-kuma] Codify MySQL monitor (id=663) via idempotent sync CronJob 2026-04-18 12:04:17 +00:00