infra/.claude
Viktor Barzin bebe8fbd74
All checks were successful
ci/woodpecker/push/default Pipeline was successful
workflows: add read-only memory-overcommit + node-removal capacity analysis
Reusable Workflow script that audits whether the cluster is memory-overcommitted and whether a single k8s worker can be removed to return RAM to the PVE host without sacrificing N-1 failover. Read-only throughout: gathers PVE host memory (qm config / free / KSM via SSH), k8s per-node capacity + cluster 30d peak working set, and per-workload right-sizing, then models N-1 two ways (physical actual-usage and scheduling-by-request) and adversarially verifies the conclusion with 3 skeptics.

Sizes requests (scheduling reservation) and limits (OOM ceiling) as SEPARATE knobs — an earlier ad-hoc pass conflated them by sizing requests to 30d peak, which manufactured a false N-1 shortfall. Invoke via Workflow {scriptPath}, or by name when cwd is the infra repo.

Requested by Viktor: identify memory overcommit and whether deployment requests can be trimmed to free PVE host RAM by removing a node, without sacrificing service reliability.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-29 12:06:17 +00:00
..
agents fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
commands fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
reference monitoring: consolidate all Slack alerting to #alerts, abandon #security 2026-06-26 13:29:44 +00:00
scripts fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
skills home-assistant skill: refresh ha-london map (HAOS 2026.5.2, Cowboy revived, Overview redesign) 2026-06-24 22:03:15 +00:00
workflows workflows: add read-only memory-overcommit + node-removal capacity analysis 2026-06-29 12:06:17 +00:00
calendar-query.py fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
CLAUDE.md dbaas: stretch CNPG checkpoint timer 5->15min + raise WAL size (cut sdc write IOPS) 2026-06-29 11:41:09 +00:00
home-assistant-sofia.py homelab CLI v0.7: add ha token + ha ssh for Home Assistant 2026-06-20 23:46:09 +00:00
home-assistant.py fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
internet-mode-used_DO_NOT_REMOVE_MANUALLY_SECURITY_RISK fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
pfsense.py fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
settings.json fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00