infra/scripts
Viktor Barzin abb15cd49d
All checks were successful
ci/woodpecker/push/default Pipeline was successful
devvm: personalize emo's cluster-health skill for ha-sofia
emo cares about ha-sofia + his Sofia smart-home devices (Tuya, the MPPT
ATS, the Барзини → Статус dashboard), and only about the cluster when it's
breaking those. Rewrite his vendored cluster-health into an ha-sofia-focused,
read-only variant:
- leads with ha-sofia's in-cluster dependency chain (tuya-bridge + the
  cloudflared/Traefik/DNS/TLS reachability path), all checkable read-only;
- fixes the script path to emo's own clone (/home/emo/code) — he can't read
  wizard's tree — and runs it --no-fix (he's cluster read-only);
- loads emo's own HA token (see below) so the ha-sofia checks (26-29, 45)
  actually run for him; documents the host-SSH/Vault checks that skip;
- triages: cluster FAIL/WARN matters only if on his chain; everything else is
  a one-line "admin's area"; escalate via /file-issue since he can't fix.

This snapshot copy is now an emo-specific variant, intentionally diverged
from the canonical 47-check admin skill — README updated to say "do not
re-sync from canonical".

Token: a dedicated long-lived HA token (client_name emo-cluster-health) was
minted on ha-sofia via the admin account and stored emo-readable at
/home/emo/.config/cluster-health/haos_token (600). It carries admin HA scope
(HA only mints tokens for the authenticating account); independently revocable.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-26 16:03:14 +00:00
..
offinfra-templates offinfra-onboard: --dockerfile flag for non-root Dockerfiles 2026-06-13 02:37:25 +00:00
server_safe_poweroff fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
t3-dispatch t3: session-auth detection for the gated nightly tracker (dispatch fallback logging + Loki alerts) 2026-06-16 09:56:55 +00:00
workstation devvm: personalize emo's cluster-health skill for ha-sofia 2026-06-26 16:03:14 +00:00
apply-mbps-caps.service fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
apply-mbps-caps.sh apply-mbps-caps: compare normalized option sets (true idempotency) + devvm I/O-stall post-mortem [ci skip] 2026-06-11 18:00:08 +00:00
apply-mbps-caps.timer fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
breakglass-firewall.sh break-glass SSH: drop port-knock for exposed key-only :52222; version host config 2026-06-11 18:23:39 +00:00
check-ingress-auth-comments.py fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
claude-auth-sync@.service Add per-user Claude auth renewal 2026-06-20 20:10:40 +00:00
claude-auth-sync@.timer Add per-user Claude auth renewal 2026-06-20 20:10:40 +00:00
cluster_healthcheck.sh goldmane-trail: polish follow-ups #57/#59/#61/#62/#63 + digest→#alerts 2026-06-25 17:49:25 +00:00
cluster_manager.py fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
daily-backup.service fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
daily-backup.sh fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
daily-backup.timer fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
devvm-promtail.service t3: connection logging across the path for drop attribution 2026-06-11 13:48:10 +00:00
devvm-promtail.yaml t3: connection logging across the path for drop attribution 2026-06-11 13:48:10 +00:00
extend_vm_storage.sh fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
fail2ban-breakglass-sshd.local break-glass SSH: drop port-knock for exposed key-only :52222; version host config 2026-06-11 18:23:39 +00:00
fan-control.env.example fan-control docs: sync runbook/env/service/design to the HA-actuator + anti-flap model 2026-06-16 08:11:48 +00:00
fan-control.service fan-control docs: sync runbook/env/service/design to the HA-actuator + anti-flap model 2026-06-16 08:11:48 +00:00
fan-control.sh fan-control: hold last command through transient HA losses (stop fan flapping) 2026-06-16 08:07:52 +00:00
forgejo-migrate-orphan-images.sh fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
frigate-bulk-classify.js fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
frigate-inspect.mjs fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
gen_service_stacks.py fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
graceful-db-maintenance.sh fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
image_pull.sh fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
image_pull_remote.sh fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
k8s-apiserver-audit-policy.yaml fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
kill_ns.sh fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
lvm-pvc-snapshot.sh fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
lvm-pvc-snapshot.timer fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
migrate-state-to-pg fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
migrate_service_state.sh fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
nfs-change-tracker.service fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
nfs-mirror.service fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
nfs-mirror.sh nfs-mirror: exclude SQLite WAL/SHM sidecars + treat rsync exit 24 as success 2026-06-16 09:34:22 +00:00
nfs-mirror.timer fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
node_registry_manager.sh fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
offinfra-onboard offinfra-onboard: --dockerfile flag for non-root Dockerfiles 2026-06-13 02:37:25 +00:00
offsite-sync-backup.service fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
offsite-sync-backup.sh fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
offsite-sync-backup.timer fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
parse-postmortem-todos.sh fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
pfsense-haproxy-bootstrap.php pfsense: SNI-routed internal 443 — mail.viktorbarzin.me serves webmail everywhere 2026-06-10 18:41:07 +00:00
pfsense-nat-mailserver-haproxy-flip.php fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
pfsense-nat-mailserver-haproxy-unflip.php fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
postmortem-pipeline.sh fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
provision-k8s-worker fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
publish-gate claude-agent-service image -> ghcr across all five consumer stacks (infra#19) 2026-06-13 01:47:54 +00:00
pve-nfs-exports fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
pve-promtail.service pve-host: ship journal to Loki (snoopy command audit + sshd-pve) for emo's root SSH 2026-06-10 19:31:45 +00:00
pve-promtail.yaml pve-host/dns: register loki.viktorbarzin.lan CNAME, drop the /etc/hosts pin 2026-06-10 22:55:20 +00:00
pve-snoopy.ini pve-host: ship journal to Loki (snoopy command audit + sshd-pve) for emo's root SSH 2026-06-10 19:31:45 +00:00
renew_worker_certs.sh fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
setup-containerd-pullthrough.sh fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
setup-forgejo-containerd-mirror.sh dns: pfSense forward-zone for viktorbarzin.me, nodes fully stock [ci skip] 2026-06-10 08:32:34 +00:00
setup-task-pipeline.sh fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
setup_containerd_mirrors.sh fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
sshd-10-breakglass.conf break-glass SSH: drop port-knock for exposed key-only :52222; version host config 2026-06-11 18:23:39 +00:00
state-sync fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
stop_storage_services.sh fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
sudoers-t3-autopair fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
t3-autoupdate.service t3: pin t3@0.0.24 + stop nightly auto-update (auth-outage fix) [ci skip] 2026-06-09 21:41:53 +00:00
t3-autoupdate.sh t3-autoupdate: source the shared safe-restart lib + record deferrals 2026-06-21 12:32:57 +00:00
t3-autoupdate.timer t3: gated nightly tracker (replaces pinned enforcer) + drop timer Persistent 2026-06-16 10:08:12 +00:00
t3-backup-state.service t3: prepare to adopt 0.0.25 — version-agnostic dispatch + real pairing health-check + state backup [ci skip] 2026-06-09 21:41:53 +00:00
t3-backup-state.sh t3-backup-state: retention 14 -> 6 (bound devvm root fs) 2026-06-16 14:26:03 +00:00
t3-backup-state.timer t3: prepare to adopt 0.0.25 — version-agnostic dispatch + real pairing health-check + state backup [ci skip] 2026-06-09 21:41:53 +00:00
t3-dispatch.service fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
t3-migrate-idle.service t3-migrate-idle: systemd oneshot + overnight timer (01:00-05:40, /20) 2026-06-21 12:35:19 +00:00
t3-migrate-idle.sh t3-migrate-idle: drain deferral markers when safe 2026-06-21 12:34:44 +00:00
t3-migrate-idle.timer t3-migrate-idle: systemd oneshot + overnight timer (01:00-05:40, /20) 2026-06-21 12:35:19 +00:00
t3-mint fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
t3-provision-users.service fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
t3-provision-users.sh t3-provision-users: install_skills heals stale symlinks + owns ~/.agents 2026-06-23 09:27:31 +00:00
t3-provision-users.timer fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
t3-safe-restart.sh t3-safe-restart: extract shared safe-restart library from t3-autoupdate 2026-06-21 12:28:53 +00:00
t3-serve@.service t3-serve@: contain agent memory storms; survive child OOM kills 2026-06-10 21:00:06 +00:00
task-processor.sh fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
test-claude-auth-sync.sh fix(workstation): claude-auth-sync must merge, not overwrite, the shared Vault path 2026-06-26 08:33:41 +00:00
test-fan-control.sh fan-control: thin actuator — HA computes the setpoint, host only applies it 2026-06-13 12:59:57 +00:00
test-vault-token-renew.sh fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
test_tg_lock_timeout.py ci: scripts/tg waits out a contended state lock (-lock-timeout) 2026-06-21 00:15:39 +00:00
tg ci: scripts/tg waits out a contended state lock (-lock-timeout) 2026-06-21 00:15:39 +00:00
tmux-persist-restore.service workstation: rename tmux persistence out of the t3 namespace [ci skip] 2026-06-10 17:42:52 +00:00
tmux-persist-save.service workstation: rename tmux persistence out of the t3 namespace [ci skip] 2026-06-10 17:42:52 +00:00
tmux-persist-save.timer workstation: rename tmux persistence out of the t3 namespace [ci skip] 2026-06-10 17:42:52 +00:00
tmux-persist.sh tmux-persist: add single-user restore mode (restore [user]) 2026-06-10 21:08:57 +00:00
update-istio-injection.sh fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
update_k8s.sh k8s-version-upgrade: ignore CoreDNS preflight on kubeadm upgrade plan too 2026-06-17 13:49:06 +00:00
update_node.sh fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
upgrade_state.sh k8s-version-upgrade: move detection to nightly 23:00 UTC (overnight upgrades) 2026-06-17 18:16:32 +00:00
vault-kubeconfig fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
vault-token-renew.service fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
vault-token-renew.sh fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
vault-token-renew.timer fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
vzdump-vms.service backup: image-level vzdump of hand-managed VMs (devvm) — close no-VM-backup DR gap 2026-06-09 21:41:54 +00:00
vzdump-vms.sh backup: fix vzdump-vms exit code — EXIT-trap && short-circuit falsely failed OK runs 2026-06-09 21:41:54 +00:00
vzdump-vms.timer backup: image-level vzdump of hand-managed VMs (devvm) — close no-VM-backup DR gap 2026-06-09 21:41:54 +00:00
woodpecker-register-forgejo-repo.sh fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00