infra/scripts
Viktor Barzin 037a609f27
All checks were successful
ci/woodpecker/push/default Pipeline was successful
k8s-version-upgrade: unblock 1.34.9 — skip kubeadm CoreDNS addon + busybox-date fix
The 1.34.9 master upgrade hard-failed `kubeadm upgrade apply` preflight: CoreDNS
is at v1.12.4 (Keel auto-bumped it 1.12.1 -> 1.12.4 on 2026-05-26 via a stale
kube-system out-of-band annotation), and 1.12.4 is ahead of kubeadm 1.34.9's
bundled corefile-migration table ("start version not supported").

- scripts/update_k8s.sh: master `kubeadm upgrade apply` now runs with
  `--ignore-preflight-errors=CoreDNSMigration,CoreDNSUnsupportedPlugins
  --skip-phases=addon/coredns`. A dry-run proved --ignore ALONE would overwrite
  our custom split-horizon Corefile with kubeadm's default AND downgrade the
  image; --skip-phases leaves CoreDNS 100% untouched while the control plane
  upgrades. CoreDNS is pinned off Keel (keel.sh/policy=never) to stop the drift.
- stacks/k8s-version-upgrade/scripts/upgrade-step.sh: fix the preflight
  quiet-baseline (settle-window) check, which silently no-op'd on the ghcr
  claude-agent-service image's busybox `date` (can't parse ISO8601). Now tries
  GNU then busybox `-D`, and warns+skips on parse failure (no silent fail-open).
- docs: runbook + architecture document the CoreDNS handling.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-17 13:45:05 +00:00
..
offinfra-templates offinfra-onboard: --dockerfile flag for non-root Dockerfiles 2026-06-13 02:37:25 +00:00
server_safe_poweroff fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
t3-dispatch t3: session-auth detection for the gated nightly tracker (dispatch fallback logging + Loki alerts) 2026-06-16 09:56:55 +00:00
workstation Merge remote-tracking branch 'origin/master' into wizard/reconcile-mirror 2026-06-16 22:32:43 +00:00
apply-mbps-caps.service fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
apply-mbps-caps.sh apply-mbps-caps: compare normalized option sets (true idempotency) + devvm I/O-stall post-mortem [ci skip] 2026-06-11 18:00:08 +00:00
apply-mbps-caps.timer fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
breakglass-firewall.sh break-glass SSH: drop port-knock for exposed key-only :52222; version host config 2026-06-11 18:23:39 +00:00
check-ingress-auth-comments.py fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
cluster_healthcheck.sh cluster-health: helm check #18 catches pending/failed releases (helm list -a) 2026-06-16 15:39:06 +00:00
cluster_manager.py fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
daily-backup.service fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
daily-backup.sh fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
daily-backup.timer fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
devvm-promtail.service t3: connection logging across the path for drop attribution 2026-06-11 13:48:10 +00:00
devvm-promtail.yaml t3: connection logging across the path for drop attribution 2026-06-11 13:48:10 +00:00
extend_vm_storage.sh fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
fail2ban-breakglass-sshd.local break-glass SSH: drop port-knock for exposed key-only :52222; version host config 2026-06-11 18:23:39 +00:00
fan-control.env.example fan-control docs: sync runbook/env/service/design to the HA-actuator + anti-flap model 2026-06-16 08:11:48 +00:00
fan-control.service fan-control docs: sync runbook/env/service/design to the HA-actuator + anti-flap model 2026-06-16 08:11:48 +00:00
fan-control.sh fan-control: hold last command through transient HA losses (stop fan flapping) 2026-06-16 08:07:52 +00:00
forgejo-migrate-orphan-images.sh fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
frigate-bulk-classify.js fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
frigate-inspect.mjs fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
gen_service_stacks.py fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
graceful-db-maintenance.sh fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
image_pull.sh fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
image_pull_remote.sh fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
k8s-apiserver-audit-policy.yaml fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
kill_ns.sh fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
lvm-pvc-snapshot.sh fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
lvm-pvc-snapshot.timer fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
migrate-state-to-pg fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
migrate_service_state.sh fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
nfs-change-tracker.service fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
nfs-mirror.service fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
nfs-mirror.sh nfs-mirror: exclude SQLite WAL/SHM sidecars + treat rsync exit 24 as success 2026-06-16 09:34:22 +00:00
nfs-mirror.timer fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
node_registry_manager.sh fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
offinfra-onboard offinfra-onboard: --dockerfile flag for non-root Dockerfiles 2026-06-13 02:37:25 +00:00
offsite-sync-backup.service fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
offsite-sync-backup.sh fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
offsite-sync-backup.timer fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
parse-postmortem-todos.sh fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
pfsense-haproxy-bootstrap.php pfsense: SNI-routed internal 443 — mail.viktorbarzin.me serves webmail everywhere 2026-06-10 18:41:07 +00:00
pfsense-nat-mailserver-haproxy-flip.php fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
pfsense-nat-mailserver-haproxy-unflip.php fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
postmortem-pipeline.sh fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
provision-k8s-worker fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
publish-gate claude-agent-service image -> ghcr across all five consumer stacks (infra#19) 2026-06-13 01:47:54 +00:00
pve-nfs-exports fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
pve-promtail.service pve-host: ship journal to Loki (snoopy command audit + sshd-pve) for emo's root SSH 2026-06-10 19:31:45 +00:00
pve-promtail.yaml pve-host/dns: register loki.viktorbarzin.lan CNAME, drop the /etc/hosts pin 2026-06-10 22:55:20 +00:00
pve-snoopy.ini pve-host: ship journal to Loki (snoopy command audit + sshd-pve) for emo's root SSH 2026-06-10 19:31:45 +00:00
renew_worker_certs.sh fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
setup-containerd-pullthrough.sh fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
setup-forgejo-containerd-mirror.sh dns: pfSense forward-zone for viktorbarzin.me, nodes fully stock [ci skip] 2026-06-10 08:32:34 +00:00
setup-task-pipeline.sh fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
setup_containerd_mirrors.sh fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
sshd-10-breakglass.conf break-glass SSH: drop port-knock for exposed key-only :52222; version host config 2026-06-11 18:23:39 +00:00
state-sync fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
stop_storage_services.sh fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
sudoers-t3-autopair fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
t3-autoupdate.service t3: pin t3@0.0.24 + stop nightly auto-update (auth-outage fix) [ci skip] 2026-06-09 21:41:53 +00:00
t3-autoupdate.sh t3: gated nightly tracker (replaces pinned enforcer) + drop timer Persistent 2026-06-16 10:08:12 +00:00
t3-autoupdate.timer t3: gated nightly tracker (replaces pinned enforcer) + drop timer Persistent 2026-06-16 10:08:12 +00:00
t3-backup-state.service t3: prepare to adopt 0.0.25 — version-agnostic dispatch + real pairing health-check + state backup [ci skip] 2026-06-09 21:41:53 +00:00
t3-backup-state.sh t3-backup-state: retention 14 -> 6 (bound devvm root fs) 2026-06-16 14:26:03 +00:00
t3-backup-state.timer t3: prepare to adopt 0.0.25 — version-agnostic dispatch + real pairing health-check + state backup [ci skip] 2026-06-09 21:41:53 +00:00
t3-dispatch.service fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
t3-mint fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
t3-provision-users.service fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
t3-provision-users.sh Merge remote-tracking branch 'origin/master' into wizard/reconcile-mirror 2026-06-16 22:32:43 +00:00
t3-provision-users.timer fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
t3-serve@.service t3-serve@: contain agent memory storms; survive child OOM kills 2026-06-10 21:00:06 +00:00
task-processor.sh fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
test-fan-control.sh fan-control: thin actuator — HA computes the setpoint, host only applies it 2026-06-13 12:59:57 +00:00
test-vault-token-renew.sh fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
tg fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
tmux-persist-restore.service workstation: rename tmux persistence out of the t3 namespace [ci skip] 2026-06-10 17:42:52 +00:00
tmux-persist-save.service workstation: rename tmux persistence out of the t3 namespace [ci skip] 2026-06-10 17:42:52 +00:00
tmux-persist-save.timer workstation: rename tmux persistence out of the t3 namespace [ci skip] 2026-06-10 17:42:52 +00:00
tmux-persist.sh tmux-persist: add single-user restore mode (restore [user]) 2026-06-10 21:08:57 +00:00
update-istio-injection.sh fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
update_k8s.sh k8s-version-upgrade: unblock 1.34.9 — skip kubeadm CoreDNS addon + busybox-date fix 2026-06-17 13:45:05 +00:00
update_node.sh fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
upgrade_state.sh k8s-version-upgrade: retry failed phases + surface wedged chain (fix 5-day silent stall) 2026-06-17 13:07:36 +00:00
vault-kubeconfig fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
vault-token-renew.service fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
vault-token-renew.sh fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
vault-token-renew.timer fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
vzdump-vms.service backup: image-level vzdump of hand-managed VMs (devvm) — close no-VM-backup DR gap 2026-06-09 21:41:54 +00:00
vzdump-vms.sh backup: fix vzdump-vms exit code — EXIT-trap && short-circuit falsely failed OK runs 2026-06-09 21:41:54 +00:00
vzdump-vms.timer backup: image-level vzdump of hand-managed VMs (devvm) — close no-VM-backup DR gap 2026-06-09 21:41:54 +00:00
woodpecker-register-forgejo-repo.sh fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00