infra/scripts
Viktor Barzin 36521839fc t3: gated nightly tracker (replaces pinned enforcer) + drop timer Persistent
Phase 2 of "track t3 nightly, accept the risk, but make sure session auth works
and revert if it breaks". Rewrites the daily t3-autoupdate from a pinned-version
enforcer into a NIGHTLY TRACKER that gates every bump so a bad build self-heals
instead of repeating 2026-06-09:

- follows the t3@nightly npm dist-tag (T3_TRACK; T3_PIN still works as a hard
  freeze; /etc/t3-autoupdate.freeze is the manual revert switch);
- downgrade-guard (the nightly tag is mutable — never move backward) + channel
  sanity (target must be a -nightly. build);
- pre-bump per-user state.sqlite backup (online VACUUM INTO) BEFORE install, so
  rollback is a restore not sqlite surgery;
- health-check now SEEDS a throwaway instance with a COPY of a real POPULATED
  state.sqlite, exercising the forward MIGRATION (the actual 2026-06-09 failure
  class) + the real mint->exchange->t3_session pairing handshake before trusting
  a build. Scratch dir is on /var/tmp (disk), not the 2G tmpfs /tmp;
- canary rollout: restart idle instances ONE AT A TIME, verify pairing through
  the real dispatch after each, and on the first failure roll back (binary +
  that user's DB from the pre-bump backup) AND self-freeze so it can't re-flap
  onto bad builds. Active-agent instances are deferred, never killed. Rollback
  target is the recorded LAST-GOOD, not "whatever was installed";
- DRY_RUN mode (T3_DRY_RUN=1) previews the gate against a temp-prefix install —
  validated: 0.0.28-nightly.20260616.571 PASSES the populated-DB migration gate.

timer: drop Persistent=true (a missed 04:00 must not fire a real bump on boot
mid-day with users active — a 2026-06-09 contributing factor).
setup-devvm.sh: install t3@nightly on fresh boxes (no state to break), in sync.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-16 10:08:12 +00:00
..
offinfra-templates offinfra-onboard: --dockerfile flag for non-root Dockerfiles 2026-06-13 02:37:25 +00:00
server_safe_poweroff fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
t3-dispatch t3: session-auth detection for the gated nightly tracker (dispatch fallback logging + Loki alerts) 2026-06-16 09:56:55 +00:00
workstation t3: gated nightly tracker (replaces pinned enforcer) + drop timer Persistent 2026-06-16 10:08:12 +00:00
apply-mbps-caps.service fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
apply-mbps-caps.sh apply-mbps-caps: compare normalized option sets (true idempotency) + devvm I/O-stall post-mortem [ci skip] 2026-06-11 18:00:08 +00:00
apply-mbps-caps.timer fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
breakglass-firewall.sh break-glass SSH: drop port-knock for exposed key-only :52222; version host config 2026-06-11 18:23:39 +00:00
check-ingress-auth-comments.py fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
cluster_healthcheck.sh technitium: mirror mail-auth records into internal zone; fix redfish check [ci skip] 2026-06-10 17:46:37 +00:00
cluster_manager.py fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
daily-backup.service fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
daily-backup.sh fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
daily-backup.timer fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
devvm-promtail.service t3: connection logging across the path for drop attribution 2026-06-11 13:48:10 +00:00
devvm-promtail.yaml t3: connection logging across the path for drop attribution 2026-06-11 13:48:10 +00:00
extend_vm_storage.sh fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
fail2ban-breakglass-sshd.local break-glass SSH: drop port-knock for exposed key-only :52222; version host config 2026-06-11 18:23:39 +00:00
fan-control.env.example fan-control docs: sync runbook/env/service/design to the HA-actuator + anti-flap model 2026-06-16 08:11:48 +00:00
fan-control.service fan-control docs: sync runbook/env/service/design to the HA-actuator + anti-flap model 2026-06-16 08:11:48 +00:00
fan-control.sh fan-control: hold last command through transient HA losses (stop fan flapping) 2026-06-16 08:07:52 +00:00
forgejo-migrate-orphan-images.sh fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
frigate-bulk-classify.js fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
frigate-inspect.mjs fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
gen_service_stacks.py fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
graceful-db-maintenance.sh fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
image_pull.sh fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
image_pull_remote.sh fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
k8s-apiserver-audit-policy.yaml fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
kill_ns.sh fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
lvm-pvc-snapshot.sh fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
lvm-pvc-snapshot.timer fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
migrate-state-to-pg fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
migrate_service_state.sh fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
nfs-change-tracker.service fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
nfs-mirror.service fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
nfs-mirror.sh nfs-mirror: exclude SQLite WAL/SHM sidecars + treat rsync exit 24 as success 2026-06-16 09:34:22 +00:00
nfs-mirror.timer fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
node_registry_manager.sh fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
offinfra-onboard offinfra-onboard: --dockerfile flag for non-root Dockerfiles 2026-06-13 02:37:25 +00:00
offsite-sync-backup.service fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
offsite-sync-backup.sh fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
offsite-sync-backup.timer fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
parse-postmortem-todos.sh fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
pfsense-haproxy-bootstrap.php pfsense: SNI-routed internal 443 — mail.viktorbarzin.me serves webmail everywhere 2026-06-10 18:41:07 +00:00
pfsense-nat-mailserver-haproxy-flip.php fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
pfsense-nat-mailserver-haproxy-unflip.php fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
postmortem-pipeline.sh fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
provision-k8s-worker fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
publish-gate claude-agent-service image -> ghcr across all five consumer stacks (infra#19) 2026-06-13 01:47:54 +00:00
pve-nfs-exports fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
pve-promtail.service pve-host: ship journal to Loki (snoopy command audit + sshd-pve) for emo's root SSH 2026-06-10 19:31:45 +00:00
pve-promtail.yaml pve-host/dns: register loki.viktorbarzin.lan CNAME, drop the /etc/hosts pin 2026-06-10 22:55:20 +00:00
pve-snoopy.ini pve-host: ship journal to Loki (snoopy command audit + sshd-pve) for emo's root SSH 2026-06-10 19:31:45 +00:00
renew_worker_certs.sh fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
setup-containerd-pullthrough.sh fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
setup-forgejo-containerd-mirror.sh dns: pfSense forward-zone for viktorbarzin.me, nodes fully stock [ci skip] 2026-06-10 08:32:34 +00:00
setup-task-pipeline.sh fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
setup_containerd_mirrors.sh fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
sshd-10-breakglass.conf break-glass SSH: drop port-knock for exposed key-only :52222; version host config 2026-06-11 18:23:39 +00:00
state-sync fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
stop_storage_services.sh fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
sudoers-t3-autopair fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
t3-autoupdate.service t3: pin t3@0.0.24 + stop nightly auto-update (auth-outage fix) [ci skip] 2026-06-09 21:41:53 +00:00
t3-autoupdate.sh t3: gated nightly tracker (replaces pinned enforcer) + drop timer Persistent 2026-06-16 10:08:12 +00:00
t3-autoupdate.timer t3: gated nightly tracker (replaces pinned enforcer) + drop timer Persistent 2026-06-16 10:08:12 +00:00
t3-backup-state.service t3: prepare to adopt 0.0.25 — version-agnostic dispatch + real pairing health-check + state backup [ci skip] 2026-06-09 21:41:53 +00:00
t3-backup-state.sh t3: prepare to adopt 0.0.25 — version-agnostic dispatch + real pairing health-check + state backup [ci skip] 2026-06-09 21:41:53 +00:00
t3-backup-state.timer t3: prepare to adopt 0.0.25 — version-agnostic dispatch + real pairing health-check + state backup [ci skip] 2026-06-09 21:41:53 +00:00
t3-dispatch.service fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
t3-mint fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
t3-provision-users.service fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
t3-provision-users.sh workstation: standardize on the native claude install (drop npm-global + npx) 2026-06-15 17:12:05 +00:00
t3-provision-users.timer fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
t3-serve@.service t3-serve@: contain agent memory storms; survive child OOM kills 2026-06-10 21:00:06 +00:00
task-processor.sh fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
test-fan-control.sh fan-control: thin actuator — HA computes the setpoint, host only applies it 2026-06-13 12:59:57 +00:00
test-vault-token-renew.sh fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
tg fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
tmux-persist-restore.service workstation: rename tmux persistence out of the t3 namespace [ci skip] 2026-06-10 17:42:52 +00:00
tmux-persist-save.service workstation: rename tmux persistence out of the t3 namespace [ci skip] 2026-06-10 17:42:52 +00:00
tmux-persist-save.timer workstation: rename tmux persistence out of the t3 namespace [ci skip] 2026-06-10 17:42:52 +00:00
tmux-persist.sh tmux-persist: add single-user restore mode (restore [user]) 2026-06-10 21:08:57 +00:00
update-istio-injection.sh fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
update_k8s.sh fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
update_node.sh fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
upgrade_state.sh fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
vault-kubeconfig fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
vault-token-renew.service fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
vault-token-renew.sh fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
vault-token-renew.timer fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00
vzdump-vms.service backup: image-level vzdump of hand-managed VMs (devvm) — close no-VM-backup DR gap 2026-06-09 21:41:54 +00:00
vzdump-vms.sh backup: fix vzdump-vms exit code — EXIT-trap && short-circuit falsely failed OK runs 2026-06-09 21:41:54 +00:00
vzdump-vms.timer backup: image-level vzdump of hand-managed VMs (devvm) — close no-VM-backup DR gap 2026-06-09 21:41:54 +00:00
woodpecker-register-forgejo-repo.sh fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 2026-06-09 08:45:33 +00:00