t3: pin t3@0.0.24 + stop nightly auto-update (auth-outage fix) [ci skip]

The t3-autoupdate timer (re-enabled by the provisioner's step 5b with
`--now`, which fires the missed daily job immediately on a Persistent
timer) pulled t3@nightly 0.0.25 mid-day. That build ran forward schema
migrations on every ~/.t3 state.sqlite (auth_pairing_links/auth_sessions
role->scopes, +proof_key_thumbprint) AND changed the bootstrap API,
breaking t3-mint/pairing for ALL devvm users (pair prompt, no session).

- t3-autoupdate.sh: now a pinned-version ENFORCER (T3_PIN=0.0.24), not a
  nightly tracker -- re-asserts the pin (a no-op when correct).
- t3-provision-users.sh step 5b: drop `--now` (it triggered the
  immediate missed-job run that pulled the bad build).
- setup-devvm.sh: install pinned t3@0.0.24 at machine setup.
- unit Descriptions + service-catalog reflect the pin.
- post-mortem: 2026-06-09-t3-nightly-autoupdate-auth-outage.md.

Host already reconciled out-of-band: rolled back to 0.0.24, re-enabled
the (now-pinned) enforcer, reset the 2 new users' disposable DBs,
surgically reverted wizard's auth tables to level-30 (96 threads + live
session preserved). All users verified 302 + t3_session.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
Viktor Barzin 2026-06-09 16:08:44 +00:00
parent 2125651aaa
commit 5ea238c707
7 changed files with 174 additions and 13 deletions

View file

@ -191,9 +191,12 @@ while IFS=$'\t' read -r os_user port; do
id "$os_user" >/dev/null 2>&1 && run systemctl enable --now "t3-serve@$os_user.service" >/dev/null 2>&1 || true
done < <(jq -r '.ports | to_entries[] | [.key, .value] | @tsv' "$desired_file")
# 5b) machine-wide (once, not per-user): keep the t3 nightly auto-updater enabled so it
# self-heals hourly — a `disabled` timer silently freezes every instance on an old build.
run systemctl enable --now t3-autoupdate.timer >/dev/null 2>&1 || true
# 5b) machine-wide (once, not per-user): keep the t3 pinned-version ENFORCER enabled (it
# re-asserts T3_PIN daily; a no-op when already correct). NOT --now: with Persistent=true
# a `--now` enable fires the missed daily job IMMEDIATELY, which on 2026-06-09 pulled a
# breaking nightly mid-day and took out auth for everyone. `enable` (no --now) just arms
# the 04:00 schedule; fresh boxes get t3 from setup-devvm.sh's pinned install, not here.
run systemctl enable t3-autoupdate.timer >/dev/null 2>&1 || true
# 6) regenerate /etc/ttyd-user-map + dispatch.json from the desired state (SSoT:
# a roster entry removed here DISAPPEARS, which is what the offboarding cut relies on)