k8s-upgrade: reclaim+auto-prune kubeadm /etc/kubernetes/tmp leak; correct crash root cause to etcd IO (not OIDC)
Some checks failed
ci/woodpecker/push/postmortem-todos Pipeline was successful
ci/woodpecker/push/default Pipeline failed

Digging into "why did the apiserver crash" disproved the earlier OIDC
explanation. An isolated v1.35.6 apiserver repro with authentik reachable
initialises OIDC cleanly (oidc.go:313, no error) and runs fine — so the
--authentication-config -> --oidc-* revert is NOT what crashed it. etcd's
surviving crash-window log is the real cause: 1180 "apply request took too long"
warnings in 16 min, individual applies up to 4.3s (healthy <100ms) right as
kubeadm tried to bring up the new apiserver. That's etcd IO starvation on the
shared sdc HDD (beads code-oflt).

A big contributor + the reason master root fs sat at 73%: kubeadm dumps a full
~400MB etcd DB backup into /etc/kubernetes/tmp/kubeadm-backup-etcd-<ts>/ before
every etcd upgrade and never cleans it up — 145 dirs / 28GB had accumulated,
driving image-GC churn and extra write-IO onto etcd's spindle. Reclaimed live
(73% -> 23%) and added a preflight prune (>3 days) so it can't re-accumulate.

Also corrected the OIDC handling: the kubeadm-config drift is real but only
breaks dashboard/kubectl SSO AFTER a successful upgrade (recoverable via the
chain's restore.sh + the kubeadm-config reconciliation) — it does not crash the
apiserver. So the preflight check is now an ALERT, not a block (was added on the
wrong hypothesis). Post-mortem, runbook, and apiserver-oidc.tf header corrected.

Per Viktor: reclaim the disk and automate so the manual cleanup never recurs;
the durable IO fix remains code-oflt (etcd off the shared HDD).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
Viktor Barzin 2026-06-25 15:23:15 +00:00
parent 60a1cb9a25
commit 9c68d147e0
4 changed files with 112 additions and 87 deletions

View file

@ -18,13 +18,16 @@
# 3. the kubeadm-config ClusterConfiguration CM what kubeadm regenerates from
# Originally only (1)+(2) were managed, so every kubeadm upgrade rewrote the
# manifest from the STALE CM, reverting --authentication-config to single-issuer
# --oidc-* flags. On k8s 1.35 that regenerated apiserver CRASH-LOOPED and stalled
# the whole upgrade mid-flight (master cordoned, etcd already bumped) see
# docs/post-mortems/2026-06-24-kubeadm-oidc-drift-apiserver-upgrade-stall.md. The
# --oidc-* flags. The consequence is SSO breakage AFTER the upgrade: kubectl +
# dashboard lose multi-issuer auth (the apiserver does NOT crash on this verified
# by an isolated repro 2026-06-24; the 2026-06-24 v1.35 upgrade *stall* was a
# separate etcd IO-starvation issue, see
# docs/post-mortems/2026-06-24-kubeadm-oidc-drift-apiserver-upgrade-stall.md). The
# remote script below now ALSO reconciles (3) via `kubeadm init phase
# upload-config`, so a future kubeadm upgrade regenerates a CORRECT manifest. The
# k8s-version-upgrade chain additionally GATES on `kubeadm upgrade diff` in
# preflight and blocks+alerts if --authentication-config would still be dropped.
# k8s-version-upgrade chain additionally ALERTS (does not block SSO drift is
# recoverable) via `kubeadm upgrade diff` in preflight if --authentication-config
# would still be dropped.
#
# SAFETY: the remote script health-gates on /livez and AUTO-ROLLS-BACK the
# manifest from a timestamped backup if the apiserver does not recover, so a