k8s-version-upgrade: unblock 1.34.9 — skip kubeadm CoreDNS addon + busybox-date fix
All checks were successful
ci/woodpecker/push/default Pipeline was successful

The 1.34.9 master upgrade hard-failed `kubeadm upgrade apply` preflight: CoreDNS
is at v1.12.4 (Keel auto-bumped it 1.12.1 -> 1.12.4 on 2026-05-26 via a stale
kube-system out-of-band annotation), and 1.12.4 is ahead of kubeadm 1.34.9's
bundled corefile-migration table ("start version not supported").

- scripts/update_k8s.sh: master `kubeadm upgrade apply` now runs with
  `--ignore-preflight-errors=CoreDNSMigration,CoreDNSUnsupportedPlugins
  --skip-phases=addon/coredns`. A dry-run proved --ignore ALONE would overwrite
  our custom split-horizon Corefile with kubeadm's default AND downgrade the
  image; --skip-phases leaves CoreDNS 100% untouched while the control plane
  upgrades. CoreDNS is pinned off Keel (keel.sh/policy=never) to stop the drift.
- stacks/k8s-version-upgrade/scripts/upgrade-step.sh: fix the preflight
  quiet-baseline (settle-window) check, which silently no-op'd on the ghcr
  claude-agent-service image's busybox `date` (can't parse ISO8601). Now tries
  GNU then busybox `-D`, and warns+skips on parse failure (no silent fail-open).
- docs: runbook + architecture document the CoreDNS handling.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
Viktor Barzin 2026-06-17 13:45:05 +00:00
parent 042d1ce1ac
commit 037a609f27
4 changed files with 50 additions and 5 deletions

View file

@ -98,7 +98,20 @@ if [[ "$ROLE" == "master" ]]; then
# right version (which is the only case where this timeout fires).
attempt=1
extra_flags=""
while ! sudo kubeadm upgrade apply "v$RELEASE" -y $extra_flags; do
# CoreDNS is managed OUTSIDE kubeadm on this cluster: the Corefile is a
# custom split-horizon config owned by the technitium stack, and the image
# is intentionally tracked separately. kubeadm's bundled corefile-migration
# library rejects CoreDNS versions it doesn't know (e.g. 1.12.4 -> "start
# version not supported"), which HARD-FAILS `upgrade apply` at preflight.
# Forcing past preflight with --ignore alone is NOT enough — kubeadm would
# then overwrite our custom Corefile with its default AND downgrade the
# image (verified via `kubeadm upgrade apply --dry-run`, 2026-06-17). So we
# also skip the coredns addon phase entirely: kubeadm leaves CoreDNS 100%
# untouched and only upgrades the control-plane components. (Root fix: keep
# CoreDNS off Keel — keel.sh/policy=never — so it stops drifting ahead of
# kubeadm's migration table.)
coredns_flags="--ignore-preflight-errors=CoreDNSMigration,CoreDNSUnsupportedPlugins --skip-phases=addon/coredns"
while ! sudo kubeadm upgrade apply "v$RELEASE" -y $coredns_flags $extra_flags; do
if (( attempt >= 3 )); then
echo "ERROR: kubeadm upgrade apply failed after 3 attempts" >&2
exit 1