k8s-version-upgrade: unblock 1.34.9 — skip kubeadm CoreDNS addon + busybox-date fix
All checks were successful
ci/woodpecker/push/default Pipeline was successful
All checks were successful
ci/woodpecker/push/default Pipeline was successful
The 1.34.9 master upgrade hard-failed `kubeadm upgrade apply` preflight: CoreDNS
is at v1.12.4 (Keel auto-bumped it 1.12.1 -> 1.12.4 on 2026-05-26 via a stale
kube-system out-of-band annotation), and 1.12.4 is ahead of kubeadm 1.34.9's
bundled corefile-migration table ("start version not supported").
- scripts/update_k8s.sh: master `kubeadm upgrade apply` now runs with
`--ignore-preflight-errors=CoreDNSMigration,CoreDNSUnsupportedPlugins
--skip-phases=addon/coredns`. A dry-run proved --ignore ALONE would overwrite
our custom split-horizon Corefile with kubeadm's default AND downgrade the
image; --skip-phases leaves CoreDNS 100% untouched while the control plane
upgrades. CoreDNS is pinned off Keel (keel.sh/policy=never) to stop the drift.
- stacks/k8s-version-upgrade/scripts/upgrade-step.sh: fix the preflight
quiet-baseline (settle-window) check, which silently no-op'd on the ghcr
claude-agent-service image's busybox `date` (can't parse ISO8601). Now tries
GNU then busybox `-D`, and warns+skips on parse failure (no silent fail-open).
- docs: runbook + architecture document the CoreDNS handling.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
parent
042d1ce1ac
commit
037a609f27
4 changed files with 50 additions and 5 deletions
|
|
@ -309,7 +309,11 @@ each Job's pod and its drain target are always different nodes.
|
|||
ConfigMap, and a `template` ConfigMap into each Job pod.
|
||||
- **Per-node script**: `infra/scripts/update_k8s.sh`. Caller passes
|
||||
`--role master|worker --release X.Y.Z`. Piped via SSH into each node by
|
||||
upgrade-step.sh.
|
||||
upgrade-step.sh. The master path runs `kubeadm upgrade apply` with
|
||||
`--ignore-preflight-errors=CoreDNSMigration,CoreDNSUnsupportedPlugins
|
||||
--skip-phases=addon/coredns` so kubeadm never touches CoreDNS (custom Corefile
|
||||
+ separately-tracked image; CoreDNS is pinned off Keel via `keel.sh/policy=never`).
|
||||
See the runbook's "CoreDNS is NOT upgraded by kubeadm here".
|
||||
- **Four Upgrade Gates alerts**:
|
||||
- `K8sVersionSkew` — kubelet/apiserver `gitVersion` count >1 for 30m. Catches a half-done rollout.
|
||||
- `EtcdPreUpgradeSnapshotMissing` — `k8s_upgrade_in_flight==1 && k8s_upgrade_snapshot_taken==0` for 10m. Catches preflight failing silently.
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue