k8s-version-upgrade: preflight skips kubeadm-plan gate when master already at target
All checks were successful
ci/woodpecker/push/default Pipeline was successful

The autonomous 1.34.9 version-upgrade chain has been failing its preflight every
night. A prior run left k8s-master + k8s-node1 on 1.34.9 while node2-6 stayed on
1.34.8, and preflight's gate-4 runs `kubeadm upgrade plan` on master. On an
already-at-target master, kubeadm prints no "kubeadm upgrade apply vX.Y.Z" line,
so the parsed target came back empty and the `!= requested` check aborted the
whole chain before any worker was touched. Deterministic — it self-cleaned and
re-failed identically each night, so it would have failed again tonight, leaving
node2-6 stuck on the old patch.

Fix: skip the kubeadm-plan-target gate when master is already on TARGET_VERSION
— the same at-target self-skip that phase_master and phase_worker already do.
The remaining workers are still validated by their own per-node phases, and the
detector already confirmed the target is installable via apt-cache. This lets
tonight's unattended chain resume and finish node2-6 -> 1.34.9.

Runbook updated: node count 5 -> 7, the gate skip note, and a Past Incidents
writeup (incl. the collateral apiserver OIDC wipe, restored via the rbac stack).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
Viktor Barzin 2026-06-18 09:17:46 +00:00
parent 8787d361dc
commit 70e217db24
2 changed files with 36 additions and 12 deletions

View file

@ -2,9 +2,9 @@
## Overview
Kubernetes component versions (`kubeadm`/`kubelet`/`kubectl`) on the 5 K8s
VMs are upgraded automatically by a weekly detection CronJob that seeds a
chain of small phase Jobs. Each Job is **pinned to a node that is NOT its
Kubernetes component versions (`kubeadm`/`kubelet`/`kubectl`) on the 7 K8s
nodes (k8s-master + k8s-node1..6) are upgraded automatically by a nightly
detection CronJob that seeds a chain of small phase Jobs. Each Job is **pinned to a node that is NOT its
drain target** — so no pod in the chain can preempt itself.
The chain (23:00 UTC nightly):
@ -39,11 +39,11 @@ Job 0 — preflight (pinned: k8s-node1)
├── All nodes Ready + no Mem/Disk pressure
├── halt-on-alert (kured-style ignore-list)
├── 24h-quiet baseline (no Ready transitions <24h ago)
├── kubeadm upgrade plan matches target
├── kubeadm upgrade plan matches target (skipped when master already at target — partial-resume)
├── Push k8s_upgrade_in_flight=1, k8s_upgrade_started_timestamp=$(date +%s)
├── Trigger backup-etcd Job, wait, verify snapshot byte count
├── SSH master: containerd skew fix (if master < workers)
├── SSH all 5 nodes: apt repo URL rewrite (only kind=minor)
├── SSH all 7 nodes: apt repo URL rewrite (only kind=minor)
└── spawn_next → k8s-upgrade-master-<target_version>
@ -356,6 +356,13 @@ kill %1
## Past Incidents
### 2026-06-18 — Preflight gate-4 wedged a partial (master-ahead) chain
- A prior 1.34.9 run upgraded k8s-master + k8s-node1, then stopped; node2-6 stayed on 1.34.8.
- Every nightly preflight then aborted at the **kubeadm-plan-target gate**: `kubeadm upgrade plan` runs on k8s-master, already on 1.34.9, so it emitted no `kubeadm upgrade apply vX.Y.Z` line → empty `plan_target``'' != '1.34.9'``exit 1`. Deterministic, not transient (gates 1-3 all green; no critical alert was firing). The failed preflight self-cleaned each night (2026-06-17 retry-on-failure) but re-failed identically.
- The two `in_flight`-based alerts stayed blind (preflight aborts pre-metric); `K8sUpgradeChainJobFailed` (warning) surfaced it.
- **Collateral**: the earlier master bump had also dropped apiserver `--authentication-config` (SSO broke); restored separately via the `rbac` stack's `apiserver_oidc_config`.
- **Mitigation**: `phase_preflight` now **skips the kubeadm-plan-target gate when k8s-master is already on TARGET_VERSION** (mirrors the at-target self-skip already in `phase_master`/`phase_worker`). Remaining workers are validated by their own phases; the detector's apt-cache probe already confirmed the target is installable.
### 2026-05-11 — Self-preemption (agent → Job-chain rewrite)
- The v1 agent ran inside the `claude-agent-service` Deployment (replicas=1, no nodeSelector) and was scheduled to k8s-node4.
- During Stage 6 (first worker drain) the agent ran `kubectl drain k8s-node4` — evicting itself.

View file

@ -356,13 +356,30 @@ phase_preflight() {
# on a Keel-drifted CoreDNS (start version unsupported) and, under pipefail,
# aborts this whole check. Ignore the two CoreDNS checks here too so plan
# still emits its "kubeadm upgrade apply vX.Y.Z" line. (See update_k8s.sh.)
local plan_target
plan_target=$(ssh "${SSH_OPTS[@]}" "$(ssh_target k8s-master)" 'sudo kubeadm upgrade plan --ignore-preflight-errors=CoreDNSMigration,CoreDNSUnsupportedPlugins' \
| grep -oE 'kubeadm upgrade apply v[0-9]+\.[0-9]+\.[0-9]+' \
| grep -oE 'v[0-9]+\.[0-9]+\.[0-9]+' | head -1 | tr -d v)
if [ "$plan_target" != "$TARGET_VERSION" ]; then
slack "ABORT preflight — kubeadm plan target $plan_target ≠ requested $TARGET_VERSION"
exit 1
#
# SKIP this gate when k8s-master is ALREADY on TARGET_VERSION — a partial-chain
# resume (master + earlier workers done, later workers still pending). `kubeadm
# upgrade plan` run on an at-target master prints NO "kubeadm upgrade apply
# vX.Y.Z" line, so the parse below yields an EMPTY plan_target and the `!=`
# check aborts every run — even though the chain just needs to finish the
# remaining workers (phase_master self-skips an at-target master the same way,
# below). Confirmed root cause of the 1.34.9 preflight aborts (2026-06-18):
# master was already on 1.34.9 while node2-6 lagged on 1.34.8, so every nightly
# preflight died here with an empty `plan target ≠ requested 1.34.9`.
local master_kubelet_v
master_kubelet_v=$($KUBECTL get node k8s-master -o jsonpath='{.status.nodeInfo.kubeletVersion}' 2>/dev/null | tr -d v)
if [ "$master_kubelet_v" = "$TARGET_VERSION" ]; then
slack "preflight — k8s-master already on v$TARGET_VERSION; skipping kubeadm-plan-target gate (workers still pending)"
echo "k8s-master already on v$TARGET_VERSION — skipping kubeadm-plan-target gate"
else
local plan_target
plan_target=$(ssh "${SSH_OPTS[@]}" "$(ssh_target k8s-master)" 'sudo kubeadm upgrade plan --ignore-preflight-errors=CoreDNSMigration,CoreDNSUnsupportedPlugins' \
| grep -oE 'kubeadm upgrade apply v[0-9]+\.[0-9]+\.[0-9]+' \
| grep -oE 'v[0-9]+\.[0-9]+\.[0-9]+' | head -1 | tr -d v)
if [ "$plan_target" != "$TARGET_VERSION" ]; then
slack "ABORT preflight — kubeadm plan target $plan_target ≠ requested $TARGET_VERSION"
exit 1
fi
fi
# 5. Push in-flight + started_timestamp metrics + ns annotations