k8s-version-upgrade: version-check uses oldest kubelet, not master
Previous version-check read RUNNING from .items[0].nodeInfo.kubeletVersion — which is just k8s-master. If master is upgraded but workers aren't (e.g. a chain that completed master phase but failed mid-worker), the version-check sees v1.34.8 and decides "no upgrade needed", never spawning the resume phase. Workers stay behind forever. Today's chain hit exactly this: master + node4 upgraded to v1.34.8, worker-node4 Failed mid-soak (alert sensitivity, since loosened), chain dead. Re-triggering the version-check looked at master only, decided cluster was "done", and refused to resume worker chain. Fix: read all node kubelet versions, sort -V, take head -1 (oldest). A partial chain now correctly reports the un-upgraded version and the chain resumes. Trivial change; tested live — chain now correctly reports v1.34.7 (workers' version) and spawns preflight → master → worker chain. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
parent
68f8514e61
commit
a0f3e15562
1 changed files with 9 additions and 3 deletions
|
|
@ -370,11 +370,17 @@ resource "kubernetes_cron_job_v1" "k8s_version_check" {
|
|||
exit 0
|
||||
fi
|
||||
|
||||
# 1. Detect running version
|
||||
# 1. Detect running version — use the OLDEST kubelet across
|
||||
# all nodes so partial chains (e.g. master upgraded but
|
||||
# workers still pending) don't trick the chain into
|
||||
# thinking the upgrade is complete. Was `.items[0]` (master
|
||||
# only) which made the chain skip when workers were behind.
|
||||
# Fixed 2026-05-23 after node4-only chain failure.
|
||||
RUNNING=$(/usr/local/bin/kubectl get nodes \
|
||||
-o jsonpath='{.items[0].status.nodeInfo.kubeletVersion}' | tr -d v)
|
||||
-o jsonpath='{range .items[*]}{.status.nodeInfo.kubeletVersion}{"\n"}{end}' \
|
||||
| tr -d v | sort -V | head -1)
|
||||
RUNNING_MINOR=$(echo "$RUNNING" | awk -F. '{print $1"."$2}')
|
||||
echo "Running version: v$RUNNING (minor $RUNNING_MINOR)"
|
||||
echo "Running version (oldest kubelet): v$RUNNING (minor $RUNNING_MINOR)"
|
||||
|
||||
# 2. Latest patch within current minor (refresh master's apt cache)
|
||||
LATEST_PATCH=$($SSH wizard@k8s-master.viktorbarzin.lan \
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue