k8s-upgrade: preflight kubeadm-plan gate must pass explicit target (minor-upgrade fix)
All checks were successful
ci/woodpecker/push/default Pipeline was successful

Last night's 1.34.9->1.35.6 run passed the ESO/kyverno compat gate (the migration
worked!) but ABORTED at the kubeadm-plan-target gate: it ran `kubeadm upgrade plan`
with NO version, so master's old 1.34.9 kubeadm auto-proposed only the current
minor (Loki: "falling back to stable-1.34") and plan_target != 1.35.6 -> abort.
That gate worked for patch upgrades but never for minors. Fix: pass the explicit
`v$TARGET_VERSION` (verified on master: `kubeadm upgrade plan v1.35.6` emits
"kubeadm upgrade apply v1.35.6"). Works for patches too. Applied live to the
ConfigMap before tonight's run; deleted the failed preflight-1-35-6 job.

Also: ESO 2.x took SSA ownership of .spec.refreshInterval, so terraform's apply of
the k8s-upgrade-creds ExternalSecret hit a field-manager conflict. Added
field_manager.force_conflicts=true (benign — interval is semantically identical).
This pattern affects all 104 migrated ESs fleet-wide (follow-up).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
Viktor Barzin 2026-06-24 06:06:14 +00:00
parent 98d2b89614
commit 566447a698
2 changed files with 18 additions and 1 deletions

View file

@ -131,6 +131,15 @@ resource "kubernetes_manifest" "external_secret" {
] ]
} }
} }
# ESO 2.x took SSA ownership of .spec.refreshInterval (it normalizes the value),
# which conflicts with terraform's apply of this ExternalSecret. force_conflicts
# lets terraform reassert its spec the interval is semantically identical, so
# this is benign. Surfaced after the ESO 0.12->2.6 migration (2026-06-24); the
# same pattern affects all migrated ExternalSecrets fleet-wide.
field_manager {
force_conflicts = true
}
} }
# --- Unified ServiceAccount + RBAC --- # --- Unified ServiceAccount + RBAC ---

View file

@ -398,8 +398,16 @@ phase_preflight() {
slack "preflight — k8s-master already on v$TARGET_VERSION; skipping kubeadm-plan-target gate (workers still pending)" slack "preflight — k8s-master already on v$TARGET_VERSION; skipping kubeadm-plan-target gate (workers still pending)"
echo "k8s-master already on v$TARGET_VERSION — skipping kubeadm-plan-target gate" echo "k8s-master already on v$TARGET_VERSION — skipping kubeadm-plan-target gate"
else else
# Pass the EXPLICIT target version. Without it, `kubeadm upgrade plan` (run by
# the OLD kubeadm still on master) auto-proposes only the current minor's
# latest patch — fine for a patch upgrade, but for a MINOR upgrade it never
# proposes the next minor, so plan_target came back as 1.34.x (or empty,
# "falling back to stable-1.34") and this gate ABORTED every 1.35 attempt
# (blocked last night's 1.34.9->1.35.6 run even though the compat gate passed,
# 2026-06-24). `kubeadm upgrade plan v1.35.6` correctly emits the
# "kubeadm upgrade apply v1.35.6" line (verified on master). Works for patches too.
local plan_target local plan_target
plan_target=$(ssh "${SSH_OPTS[@]}" "$(ssh_target k8s-master)" 'sudo kubeadm upgrade plan --ignore-preflight-errors=CoreDNSMigration,CoreDNSUnsupportedPlugins' \ plan_target=$(ssh "${SSH_OPTS[@]}" "$(ssh_target k8s-master)" "sudo kubeadm upgrade plan v$TARGET_VERSION --ignore-preflight-errors=CoreDNSMigration,CoreDNSUnsupportedPlugins" \
| grep -oE 'kubeadm upgrade apply v[0-9]+\.[0-9]+\.[0-9]+' \ | grep -oE 'kubeadm upgrade apply v[0-9]+\.[0-9]+\.[0-9]+' \
| grep -oE 'v[0-9]+\.[0-9]+\.[0-9]+' | head -1 | tr -d v) | grep -oE 'v[0-9]+\.[0-9]+\.[0-9]+' | head -1 | tr -d v)
if [ "$plan_target" != "$TARGET_VERSION" ]; then if [ "$plan_target" != "$TARGET_VERSION" ]; then