infra/stacks/k8s-version-upgrade
Viktor Barzin 8aff0ba1a2 k8s-version-upgrade: fix two more grep-pipefail bugs
Same `grep -v` / `set -o pipefail` interaction as commit 10b261d2,
in two more callsites the previous fix didn't cover:

  Line 354 (phase_master): control-plane Running check —
    `grep -v Running | wc -l` returns 1 when all pods are Running
    (the happy path), aborting the chain right after master upgrades.

  Line 419 (phase_postflight): on-target node check —
    `grep -v ":v$TARGET_VERSION$" | wc -l` returns 1 when all nodes
    are on the target version (the happy path, exactly when postflight
    should succeed). Aborts at the moment of victory.

Forensics on yesterday's master Job failure (see commit message of
10b261d2 for context): the master Job spawned 16s after the previous
fix's TF apply, before configmap propagation completed on the kubelet.
With those two latent bugs also looming, the chain would have died
post-master-upgrade and again at postflight even if propagation had
been timely.

Wrapping each grep in `{ ... || true; }` so a no-matches result
returns success.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-22 14:17:00 +00:00
..
scripts k8s-version-upgrade: fix two more grep-pipefail bugs 2026-05-22 14:17:00 +00:00
job-template.yaml k8s-version-upgrade: decompose into Job chain to fix self-preemption 2026-05-22 14:16:45 +00:00
main.tf k8s-version-upgrade: switch detection cron from weekly to daily 2026-05-22 14:16:57 +00:00
terragrunt.hcl k8s-version-upgrade: automated kubeadm/kubelet/kubectl upgrade pipeline 2026-05-22 14:16:42 +00:00