Make k8s upgrades (patch AND minor) autonomous without being reckless: the chain
attempts every upgrade but refuses unless it can prove the target is safe. A
refusal is a BLOCK (not a crash) — it halts the chain and signals for attention.
- compat-gate.py: read-only preflight check. Blocks if (a) a critical addon's
running version doesn't support the target k8s minor, (b) an in-use deprecated
API (apiserver_requested_deprecated_apis) is removed at/before the target, or
(c) a node's containerd is below the target's floor. Validated against the live
cluster: correctly blocks 1.35/1.36 today on Calico 3.26 / ESO 0.12 / kyverno
1.16 (all behind), which is exactly the auto-halt we want until they're bumped.
- addon-compat.json: curated addon -> max-supported-k8s matrix (Calico, ESO,
kyverno, gpu-operator + containerd floor), sourced from each project's compat
docs (2026-06-19). The keystone data the gate reads; keep current.
- upgrade-step.sh: phase_preflight runs the gate FIRST (before any mutation);
block() pushes k8s_upgrade_blocked=1 + Slacks the reasons + halts.
- main.tf: detector minor-probe fix (curl -sILo so the 302 from pkgs.k8s.io
resolves to 200 — minors were never being detected). Gated behind the compat
gate above, so enabling minor detection can't roll an unsafe minor.
Not pushed yet: deploys with the K8sUpgradeBlocked alert + deeper postflight +
runbook (next commit) so the detector fix only goes live with the full net.