k8s-version-upgrade: make tigera-operator restore crash-safe (EXIT trap)
All checks were successful
ci/woodpecker/push/default Pipeline was successful
All checks were successful
ci/woodpecker/push/default Pipeline was successful
phase_master quiesces tigera-operator (Calico's config reconciler) to 0 around
the master upgrade so it can't crashloop during the apiserver blip + I/O-storm
kubeadm's static-pod-hash watch (which would roll the upgrade back). The restore
was a plain line at the end of the phase, so any abort AFTER quiescing left the
operator at 0 — and the idempotent retry then skipped the already-on-target
master phase and never restored it. Observed 2026-06-17: a post-upgrade gate
aborted the master attempt; the operator sat scaled to 0 for ~1.5h (data plane
fine — calico-node keeps running — but no Calico reconciliation).
Fix:
- Drain first (drain doesn't blip the apiserver), THEN quiesce right before
`kubeadm upgrade apply`, and install an EXIT trap that restores the operator
no matter how the phase exits (gate abort, set -e on ssh/kubeadm, success).
Trap is set AFTER drain_node so its own EXIT trap can't clobber it; cleared
after the explicit happy-path restore.
- postflight also force-restores replicas=1 as a final guarantee (covers the
skip-on-retry path that never quiesces or restores).
Long-term fix remains HA control plane (apiserver never goes down) — bead code-n0ow.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
parent
c04efa3d3a
commit
b931d9fb20
1 changed files with 20 additions and 8 deletions
|
|
@ -466,17 +466,22 @@ phase_master() {
|
||||||
# keeps running unchanged — only the OPERATOR (a config reconciler) goes away
|
# keeps running unchanged — only the OPERATOR (a config reconciler) goes away
|
||||||
# briefly. Restored at the end of the phase below.
|
# briefly. Restored at the end of the phase below.
|
||||||
#
|
#
|
||||||
# If the chain dies between quiesce and restore (e.g. kubeadm fails),
|
|
||||||
# manually restore with:
|
|
||||||
# kubectl -n tigera-operator scale deploy tigera-operator --replicas=1
|
|
||||||
#
|
|
||||||
# Long-term fix: HA control plane (3 masters) so apiserver never goes down
|
# Long-term fix: HA control plane (3 masters) so apiserver never goes down
|
||||||
# — see docs/plans/2026-05-21-ha-control-plane-{design,plan}.md (beads code-n0ow).
|
# — see docs/plans/2026-05-21-ha-control-plane-{design,plan}.md (beads code-n0ow).
|
||||||
echo "Quiescing tigera-operator before master upgrade (it crashes on apiserver outage)"
|
|
||||||
$KUBECTL -n tigera-operator scale deploy tigera-operator --replicas=0 2>&1 || true
|
|
||||||
|
|
||||||
drain_node k8s-master
|
drain_node k8s-master
|
||||||
|
|
||||||
|
# Quiesce tigera-operator ONLY for the kubeadm window. Drain happens FIRST (it
|
||||||
|
# doesn't blip the apiserver — only the static-pod swaps in `kubeadm upgrade
|
||||||
|
# apply` do), then quiesce right before that. The EXIT trap GUARANTEES the
|
||||||
|
# operator is restored even if any step below aborts: on 2026-06-17 the master
|
||||||
|
# run aborted on a post-upgrade gate AFTER quiescing, the idempotent retry then
|
||||||
|
# skipped the (now already-on-target) master phase, and the operator sat at 0
|
||||||
|
# for ~1.5h. The trap is set AFTER drain_node so drain_node's own EXIT trap
|
||||||
|
# (background predrain-watcher cleanup) can't clobber it.
|
||||||
|
echo "Quiescing tigera-operator for the kubeadm window (it crashes on apiserver outage)"
|
||||||
|
$KUBECTL -n tigera-operator scale deploy tigera-operator --replicas=0 2>&1 || true
|
||||||
|
trap '$KUBECTL -n tigera-operator scale deploy tigera-operator --replicas=1 >/dev/null 2>&1 || true' EXIT
|
||||||
|
|
||||||
slack "Running update_k8s.sh on k8s-master (--role master --release $TARGET_VERSION)"
|
slack "Running update_k8s.sh on k8s-master (--role master --release $TARGET_VERSION)"
|
||||||
ssh "${SSH_OPTS[@]}" "$(ssh_target k8s-master)" 'bash -s' \
|
ssh "${SSH_OPTS[@]}" "$(ssh_target k8s-master)" 'bash -s' \
|
||||||
< "$UPDATE_K8S_SH" -- --role master --release "$TARGET_VERSION"
|
< "$UPDATE_K8S_SH" -- --role master --release "$TARGET_VERSION"
|
||||||
|
|
@ -499,9 +504,10 @@ phase_master() {
|
||||||
alerts=$(halt_on_alert_query "RecentNodeReboot|IngressTTFBCritical")
|
alerts=$(halt_on_alert_query "RecentNodeReboot|IngressTTFBCritical")
|
||||||
[ -n "$alerts" ] && { slack "ABORT master — alerts firing post-upgrade: $alerts"; exit 1; }
|
[ -n "$alerts" ] && { slack "ABORT master — alerts firing post-upgrade: $alerts"; exit 1; }
|
||||||
|
|
||||||
# Restore tigera-operator (quiesced before drain). It reconciles in seconds.
|
# Restore tigera-operator (happy path) + clear the safety-net EXIT trap.
|
||||||
echo "Restoring tigera-operator"
|
echo "Restoring tigera-operator"
|
||||||
$KUBECTL -n tigera-operator scale deploy tigera-operator --replicas=1 2>&1 || true
|
$KUBECTL -n tigera-operator scale deploy tigera-operator --replicas=1 2>&1 || true
|
||||||
|
trap - EXIT
|
||||||
|
|
||||||
slack "Master on v$TARGET_VERSION, control-plane Running. Dispatching worker chain."
|
slack "Master on v$TARGET_VERSION, control-plane Running. Dispatching worker chain."
|
||||||
}
|
}
|
||||||
|
|
@ -568,6 +574,12 @@ phase_worker() {
|
||||||
phase_postflight() {
|
phase_postflight() {
|
||||||
slack "Running postflight"
|
slack "Running postflight"
|
||||||
|
|
||||||
|
# Belt-and-suspenders: ensure tigera-operator is back to 1 at chain end. The
|
||||||
|
# master phase's EXIT trap already restores it, but a master phase that was
|
||||||
|
# skipped-on-retry (master already on target) never quiesces OR restores, so
|
||||||
|
# if an earlier aborted attempt left it down this is the final guarantee.
|
||||||
|
$KUBECTL -n tigera-operator scale deploy tigera-operator --replicas=1 >/dev/null 2>&1 || true
|
||||||
|
|
||||||
# All nodes at target
|
# All nodes at target
|
||||||
local versions wrong
|
local versions wrong
|
||||||
versions=$($KUBECTL get nodes -o jsonpath='{range .items[*]}{.metadata.name}:{.status.nodeInfo.kubeletVersion}{"\n"}{end}')
|
versions=$($KUBECTL get nodes -o jsonpath='{range .items[*]}{.metadata.name}:{.status.nodeInfo.kubeletVersion}{"\n"}{end}')
|
||||||
|
|
|
||||||
Loading…
Add table
Add a link
Reference in a new issue