k8s-version-upgrade: decompose into Job chain to fix self-preemption
The agent-based v1 ran inside claude-agent-service (replicas=1, no
nodeSelector) and self-evicted when it tried to drain its host (k8s-node4
on 2026-05-11). Cluster ended half-upgraded (master v1.34.7, workers
v1.34.2) until manual recovery.
Rewrite the pipeline as a chain of nodeSelector-pinned Jobs:
preflight (k8s-node1)
→ master (k8s-node1) drains k8s-master
→ worker × 4 (k8s-node1) drains k8s-node{4,3,2}
→ worker (k8s-master + control-plane toleration) drains k8s-node1
→ postflight (no pinning)
Each Job runs scripts/upgrade-step.sh (case-on-$PHASE) and ends by
envsubst-ing job-template.yaml into the next Job. Deterministic names
(k8s-upgrade-<phase>-<target_version>[-<node>]) make `kubectl apply`
idempotent — a failed Job can be re-created without duplicating
downstream.
Also lands `predrain_unstick`: deletes pods on the target node whose PDB
has 0 disruptionsAllowed. Without this, drain loops indefinitely on
single-replica deployments (e.g. every Anubis instance — discovered the
hard way during 2026-05-11 manual recovery of k8s-node3).
Adds K8sUpgradeStalled alert (in_flight + started_timestamp > 90 min).
Deprecates the agent prompt (renamed to *.deprecated.md with a header
pointer to the new code).
Apply order: k8s-version-upgrade first (consumes new SA + ConfigMaps),
then monitoring (loads the new alert). Both applied 2026-05-11.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
parent
8e13f1528e
commit
448bc0c0f6
7 changed files with 1063 additions and 394 deletions
88
stacks/k8s-version-upgrade/job-template.yaml
Normal file
88
stacks/k8s-version-upgrade/job-template.yaml
Normal file
|
|
@ -0,0 +1,88 @@
|
|||
# k8s-upgrade-chain Job template.
|
||||
#
|
||||
# Rendered by `envsubst` inside upgrade-step.sh (and the detection CronJob)
|
||||
# before `kubectl apply`. All ${VAR} placeholders are envsubst-side; this file
|
||||
# is NOT processed by Terraform.
|
||||
#
|
||||
# Required environment for envsubst:
|
||||
# JOB_NAME unique-per-(phase, target_version[, target_node])
|
||||
# PHASE_NEXT phase the Job runs (preflight|master|worker|postflight)
|
||||
# TARGET_NODE_NEXT node the Job operates on (empty for preflight/postflight)
|
||||
# TARGET_VERSION X.Y.Z
|
||||
# TARGET_VERSION_LABEL X-Y-Z (label-safe)
|
||||
# KIND patch | minor
|
||||
# IMAGE container image to run upgrade-step.sh
|
||||
# SCHEDULING_BLOCK YAML fragment with nodeSelector/tolerations (may be empty)
|
||||
#
|
||||
# Idempotency: name is deterministic per (phase, target_version[, target_node])
|
||||
# so `kubectl apply` reconciles to a single Job per run.
|
||||
apiVersion: batch/v1
|
||||
kind: Job
|
||||
metadata:
|
||||
name: ${JOB_NAME}
|
||||
namespace: k8s-upgrade
|
||||
labels:
|
||||
app: k8s-upgrade-chain
|
||||
phase: ${PHASE_NEXT}
|
||||
target-version: "${TARGET_VERSION_LABEL}"
|
||||
spec:
|
||||
ttlSecondsAfterFinished: 604800 # 7 days for postmortem review
|
||||
backoffLimit: 1
|
||||
template:
|
||||
metadata:
|
||||
labels:
|
||||
app: k8s-upgrade-chain
|
||||
phase: ${PHASE_NEXT}
|
||||
spec:
|
||||
serviceAccountName: k8s-upgrade-job
|
||||
restartPolicy: Never
|
||||
${SCHEDULING_BLOCK}
|
||||
imagePullSecrets:
|
||||
- name: registry-credentials
|
||||
containers:
|
||||
- name: upgrade-step
|
||||
image: ${IMAGE}
|
||||
env:
|
||||
- name: PHASE
|
||||
value: "${PHASE_NEXT}"
|
||||
- name: TARGET_NODE
|
||||
value: "${TARGET_NODE_NEXT}"
|
||||
- name: TARGET_VERSION
|
||||
value: "${TARGET_VERSION}"
|
||||
- name: KIND
|
||||
value: "${KIND}"
|
||||
- name: IMAGE
|
||||
value: "${IMAGE}"
|
||||
- name: HOME
|
||||
value: "/tmp"
|
||||
command: ["/bin/bash", "/scripts/upgrade-step.sh"]
|
||||
volumeMounts:
|
||||
- name: creds
|
||||
mountPath: /secrets/k8s-upgrade
|
||||
readOnly: true
|
||||
- name: scripts
|
||||
mountPath: /scripts
|
||||
readOnly: true
|
||||
- name: template
|
||||
mountPath: /template
|
||||
readOnly: true
|
||||
resources:
|
||||
requests:
|
||||
cpu: "100m"
|
||||
memory: "256Mi"
|
||||
limits:
|
||||
memory: "512Mi"
|
||||
volumes:
|
||||
- name: creds
|
||||
secret:
|
||||
secretName: k8s-upgrade-creds
|
||||
# 0444 so the non-root container can read; upgrade-step.sh copies
|
||||
# the SSH key to /tmp/ssh_key with mode 0400 for openssh.
|
||||
defaultMode: 0444
|
||||
- name: scripts
|
||||
configMap:
|
||||
name: k8s-upgrade-scripts
|
||||
defaultMode: 0755
|
||||
- name: template
|
||||
configMap:
|
||||
name: k8s-upgrade-job-template
|
||||
Loading…
Add table
Add a link
Reference in a new issue