k8s-version-upgrade: switch detection cron from weekly to daily
Was `0 12 * * 0` (Sun 12:00 UTC) — patch releases waited up to 6 days before the chain picked them up. Now `0 12 * * *` (daily 12:00 UTC, still outside kured's 02:00-06:00 London window). Concurrency is bounded by Forbid + deterministic job-name idempotency (the detection job exits early if a preflight Job for the same target already exists), so back-to-back days can't pile up parallel runs. - stacks/k8s-version-upgrade/main.tf: var.schedule default + rationale comment - scripts/upgrade_state.sh: rename next_sunday_noon_utc -> next_daily_noon_utc (now returns "Tue 2026-05-19 12:00 UTC" form); change "(Sun cron)" label to "(daily cron)" - .claude/skills/upgrade-state/SKILL.md: cadence column + frontmatter Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
parent
018ef3790f
commit
3d43d96a5e
3 changed files with 22 additions and 21 deletions
|
|
@ -8,7 +8,7 @@ description: |
|
|||
(2) User asks "what's pending upgrade" or "what's the upgrade state",
|
||||
(3) User asks if Keel / kured / k8s-version-check is healthy,
|
||||
(4) User asks about kept-back / held packages or pending reboots,
|
||||
(5) Before the Sunday `k8s-version-check` CronJob fires (weekly survey).
|
||||
(5) Periodic survey before the next `k8s-version-check` daily run.
|
||||
Read-only — no `--fix`. Exits 0 healthy / 1 attention / 2 stalled.
|
||||
author: Claude Code
|
||||
version: 1.0.0
|
||||
|
|
@ -51,7 +51,7 @@ Exit codes: `0` healthy, `1` attention warranted, `2` stalled / broken.
|
|||
|---|---|---|---|
|
||||
| **Apps** | Keel polls every watched Deployment's container registry; rolls on new digest | hourly | Prom (`pending_approvals`, `registries_scanned_total`), Keel pod logs |
|
||||
| **OS** | `unattended-upgrades` in-release patching; `kured` reboots when `/var/run/reboot-required` is set | daily 02:00-06:00 London | SSH fan-out to all 5 nodes |
|
||||
| **K8s** | `k8s-version-check` CronJob detects new kubeadm patch/minor; spawns the Job-chain that drains+upgrades node-by-node | Sun 12:00 UTC | Pushgateway (`k8s_upgrade_*`), `kubectl get nodes` |
|
||||
| **K8s** | `k8s-version-check` CronJob detects new kubeadm patch/minor; spawns the Job-chain that drains+upgrades node-by-node | daily 12:00 UTC | Pushgateway (`k8s_upgrade_*`), `kubectl get nodes` |
|
||||
|
||||
The K8s pipeline pushes a small set of gauges to the Prometheus
|
||||
Pushgateway (`prometheus-prometheus-pushgateway.monitoring:9091`):
|
||||
|
|
@ -138,8 +138,9 @@ kubectl -n kured get pods -l name=kured-sentinel-gate
|
|||
|
||||
### K8s `→` — patch/minor available
|
||||
|
||||
Detection ran, target identified, chain NOT started. This is normal
|
||||
between Sun 12:00 UTC detection and the next Job chain.
|
||||
Detection ran, target identified, chain NOT started. The chain spawns
|
||||
on the same daily detection cycle — typically within ~24h of the
|
||||
target first being detected.
|
||||
|
||||
```bash
|
||||
# Inspect Pushgateway state
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue