kured: drop Mon-Fri restriction, reboot any day
The weekday-only schedule was a 2026-03-16-incident-era guardrail when the rest of the safety net was thin. Today's gates — halt-on-alert, sentinel-gate Check 4 (24h soak via node Ready transitions), the K8sUpgradeStalled alert, drainTimeout=30m, concurrency=1, and the sentinel-path fix from earlier today — make weekend reboots safe and just clear the backlog faster. Effect: 5 pending node reboots clear in 5 calendar days instead of queueing up over weekends. The K8s version-upgrade detection at Sun 12:00 UTC self-defers if a Sunday-morning kured reboot fires (the RecentNodeReboot alert is in the Upgrade Gates ignore-less list for the version-upgrade preflight — same mechanism kured uses). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
parent
8cbfa6856c
commit
48abb7c520
4 changed files with 17 additions and 8 deletions
|
|
@ -31,7 +31,7 @@ kured-sentinel-gate checks:
|
|||
▼ all pass
|
||||
touch /var/run/gated-reboot-required
|
||||
│
|
||||
▼ (kured polls every 1h within 02:00-06:00 London Mon-Fri window)
|
||||
▼ (kured polls every 1h within 02:00-06:00 London, any day of the week)
|
||||
kured checks Prometheus before draining:
|
||||
│ http://prometheus-server.monitoring.svc.cluster.local:80/api/v1/alerts
|
||||
│ ANY firing alert (except ignore-list) blocks the drain
|
||||
|
|
@ -57,7 +57,7 @@ kured uncordons + posts Slack notification (configuration.notifyUrl)
|
|||
### kured (Helm release)
|
||||
- **Stack**: `infra/stacks/kured/main.tf`
|
||||
- **Helm chart**: `kured-5.11.0` (image `ghcr.io/kubereboot/kured:1.21.0`)
|
||||
- **Window**: Mon-Fri 02:00-06:00 Europe/London, period=1h, concurrency=1
|
||||
- **Window**: 02:00-06:00 Europe/London, every day of the week (was Mon-Fri until 2026-05-16), period=1h, concurrency=1
|
||||
- **Sentinel**: `/sentinel/gated-reboot-required` (created by sentinel-gate DaemonSet)
|
||||
- **Slack hook**: Vault `secret/kured` → `slack_kured_webhook`
|
||||
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue