kured: drop Mon-Fri restriction, reboot any day
The weekday-only schedule was a 2026-03-16-incident-era guardrail when the rest of the safety net was thin. Today's gates — halt-on-alert, sentinel-gate Check 4 (24h soak via node Ready transitions), the K8sUpgradeStalled alert, drainTimeout=30m, concurrency=1, and the sentinel-path fix from earlier today — make weekend reboots safe and just clear the backlog faster. Effect: 5 pending node reboots clear in 5 calendar days instead of queueing up over weekends. The K8s version-upgrade detection at Sun 12:00 UTC self-defers if a Sunday-morning kured reboot fires (the RecentNodeReboot alert is in the Upgrade Gates ignore-less list for the version-upgrade preflight — same mechanism kured uses). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
parent
2e52583abd
commit
411524a10d
4 changed files with 17 additions and 8 deletions
|
|
@ -57,7 +57,13 @@ resource "helm_release" "kured" {
|
|||
timeZone = "Europe/London"
|
||||
startTime = "02:00"
|
||||
endTime = "06:00"
|
||||
rebootDays = ["mo", "tu", "we", "th", "fr"]
|
||||
# All 7 days — operator decision 2026-05-16. The Mon–Fri restriction
|
||||
# was a 2026-03-16-era guardrail (overlapping with weekend on-call
|
||||
# response). Today the rest of the safety net (halt-on-alert,
|
||||
# sentinel-gate Check 4 = 24h soak, single-concurrency, the
|
||||
# K8sUpgradeStalled alert) is strong enough to operate any day; the
|
||||
# weekday-only schedule was just slowing the backlog down.
|
||||
rebootDays = ["mo", "tu", "we", "th", "fr", "sa", "su"]
|
||||
# IMPORTANT: must match where kured-sentinel-gate writes (below):
|
||||
# `touch /host/var-run/gated-reboot-required` → host
|
||||
# `/var/run/gated-reboot-required`. The kured chart derives the host
|
||||
|
|
@ -90,7 +96,7 @@ resource "helm_release" "kured" {
|
|||
alertFiringOnly = true
|
||||
alertFilterMatchOnly = false
|
||||
}
|
||||
reboot_days = "mon,tue,wed,thu,fri"
|
||||
reboot_days = "mon,tue,wed,thu,fri,sat,sun"
|
||||
window_end = "06:00"
|
||||
window_start = "22:00"
|
||||
service = {
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue