Reverses the March 2026 outage mitigation that disabled unattended-
upgrades cluster-wide. Now re-enables it on the k8s template VM with:
- Allowed-Origins limited to security/updates pockets
- Package-Blacklist for k8s/containerd/runc/calico-node (apt-mark
hold on the cluster-critical components)
- Automatic-Reboot disabled — kured drives the actual reboots
- Compatible with the existing kured + sentinel-gate flow
kured side:
- rebootDelay 30s, concurrency 1
- Sentinel cool-down stretched 30m → 24h (aligns with the 24h soak
window from the post-mortem)
- prometheusUrl + alertFilterRegexp wired so any firing non-ignored
alert halts the rollout. Ignore-list excludes self-referential
alerts (Watchdog/RebootRequired/KuredNodeWasNotDrained/
InfoInhibitor) that would otherwise deadlock kured.
Prometheus side (already partly landed in 6c4e0966 — the "Upgrade
Gates" rule group):
- Refine `KubeQuotaAlmostFull` to include the resourcequota label in
both the on-clause and the summary, so multi-quota namespaces
(authentik, beads-server, frigate) report the quota name correctly.
grafana.tf: terraform fmt whitespace only.
Together with the post-mortem 2026-03-22 (memory id=390) the loop is
closed: unattended-upgrades runs again, kernel-class updates can land,
but only when cluster health is green and the reboot window is open.
The kured sentinel gate DaemonSet requires /sentinel to exist on
all nodes. Without it, kured pods get stuck in ContainerCreating
with hostPath mount failure. Previously created manually; now
provisioned automatically for new nodes.