infra/stacks
Viktor Barzin ad9f6c8f41 k8s-version-upgrade: halt_on_alert allowlist (severity=critical only)
Refactored halt_on_alert_query from denylist ("ignore these noisy alerts")
to an allowlist ("only halt on severity=critical"). Today's blocking
alerts were all warning/info-level and not actual upgrade blockers:
  - PodCrashLooping (gpu-operator on the GPU node, code-8vr0, long-standing)
  - IngressTTFBHigh (Traefik latency, transient)
  - NodeHighIOWait (chicken-and-egg with our own upgrade I/O)
  - RecentNodeReboot (chain causes this itself)

severity=critical filtering is more robust than maintaining a denylist
of every noisy alert that crops up. extra_ignore parameter kept for
backwards compatibility but is rarely needed now (critical alerts are
the only ones that should actually halt the chain).

Tested end-to-end this session — master successfully upgraded to v1.34.8
via the autonomous chain after the apiserver state-repair (apiserver
manifest had been pinned at v1.34.2 from a previous month's rollback;
required a one-time manual edit + kubelet reload to bring back to v1.34.7,
after which the chain ran cleanly).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-23 09:14:39 +00:00
..
_template ingress_factory: replace protected bool with auth enum + audit pass across 100 stacks 2026-05-10 18:53:49 +00:00
actualbudget recruiter-responder: bump image_tag to 189ef901 2026-05-16 12:41:05 +00:00
affine recruiter-responder: bump image_tag to 189ef901 2026-05-16 12:41:05 +00:00
authentik authentik: worker replicas 3 -> 2 2026-05-21 09:14:35 +00:00
beads-server beads-server: codify Keel annotations on Dolt deployment (drift cleanup) 2026-05-17 22:22:40 +00:00
blog final wave: enroll immich + status-page, retrigger 17 pending Bucket A 2026-05-16 23:19:20 +00:00
broker-sync broker-sync(imap): fix command name + add fsGroup for sync.db writes 2026-05-22 14:41:54 +00:00
calico security(wave1): W1.6 expand observation from recruiter-responder pilot → tier 3+4 (82 namespaces) 2026-05-19 22:14:16 +00:00
changedetection enrolled-patch stacks: ignore image drift from Keel auto-update 2026-05-16 13:24:16 +00:00
chrome-service recruiter-responder: bump image_tag to 189ef901 2026-05-16 12:41:05 +00:00
city-guesser enrolled-patch stacks: ignore image drift from Keel auto-update 2026-05-16 13:24:16 +00:00
claude-agent-service recruiter-triage: AI culture & tooling section + warm-engage AI ask 2026-05-16 13:14:27 +00:00
claude-memory recruiter-responder: bump image_tag to 189ef901 2026-05-16 12:41:05 +00:00
cloudflared mailserver: decommission SendGrid 2026-05-22 20:08:38 +00:00
cnpg cnpg: bump webhook-cert renewal threshold 7d -> 30d 2026-05-22 15:00:41 +00:00
coturn enrolled-patch stacks: ignore image drift from Keel auto-update 2026-05-16 13:24:16 +00:00
crowdsec keel: enroll 15 critical-path namespaces for digest-only auto-update 2026-05-17 12:13:22 +00:00
cyberchef final wave: enroll immich + status-page, retrigger 17 pending Bucket A 2026-05-16 23:19:20 +00:00
dashy enrolled-patch stacks: ignore image drift from Keel auto-update 2026-05-16 13:24:16 +00:00
dawarich enrolled-patch stacks: ignore image drift from Keel auto-update 2026-05-16 13:24:16 +00:00
dbaas dbaas: opt MySQL out of Keel + add do-not-bump warning 2026-05-19 13:21:03 +00:00
descheduler keel: enroll 15 critical-path namespaces for digest-only auto-update 2026-05-17 12:13:22 +00:00
diun enrolled-patch stacks: ignore image drift from Keel auto-update 2026-05-16 13:24:16 +00:00
ebook2audiobook enrolled-patch stacks: ignore image drift from Keel auto-update 2026-05-16 13:24:16 +00:00
ebooks enrolled-patch stacks: ignore image drift from Keel auto-update 2026-05-16 13:24:16 +00:00
echo enrolled-patch stacks: ignore image drift from Keel auto-update 2026-05-16 13:24:16 +00:00
excalidraw enrolled-patch stacks: ignore image drift from Keel auto-update 2026-05-16 13:24:16 +00:00
external-secrets recruiter-responder: bump image_tag to 189ef901 2026-05-16 12:41:05 +00:00
f1-stream final wave: enroll immich + status-page, retrigger 17 pending Bucket A 2026-05-16 23:19:20 +00:00
fire-planner fire-planner: COL refresh CronJob + Grafana Cost-of-Living dashboard 2026-05-22 14:15:38 +00:00
foolery recruiter-responder: bump image_tag to 189ef901 2026-05-16 12:41:05 +00:00
forgejo forgejo: disable source archive ZIP/TAR downloads 2026-05-21 09:12:20 +00:00
freedify recruiter-responder: bump image_tag to 189ef901 2026-05-16 12:41:05 +00:00
freshrss infra: add kubectl + authentik providers across 6 stacks 2026-05-21 08:07:22 +00:00
frigate ci: retrigger v2 — apply pending Keel-enrolled stacks (#697 was cancelled by #698) 2026-05-16 13:47:13 +00:00
grampsweb ci: retrigger v2 — apply pending Keel-enrolled stacks (#697 was cancelled by #698) 2026-05-16 13:47:13 +00:00
hackmd ci: retrigger v2 — apply pending Keel-enrolled stacks (#697 was cancelled by #698) 2026-05-16 13:47:13 +00:00
headscale keel: enroll 15 critical-path namespaces for digest-only auto-update 2026-05-17 12:13:22 +00:00
health ci: retrigger v2 — apply pending Keel-enrolled stacks (#697 was cancelled by #698) 2026-05-16 13:47:13 +00:00
hermes-agent ci: retrigger v2 — apply pending Keel-enrolled stacks (#697 was cancelled by #698) 2026-05-16 13:47:13 +00:00
homepage final wave: enroll immich + status-page, retrigger 17 pending Bucket A 2026-05-16 23:19:20 +00:00
immich final wave: enroll immich + status-page, retrigger 17 pending Bucket A 2026-05-16 23:19:20 +00:00
infra [forgejo] Phases 3+4+5: cutover, decommission, docs sweep 2026-05-07 18:30:02 +00:00
infra-maintenance [infra] Sweep dns_config ignore_changes across all pod-owning resources [ci skip] 2026-04-18 21:19:48 +00:00
insta2spotify ci: retrigger v2 — apply pending Keel-enrolled stacks (#697 was cancelled by #698) 2026-05-16 13:47:13 +00:00
instagram-poster Bucket A retrigger + Bucket D enrollment (5 module-nested stacks) 2026-05-16 23:10:38 +00:00
isponsorblocktv ci: retrigger v2 — apply pending Keel-enrolled stacks (#697 was cancelled by #698) 2026-05-16 13:47:13 +00:00
job-hunter ci: retrigger v2 — apply pending Keel-enrolled stacks (#697 was cancelled by #698) 2026-05-16 13:47:13 +00:00
jsoncrack final wave: enroll immich + status-page, retrigger 17 pending Bucket A 2026-05-16 23:19:20 +00:00
k8s-dashboard final wave: enroll immich + status-page, retrigger 17 pending Bucket A 2026-05-16 23:19:20 +00:00
k8s-portal Bucket A retrigger + Bucket D enrollment (5 module-nested stacks) 2026-05-16 23:10:38 +00:00
k8s-version-upgrade k8s-version-upgrade: halt_on_alert allowlist (severity=critical only) 2026-05-23 09:14:39 +00:00
keel upgrade-state: skill + script + Keel scrape for periodic three-pipeline audit 2026-05-18 10:50:43 +00:00
kms final wave: enroll immich + status-page, retrigger 17 pending Bucket A 2026-05-16 23:19:20 +00:00
kured ci: retrigger v2 — apply pending Keel-enrolled stacks (#697 was cancelled by #698) 2026-05-16 13:47:13 +00:00
kyverno kyverno: allowlist woodpeckerci/* for CI step pods 2026-05-23 08:52:48 +00:00
linkwarden infra: add kubectl + authentik providers across 6 stacks 2026-05-21 08:07:22 +00:00
llama-cpp Bucket C: enroll 5 raw-deploy stacks in Keel auto-update 2026-05-16 23:14:43 +00:00
local-path final wave: enroll immich + status-page, retrigger 17 pending Bucket A 2026-05-16 23:19:20 +00:00
mailserver mailserver: decommission SendGrid 2026-05-22 20:08:38 +00:00
matrix ci: retrigger v2 — apply pending Keel-enrolled stacks (#697 was cancelled by #698) 2026-05-16 13:47:13 +00:00
meshcentral ci: retrigger v2 — apply pending Keel-enrolled stacks (#697 was cancelled by #698) 2026-05-16 13:47:13 +00:00
metallb keel: enroll 11 more namespaces (operators + critical infra) 2026-05-17 20:59:14 +00:00
metrics-server keel: enroll 15 critical-path namespaces for digest-only auto-update 2026-05-17 12:13:22 +00:00
monitoring technitium: add viktorbarzin.me apex DNS drift probe + alerts 2026-05-23 08:41:14 +00:00
n8n ci: retrigger v2 — apply pending Keel-enrolled stacks (#697 was cancelled by #698) 2026-05-16 13:47:13 +00:00
navidrome infra: add kubectl + authentik providers across 6 stacks 2026-05-21 08:07:22 +00:00
netbox ci: retrigger v2 — apply pending Keel-enrolled stacks (#697 was cancelled by #698) 2026-05-16 13:47:13 +00:00
networking-toolbox ci: retrigger v2 — apply pending Keel-enrolled stacks (#697 was cancelled by #698) 2026-05-16 13:47:13 +00:00
nextcloud ci: retrigger v2 — apply pending Keel-enrolled stacks (#697 was cancelled by #698) 2026-05-16 13:47:13 +00:00
nfs-csi keel: enroll 11 more namespaces (operators + critical infra) 2026-05-17 20:59:14 +00:00
nodelocal-dns [dns] NodeLocal DNSCache — deploy DaemonSet to all nodes (WS C) 2026-04-19 15:46:41 +00:00
novelapp Woodpecker CI deploy [CI SKIP] 2026-05-16 23:17:44 +00:00
ntfy ci: retrigger v2 — apply pending Keel-enrolled stacks (#697 was cancelled by #698) 2026-05-16 13:47:13 +00:00
nvidia nvidia: bump driver container memory limit 128Mi → 2Gi 2026-05-17 11:23:52 +00:00
onlyoffice ci: retrigger v2 — apply pending Keel-enrolled stacks (#697 was cancelled by #698) 2026-05-16 13:47:13 +00:00
openclaw openclaw: switch primary to nim/meta/llama-3.1-70b-instruct 2026-05-22 15:23:17 +00:00
osm_routing final wave: enroll immich + status-page, retrigger 17 pending Bucket A 2026-05-16 23:19:20 +00:00
owntracks ci: retrigger v2 — apply pending Keel-enrolled stacks (#697 was cancelled by #698) 2026-05-16 13:47:13 +00:00
paperless-mcp paperless-mcp: deploy MCP for AI document search 2026-05-17 11:14:35 +00:00
paperless-ngx ci: retrigger v2 — apply pending Keel-enrolled stacks (#697 was cancelled by #698) 2026-05-16 13:47:13 +00:00
payslip-ingest ci: retrigger v2 — apply pending Keel-enrolled stacks (#697 was cancelled by #698) 2026-05-16 13:47:13 +00:00
phpipam ci: retrigger v2 — apply pending Keel-enrolled stacks (#697 was cancelled by #698) 2026-05-16 13:47:13 +00:00
platform [infra] Add Cloudflare provider to all stack lock files and generated providers 2026-04-16 16:31:36 +00:00
plotting-book Woodpecker CI deploy [CI SKIP] 2026-05-16 23:17:44 +00:00
poison-fountain ci: retrigger v2 — apply pending Keel-enrolled stacks (#697 was cancelled by #698) 2026-05-16 13:47:13 +00:00
postiz postiz: disable unused providers + pin temporal vs Keel force-policy 2026-05-21 10:04:22 +00:00
priority-pass ci: retrigger v2 — apply pending Keel-enrolled stacks (#697 was cancelled by #698) 2026-05-16 13:47:13 +00:00
privatebin ci: retrigger v2 — apply pending Keel-enrolled stacks (#697 was cancelled by #698) 2026-05-16 13:47:13 +00:00
proxmox-csi keel: enroll 11 more namespaces (operators + critical infra) 2026-05-17 20:59:14 +00:00
pvc-autoresizer [infra] Suppress Goldilocks vpa-update-mode label drift on all namespaces [ci skip] 2026-04-18 21:15:27 +00:00
rbac [infra] Migrate Terraform state from local SOPS to PostgreSQL backend 2026-04-16 19:33:12 +00:00
real-estate-crawler realestate-crawler: dockerhub pull-secret + lift image-pin on ui/api 2026-05-18 19:11:43 +00:00
recruiter-responder openclaw: enable recruiter-api plugin (allowlist + manifest contracts) 2026-05-20 21:56:11 +00:00
redis keel: enroll 15 critical-path namespaces for digest-only auto-update 2026-05-17 12:13:22 +00:00
reloader keel: enroll 15 critical-path namespaces for digest-only auto-update 2026-05-17 12:13:22 +00:00
resume ci: retrigger v2 — apply pending Keel-enrolled stacks (#697 was cancelled by #698) 2026-05-16 13:47:13 +00:00
reverse-proxy keel: enroll 15 critical-path namespaces for digest-only auto-update 2026-05-17 12:13:22 +00:00
rybbit ci: retrigger v2 — apply pending Keel-enrolled stacks (#697 was cancelled by #698) 2026-05-16 13:47:13 +00:00
sealed-secrets keel: enroll 11 more namespaces (operators + critical infra) 2026-05-17 20:59:14 +00:00
send ci: retrigger v2 — apply pending Keel-enrolled stacks (#697 was cancelled by #698) 2026-05-16 13:47:13 +00:00
servarr keel: enroll 11 more namespaces (operators + critical infra) 2026-05-17 20:59:14 +00:00
shadowsocks ci: retrigger v2 — apply pending Keel-enrolled stacks (#697 was cancelled by #698) 2026-05-16 13:47:13 +00:00
speedtest ci: retrigger v2 — apply pending Keel-enrolled stacks (#697 was cancelled by #698) 2026-05-16 13:47:13 +00:00
status-page [infra] Establish KYVERNO_LIFECYCLE_V1 drift-suppression convention [ci skip] 2026-04-18 14:15:51 +00:00
stirling-pdf ci: retrigger v2 — apply pending Keel-enrolled stacks (#697 was cancelled by #698) 2026-05-16 13:47:13 +00:00
tandoor infra: add kubectl + authentik providers across 6 stacks 2026-05-21 08:07:22 +00:00
technitium technitium: add viktorbarzin.me apex DNS drift probe + alerts 2026-05-23 08:41:14 +00:00
terminal terminal: probe + alerts after Traefik replica routing-table skew 2026-05-17 10:04:26 +00:00
tor-proxy ci: retrigger v3 — apply remaining 22 Keel-enrolled stacks 2026-05-16 14:06:39 +00:00
trading-bot trading-bot: pin Meet Kevin LLM model to claude-haiku-4-5 2026-05-22 20:43:05 +00:00
traefik traefik: bump auth-proxy nginx header buffers to handle Authentik cookie pile 2026-05-23 08:34:33 +00:00
travel_blog final wave: enroll immich + status-page, retrigger 17 pending Bucket A 2026-05-16 23:19:20 +00:00
tuya-bridge ci: retrigger v3 — apply remaining 22 Keel-enrolled stacks 2026-05-16 14:06:39 +00:00
uptime-kuma Bucket A retrigger + Bucket D enrollment (5 module-nested stacks) 2026-05-16 23:10:38 +00:00
url ci: retrigger v3 — apply remaining 22 Keel-enrolled stacks 2026-05-16 14:06:39 +00:00
vault trading-bot: revive K8s stack + add meet-kevin-watcher 2026-05-22 11:23:30 +00:00
vaultwarden Bucket A retrigger + Bucket D enrollment (5 module-nested stacks) 2026-05-16 23:10:38 +00:00
vpa keel: enroll 11 more namespaces (operators + critical infra) 2026-05-17 20:59:14 +00:00
wealthfolio Woodpecker CI deploy [CI SKIP] 2026-05-16 13:45:45 +00:00
webhook_handler final wave: enroll immich + status-page, retrigger 17 pending Bucket A 2026-05-16 23:19:20 +00:00
whisper ci: retrigger v3 — apply remaining 22 Keel-enrolled stacks 2026-05-16 14:06:39 +00:00
wireguard keel: enroll 15 critical-path namespaces for digest-only auto-update 2026-05-17 12:13:22 +00:00
woodpecker ci: retrigger v3 — apply remaining 22 Keel-enrolled stacks 2026-05-16 14:06:39 +00:00
xray keel: enroll 15 critical-path namespaces for digest-only auto-update 2026-05-17 12:13:22 +00:00
ytdlp ci: retrigger v3 — apply remaining 22 Keel-enrolled stacks 2026-05-16 14:06:39 +00:00