From 0a6b2489f751c472c31a97f66820f433961fec9d Mon Sep 17 00:00:00 2001 From: Viktor Barzin Date: Sat, 16 May 2026 13:13:16 +0000 Subject: [PATCH] =?UTF-8?q?keel:=20default=20policy=20=E2=86=92=20never=20?= =?UTF-8?q?(post-incident=20safe=20default)?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit 2026-05-16 incident: Keel's `force` policy switched semver-pinned images (affine 0.26.6 → :nightly-latest, calico v3.26.1 → :master) instead of digest-tracking. Force is documented as "always update to the newest tag in the registry" — only safe on already-mutable tags like :latest. Changing the cluster-wide default in inject-keel-annotations to `never`. The namespace enrollment label + V2 lifecycle suppression stay in place so opt-in is one annotation per Deployment, but no service auto-updates until explicitly approved. To opt in a workload now: 1. Verify the Deployment image is on a mutable tag (:latest, :, or a vendor "stable" tag) — change in Terraform first if needed. 2. Add to the Deployment's metadata.annotations: "keel.sh/policy" = "force" (digest tracking) OR "keel.sh/policy" = "patch" (semver patch bumps — also requires ignore_changes on the image) Live policy already updated via kubectl apply + per-workload override (force → never). Co-Authored-By: Claude Opus 4.7 --- .../modules/kyverno/keel-annotations.tf | 25 ++++++++++++++++++- 1 file changed, 24 insertions(+), 1 deletion(-) diff --git a/stacks/kyverno/modules/kyverno/keel-annotations.tf b/stacks/kyverno/modules/kyverno/keel-annotations.tf index 9b833a4e..5675ac9e 100644 --- a/stacks/kyverno/modules/kyverno/keel-annotations.tf +++ b/stacks/kyverno/modules/kyverno/keel-annotations.tf @@ -68,7 +68,30 @@ resource "kubernetes_manifest" "policy_inject_keel_annotations" { metadata = { annotations = { # `+(...)` only adds if not present; per-workload overrides win. - "+(keel.sh/policy)" = "force" + # + # DEFAULT IS `never` — Keel ignores the workload. + # + # Rationale (post 2026-05-16 incident): Keel's `force` policy + # is documented as "always update to the newest tag in the + # registry," not "watch current tag for digest changes." On + # services pinned to semver (e.g. calico/node:v3.26.1, + # affine:0.26.6), force triggers a tag REWRITE — Keel switched + # affine → :nightly-latest and calico → :master. Calico was + # auto-healed by tigera-operator; affine had to be rolled back. + # + # Safe enablement now requires per-WORKLOAD opt-in: + # (a) ensure the Deployment's image is on a MUTABLE tag — + # `:latest` (force works), `:` like `:16`/`:7`, + # or a vendor "stable" tag. + # (b) override THIS default by setting the Deployment's + # metadata.annotations["keel.sh/policy"] to `force` + # (digest tracking on the mutable tag) or `patch`/`minor` + # (semver bumps, requires `ignore_changes` on image). + # + # The namespace enrollment label + V2 lifecycle remain in + # place so opt-in is a one-line annotation per Deployment, + # without touching the namespace or refactoring lifecycle. + "+(keel.sh/policy)" = "never" "+(keel.sh/trigger)" = "poll" "+(keel.sh/pollSchedule)" = "@every 1h" }