infra/stacks/metrics-server/modules/metrics-server/main.tf
Viktor Barzin 1b340ef531 keel: enroll 15 critical-path namespaces for digest-only auto-update
Per user decision today: monitoring, mailserver, vault, descheduler,
metrics-server, traefik, technitium, crowdsec, redis, reverse-proxy,
reloader, headscale, wireguard, xray, cloudflared now participate in
the same `force + match-tag` regime as the rest of the cluster — Keel
watches the deployment's CURRENT tag for digest changes only and rolls
on push, never rewriting tag strings.

Two-part change:

stacks/kyverno/modules/kyverno/keel-annotations.tf
  Trim the policy-level namespace exclude list from 31 → 16. The 16
  remaining exclusions are the irreducible cluster-operator + state-
  coupled set: keel itself, calico-system + tigera-operator (operator
  loop), authentik (2026-05-17 pgbouncer incident bite), cnpg-system +
  dbaas (state-coupled), kyverno, metallb-system, external-secrets,
  proxmox-csi + nfs-csi + nvidia (just stabilized today, chart-pinned),
  kube-system, vpa, sealed-secrets, infra-maintenance.

stacks/<each-of-15>/.../main.tf
  Add `"keel.sh/enrolled" = "true"` label to the `kubernetes_namespace`
  resource so the Kyverno mutate policy can target the workloads via
  its namespaceSelector matchLabels.

Note on the apply path: the live ClusterPolicy was patched via
`kubectl patch` because the hashicorp/kubernetes provider v3.1.0 panics
during state refresh on Kyverno ClusterPolicy schemas with deeply
nested optional `context.celPreconditions` / `imageRegistry` fields
(see crash dump). The TF source above has the desired state, so any
clean future apply on a fixed provider version will be a no-op against
the live cluster.

Floating-tag workloads in the newly-enrolled set (will roll on every
upstream digest update — acceptable risk per user):
  - wireguard: sclevine/wg:latest (image fixed today via iptables-nft
    postStart shim)
  - xray: teddysun/xray
  - crowdsec-web: viktorbarzin/crowdsec_web
  - monitoring: prompve/prometheus-pve-exporter:latest, prom/snmp-exporter
  - traefik: nginx:1-alpine, openresty/openresty:alpine,
    ghcr.io/tarampampam/error-pages:3
  - redis: haproxy:3.1-alpine, redis:8-alpine

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-17 12:13:22 +00:00

34 lines
1 KiB
HCL

variable "tls_secret_name" {}
variable "tier" { type = string }
resource "kubernetes_namespace" "metrics-server" {
metadata {
name = "metrics-server"
labels = {
tier = var.tier
"keel.sh/enrolled" = "true"
}
}
lifecycle {
# KYVERNO_LIFECYCLE_V1: goldilocks-vpa-auto-mode ClusterPolicy stamps this label on every namespace
ignore_changes = [metadata[0].labels["goldilocks.fairwinds.com/vpa-update-mode"]]
}
}
module "tls_secret" {
source = "../../../../modules/kubernetes/setup_tls_secret"
namespace = kubernetes_namespace.metrics-server.metadata[0].name
tls_secret_name = var.tls_secret_name
}
resource "helm_release" "metrics-server" {
namespace = kubernetes_namespace.metrics-server.metadata[0].name
create_namespace = false
name = "metrics-server"
atomic = true
repository = "https://kubernetes-sigs.github.io/metrics-server/"
chart = "metrics-server"
values = [templatefile("${path.module}/values.yaml", {})]
}