infra/stacks/speedtest/main.tf
Viktor Barzin 327ce215b9 [infra] Sweep dns_config ignore_changes across all pod-owning resources [ci skip]
## Context

Wave 3A (commit c9d221d5) added the `# KYVERNO_LIFECYCLE_V1` marker to the
27 pre-existing `ignore_changes = [...dns_config]` sites so they could be
grepped and audited. It did NOT address pod-owning resources that were
simply missing the suppression entirely. Post-Wave-3A sampling (2026-04-18)
found that navidrome, f1-stream, frigate, servarr, monitoring, crowdsec,
and many other stacks showed perpetual `dns_config` drift every plan
because their `kubernetes_deployment` / `kubernetes_stateful_set` /
`kubernetes_cron_job_v1` resources had no `lifecycle {}` block at all.

Root cause (same as Wave 3A): Kyverno's admission webhook stamps
`dns_config { option { name = "ndots"; value = "2" } }` on every pod's
`spec.template.spec.dns_config` to prevent NxDomain search-domain flooding
(see `k8s-ndots-search-domain-nxdomain-flood` skill). Without `ignore_changes`
on every Terraform-managed pod-owner, Terraform repeatedly tries to strip
the injected field.

## This change

Extends the Wave 3A convention by sweeping EVERY `kubernetes_deployment`,
`kubernetes_stateful_set`, `kubernetes_daemon_set`, `kubernetes_cron_job_v1`,
`kubernetes_job_v1` (+ their `_v1` variants) in the repo and ensuring each
carries the right `ignore_changes` path:

- **kubernetes_deployment / stateful_set / daemon_set / job_v1**:
  `spec[0].template[0].spec[0].dns_config`
- **kubernetes_cron_job_v1**:
  `spec[0].job_template[0].spec[0].template[0].spec[0].dns_config`
  (extra `job_template[0]` nesting — the CronJob's PodTemplateSpec is
  one level deeper)

Each injection / extension is tagged `# KYVERNO_LIFECYCLE_V1: Kyverno
admission webhook mutates dns_config with ndots=2` inline so the
suppression is discoverable via `rg 'KYVERNO_LIFECYCLE_V1' stacks/`.

Two insertion paths are handled by a Python pass (`/tmp/add_dns_config_ignore.py`):

1. **No existing `lifecycle {}`**: inject a brand-new block just before the
   resource's closing `}`. 108 new blocks on 93 files.
2. **Existing `lifecycle {}` (usually for `DRIFT_WORKAROUND: CI owns image tag`
   from Wave 4, commit a62b43d1)**: extend its `ignore_changes` list with the
   dns_config path. Handles both inline (`= [x]`) and multiline
   (`= [\n  x,\n]`) forms; ensures the last pre-existing list item carries
   a trailing comma so the extended list is valid HCL. 34 extensions.

The script skips anything already mentioning `dns_config` inside an
`ignore_changes`, so re-running is a no-op.

## Scale

- 142 total lifecycle injections/extensions
- 93 `.tf` files touched
- 108 brand-new `lifecycle {}` blocks + 34 extensions of existing ones
- Every Tier 0 and Tier 1 stack with a pod-owning resource is covered
- Together with Wave 3A's 27 pre-existing markers → **169 greppable
  `KYVERNO_LIFECYCLE_V1` dns_config sites across the repo**

## What is NOT in this change

- `stacks/trading-bot/main.tf` — entirely commented-out block (`/* … */`).
  Python script touched the file, reverted manually.
- `_template/main.tf.example` skeleton — kept minimal on purpose; any
  future stack created from it should either inherit the Wave 3A one-line
  form or add its own on first `kubernetes_deployment`.
- `terraform fmt` fixes to pre-existing alignment issues in meshcentral,
  nvidia/modules/nvidia, vault — unrelated to this commit. Left for a
  separate fmt-only pass.
- Non-pod resources (`kubernetes_service`, `kubernetes_secret`,
  `kubernetes_manifest`, etc.) — they don't own pods so they don't get
  Kyverno dns_config mutation.

## Verification

Random sample post-commit:
```
$ cd stacks/navidrome && ../../scripts/tg plan  → No changes.
$ cd stacks/f1-stream && ../../scripts/tg plan  → No changes.
$ cd stacks/frigate && ../../scripts/tg plan    → No changes.

$ rg -c 'KYVERNO_LIFECYCLE_V1' stacks/ --include='*.tf' --include='*.tf.example' \
    | awk -F: '{s+=$2} END {print s}'
169
```

## Reproduce locally
1. `git pull`
2. `rg 'KYVERNO_LIFECYCLE_V1' stacks/ | wc -l` → 169+
3. `cd stacks/navidrome && ../../scripts/tg plan` → expect 0 drift on
   the deployment's dns_config field.

Refs: code-seq (Wave 3B dns_config class closed; kubernetes_manifest
annotation class handled separately in 8d94688d for tls_secret)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 21:19:48 +00:00

250 lines
6.2 KiB
HCL
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

variable "tls_secret_name" {
type = string
sensitive = true
}
variable "nfs_server" { type = string }
variable "mysql_host" { type = string }
resource "kubernetes_namespace" "speedtest" {
metadata {
name = "speedtest"
labels = {
tier = local.tiers.aux
}
}
lifecycle {
# KYVERNO_LIFECYCLE_V1: goldilocks-vpa-auto-mode ClusterPolicy stamps this label on every namespace
ignore_changes = [metadata[0].labels["goldilocks.fairwinds.com/vpa-update-mode"]]
}
}
resource "kubernetes_manifest" "external_secret" {
manifest = {
apiVersion = "external-secrets.io/v1beta1"
kind = "ExternalSecret"
metadata = {
name = "speedtest-secrets"
namespace = "speedtest"
}
spec = {
refreshInterval = "15m"
secretStoreRef = {
name = "vault-database"
kind = "ClusterSecretStore"
}
target = {
name = "speedtest-secrets"
}
data = [{
secretKey = "db_password"
remoteRef = {
key = "static-creds/mysql-speedtest"
property = "password"
}
}]
}
}
depends_on = [kubernetes_namespace.speedtest]
}
module "tls_secret" {
source = "../../modules/kubernetes/setup_tls_secret"
namespace = kubernetes_namespace.speedtest.metadata[0].name
tls_secret_name = var.tls_secret_name
}
resource "random_id" "secret_key" {
byte_length = 32 # 32 bytes × 2 hex chars = 64 hex characters
}
resource "kubernetes_persistent_volume_claim" "config_proxmox" {
wait_until_bound = false
metadata {
name = "speedtest-config-proxmox"
namespace = kubernetes_namespace.speedtest.metadata[0].name
annotations = {
"resize.topolvm.io/threshold" = "80%"
"resize.topolvm.io/increase" = "100%"
"resize.topolvm.io/storage_limit" = "5Gi"
}
}
spec {
access_modes = ["ReadWriteOnce"]
storage_class_name = "proxmox-lvm"
resources {
requests = {
storage = "1Gi"
}
}
}
}
resource "kubernetes_deployment" "speedtest" {
metadata {
name = "speedtest"
namespace = kubernetes_namespace.speedtest.metadata[0].name
labels = {
app = "speedtest"
tier = local.tiers.aux
}
annotations = {
"reloader.stakater.com/auto" = "true"
}
}
spec {
replicas = 1
strategy {
type = "Recreate"
}
selector {
match_labels = {
app = "speedtest"
}
}
template {
metadata {
labels = {
app = "speedtest"
}
annotations = {
"diun.enable" = "true"
"diun.include_tags" = "^\\d+\\.\\d+\\.\\d+$"
"dependency.kyverno.io/wait-for" = "mysql.dbaas:3306"
}
}
spec {
container {
image = "lscr.io/linuxserver/speedtest-tracker:0.24.1"
name = "speedtest"
port {
container_port = 80
}
env {
name = "PUID"
value = 1000
}
env {
name = "PGID"
value = 1000
}
env {
name = "APP_KEY"
value = "base64:${random_id.secret_key.b64_std}"
}
env {
name = "SPEEDTEST_SCHEDULE"
value = "0 * * * *"
}
# env {
# name = "SPEEDTEST_SERVERS"
# # Sofia speedtest servers - https://c.speedtest.net/speedtest-servers-static.php
# value = "7617,17787,11348,37980,54640,27843,57118,10754,20191,29617"
# }
env {
name = "APP_URL"
value = "https://speedtest.viktorbarzin.me"
}
env {
name = "DB_CONNECTION"
value = "mysql"
}
env {
name = "DB_HOST"
value = var.mysql_host
}
env {
name = "DB_DATABASE"
value = "speedtest"
}
env {
name = "DB_USERNAME"
value = "speedtest"
}
env {
name = "DB_PASSWORD"
value_from {
secret_key_ref {
name = "speedtest-secrets"
key = "db_password"
}
}
}
env {
name = "DB_PORT"
value = "3306"
}
env {
name = "APP_TIMEZONE"
value = "Europe/Sofia"
}
resources {
requests = {
cpu = "25m"
memory = "128Mi"
}
limits = {
memory = "512Mi"
}
}
volume_mount {
name = "config"
mount_path = "/config"
}
}
volume {
name = "config"
persistent_volume_claim {
claim_name = kubernetes_persistent_volume_claim.config_proxmox.metadata[0].name
}
}
}
}
}
lifecycle {
# KYVERNO_LIFECYCLE_V1: Kyverno admission webhook mutates dns_config with ndots=2
ignore_changes = [spec[0].template[0].spec[0].dns_config]
}
}
resource "kubernetes_service" "speedtest" {
metadata {
name = "speedtest"
namespace = kubernetes_namespace.speedtest.metadata[0].name
labels = {
"app" = "speedtest"
}
annotations = {
"prometheus.io/scrape" = "false"
}
}
spec {
selector = {
app = "speedtest"
}
port {
name = "http"
port = 80
target_port = 80
}
}
}
module "ingress" {
source = "../../modules/kubernetes/ingress_factory"
dns_type = "proxied"
namespace = kubernetes_namespace.speedtest.metadata[0].name
name = "speedtest"
tls_secret_name = var.tls_secret_name
protected = true
extra_annotations = {
"gethomepage.dev/enabled" = "true"
"gethomepage.dev/name" = "Speedtest"
"gethomepage.dev/description" = "Internet speed tracker"
"gethomepage.dev/icon" = "speedtest-tracker.png"
"gethomepage.dev/group" = "Infrastructure"
"gethomepage.dev/widget.type" = "speedtest"
"gethomepage.dev/widget.url" = "http://speedtest.speedtest.svc.cluster.local"
"gethomepage.dev/pod-selector" = ""
}
}