infra/stacks/dashy/main.tf

variable "tls_secret_name" {
  type = string
  sensitive = true
}


module "tls_secret" {
  source          = "../../modules/kubernetes/setup_tls_secret"
  namespace       = kubernetes_namespace.dashy.metadata[0].name
  tls_secret_name = var.tls_secret_name
}

resource "kubernetes_namespace" "dashy" {
  metadata {
    name = "dashy"
    labels = {
      "istio-injection" : "disabled"
      tier = local.tiers.aux
    }
  }
}

resource "kubernetes_config_map" "config" {
  metadata {
    name      = "config"
    namespace = kubernetes_namespace.dashy.metadata[0].name

    annotations = {
      "reloader.stakater.com/match" = "true"
    }
  }

  data = {
    "conf.yml" = file("${path.module}/conf.yml")
  }
}

resource "kubernetes_deployment" "dashy" {
  metadata {
    name      = "dashy"
    namespace = kubernetes_namespace.dashy.metadata[0].name
    labels = {
      app  = "dashy"
      tier = local.tiers.aux
    }
    annotations = {
      "reloader.stakater.com/search" = "true"
    }
  }
  spec {
    replicas = 1
    selector {
      match_labels = {
        app = "dashy"
      }
    }
    template {
      metadata {
        annotations = {
          # "diun.enable" = "true"
        }
        labels = {
          app = "dashy"
        }
      }
      spec {
        container {
          image = "lissy93/dashy:latest"
          name  = "dashy"

          resources {
            requests = {
              cpu    = "50m"
              memory = "512Mi"
            }
            limits = {
              cpu    = "500m"
              memory = "1Gi"
            }
          }
          port {
            container_port = 8080
          }
          volume_mount {
            name       = "config"
            mount_path = "/app/user-data/"
          }
        }
        volume {
          name = "config"
          config_map {
            name = "config"
          }
        }
      }
    }
  }
}

resource "kubernetes_service" "dashy" {
  metadata {
    name      = "dashy"
    namespace = kubernetes_namespace.dashy.metadata[0].name
    labels = {
      app = "dashy"
    }
  }

  spec {
    selector = {
      app = "dashy"
    }
    port {
      name        = "http"
      port        = 80
      target_port = 8080
    }
  }
}

module "ingress" {
  source          = "../../modules/kubernetes/ingress_factory"
  namespace       = kubernetes_namespace.dashy.metadata[0].name
  name            = "dashy"
  tls_secret_name = var.tls_secret_name
  protected       = true # hidden as we use homepage now
}
[ci skip] phase 5+6: update CI pipelines for SOPS, add sensitive=true to secret vars Phase 5 — CI pipelines: - default.yml: add SOPS decrypt in prepare step, change git add . to specific paths (stacks/ state/ .woodpecker/), cleanup on success+failure - renew-tls.yml: change git add . to git add secrets/ state/ Phase 6 — sensitive=true: - Add sensitive = true to 256 variable declarations across 149 stack files - Prevents secret values from appearing in terraform plan output - Does NOT modify shared modules (ingress_factory, nfs_volume) to avoid breaking module interface contracts Note: CI pipeline SOPS decryption requires sops_age_key Woodpecker secret to be created before the pipeline will work with SOPS. Until then, the old terraform.tfvars path continues to function. 2026-03-07 14:30:36 +00:00			`variable "tls_secret_name" {`
			`type = string`
			`sensitive = true`
			`}`
[ci skip] Phase 3: Create 66 service stacks and migrate state Generated individual stack directories for all 66 services under stacks/. Each stack has terragrunt.hcl (depends on platform) and main.tf (thin wrapper calling existing module). Migrated all 64 active service states from root terraform.tfstate to individual state files. Root state is now empty. Verified with terragrunt plan on multiple stacks (no changes). 2026-02-22 13:56:34 +00:00

[ci skip] Flatten module wrappers into stack roots Remove the module "xxx" { source = "./module" } indirection layer from all 66 service stacks. Resources are now defined directly in each stack's main.tf instead of through a wrapper module. - Merge module/main.tf contents into stack main.tf - Apply variable replacements (var.tier -> local.tiers.X, renamed vars) - Fix shared module paths (one fewer ../ at each level) - Move extra files/dirs (factory/, chart_values, subdirs) to stack root - Update state files to strip module.<name>. prefix - Update CLAUDE.md to reflect flat structure Verified: terragrunt plan shows 0 add, 0 destroy across all stacks. 2026-02-22 15:13:55 +00:00			`module "tls_secret" {`
			`source = "../../modules/kubernetes/setup_tls_secret"`
			`namespace = kubernetes_namespace.dashy.metadata[0].name`
[ci skip] Move Terraform modules into stack directories Move all 88 service modules (66 individual + 22 platform) from modules/kubernetes/<service>/ into their corresponding stack directories: - Service stacks: stacks/<service>/module/ - Platform stack: stacks/platform/modules/<service>/ This collocates module source code with its Terragrunt definition. Only shared utility modules remain in modules/kubernetes/: ingress_factory, setup_tls_secret, dockerhub_secret, oauth-proxy. All cross-references to shared modules updated to use correct relative paths. Verified with terragrunt run --all -- plan: 0 adds, 0 destroys across all 68 stacks. 2026-02-22 14:38:14 +00:00			`tls_secret_name = var.tls_secret_name`
[ci skip] Flatten module wrappers into stack roots Remove the module "xxx" { source = "./module" } indirection layer from all 66 service stacks. Resources are now defined directly in each stack's main.tf instead of through a wrapper module. - Merge module/main.tf contents into stack main.tf - Apply variable replacements (var.tier -> local.tiers.X, renamed vars) - Fix shared module paths (one fewer ../ at each level) - Move extra files/dirs (factory/, chart_values, subdirs) to stack root - Update state files to strip module.<name>. prefix - Update CLAUDE.md to reflect flat structure Verified: terragrunt plan shows 0 add, 0 destroy across all stacks. 2026-02-22 15:13:55 +00:00			`}`

			`resource "kubernetes_namespace" "dashy" {`
			`metadata {`
			`name = "dashy"`
			`labels = {`
			`"istio-injection" : "disabled"`
			`tier = local.tiers.aux`
			`}`
			`}`
			`}`

			`resource "kubernetes_config_map" "config" {`
			`metadata {`
			`name = "config"`
			`namespace = kubernetes_namespace.dashy.metadata[0].name`

			`annotations = {`
			`"reloader.stakater.com/match" = "true"`
			`}`
			`}`

			`data = {`
			`"conf.yml" = file("${path.module}/conf.yml")`
			`}`
			`}`

			`resource "kubernetes_deployment" "dashy" {`
			`metadata {`
			`name = "dashy"`
			`namespace = kubernetes_namespace.dashy.metadata[0].name`
			`labels = {`
			`app = "dashy"`
			`tier = local.tiers.aux`
			`}`
			`annotations = {`
			`"reloader.stakater.com/search" = "true"`
			`}`
			`}`
			`spec {`
			`replicas = 1`
			`selector {`
			`match_labels = {`
			`app = "dashy"`
			`}`
			`}`
			`template {`
			`metadata {`
			`annotations = {`
			`# "diun.enable" = "true"`
			`}`
			`labels = {`
			`app = "dashy"`
			`}`
			`}`
			`spec {`
			`container {`
			`image = "lissy93/dashy:latest"`
			`name = "dashy"`

			`resources {`
			`requests = {`
[ci skip] right-size all pod resources based on VPA + live metrics audit Full cluster resource audit: cross-referenced Goldilocks VPA recommendations, live kubectl top metrics, and Terraform definitions for 100+ containers. Critical fixes: - dashy: CPU throttled at 98% (490m/500m) → 2 CPU limit - stirling-pdf: CPU throttled at 99.7% (299m/300m) → 2 CPU limit - traefik auth-proxy/bot-block-proxy: mem limit 32Mi → 128Mi Added explicit resources to ~40 containers that had none: - audiobookshelf, changedetection, cyberchef, dawarich, diun, echo, excalidraw, freshrss, hackmd, isponsorblocktv, linkwarden, n8n, navidrome, ntfy, owntracks, privatebin, send, shadowsocks, tandoor, tor-proxy, wealthfolio, networking-toolbox, rybbit, mailserver, cloudflared, pgadmin, phpmyadmin, crowdsec-web, xray, wireguard, k8s-portal, tuya-bridge, ollama-ui, whisper, piper, immich-server, immich-postgresql, osrm-foot GPU containers: added CPU/mem alongside GPU limits: - ollama: removed CPU/mem limits (models vary in size), keep GPU only - frigate: req 500m/2Gi, lim 4/8Gi + GPU - immich-ml: req 100m/1Gi, lim 2/4Gi + GPU Right-sized ~25 over-provisioned containers: - kms-web-page: 500m/512Mi → 50m/64Mi (was using 0m/10Mi) - onlyoffice: CPU 8 → 2 (VPA upper 45m) - realestate-crawler-api: CPU 2000m → 250m - blog/travel-blog/webhook-handler: 500m → 100m - coturn/health/plotting-book: reduced to match actual usage Conservative methodology: limits = max(VPA upper * 2, live usage * 2) 2026-03-01 19:18:50 +00:00			`cpu = "50m"`
resource quota review: fix OOM risks, close quota gaps, add HA protections Phase 1 - OOM fixes: - dashy: increase memory limit 512Mi→1Gi (was at 99% utilization) - caretta DaemonSet: set explicit resources 300Mi/512Mi (was at 85-98%) - mysql-operator: add Helm resource values 256Mi/512Mi, create namespace with tier label (was at 92% of LimitRange default) - prowlarr, flaresolverr, annas-archive-stacks: add explicit resources (outgrowing 256Mi LimitRange defaults) - real-estate-crawler celery: add resources 512Mi/3Gi (608Mi actual, no explicit resources) Phase 2 - Close quota gaps: - nvidia, real-estate-crawler, trading-bot: remove custom-quota=true labels so Kyverno generates tier-appropriate quotas - descheduler: add tier=1-cluster label for proper classification Phase 3 - Reduce excessive quotas: - monitoring: limits.memory 240Gi→64Gi, limits.cpu 120→64 - woodpecker: limits.memory 128Gi→32Gi, limits.cpu 64→16 - GPU tier default: limits.memory 96Gi→32Gi, limits.cpu 48→16 Phase 4 - Kubelet protection: - Add cpu: 200m to systemReserved and kubeReserved in kubelet template Phase 5 - HA improvements: - cloudflared: add topology spread (ScheduleAnyway) + PDB (maxUnavailable:1) - grafana: add topology spread + PDB via Helm values - crowdsec LAPI: add topology spread + PDB via Helm values - authentik server: add topology spread via Helm values - authentik worker: add topology spread + PDB via Helm values 2026-03-08 18:17:46 +00:00			`memory = "512Mi"`
[ci skip] Flatten module wrappers into stack roots Remove the module "xxx" { source = "./module" } indirection layer from all 66 service stacks. Resources are now defined directly in each stack's main.tf instead of through a wrapper module. - Merge module/main.tf contents into stack main.tf - Apply variable replacements (var.tier -> local.tiers.X, renamed vars) - Fix shared module paths (one fewer ../ at each level) - Move extra files/dirs (factory/, chart_values, subdirs) to stack root - Update state files to strip module.<name>. prefix - Update CLAUDE.md to reflect flat structure Verified: terragrunt plan shows 0 add, 0 destroy across all stacks. 2026-02-22 15:13:55 +00:00			`}`
			`limits = {`
[ci skip] reduce resource limits per VPA recommendations dashy: 4Gi→512Mi mem, 2→500m cpu (actual: 206Mi) affine: 4Gi→512Mi mem, 2→1 cpu (actual: 186Mi) rybbit clickhouse: 4Gi→2Gi mem, 2→1 cpu (actual: 618Mi) 2026-03-06 20:23:21 +00:00			`cpu = "500m"`
resource quota review: fix OOM risks, close quota gaps, add HA protections Phase 1 - OOM fixes: - dashy: increase memory limit 512Mi→1Gi (was at 99% utilization) - caretta DaemonSet: set explicit resources 300Mi/512Mi (was at 85-98%) - mysql-operator: add Helm resource values 256Mi/512Mi, create namespace with tier label (was at 92% of LimitRange default) - prowlarr, flaresolverr, annas-archive-stacks: add explicit resources (outgrowing 256Mi LimitRange defaults) - real-estate-crawler celery: add resources 512Mi/3Gi (608Mi actual, no explicit resources) Phase 2 - Close quota gaps: - nvidia, real-estate-crawler, trading-bot: remove custom-quota=true labels so Kyverno generates tier-appropriate quotas - descheduler: add tier=1-cluster label for proper classification Phase 3 - Reduce excessive quotas: - monitoring: limits.memory 240Gi→64Gi, limits.cpu 120→64 - woodpecker: limits.memory 128Gi→32Gi, limits.cpu 64→16 - GPU tier default: limits.memory 96Gi→32Gi, limits.cpu 48→16 Phase 4 - Kubelet protection: - Add cpu: 200m to systemReserved and kubeReserved in kubelet template Phase 5 - HA improvements: - cloudflared: add topology spread (ScheduleAnyway) + PDB (maxUnavailable:1) - grafana: add topology spread + PDB via Helm values - crowdsec LAPI: add topology spread + PDB via Helm values - authentik server: add topology spread via Helm values - authentik worker: add topology spread + PDB via Helm values 2026-03-08 18:17:46 +00:00			`memory = "1Gi"`
[ci skip] Flatten module wrappers into stack roots Remove the module "xxx" { source = "./module" } indirection layer from all 66 service stacks. Resources are now defined directly in each stack's main.tf instead of through a wrapper module. - Merge module/main.tf contents into stack main.tf - Apply variable replacements (var.tier -> local.tiers.X, renamed vars) - Fix shared module paths (one fewer ../ at each level) - Move extra files/dirs (factory/, chart_values, subdirs) to stack root - Update state files to strip module.<name>. prefix - Update CLAUDE.md to reflect flat structure Verified: terragrunt plan shows 0 add, 0 destroy across all stacks. 2026-02-22 15:13:55 +00:00			`}`
			`}`
			`port {`
			`container_port = 8080`
			`}`
			`volume_mount {`
			`name = "config"`
			`mount_path = "/app/user-data/"`
			`}`
			`}`
			`volume {`
			`name = "config"`
			`config_map {`
			`name = "config"`
			`}`
			`}`
			`}`
			`}`
			`}`
			`}`

			`resource "kubernetes_service" "dashy" {`
			`metadata {`
			`name = "dashy"`
			`namespace = kubernetes_namespace.dashy.metadata[0].name`
			`labels = {`
			`app = "dashy"`
			`}`
			`}`

			`spec {`
			`selector = {`
			`app = "dashy"`
			`}`
			`port {`
			`name = "http"`
			`port = 80`
			`target_port = 8080`
			`}`
			`}`
			`}`

			`module "ingress" {`
			`source = "../../modules/kubernetes/ingress_factory"`
			`namespace = kubernetes_namespace.dashy.metadata[0].name`
			`name = "dashy"`
			`tls_secret_name = var.tls_secret_name`
			`protected = true # hidden as we use homepage now`
[ci skip] Phase 3: Create 66 service stacks and migrate state Generated individual stack directories for all 66 services under stacks/. Each stack has terragrunt.hcl (depends on platform) and main.tf (thin wrapper calling existing module). Migrated all 64 active service states from root terraform.tfstate to individual state files. Root state is now empty. Verified with terragrunt plan on multiple stacks (no changes). 2026-02-22 13:56:34 +00:00			`}`