infra/terragrunt.hcl
Viktor Barzin 1082cba0fb kyverno(wave1): swap kubernetes_manifest → kubectl_manifest + flip 3 security policies to Enforce
## Resolves code-e2dp (Kyverno TF apply blocked)
Root cause: terraform-provider-kubernetes v3.1.0 panics on plan/refresh of
kubernetes_manifest resources holding Kyverno ClusterPolicy CRDs (large
CEL/foreach schemas). Workaround: swap to gavinbunney/kubectl_manifest which
treats manifests as opaque YAML strings.

## Migration mechanics
- Root terragrunt.hcl: added gavinbunney/kubectl provider declaration so all
  stacks get it generated in providers.tf.
- stacks/kyverno/modules/kyverno/versions.tf (new): module-level provider source
  declaration (required for kubectl_manifest in a child module).
- Converted 17 kubernetes_manifest resources across 7 files to kubectl_manifest
  with yaml_body = yamlencode({...}). depends_on chains preserved.
- terraform state rm for all 17 old kubernetes_manifest entries.
- stacks/kyverno/imports.tf (new): TF 1.5+ import blocks mapping each
  kubectl_manifest to its live cluster resource by apiVersion//Kind//name ID.
- One resource (policy_inject_keel_annotations) needed kubectl delete + recreate
  because the kubectl provider couldn't patch it cleanly (resourceVersion=0
  invalid for update — gotcha when adopting a resource previously
  kubernetes_manifest-owned).

## W1.4 — security policies Audit → Enforce (LIVE)
Three policies flipped: deny-privileged-containers, deny-host-namespaces,
restrict-sys-admin. Verified live via kubectl. failurePolicy=Ignore preserved.

## Shared exclude list (35 namespaces)
local.security_policy_exclude_namespaces in security-policies.tf.
- 31 critical from memory id=1970 (Keel rollout list)
- + frigate (camera HW transcoding needs host access)
- + kured (privileged DaemonSet for node reboots)
- + default (etcd backup/defrag CronJobs use hostNetwork)
- + changedetection (uses SYS_ADMIN for chromium sandbox)

## W1.5 — require-trusted-registries stays Audit
Pattern */* allows anything-with-a-slash; Enforce would be a no-op for supply
chain. Tracked under beads code-8ywc as follow-up.

## TF import-blocks
The imports.tf file should be removed in a follow-up cleanup commit once
verified — TF doesn't auto-clean these.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Closes: code-e2dp
2026-05-22 14:16:58 +00:00

141 lines
3.6 KiB
HCL

# Root Terragrunt configuration
# Provides DRY provider, backend, and variable loading for all stacks.
# Two-tier state backend:
# Tier 0 (bootstrap): local state, SOPS-encrypted in git — must exist before PG is reachable.
# Tier 1 (everything else): PG backend on CNPG cluster, native pg_advisory_lock.
locals {
tier0_stacks = ["infra", "platform", "cnpg", "vault", "dbaas", "external-secrets"]
stack_name = replace(path_relative_to_include(), "stacks/", "")
is_tier0 = contains(local.tier0_stacks, local.stack_name)
}
remote_state {
backend = local.is_tier0 ? "local" : "pg"
generate = {
path = "backend.tf"
if_exists = "overwrite_terragrunt"
}
config = local.is_tier0 ? {
path = "${get_repo_root()}/state/${path_relative_to_include()}/terraform.tfstate"
} : {
conn_str = get_env("PG_CONN_STR", "")
schema_name = local.stack_name
}
}
# Load config.tfvars (plaintext). Secrets come from Vault KV — authenticate via `vault login -method=oidc`.
terraform {
extra_arguments "common_vars" {
commands = get_terraform_commands_that_need_vars()
required_var_files = [
"${get_repo_root()}/config.tfvars"
]
}
extra_arguments "no_backup" {
commands = ["apply", "plan", "destroy", "import"]
arguments = ["-backup=-"]
}
extra_arguments "kube_config" {
commands = get_terraform_commands_that_need_vars()
arguments = [
"-var", "kube_config_path=${get_repo_root()}/config"
]
}
}
# Generate kubernetes + helm + cloudflare providers for all stacks.
# The infra stack overrides this to add the proxmox provider.
generate "k8s_providers" {
path = "providers.tf"
if_exists = "overwrite_terragrunt"
contents = <<EOF
terraform {
required_providers {
vault = {
source = "hashicorp/vault"
version = "~> 4.0"
}
cloudflare = {
source = "cloudflare/cloudflare"
version = "~> 4"
}
authentik = {
source = "goauthentik/authentik"
version = "~> 2024.10"
}
# kubectl (gavinbunney) — workaround for hashicorp/kubernetes
# `kubernetes_manifest` panics on Kyverno CRDs. See beads code-e2dp.
# Declared for all stacks but only used where opted-in.
kubectl = {
source = "gavinbunney/kubectl"
version = "~> 1.14"
}
}
}
variable "kube_config_path" {
type = string
default = "~/.kube/config"
}
provider "kubernetes" {
config_path = var.kube_config_path
}
provider "helm" {
kubernetes = {
config_path = var.kube_config_path
}
}
provider "vault" {
address = "https://vault.viktorbarzin.me"
skip_child_token = true
}
provider "kubectl" {
config_path = var.kube_config_path
load_config_file = true
}
EOF
}
# Generate Cloudflare provider config (separate file to avoid conflicts
# with stacks that override providers.tf, e.g. infra stack).
# DNS records are created per-service via ingress_factory's dns_type param.
generate "cloudflare_provider" {
path = "cloudflare_provider.tf"
if_exists = "overwrite_terragrunt"
contents = <<EOF
data "vault_kv_secret_v2" "cf_platform" {
mount = "secret"
name = "platform"
}
provider "cloudflare" {
api_key = data.vault_kv_secret_v2.cf_platform.data["cloudflare_api_key"]
email = "vbarzin@gmail.com"
}
EOF
}
# Generate shared tiers locals for all stacks.
# Previously duplicated in 67+ stacks; now defined once here.
generate "tiers" {
path = "tiers.tf"
if_exists = "overwrite_terragrunt"
contents = <<EOF
locals {
tiers = {
core = "0-core"
cluster = "1-cluster"
gpu = "2-gpu"
edge = "3-edge"
aux = "4-aux"
}
}
EOF
}