# Terragrunt Migration Implementation Plan > **For Claude:** REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task. **Goal:** Migrate the monolithic Terraform setup (857 resources, 15MB state) to Terragrunt with per-service state isolation, proper DAG dependencies, and changed-stack CI/CD detection. **Architecture:** Flat stacks under `stacks/` with thin `main.tf` wrappers calling existing modules. Root `terragrunt.hcl` provides DRY provider/backend config. Platform stack groups ~20 core services and exports outputs (redis_host, postgresql_host, etc.) consumed by ~65 per-service stacks via Terragrunt `dependency` blocks. **Tech Stack:** Terragrunt, Terraform 1.14.x, local state backend, Drone CI **Design Doc:** `docs/plans/2026-02-22-terragrunt-migration-design.md` --- ## Task 1: Install Terragrunt and Create Directory Skeleton **Files:** - Create: `stacks/` directory - Create: `state/` directory - Create: `.gitignore` updates **Step 1: Install Terragrunt** Run: ```bash brew install terragrunt ``` Expected: Terragrunt available at `terragrunt --version` **Step 2: Create directory skeleton** Run: ```bash mkdir -p stacks/{infra,platform} mkdir -p state ``` **Step 3: Update `.gitignore`** Add to `.gitignore`: ``` # Terragrunt .terragrunt-cache/ state/ ``` The `state/` directory contains per-stack terraform state files. These are local-only and should not be committed (they contain resource IDs and potentially sensitive data, same as the current `terraform.tfstate`). **Step 4: Commit** ```bash git add stacks/ .gitignore git commit -m "[ci skip] Add Terragrunt directory skeleton" ``` --- ## Task 2: Create Root Terragrunt Configuration **Files:** - Create: `terragrunt.hcl` **Step 1: Write root terragrunt.hcl** ```hcl # Root Terragrunt configuration # Provides DRY provider, backend, and variable loading for all stacks. # Each stack gets its own local state file under state// remote_state { backend = "local" generate = { path = "backend.tf" if_exists = "overwrite_terragrunt" } config = { path = "${get_repo_root()}/state/${path_relative_to_include()}/terraform.tfstate" } } # Load terraform.tfvars for all stacks. # Variables not declared by a stack are silently ignored (Terraform 1.x behavior). terraform { extra_arguments "common_vars" { commands = get_terraform_commands_that_need_vars() required_var_files = [ "${get_repo_root()}/terraform.tfvars" ] } extra_arguments "kube_config" { commands = get_terraform_commands_that_need_vars() arguments = [ "-var", "kube_config_path=${get_repo_root()}/config" ] } } # Generate kubernetes + helm providers for K8s stacks. # The infra stack overrides this to add the proxmox provider. generate "k8s_providers" { path = "providers.tf" if_exists = "overwrite_terragrunt" contents = <&1 | head -5 ``` Expected: No parse errors (may show warnings about missing main.tf, that's fine) **Step 3: Commit** ```bash git add terragrunt.hcl git commit -m "[ci skip] Add root Terragrunt configuration" ``` --- ## Task 3: Create Infra Stack (Proxmox VMs) **Files:** - Create: `stacks/infra/terragrunt.hcl` - Create: `stacks/infra/main.tf` **Step 1: Write infra terragrunt.hcl** This stack needs the proxmox provider instead of (or in addition to) the default k8s providers. ```hcl # stacks/infra/terragrunt.hcl include "root" { path = find_in_parent_folders() } # Override provider generation to include proxmox generate "providers" { path = "providers.tf" if_exists = "overwrite_terragrunt" contents = < /etc/containerd/certs.d/docker.io/hosts.toml mkdir -p /etc/containerd/certs.d/ghcr.io printf 'server = "https://ghcr.io"\n\n[host."http://10.0.20.10:5010"]\n capabilities = ["pull", "resolve"]\n' > /etc/containerd/certs.d/ghcr.io/hosts.toml mkdir -p /etc/containerd/certs.d/quay.io printf 'server = "https://quay.io"\n\n[host."http://10.0.20.10:5020"]\n capabilities = ["pull", "resolve"]\n' > /etc/containerd/certs.d/quay.io/hosts.toml mkdir -p /etc/containerd/certs.d/registry.k8s.io printf 'server = "https://registry.k8s.io"\n\n[host."http://10.0.20.10:5030"]\n capabilities = ["pull", "resolve"]\n' > /etc/containerd/certs.d/registry.k8s.io/hosts.toml mkdir -p /etc/containerd/certs.d/reg.kyverno.io printf 'server = "https://reg.kyverno.io"\n\n[host."http://10.0.20.10:5040"]\n capabilities = ["pull", "resolve"]\n' > /etc/containerd/certs.d/reg.kyverno.io/hosts.toml sed -i 's/.*max_concurrent_downloads = 3/max_concurrent_downloads = 20/g' /etc/containerd/config.toml sudo sed -i '/serializeImagePulls:/d' /var/lib/kubelet/config.yaml && \ sudo sed -i '/maxParallelImagePulls:/d' /var/lib/kubelet/config.yaml && \ echo -e 'serializeImagePulls: false\nmaxParallelImagePulls: 50' | sudo tee -a /var/lib/kubelet/config.yaml EOF k8s_join_command = var.k8s_join_command } module "non-k8s-node-template" { source = "../../modules/create-template-vm" proxmox_host = var.proxmox_host proxmox_user = "root" ssh_private_key = var.ssh_private_key ssh_public_key = var.ssh_public_key cloud_image_url = local.cloud_init_image_url image_path = local.non_k8s_cloud_init_image_path template_id = 1000 template_name = local.non_k8s_vm_template user_passwd = var.vm_wizard_password is_k8s_template = false snippet_name = local.non_k8s_cloud_init_snippet_name } module "docker-registry-template" { source = "../../modules/create-template-vm" proxmox_host = var.proxmox_host proxmox_user = "root" ssh_private_key = var.ssh_private_key ssh_public_key = "ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIDHLhYDfyx237eJgOGVoJRECpUS95+7rEBS9vacsIxtx devvm" cloud_image_url = local.cloud_init_image_url image_path = local.non_k8s_cloud_init_image_path template_id = 1001 template_name = "docker-registry-template" user_passwd = var.vm_wizard_password is_k8s_template = false snippet_name = "docker-registry.yaml" provision_cmds = [ "mkdir -p /etc/docker-registry", format("echo %s | base64 -d > /etc/docker-registry/config.yml", base64encode( templatefile("${path.root}/../../modules/docker-registry/config.yaml", { password = var.dockerhub_registry_password }) ) ), # ... (copy remaining provision_cmds from main.tf lines 305-371) ] } module "docker-registry-vm" { source = "../../modules/create-vm" vmid = 220 vm_cpus = 4 vm_mem_mb = 4196 vm_disk_size = "64G" template_name = "docker-registry-template" vm_name = "docker-registry" cisnippet_name = "docker-registry.yaml" vm_mac_address = "DE:AD:BE:EF:22:22" bridge = "vmbr1" vlan_tag = "20" ipconfig0 = "ip=10.0.20.10/24,gw=10.0.20.1" } ``` **Note:** The `provision_cmds` for docker-registry-template is long (~60 lines). Copy it exactly from the current `main.tf` lines 296-371. The only change is `templatefile` paths: prefix with `${path.root}/../../` since the working directory is now `stacks/infra/`. **Step 3: Verify with init (do NOT apply yet)** Run: ```bash cd stacks/infra && terragrunt init ``` Expected: Successful init, providers downloaded **Step 4: Commit** ```bash git add stacks/infra/ git commit -m "[ci skip] Add infra stack (Proxmox VMs)" ``` --- ## Task 4: Migrate Infra Stack State **CRITICAL: This task modifies live state. Take a backup first.** **Step 1: Backup current state** Run: ```bash cp terraform.tfstate terraform.tfstate.backup-pre-terragrunt ``` **Step 2: List current infra resources in state** Run: ```bash terraform state list | grep -E '^module\.(k8s-node-template|non-k8s-node-template|docker-registry-template|docker-registry-vm)\.' ``` Expected: List of ~10 resources belonging to these 4 modules **Step 3: Move resources to new state file** For each resource listed in step 2, run: ```bash mkdir -p state/infra terraform state mv \ -state=terraform.tfstate \ -state-out=state/infra/terraform.tfstate \ 'module.k8s-node-template' 'module.k8s-node-template' terraform state mv \ -state=terraform.tfstate \ -state-out=state/infra/terraform.tfstate \ 'module.non-k8s-node-template' 'module.non-k8s-node-template' terraform state mv \ -state=terraform.tfstate \ -state-out=state/infra/terraform.tfstate \ 'module.docker-registry-template' 'module.docker-registry-template' terraform state mv \ -state=terraform.tfstate \ -state-out=state/infra/terraform.tfstate \ 'module.docker-registry-vm' 'module.docker-registry-vm' ``` **Step 4: Verify no changes in new state** Run: ```bash cd stacks/infra && terragrunt plan ``` Expected: `No changes. Infrastructure is up-to-date.` If there are changes, something went wrong — restore from backup and investigate. **Step 5: Remove infra modules from root main.tf** Remove (or comment out) the `module.k8s-node-template`, `module.non-k8s-node-template`, `module.docker-registry-template`, and `module.docker-registry-vm` blocks from `main.tf` (lines 208-400). Also remove the corresponding `locals` block (lines 196-206) since they're now in `stacks/infra/main.tf`. **Step 6: Verify legacy state is clean** Run: ```bash terraform plan -var="kube_config_path=$(pwd)/config" ``` Expected: No changes (the moved resources are gone from this state but also from main.tf) **Step 7: Commit** ```bash git add main.tf stacks/infra/ git commit -m "[ci skip] Migrate infra stack (VMs) to Terragrunt" ``` --- ## Task 5: Create Platform Stack **Files:** - Create: `stacks/platform/terragrunt.hcl` - Create: `stacks/platform/main.tf` This is the largest task — it groups ~20 core services into one stack. **Step 1: Write platform terragrunt.hcl** ```hcl # stacks/platform/terragrunt.hcl include "root" { path = find_in_parent_folders() } dependency "infra" { config_path = "../infra" skip_outputs = true } ``` **Step 2: Write platform main.tf** This file contains all core/cluster service module calls. Copy each from `modules/kubernetes/main.tf`, adjusting `source` paths from `"./"` to `"../../modules/kubernetes/"`. Remove `for_each` conditionals (core services are always present). Remove `depends_on = [null_resource.core_services]`. Platform services (from `modules/kubernetes/main.tf`): ```hcl # stacks/platform/main.tf # Variables — declare all variables needed by platform services variable "kube_config_path" { default = "~/.kube/config" } variable "tls_secret_name" {} variable "prod" { default = false } # dbaas vars variable "dbaas_root_password" {} variable "dbaas_postgresql_root_password" {} variable "dbaas_pgadmin_password" {} # traefik vars variable "ingress_crowdsec_api_key" {} # technitium vars variable "technitium_db_password" {} variable "homepage_credentials" { type = map(any) } # headscale vars variable "headscale_config" {} variable "headscale_acl" {} # authentik vars variable "authentik_secret_key" {} variable "authentik_postgres_password" {} variable "k8s_users" { type = map(any); default = {} } variable "ssh_private_key" { type = string; default = ""; sensitive = true } # crowdsec vars variable "crowdsec_enroll_key" { type = string } variable "crowdsec_db_password" { type = string } variable "crowdsec_dash_api_key" { type = string } variable "crowdsec_dash_machine_id" { type = string } variable "crowdsec_dash_machine_password" { type = string } variable "alertmanager_slack_api_url" {} # cloudflared vars variable "cloudflare_api_key" {} variable "cloudflare_email" {} variable "cloudflare_account_id" {} variable "cloudflare_zone_id" {} variable "cloudflare_tunnel_id" {} variable "public_ip" {} variable "cloudflare_proxied_names" {} variable "cloudflare_non_proxied_names" {} variable "cloudflare_tunnel_token" {} # monitoring vars variable "alertmanager_account_password" {} variable "idrac_username" { default = "" } variable "idrac_password" { default = "" } variable "tiny_tuya_service_secret" { type = string } variable "haos_api_token" { type = string } variable "pve_password" { type = string } variable "grafana_db_password" { type = string } variable "grafana_admin_password" { type = string } # vaultwarden vars variable "vaultwarden_smtp_password" {} # reverse-proxy vars (homepage tokens are in homepage_credentials) # wireguard vars variable "wireguard_wg_0_conf" {} variable "wireguard_wg_0_key" {} variable "wireguard_firewall_sh" {} # xray vars variable "xray_reality_clients" { type = list(map(string)) } variable "xray_reality_private_key" { type = string } variable "xray_reality_short_ids" { type = list(string) } # nvidia vars (none beyond tls_secret_name + tier) # mailserver vars variable "mailserver_accounts" {} variable "mailserver_aliases" {} variable "mailserver_opendkim_key" {} variable "mailserver_sasl_passwd" {} variable "mailserver_roundcubemail_db_password" { type = string } # infra-maintenance vars variable "webhook_handler_git_user" {} variable "webhook_handler_git_token" {} variable "technitium_username" {} variable "technitium_password" {} # uptime-kuma (no extra vars) # metrics-server (no extra vars) # kyverno (no extra vars) locals { tiers = { core = "0-core" cluster = "1-cluster" gpu = "2-gpu" edge = "3-edge" aux = "4-aux" } } # --- Core Services (no dependencies, deployed first) --- module "metallb" { source = "../../modules/kubernetes/metallb" tier = local.tiers.core } module "dbaas" { source = "../../modules/kubernetes/dbaas" prod = var.prod tls_secret_name = var.tls_secret_name dbaas_root_password = var.dbaas_root_password postgresql_root_password = var.dbaas_postgresql_root_password pgadmin_password = var.dbaas_pgadmin_password tier = local.tiers.cluster } module "redis" { source = "../../modules/kubernetes/redis" tls_secret_name = var.tls_secret_name tier = local.tiers.cluster } module "traefik" { source = "../../modules/kubernetes/traefik" tier = local.tiers.core crowdsec_api_key = var.ingress_crowdsec_api_key tls_secret_name = var.tls_secret_name } module "technitium" { source = "../../modules/kubernetes/technitium" tls_secret_name = var.tls_secret_name homepage_token = var.homepage_credentials["technitium"]["token"] technitium_db_password = var.technitium_db_password tier = local.tiers.core } module "headscale" { source = "../../modules/kubernetes/headscale" tls_secret_name = var.tls_secret_name headscale_config = var.headscale_config headscale_acl = var.headscale_acl tier = local.tiers.core } module "authentik" { source = "../../modules/kubernetes/authentik" tier = local.tiers.cluster tls_secret_name = var.tls_secret_name secret_key = var.authentik_secret_key postgres_password = var.authentik_postgres_password } module "rbac" { source = "../../modules/kubernetes/rbac" tier = local.tiers.cluster tls_secret_name = var.tls_secret_name k8s_users = var.k8s_users ssh_private_key = var.ssh_private_key } module "k8s-portal" { source = "../../modules/kubernetes/k8s-portal" tier = local.tiers.edge tls_secret_name = var.tls_secret_name } module "crowdsec" { source = "../../modules/kubernetes/crowdsec" tier = local.tiers.cluster tls_secret_name = var.tls_secret_name homepage_username = var.homepage_credentials["crowdsec"]["username"] homepage_password = var.homepage_credentials["crowdsec"]["password"] enroll_key = var.crowdsec_enroll_key db_password = var.crowdsec_db_password crowdsec_dash_api_key = var.crowdsec_dash_api_key crowdsec_dash_machine_id = var.crowdsec_dash_machine_id crowdsec_dash_machine_password = var.crowdsec_dash_machine_password slack_webhook_url = var.alertmanager_slack_api_url } module "cloudflared" { source = "../../modules/kubernetes/cloudflared" tier = local.tiers.core tls_secret_name = var.tls_secret_name cloudflare_api_key = var.cloudflare_api_key cloudflare_email = var.cloudflare_email cloudflare_account_id = var.cloudflare_account_id cloudflare_zone_id = var.cloudflare_zone_id cloudflare_tunnel_id = var.cloudflare_tunnel_id public_ip = var.public_ip cloudflare_proxied_names = var.cloudflare_proxied_names cloudflare_non_proxied_names = var.cloudflare_non_proxied_names cloudflare_tunnel_token = var.cloudflare_tunnel_token } module "monitoring" { source = "../../modules/kubernetes/monitoring" tls_secret_name = var.tls_secret_name alertmanager_account_password = var.alertmanager_account_password idrac_username = var.idrac_username idrac_password = var.idrac_password alertmanager_slack_api_url = var.alertmanager_slack_api_url tiny_tuya_service_secret = var.tiny_tuya_service_secret haos_api_token = var.haos_api_token pve_password = var.pve_password grafana_db_password = var.grafana_db_password grafana_admin_password = var.grafana_admin_password tier = local.tiers.cluster } module "vaultwarden" { source = "../../modules/kubernetes/vaultwarden" tls_secret_name = var.tls_secret_name smtp_password = var.vaultwarden_smtp_password tier = local.tiers.edge } module "reverse-proxy" { source = "../../modules/kubernetes/reverse_proxy" tls_secret_name = var.tls_secret_name truenas_homepage_token = var.homepage_credentials["reverse_proxy"]["truenas_token"] pfsense_homepage_token = var.homepage_credentials["reverse_proxy"]["pfsense_token"] } module "metrics-server" { source = "../../modules/kubernetes/metrics-server" tier = local.tiers.cluster tls_secret_name = var.tls_secret_name } module "nvidia" { source = "../../modules/kubernetes/nvidia" tls_secret_name = var.tls_secret_name tier = local.tiers.gpu } module "kyverno" { source = "../../modules/kubernetes/kyverno" } module "uptime-kuma" { source = "../../modules/kubernetes/uptime-kuma" tls_secret_name = var.tls_secret_name tier = local.tiers.cluster } module "wireguard" { source = "../../modules/kubernetes/wireguard" tls_secret_name = var.tls_secret_name wg_0_conf = var.wireguard_wg_0_conf wg_0_key = var.wireguard_wg_0_key firewall_sh = var.wireguard_firewall_sh tier = local.tiers.core } module "xray" { source = "../../modules/kubernetes/xray" tls_secret_name = var.tls_secret_name tier = local.tiers.core xray_reality_clients = var.xray_reality_clients xray_reality_private_key = var.xray_reality_private_key xray_reality_short_ids = var.xray_reality_short_ids } module "mailserver" { source = "../../modules/kubernetes/mailserver" tls_secret_name = var.tls_secret_name mailserver_accounts = var.mailserver_accounts postfix_account_aliases = var.mailserver_aliases opendkim_key = var.mailserver_opendkim_key sasl_passwd = var.mailserver_sasl_passwd roundcube_db_password = var.mailserver_roundcubemail_db_password tier = local.tiers.edge } module "infra-maintenance" { source = "../../modules/kubernetes/infra-maintenance" git_user = var.webhook_handler_git_user git_token = var.webhook_handler_git_token technitium_username = var.technitium_username technitium_password = var.technitium_password } # --- OUTPUTS (consumed by service stacks via Terragrunt dependency) --- output "tls_secret_name" { value = var.tls_secret_name } output "redis_host" { value = "redis.redis.svc.cluster.local" } output "postgresql_host" { value = "postgresql.dbaas.svc.cluster.local" } output "postgresql_port" { value = 5432 } output "mysql_host" { value = "mysql.dbaas.svc.cluster.local" } output "mysql_port" { value = 3306 } output "smtp_host" { value = "mail.viktorbarzin.me" } output "smtp_port" { value = 587 } ``` **Step 3: Verify init succeeds** Run: ```bash cd stacks/platform && terragrunt init ``` **Step 4: Commit** ```bash git add stacks/platform/ git commit -m "[ci skip] Add platform stack (core services)" ``` --- ## Task 6: Migrate Platform Stack State **CRITICAL: Largest state migration. Backup first.** **Step 1: Backup** ```bash cp terraform.tfstate terraform.tfstate.backup-pre-platform ``` **Step 2: Move core service resources** The resources are currently at `module.kubernetes_cluster.module.[""]` (the `for_each` key). Services without `for_each` are at `module.kubernetes_cluster.module.`. Run state mv for each platform service. Example pattern: ```bash # Services WITH for_each (note the ["key"] suffix): terraform state mv \ -state=terraform.tfstate \ -state-out=state/platform/terraform.tfstate \ 'module.kubernetes_cluster.module.redis["redis"]' \ 'module.redis' terraform state mv \ -state=terraform.tfstate \ -state-out=state/platform/terraform.tfstate \ 'module.kubernetes_cluster.module.traefik["traefik"]' \ 'module.traefik' # Services WITHOUT for_each: terraform state mv \ -state=terraform.tfstate \ -state-out=state/platform/terraform.tfstate \ 'module.kubernetes_cluster.module.metallb' \ 'module.metallb' terraform state mv \ -state=terraform.tfstate \ -state-out=state/platform/terraform.tfstate \ 'module.kubernetes_cluster.module.dbaas' \ 'module.dbaas' terraform state mv \ -state=terraform.tfstate \ -state-out=state/platform/terraform.tfstate \ 'module.kubernetes_cluster.module.cloudflared' \ 'module.cloudflared' terraform state mv \ -state=terraform.tfstate \ -state-out=state/platform/terraform.tfstate \ 'module.kubernetes_cluster.module.infra-maintenance' \ 'module.infra-maintenance' terraform state mv \ -state=terraform.tfstate \ -state-out=state/platform/terraform.tfstate \ 'module.kubernetes_cluster.module.reverse-proxy["reverse-proxy"]' \ 'module.reverse-proxy' ``` Repeat for all platform services. Check whether each has `for_each` by looking at the state list: ```bash terraform state list | grep 'module.kubernetes_cluster.module' | sort ``` Services with `for_each` have `["key"]` suffix; those without don't. **Step 3: Also move null_resource.core_services** ```bash # This resource can be dropped — don't move it, just remove it terraform state rm 'module.kubernetes_cluster.null_resource.core_services' ``` **Step 4: Verify platform state** Run: ```bash cd stacks/platform && terragrunt plan ``` Expected: `No changes.` (or only expected diffs from removed for_each wrappers) **Step 5: Remove platform services from modules/kubernetes/main.tf** Remove the module blocks for all services that moved to the platform stack. Also remove `null_resource.core_services` and the `defcon_modules`/`active_modules` locals that reference these modules. **Step 6: Verify legacy state** Run: ```bash terraform plan -var="kube_config_path=$(pwd)/config" ``` Expected: No changes for remaining services **Step 7: Commit** ```bash git add main.tf modules/kubernetes/main.tf stacks/platform/ git commit -m "[ci skip] Migrate platform stack (core services) to Terragrunt" ``` --- ## Task 7: Create Simple Service Stack Template + Migrate First Service (blog) **Files:** - Create: `stacks/blog/terragrunt.hcl` - Create: `stacks/blog/main.tf` **Step 1: Write blog terragrunt.hcl** ```hcl # stacks/blog/terragrunt.hcl include "root" { path = find_in_parent_folders() } dependency "platform" { config_path = "../platform" } inputs = { tls_secret_name = dependency.platform.outputs.tls_secret_name } ``` **Step 2: Write blog main.tf** ```hcl # stacks/blog/main.tf variable "tls_secret_name" {} variable "kube_config_path" { default = "~/.kube/config" } module "blog" { source = "../../modules/kubernetes/blog" tls_secret_name = var.tls_secret_name tier = "4-aux" } ``` **Step 3: Move blog state** ```bash terraform state mv \ -state=terraform.tfstate \ -state-out=state/blog/terraform.tfstate \ 'module.kubernetes_cluster.module.blog["blog"]' \ 'module.blog' ``` **Step 4: Verify** ```bash cd stacks/blog && terragrunt plan ``` Expected: `No changes.` **Step 5: Remove blog from modules/kubernetes/main.tf** Delete the `module "blog" { ... }` block (lines 197-205). **Step 6: Commit** ```bash git add stacks/blog/ modules/kubernetes/main.tf git commit -m "[ci skip] Migrate blog to Terragrunt stack" ``` --- ## Task 8: Batch-Migrate Remaining Simple Services Simple services only need `tls_secret_name` (and possibly a few non-DB variables). These follow the exact same pattern as blog. **Simple services to migrate** (one stack each): - echo, privatebin, excalidraw, city-guesser, dashy, travel_blog, jsoncrack, cyberchef, stirling-pdf, networking-toolbox, meshcentral, ntfy, plotting-book, reloader, descheduler, homepage, tor-proxy, forgejo, freshrss, navidrome, audiobookshelf, ebook2audiobook, whisper, frigate, matrix, changedetection, isponsorblocktv **Services with a few extra variables** (still no DB host refs): - shadowsocks (password), kms, hackmd (db_password), drone (github creds, rpc_secret), diun (nfty_token, slack_url), calibre (homepage creds), owntracks (credentials), webhook_handler (many tokens), coturn (turn_secret, public_ip), wealthfolio (password_hash), actualbudget (credentials), servarr (aiostreams), onlyoffice (db_password, jwt_token), xray (reality vars), tuya-bridge (api keys), openclaw (ssh_key, api keys), f1-stream (turn_secret), paperless-ngx (db_password), freedify (credentials), netbox For each service, create: 1. `stacks//terragrunt.hcl` — include root, dependency on platform, inputs from platform outputs 2. `stacks//main.tf` — variable declarations + module call with `source = "../../modules/kubernetes/"` 3. `terraform state mv` from legacy state 4. Remove module block from `modules/kubernetes/main.tf` 5. Verify with `terragrunt plan` **Automation script** (run for each simple service): ```bash #!/bin/bash # Usage: ./migrate-service.sh # Example: ./migrate-service.sh echo echo echo 3-edge SERVICE=$1 SOURCE_DIR=${2:-$1} FOR_EACH_KEY=${3:-$1} TIER=${4:-4-aux} mkdir -p stacks/$SERVICE cat > stacks/$SERVICE/terragrunt.hcl <<'TGEOF' include "root" { path = find_in_parent_folders() } dependency "platform" { config_path = "../platform" } inputs = { tls_secret_name = dependency.platform.outputs.tls_secret_name } TGEOF cat > stacks/$SERVICE/main.tf < to Terragrunt stacks" ``` --- ## Task 9: Modify Service Modules to Accept Host Variables **~20 modules need modification** to replace hardcoded DNS names with variables. For each module, the change is mechanical: 1. Add `variable "redis_host" { type = string }` (and/or postgresql_host, etc.) 2. Replace the hardcoded string with `var.redis_host` **Modules to modify and their needed variables:** | Module | Add variables | Replace in | |--------|-------------|-----------| | affine | redis_host, postgresql_host, postgresql_port, smtp_host, smtp_port | main.tf:25,29,50-64 | | immich | redis_host, postgresql_host | main.tf:80,96 | | nextcloud | redis_host, mysql_host | chart_values.yaml:31,37 | | grampsweb | redis_host, smtp_host, smtp_port | main.tf:37,41,45,57 | | dawarich | redis_host, postgresql_host | main.tf:75,79,147 | | send | redis_host | main.tf:75 | | linkwarden | postgresql_host, postgresql_port | main.tf:67 | | n8n | postgresql_host | main.tf:56 | | health | postgresql_host, postgresql_port | main.tf:54 | | tandoor | postgresql_host, smtp_host, smtp_port | main.tf:66,98 | | rybbit | postgresql_host | main.tf:162 | | netbox | postgresql_host | main.tf:73 | | speedtest | mysql_host | main.tf:85 | | real-estate-crawler | redis_host, mysql_host | main.tf:140,153,157,301,305,309,401,405,409 | | ytdlp | redis_host, ollama_host | main.tf:241,255 | | resume | smtp_host, smtp_port | main.tf:186 | | monitoring | mysql_host, smtp_host | grafana_chart_values.yaml:51, prometheus_chart_values.tpl:35,37 | **Example modification for affine:** In `modules/kubernetes/affine/main.tf`, add variables: ```hcl variable "redis_host" { type = string } variable "postgresql_host" { type = string } variable "postgresql_port" { type = number } variable "smtp_host" { type = string } variable "smtp_port" { type = number } ``` Replace: ```hcl # Before: DATABASE_URL = "postgresql://postgres:${var.postgresql_password}@postgresql.dbaas.svc.cluster.local:5432/affine" # After: DATABASE_URL = "postgresql://postgres:${var.postgresql_password}@${var.postgresql_host}:${var.postgresql_port}/affine" # Before: REDIS_SERVER_HOST = "redis.redis.svc.cluster.local" # After: REDIS_SERVER_HOST = var.redis_host ``` **Step: Commit each module modification** ```bash git add modules/kubernetes// git commit -m "[ci skip] Accept host variables in module" ``` --- ## Task 10: Migrate Database-Backed Services to Terragrunt Stacks After modules are modified (Task 9), create stacks that wire platform outputs to module inputs. **Example: stacks/affine/terragrunt.hcl** ```hcl include "root" { path = find_in_parent_folders() } dependency "platform" { config_path = "../platform" } inputs = { tls_secret_name = dependency.platform.outputs.tls_secret_name redis_host = dependency.platform.outputs.redis_host postgresql_host = dependency.platform.outputs.postgresql_host postgresql_port = dependency.platform.outputs.postgresql_port smtp_host = dependency.platform.outputs.smtp_host smtp_port = dependency.platform.outputs.smtp_port } ``` **stacks/affine/main.tf:** ```hcl variable "tls_secret_name" {} variable "kube_config_path" { default = "~/.kube/config" } variable "affine_postgresql_password" {} variable "redis_host" { type = string } variable "postgresql_host" { type = string } variable "postgresql_port" { type = number } variable "smtp_host" { type = string } variable "smtp_port" { type = number } module "affine" { source = "../../modules/kubernetes/affine" tls_secret_name = var.tls_secret_name postgresql_password = var.affine_postgresql_password redis_host = var.redis_host postgresql_host = var.postgresql_host postgresql_port = var.postgresql_port smtp_host = var.smtp_host smtp_port = var.smtp_port tier = "4-aux" } ``` State migration follows the same pattern as Task 7. Repeat for all DB-backed services from the table in Task 9. --- ## Task 11: Migrate Service-to-Service Dependencies Services that depend on other non-platform services need multi-dependency stacks. **Step 1: Create ollama stack with outputs** ```hcl # stacks/ollama/main.tf variable "tls_secret_name" {} variable "kube_config_path" { default = "~/.kube/config" } variable "ollama_api_credentials" {} module "ollama" { source = "../../modules/kubernetes/ollama" tls_secret_name = var.tls_secret_name tier = "2-gpu" ollama_api_credentials = var.ollama_api_credentials } output "ollama_host" { value = "ollama.ollama.svc.cluster.local" } ``` **Step 2: Create openclaw stack with ollama dependency** ```hcl # stacks/openclaw/terragrunt.hcl include "root" { path = find_in_parent_folders() } dependency "platform" { config_path = "../platform" } dependency "ollama" { config_path = "../ollama" } inputs = { tls_secret_name = dependency.platform.outputs.tls_secret_name ollama_host = dependency.ollama.outputs.ollama_host } ``` **Step 3: Similarly for coturn → f1-stream and osm-routing → real-estate-crawler** --- ## Task 12: Final Cleanup **Step 1: Remove legacy modules/kubernetes/main.tf** After all services are migrated, this file should be empty (or contain only commented-out blocks). Delete it. **Step 2: Remove kubernetes_cluster module call from root main.tf** The root `main.tf` should now only contain provider blocks (which can also be removed since Terragrunt generates them) and the `variable` declarations for `terraform.tfvars` loading. **Step 3: Archive legacy state** ```bash mv terraform.tfstate terraform.tfstate.legacy mv terraform.tfstate.backup-* state/backups/ ``` **Step 4: Verify full DAG** ```bash cd stacks && terragrunt run-all plan ``` Expected: All stacks show `No changes.` **Step 5: Update CLAUDE.md** Update the knowledge file to reflect the new Terragrunt architecture, commands, and workflow. **Step 6: Final commit** ```bash git add -A git commit -m "[ci skip] Complete Terragrunt migration — remove legacy monolith" ``` --- ## Task 13: Update CI/CD (Drone Pipeline) **Files:** - Modify: `.drone.yml` Create a Drone pipeline that: 1. Detects changed files 2. Maps to affected stacks 3. Runs `terragrunt plan` (on PR) or `terragrunt apply` (on master merge) See design doc section "CI/CD: Changed-Stack Detection" for the pipeline logic. --- ## Execution Order Summary | Task | Phase | Risk | Reversible | |------|-------|------|-----------| | 1. Install + skeleton | 0 | None | Yes (delete dirs) | | 2. Root terragrunt.hcl | 0 | None | Yes (delete file) | | 3. Infra stack files | 1 | None | Yes (delete stack) | | 4. Infra state migration | 1 | Medium | Yes (state mv back) | | 5. Platform stack files | 2 | None | Yes (delete stack) | | 6. Platform state migration | 2 | High | Yes (state mv back) | | 7. First simple service (blog) | 3 | Low | Yes (state mv back) | | 8. Batch simple services | 3 | Low | Yes (state mv back) | | 9. Module host variable mods | 4 | Low | Yes (revert changes) | | 10. DB service stacks | 4 | Low | Yes (state mv back) | | 11. Service-to-service deps | 5 | Low | Yes (state mv back) | | 12. Final cleanup | 6 | Medium | Harder to reverse | | 13. CI/CD update | 6 | Low | Yes (revert .drone.yml) |