## Context
Wave 6a of the state-drift consolidation plan. The Domain wide catch all
Proxy Provider (pk=5) + its wrapping Application (slug=domain-wide-catch-all)
+ the embedded outpost (uuid 0eecac07-97c7-443c-8925-05f2f4fe3e47) have
run for a year as pure UI-created state. When the 2026-04-18 outpost SEV2
hit, it was harder to reason about the config than it should have been —
the only source of truth was the Authentik admin UI. Bringing the provider
+ application under Terraform means future changes are reviewable in PRs
and recoverable from git if the admin UI misbehaves.
## This change
Adds the `goauthentik/authentik` provider to the repo's central
`terragrunt.hcl` `required_providers` (side-effect: every stack can now
declare authentik resources; this stack is the only current consumer).
Stack-local `stacks/authentik/authentik_provider.tf` holds the provider
instance configuration + API token wiring + two resources + their flow
data-source lookups.
### Auth
- API token stored in Vault at `secret/authentik/tf_api_token`, identifier
`terraform-infra-stack`, intent=API, user=akadmin, no expiry. Rotatable
by rewriting the Vault KV + any running TF apply picks it up on next
plan.
### Imports (both landed zero-diff)
- `authentik_application.catchall` ← id `domain-wide-catch-all`
- `authentik_provider_proxy.catchall` ← id `5`
### Flow references
Authorization + invalidation flows are looked up via `data
"authentik_flow"` by slug (`default-provider-authorization-implicit-consent`
+ `default-provider-invalidation-flow`). Keeping them as data sources
rather than hardcoded UUIDs means a flow recreation (slug unchanged)
doesn't require an HCL edit.
### `lifecycle { ignore_changes }` scope
On `authentik_provider_proxy.catchall`:
- `property_mappings` (5 UUIDs), `jwt_federation_sources` (1 UUID) — the
live state references complex many-to-many relations that are easier
to manage from the Authentik UI than to serialise in HCL. Drift
suppressed.
- `skip_path_regex`, `internal_host`, all `basic_auth_*`,
`intercept_header_auth`, `access_token_validity` — either defaults or
UI-only tuning knobs that aren't part of Terraform's concern for this
catch-all provider.
On `authentik_application.catchall`:
- `meta_description`, `meta_launch_url`, `meta_icon`, `group`,
`backchannel_providers`, `policy_engine_mode`, `open_in_new_tab` —
cosmetic/non-functional attributes; the Authentik UI is the right
place to edit these and drift on them isn't interesting.
## What is NOT in this change
- Outpost-binding resource — the embedded outpost's provider list is a
single-row many-to-many that the Authentik UI manages cleanly; adding
TF there would fight the UI without reducing drift.
- Property mappings and JWT federation source — managed via UI, drift
suppressed. A future wave can bring them in when someone actually
wants to edit them through code review.
- Other Authentik entities (Flows, Stages, Groups, RBAC policies) —
same rationale: UI is the natural editing surface. Adopt incrementally
as they become interesting to code-review.
## Verification
```
$ cd stacks/authentik && ../../scripts/tg plan | grep Plan:
Plan: 0 to add, 1 to change, 0 to destroy.
# module.authentik.kubernetes_deployment.pgbouncer — pre-existing drift,
# unrelated to this commit (image_pull_policy Always -> IfNotPresent)
$ ../../scripts/tg state list | grep authentik_
authentik_application.catchall
authentik_provider_proxy.catchall
data.authentik_flow.default_authorization_implicit_consent
data.authentik_flow.default_provider_invalidation
```
## Reproduce locally
1. `git pull && cd stacks/authentik && ../../scripts/tg init`
2. Terraform pulls goauthentik/authentik provider (first time).
3. `tg plan` — expect only pgbouncer drift; authentik resources read-only.
Refs: Wave 6a of the state-drift consolidation (code-hl1)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two-tier state architecture:
- Tier 0 (infra, platform, cnpg, vault, dbaas, external-secrets): local
state with SOPS encryption in git — unchanged, required for bootstrap.
- Tier 1 (105 app stacks): PostgreSQL backend on CNPG cluster at
10.0.20.200:5432/terraform_state with native pg_advisory_lock.
Motivation: multi-operator friction (every workstation needed SOPS + age +
git-crypt), bootstrap complexity for new operators, and headless agents/CI
needing the full encryption toolchain just to read state.
Changes:
- terragrunt.hcl: conditional backend (local vs pg) based on tier0 list
- scripts/tg: tier detection, auto-fetch PG creds from Vault for Tier 1,
skip SOPS and Vault KV locking for Tier 1 stacks
- scripts/state-sync: tier-aware encrypt/decrypt (skips Tier 1)
- scripts/migrate-state-to-pg: one-shot migration script (idempotent)
- stacks/vault/main.tf: pg-terraform-state static role + K8s auth role
for claude-agent namespace
- stacks/dbaas: terraform_state DB creation + MetalLB LoadBalancer
service on shared IP 10.0.20.200
- Deleted 107 .tfstate.enc files for migrated Tier 1 stacks
- Cleaned up per-stack tiers.tf (now generated by root terragrunt.hcl)
[ci skip]
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
## Context
Deploying new services required manually adding hostnames to
cloudflare_proxied_names/cloudflare_non_proxied_names in config.tfvars —
a separate file from the service stack. This was frequently forgotten,
leaving services unreachable externally.
## This change:
- Add `dns_type` parameter to `ingress_factory` and `reverse_proxy/factory`
modules. Setting `dns_type = "proxied"` or `"non-proxied"` auto-creates
the Cloudflare DNS record (CNAME to tunnel or A/AAAA to public IP).
- Simplify cloudflared tunnel from 100 per-hostname rules to wildcard
`*.viktorbarzin.me → Traefik`. Traefik still handles host-based routing.
- Add global Cloudflare provider via terragrunt.hcl (separate
cloudflare_provider.tf with Vault-sourced API key).
- Migrate 118 hostnames from centralized config.tfvars to per-service
dns_type. 17 hostnames remain centrally managed (Helm ingresses,
special cases).
- Update docs, AGENTS.md, CLAUDE.md, dns.md runbook.
```
BEFORE AFTER
config.tfvars (manual list) stacks/<svc>/main.tf
| module "ingress" {
v dns_type = "proxied"
stacks/cloudflared/ }
for_each = list |
cloudflare_record auto-creates
tunnel per-hostname cloudflare_record + annotation
```
## What is NOT in this change:
- Uptime Kuma monitor migration (still reads from config.tfvars)
- 17 remaining centrally-managed hostnames (Helm, special cases)
- Removal of allow_overwrite (keep until migration confirmed stable)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Vault is now the sole source of truth for secrets. SOPS pipeline
removed entirely — auth via `vault login -method=oidc`.
Part A: SOPS removal
- vault/main.tf: delete 990 lines (93 vars + 43 KV write resources),
add self-read data source for OIDC creds from secret/vault
- terragrunt.hcl: remove SOPS var loading, vault_root_token, check_secrets hook
- scripts/tg: remove SOPS decryption, keep -auto-approve logic
- .woodpecker/default.yml: replace SOPS with Vault K8s auth via curl
- Delete secrets.sops.json, .sops.yaml
Part B: External Secrets Operator
- New stack stacks/external-secrets/ with Helm chart + 2 ClusterSecretStores
(vault-kv for KV v2, vault-database for DB engine)
Part C: Database secrets engine (in vault/main.tf)
- MySQL + PostgreSQL connections with static role rotation (24h)
- 6 MySQL roles (speedtest, wrongmove, codimd, nextcloud, shlink, grafana)
- 6 PostgreSQL roles (trading, health, linkwarden, affine, woodpecker, claude_memory)
Part D: Kubernetes secrets engine (in vault/main.tf)
- RBAC for Vault SA to manage K8s tokens
- Roles: dashboard-admin, ci-deployer, openclaw, local-admin
- New scripts/vault-kubeconfig helper for dynamic kubeconfig
K8s auth method with scoped policies for CI, ESO, OpenClaw, Woodpecker sync.
- Add vault provider to root terragrunt.hcl (generated providers.tf)
- Delete stacks/vault/vault_provider.tf (now in generated providers.tf)
- Add 124 variable declarations + 43 vault_kv_secret_v2 resources to
vault/main.tf to populate Vault KV at secret/<stack-name>
- Migrate 43 consuming stacks to read secrets from Vault KV via
data "vault_kv_secret_v2" instead of SOPS var-file
- Add dependency "vault" to all migrated stacks' terragrunt.hcl
- Complex types (maps/lists) stored as JSON strings, decoded with
jsondecode() in locals blocks
Bootstrap secrets (vault_root_token, vault_authentik_client_id,
vault_authentik_client_secret) remain in SOPS permanently.
Apply order: vault stack first (populates KV), then all others.
terragrunt.hcl now loads:
- config.tfvars (required, plaintext)
- terraform.tfvars (optional, git-crypt — backward compat)
- secrets.auto.tfvars.json (optional, SOPS-decrypted)
before_hook checks that at least one secrets source exists.
Use `scripts/tg` wrapper for SOPS-based workflow.
Old terraform.tfvars kept for reference and backward compatibility.