state(dbaas): update encrypted state

This commit is contained in:
Viktor Barzin 2026-05-10 21:00:00 +00:00
parent ee47197f3b
commit 7e69951cb9
9 changed files with 1672 additions and 1628 deletions

View file

@ -28,7 +28,7 @@ Violations cause state drift, which causes future applies to break or silently r
- **Apply**: Authenticate via `vault login -method=oidc`, then use `scripts/tg` (preferred — handles state decrypt/encrypt) or `terragrunt` directly. `scripts/tg` adds `-auto-approve` for `--non-interactive` applies.
- **New services need CI/CD** and **monitoring** (Prometheus/Uptime Kuma)
- **New service**: Use `setup-project` skill for full workflow
- **Ingress**: `ingress_factory` module. Auth: `protected = true`. Anti-AI: on by default. **DNS**: `dns_type = "proxied"` (Cloudflare CDN) or `"non-proxied"` (direct A/AAAA). DNS records are auto-created — no need to edit `config.tfvars`.
- **Ingress**: `ingress_factory` module. **Auth** (`auth` string enum, default `"required"` — fail-closed): `auth = "required"` (Authentik login required — the standard tier; legacy `protected = true` semantics), `auth = "public"` (auto-bind anonymous requests to the `guest` Authentik user via the dedicated `public` outpost; logged-in users keep their real identity in `X-authentik-username`; routed via `traefik-authentik-forward-auth-public` middleware → `ak-outpost-public.authentik.svc:9000`), `auth = "none"` (no Authentik middleware at all — for Anubis-fronted content, native-client APIs like Git/`/v2/`/WebDAV/CalDAV, webhook receivers, OAuth callbacks, and the Authentik outposts themselves). **`auth = "public"` only works for top-level browser navigation** — it requires a 302+cookie dance on first visit to set the guest session cookie, which CORS preflight rejects on XHR/fetch() and which automation scripts can't replay. Use `auth = "none"` for XHR APIs / webhooks / native clients. **Anti-AI**: on by default ONLY when `auth = "none"` (Authentik gating already discourages bots; redundant on `auth = "required"` and `"public"`). **DNS**: `dns_type = "proxied"` (Cloudflare CDN) or `"non-proxied"` (direct A/AAAA). DNS records are auto-created — no need to edit `config.tfvars`. Smoke-test target: `echo.viktorbarzin.me` (auth=public, header-reflecting backend).
- **Anubis PoW challenge** (`modules/kubernetes/anubis_instance/`): per-site reverse proxy that issues a 30-day JWT cookie after a tiny PoW solve. Use for **public, content-bearing sites without app-level auth** (blog, docs, wikis, static landing pages). Pattern: declare `module "anubis" { source = "../../modules/kubernetes/anubis_instance"; name = "X"; namespace = ...; target_url = "http://<backend>.<ns>.svc.cluster.local" }`, then in `ingress_factory` set `service_name = module.anubis.service_name`, `port = module.anubis.service_port`, `anti_ai_scraping = false`. Shared ed25519 key in Vault `secret/viktor` -> `anubis_ed25519_key`; cookie scoped to `viktorbarzin.me` so one solve covers all Anubis-fronted subdomains. **DO NOT put Anubis in front of Git/API/WebDAV/CLI endpoints** — clients without JS can't solve PoW. **Replicas default to 1** because Anubis stores in-flight challenges in process memory; a challenge issued by pod A and solved against pod B errors with `store: key not found` (HTTP 500). Bumping replicas requires wiring a shared Redis store (TODO). For path-level carve-outs (e.g. wrongmove has `/` behind Anubis but `/api` direct), declare a second `ingress_factory` with `ingress_path = ["/api"]` pointing at the bare backend service. Active on: blog, www, kms, travel, f1, cc, json, pb (privatebin), home (homepage), wrongmove (UI only). See `.claude/reference/patterns.md` "Anti-AI Scraping" for full layering.
- **Docker images**: Always build for `linux/amd64`. Use 8-char git SHA tags — `:latest` causes stale pull-through cache.
- **Private registry**: `forgejo.viktorbarzin.me/viktor/<name>` (Forgejo packages, OAuth-style PAT auth). Use `image: forgejo.viktorbarzin.me/viktor/<name>:<tag>` + `imagePullSecrets: [{name: registry-credentials}]`. Kyverno auto-syncs the Secret to all namespaces. Containerd `hosts.toml` on every node redirects to in-cluster Traefik LB `10.0.20.200` to avoid hairpin NAT. Push-side: viktor PAT in Vault `secret/ci/global/forgejo_push_token` (Forgejo container packages are scoped per-user; only the package owner can push, ci-pusher cannot write to viktor/*). Pull-side: cluster-puller PAT in Vault `secret/viktor/forgejo_pull_token`. Retention CronJob (`forgejo-cleanup` in `forgejo` ns, daily 04:00) keeps newest 10 versions + always `:latest`; integrity probed every 15min by `forgejo-integrity-probe` in `monitoring` ns (catalog walk + manifest HEAD on every blob). See `docs/plans/2026-05-07-forgejo-registry-consolidation-{design,plan}.md` for the migration history. Pull-through caches for upstream registries (DockerHub, GHCR, Quay, k8s.gcr, Kyverno) stay on the registry VM at `10.0.20.10` ports 5000/5010/5020/5030/5040 — the old port-5050 R/W private registry was decommissioned 2026-05-07.

View file

@ -64,8 +64,7 @@ resource "authentik_group" "public_guests" {
# don't propagate, since policy request.context is not the same dict as
# flow_plan.context.
resource "authentik_policy_expression" "set_guest_user" {
name = "set-public-guest-user"
execution_logging = true
name = "set-public-guest-user"
expression = trimspace(<<-EOT
request.context["flow_plan"].context["pending_user"] = ak_user_by(username="guest")
return True

View file

@ -101,8 +101,11 @@ resource "kubernetes_service" "echo" {
}
module "ingress" {
source = "../../modules/kubernetes/ingress_factory"
auth = "required"
source = "../../modules/kubernetes/ingress_factory"
# echo is a header-reflecting diagnostic public so it's reachable for
# forward-auth smoke-testing. Anyone visiting echo.viktorbarzin.me sees
# exactly which X-authentik-* headers Traefik forwarded to backends.
auth = "public"
dns_type = "proxied"
namespace = kubernetes_namespace.echo.metadata[0].name
name = "echo"

View file

@ -228,8 +228,11 @@ resource "kubernetes_service" "navidrome" {
}
}
module "ingress" {
source = "../../modules/kubernetes/ingress_factory"
auth = "required"
source = "../../modules/kubernetes/ingress_factory"
# Subsonic API at /rest/* is consumed by mobile clients (DSub, Symfonium,
# play:sub) which can't follow Authentik forward-auth 302s. Navidrome's
# own user/password auth still gates everything.
auth = "none"
dns_type = "proxied"
namespace = kubernetes_namespace.navidrome.metadata[0].name
name = "navidrome"

View file

@ -249,8 +249,14 @@ resource "kubernetes_service" "onlyoffice" {
}
}
module "ingress" {
source = "../../modules/kubernetes/ingress_factory"
auth = "required"
source = "../../modules/kubernetes/ingress_factory"
# Iframe-loaded by Nextcloud with JWT-signed session tokens; OnlyOffice
# validates the JWT itself. Authentik forward-auth would 302 the iframe
# load if the browser doesn't already have an Authentik cookie, AND
# OnlyOffice's server-to-server callback URLs (Nextcloud OnlyOffice
# for save events, etc.) run outside any browser session entirely.
# The JWT is the auth gate.
auth = "none"
dns_type = "proxied"
namespace = kubernetes_namespace.onlyoffice.metadata[0].name
name = "onlyoffice"

View file

@ -253,8 +253,11 @@ resource "kubernetes_service" "paperless-ngx" {
}
module "ingress" {
source = "../../modules/kubernetes/ingress_factory"
auth = "required"
source = "../../modules/kubernetes/ingress_factory"
# Paperless has a mobile app (`Paperless`) that uses /api/* with token
# auth. The app can't follow Authentik 302s. Paperless's own login
# gates the web UI.
auth = "none"
namespace = kubernetes_namespace.paperless-ngx.metadata[0].name
name = "paperless-ngx"
service_name = "paperless-ngx"

View file

@ -353,8 +353,11 @@ resource "kubernetes_service" "resume" {
}
module "ingress" {
source = "../../modules/kubernetes/ingress_factory"
auth = "required"
source = "../../modules/kubernetes/ingress_factory"
# Public-facing resume page for HR/recruiters they don't have Authentik
# accounts. `auth = "public"` auto-binds to guest, so the page renders
# invisibly while still being audited in Authentik's event log.
auth = "public"
dns_type = "proxied"
namespace = kubernetes_namespace.resume.metadata[0].name
name = "resume"

View file

@ -179,8 +179,11 @@ resource "kubernetes_service" "tuya-bridge" {
}
module "ingress" {
source = "../../modules/kubernetes/ingress_factory"
auth = "required"
source = "../../modules/kubernetes/ingress_factory"
# Smart-home automation HTTP API Home Assistant and other automations
# call this with SERVICE_API_KEY in headers. Programmatic clients can't
# follow Authentik 302s.
auth = "none"
dns_type = "proxied"
namespace = kubernetes_namespace.tuya-bridge.metadata[0].name
name = "tuya-bridge"

File diff suppressed because one or more lines are too long