diff --git a/.claude/CLAUDE.md b/.claude/CLAUDE.md index 2bde8934..a4541789 100755 --- a/.claude/CLAUDE.md +++ b/.claude/CLAUDE.md @@ -28,7 +28,14 @@ Violations cause state drift, which causes future applies to break or silently r - **Apply**: Authenticate via `vault login -method=oidc`, then use `scripts/tg` (preferred — handles state decrypt/encrypt) or `terragrunt` directly. `scripts/tg` adds `-auto-approve` for `--non-interactive` applies. - **New services need CI/CD** and **monitoring** (Prometheus/Uptime Kuma) - **New service**: Use `setup-project` skill for full workflow -- **Ingress**: `ingress_factory` module. **Auth** (`auth` string enum, default `"required"` — fail-closed): `auth = "required"` (Authentik login required — the standard tier; legacy `protected = true` semantics), `auth = "public"` (auto-bind anonymous requests to the `guest` Authentik user via the dedicated `public` outpost; logged-in users keep their real identity in `X-authentik-username`; routed via `traefik-authentik-forward-auth-public` middleware → `ak-outpost-public.authentik.svc:9000`), `auth = "none"` (no Authentik middleware at all — for Anubis-fronted content, native-client APIs like Git/`/v2/`/WebDAV/CalDAV, webhook receivers, OAuth callbacks, and the Authentik outposts themselves). **`auth = "public"` only works for top-level browser navigation** — it requires a 302+cookie dance on first visit to set the guest session cookie, which CORS preflight rejects on XHR/fetch() and which automation scripts can't replay. Use `auth = "none"` for XHR APIs / webhooks / native clients. **Anti-AI**: on by default ONLY when `auth = "none"` (Authentik gating already discourages bots; redundant on `auth = "required"` and `"public"`). **DNS**: `dns_type = "proxied"` (Cloudflare CDN) or `"non-proxied"` (direct A/AAAA). DNS records are auto-created — no need to edit `config.tfvars`. Smoke-test target: `echo.viktorbarzin.me` (auth=public, header-reflecting backend). +- **Ingress**: `ingress_factory` module. **Auth** (`auth` string enum, default `"required"` — fail-closed). Pick by asking "what gates the app?": + - `auth = "required"` — Authentik forward-auth gates every request. Use when the backend has **no built-in user auth** and Authentik is the only thing standing between strangers and the app (prowlarr, qbittorrent, netbox, phpipam, k8s-dashboard, foolery, any admin UI shipped without its own login). + - `auth = "app"` — the backend handles its own user authentication (NextAuth, Django, OAuth, bearer-token API, etc.); Authentik would only break it. No middleware attached; the app's own login is the gate. Examples: immich, linkwarden, tandoor, freshrss, affine, actualbudget, audiobookshelf, novelapp. **Functionally identical to `"none"`** — the distinct name exists to record intent at the call site. + - `auth = "public"` — Authentik anonymous binding via the dedicated `public` outpost (routes via `traefik-authentik-forward-auth-public` → `ak-outpost-public.authentik.svc:9000`). Strangers auto-bound to `guest`; logged-in users keep their identity in `X-authentik-username`. **Only works for top-level browser navigation** — CORS preflight rejects XHR/fetch and automation can't replay the cookie dance. Audit trail, not a gate. + - `auth = "none"` — no Authentik, no own-auth claim. Use for Anubis-fronted content (Anubis is the gate), native-client APIs (Git, `/v2/`, WebDAV/CalDAV, CardDAV), webhook receivers, OAuth callbacks, and Authentik outposts themselves. + - **Anti-exposure rule** (the reason `"app"` exists): only pick `"app"` or `"none"` AFTER you've verified the app has its own user auth (`"app"`) OR the endpoint is intentionally public (`"none"`). Default is `"required"` so accidental omission fails closed. **Convention**: when using `"app"` or `"none"`, add a comment line above the `auth = "..."` line stating what gates the app or why it's public. + - **Anti-AI**: on by default when `auth = "none"` or `auth = "app"` (no Authentik to discourage bots); redundant on `"required"` and `"public"`. + - **DNS**: `dns_type = "proxied"` (Cloudflare CDN) or `"non-proxied"` (direct A/AAAA). DNS records are auto-created — no need to edit `config.tfvars`. Smoke-test target: `echo.viktorbarzin.me` (auth=public, header-reflecting backend). - **Anubis PoW challenge** (`modules/kubernetes/anubis_instance/`): per-site reverse proxy that issues a 30-day JWT cookie after a tiny PoW solve. Use for **public, content-bearing sites without app-level auth** (blog, docs, wikis, static landing pages). Pattern: declare `module "anubis" { source = "../../modules/kubernetes/anubis_instance"; name = "X"; namespace = ...; target_url = "http://..svc.cluster.local" }`, then in `ingress_factory` set `service_name = module.anubis.service_name`, `port = module.anubis.service_port`, `anti_ai_scraping = false`. Shared ed25519 key in Vault `secret/viktor` -> `anubis_ed25519_key`; cookie scoped to `viktorbarzin.me` so one solve covers all Anubis-fronted subdomains. **DO NOT put Anubis in front of Git/API/WebDAV/CLI endpoints** — clients without JS can't solve PoW. **Replicas default to 1** because Anubis stores in-flight challenges in process memory; a challenge issued by pod A and solved against pod B errors with `store: key not found` (HTTP 500). Bumping replicas requires wiring a shared Redis store (TODO). For path-level carve-outs (e.g. wrongmove has `/` behind Anubis but `/api` direct), declare a second `ingress_factory` with `ingress_path = ["/api"]` pointing at the bare backend service. Active on: blog, www, kms, travel, f1, cc, json, pb (privatebin), home (homepage), wrongmove (UI only). See `.claude/reference/patterns.md` "Anti-AI Scraping" for full layering. - **Docker images**: Always build for `linux/amd64`. Use 8-char git SHA tags — `:latest` causes stale pull-through cache. - **Private registry**: `forgejo.viktorbarzin.me/viktor/` (Forgejo packages, OAuth-style PAT auth). Use `image: forgejo.viktorbarzin.me/viktor/:` + `imagePullSecrets: [{name: registry-credentials}]`. Kyverno auto-syncs the Secret to all namespaces. Containerd `hosts.toml` on every node redirects to in-cluster Traefik LB `10.0.20.200` to avoid hairpin NAT. Push-side: viktor PAT in Vault `secret/ci/global/forgejo_push_token` (Forgejo container packages are scoped per-user; only the package owner can push, ci-pusher cannot write to viktor/*). Pull-side: cluster-puller PAT in Vault `secret/viktor/forgejo_pull_token`. Retention CronJob (`forgejo-cleanup` in `forgejo` ns, daily 04:00) keeps newest 10 versions + always `:latest`; integrity probed every 15min by `forgejo-integrity-probe` in `monitoring` ns (catalog walk + manifest HEAD on every blob). See `docs/plans/2026-05-07-forgejo-registry-consolidation-{design,plan}.md` for the migration history. Pull-through caches for upstream registries (DockerHub, GHCR, Quay, k8s.gcr, Kyverno) stay on the registry VM at `10.0.20.10` ports 5000/5010/5020/5030/5040 — the old port-5050 R/W private registry was decommissioned 2026-05-07. diff --git a/modules/kubernetes/ingress_factory/main.tf b/modules/kubernetes/ingress_factory/main.tf index a66f5a47..0f239fb4 100644 --- a/modules/kubernetes/ingress_factory/main.tf +++ b/modules/kubernetes/ingress_factory/main.tf @@ -35,23 +35,48 @@ variable "auth" { type = string default = "required" description = <<-EOT - Authentik auth posture for this ingress: - * "required" (default): standard Authentik forward-auth — login required. - Catches the legacy `protected = true` semantics. - * "public": public-tier — auto-bind anonymous requests to the `guest` - Authentik user (no UI prompt), audited but not gated. Logged-in - users keep their real identity in X-authentik-username. - * "none": no Authentik forward-auth middleware at all. Use for - Anubis-fronted content sites, native-client APIs (Git, /v2/, WebDAV), - webhook receivers, and the Authentik outpost itself. Anti-AI - headers are auto-enabled when auth = "none" unless overridden. + Auth posture for this ingress. Pick by asking "what gates the app?": - Defaulting to "required" enforces "every ingress must have an explicit - auth decision recorded by Authentik" — accidental omission fails closed. + * "required" (default, fail-closed): Authentik forward-auth gates every + request. Pick this when the backend has NO built-in user auth and + Authentik is the only thing standing between strangers and the app. + Examples: prowlarr, qbittorrent, netbox, phpipam, k8s-dashboard, any + admin UI shipped without its own login. + + * "app": the backend handles its own user authentication (NextAuth, + Django sessions, OAuth, bearer-token API, etc.) and Authentik would + only get in the way. No Authentik middleware is attached; the app's + own login is the gate. Examples: immich, linkwarden, tandoor, + freshrss, affine, actualbudget, audiobookshelf, novelapp. + **Functionally identical to "none"** — the distinct name exists to + record intent at the call site so future readers don't have to guess. + + * "public": Authentik anonymous binding via the `public` outpost. + Strangers are auto-bound to the `guest` Authentik user; logged-in + users keep their identity in X-authentik-username. Only works for + top-level browser navigation — CORS preflight rejects XHR/fetch and + automation can't replay the cookie dance. Audit trail, not a gate. + + * "none": no Authentik middleware, no own-auth claim — explicitly + public or unauthenticated-by-design. Use for: Anubis-fronted content + sites (where Anubis is the gate), native-client APIs that auth + themselves (Git, /v2/, WebDAV/CalDAV, CardDAV), webhook receivers, + OAuth callbacks, and Authentik outposts themselves. + + **Anti-exposure rule** (the reason "app" exists as a distinct mode): + only pick "app" or "none" AFTER you have verified the app has its own + user auth (for "app") OR the endpoint is intentionally public (for + "none"). Picking either of these on a naked admin UI exposes it to the + internet. The default is "required" specifically so accidental omission + fails closed. + + **Convention**: when using "app" or "none", add a comment line above + the `auth = "..."` line stating what gates the app or why it's public. + Future-you reads the call site, not the module description. EOT validation { - condition = contains(["required", "public", "none"], var.auth) - error_message = "auth must be one of: required, public, none." + condition = contains(["required", "app", "public", "none"], var.auth) + error_message = "auth must be one of: required, app, public, none." } } variable "ingress_path" { @@ -162,13 +187,17 @@ variable "homepage_enabled" { locals { effective_host = var.full_host != null ? var.full_host : "${var.host != null ? var.host : var.name}.${var.root_domain}" - # Anti-AI default: ON only when no Authentik auth is in front of the ingress - # (i.e. auth = "none" — public Anubis-fronted content sites, etc.). When - # Authentik gates the request (required/public), the auth flow already - # discourages bots, so anti-AI noise is redundant. - effective_anti_ai = var.anti_ai_scraping != null ? var.anti_ai_scraping : (var.auth == "none") + # Anti-AI default: ON when no Authentik auth fronts the ingress (auth = + # "none" or auth = "app" — either the app gates users itself or the site + # is intentionally public). When Authentik gates the request + # (required/public), the auth flow already discourages bots. + effective_anti_ai = var.anti_ai_scraping != null ? var.anti_ai_scraping : (var.auth == "none" || var.auth == "app") - # Auth middleware selection. "none" attaches no Authentik middleware at all. + # Auth middleware selection. "app" and "none" both attach no Authentik + # middleware — "app" signals "the backend has its own user auth", "none" + # signals "intentionally public / native-client API / webhook". The + # distinction lives at the call site for human readers; the runtime + # effect is identical. auth_middleware = ( var.auth == "required" ? "traefik-authentik-forward-auth@kubernetescrd" : var.auth == "public" ? "traefik-authentik-forward-auth-public@kubernetescrd" : diff --git a/stacks/actualbudget/factory/main.tf b/stacks/actualbudget/factory/main.tf index 869f9036..a72ee0f4 100644 --- a/stacks/actualbudget/factory/main.tf +++ b/stacks/actualbudget/factory/main.tf @@ -156,10 +156,10 @@ resource "kubernetes_service" "actualbudget" { module "ingress" { source = "../../../modules/kubernetes/ingress_factory" - # auth = "none": Actual Budget enforces a server password + per-user login + # auth = "app": Actual Budget enforces a server password + per-user login # on its own sync API. Authentik forward-auth was 302-ing the mobile/web # sync clients; Actual's own auth gates users. - auth = "none" + auth = "app" namespace = "actualbudget" name = "budget-${var.name}" tls_secret_name = var.tls_secret_name diff --git a/stacks/affine/main.tf b/stacks/affine/main.tf index 055924b6..22cc0fa1 100644 --- a/stacks/affine/main.tf +++ b/stacks/affine/main.tf @@ -359,10 +359,10 @@ resource "kubernetes_service" "affine" { module "ingress" { source = "../../modules/kubernetes/ingress_factory" - # auth = "none": AFFiNE has its own workspace auth + bearer-token API + # auth = "app": AFFiNE has its own workspace auth + bearer-token API # used by desktop/mobile sync clients. Authentik forward-auth was 302-ing # those API callers; AFFiNE's own auth gates users. - auth = "none" + auth = "app" dns_type = "non-proxied" namespace = kubernetes_namespace.affine.metadata[0].name name = "affine" diff --git a/stacks/ebooks/main.tf b/stacks/ebooks/main.tf index 29a6dd63..b9b16389 100644 --- a/stacks/ebooks/main.tf +++ b/stacks/ebooks/main.tf @@ -662,10 +662,10 @@ resource "kubernetes_service" "audiobookshelf" { module "audiobookshelf_ingress" { source = "../../modules/kubernetes/ingress_factory" - # auth = "none": Audiobookshelf has its own user/password login + API + # auth = "app": Audiobookshelf has its own user/password login + API # tokens used by the iOS/Android Audiobookshelf app. Authentik forward-auth # was 302-ing the mobile clients; ABS's own auth gates users. - auth = "none" + auth = "app" dns_type = "non-proxied" namespace = kubernetes_namespace.ebooks.metadata[0].name name = "audiobookshelf" diff --git a/stacks/freshrss/main.tf b/stacks/freshrss/main.tf index 8b40b9cd..35a22953 100644 --- a/stacks/freshrss/main.tf +++ b/stacks/freshrss/main.tf @@ -229,10 +229,10 @@ resource "kubernetes_service" "freshrss" { } module "ingress" { source = "../../modules/kubernetes/ingress_factory" - # auth = "none": FreshRSS has built-in user login and exposes Fever + + # auth = "app": FreshRSS has built-in user login and exposes Fever + # GReader APIs (/api/fever.php, /api/greader.php) used by mobile RSS # readers like Reeder/FeedMe. Authentik forward-auth was 302-ing those. - auth = "none" + auth = "app" dns_type = "proxied" namespace = "freshrss" name = "rss" diff --git a/stacks/immich/main.tf b/stacks/immich/main.tf index a50d3641..a1849524 100644 --- a/stacks/immich/main.tf +++ b/stacks/immich/main.tf @@ -738,10 +738,10 @@ resource "kubernetes_service" "immich-machine-learning" { module "ingress-immich" { source = "../../modules/kubernetes/ingress_factory" - # auth = "none": Immich has its own user auth + bearer-token API. Authentik + # auth = "app": Immich has its own user auth + bearer-token API. Authentik # forward-auth on `/api/*` was 302-ing the iOS/Android Immich app and any # external API consumer. App-level auth is the gate now. - auth = "none" + auth = "app" dns_type = "non-proxied" namespace = kubernetes_namespace.immich.metadata[0].name name = "immich" diff --git a/stacks/linkwarden/main.tf b/stacks/linkwarden/main.tf index 0cd201ef..bcebbd61 100644 --- a/stacks/linkwarden/main.tf +++ b/stacks/linkwarden/main.tf @@ -229,10 +229,10 @@ resource "kubernetes_service" "linkwarden" { module "ingress" { source = "../../modules/kubernetes/ingress_factory" - # auth = "none": Linkwarden uses NextAuth (NEXTAUTH_SECRET/URL set above) + # auth = "app": Linkwarden uses NextAuth (NEXTAUTH_SECRET/URL set above) # and exposes /api/* for its mobile clients. Authentik forward-auth would # 302 those callers; app-level NextAuth gates users. - auth = "none" + auth = "app" dns_type = "proxied" namespace = kubernetes_namespace.linkwarden.metadata[0].name name = "linkwarden" diff --git a/stacks/novelapp/main.tf b/stacks/novelapp/main.tf index 644c43d6..1c66030b 100644 --- a/stacks/novelapp/main.tf +++ b/stacks/novelapp/main.tf @@ -224,11 +224,11 @@ resource "kubernetes_service" "novelapp" { module "ingress" { source = "../../modules/kubernetes/ingress_factory" - # auth = "none": novelapp handles its own auth via NextAuth + Google OAuth + # auth = "app": novelapp handles its own auth via NextAuth + Google OAuth # (AUTH_URL/AUTH_SECRET/GOOGLE_CLIENT_{ID,SECRET} env vars above). Putting # Authentik forward-auth in front double-gates the app and breaks iOS/Android # webview clients that can't complete the Authentik 302/cookie dance. - auth = "none" + auth = "app" dns_type = "non-proxied" namespace = kubernetes_namespace.novelapp.metadata[0].name name = "novelapp" diff --git a/stacks/tandoor/main.tf b/stacks/tandoor/main.tf index e265431b..94242a8f 100644 --- a/stacks/tandoor/main.tf +++ b/stacks/tandoor/main.tf @@ -259,10 +259,10 @@ resource "kubernetes_service" "tandoor" { module "ingress" { source = "../../modules/kubernetes/ingress_factory" - # auth = "none": Tandoor uses Django auth (SECRET_KEY set above) and exposes + # auth = "app": Tandoor uses Django auth (SECRET_KEY set above) and exposes # /api/* with token auth for its mobile clients. Authentik forward-auth was # 302-ing those callers; Django session/token auth gates users. - auth = "none" + auth = "app" dns_type = "proxied" namespace = kubernetes_namespace.tandoor.metadata[0].name name = "tandoor"