docs(k8s-dashboard): SSO via Authentik oauth2-proxy — design [ci skip]

Design for letting namespace-owner users (e.g. gheorghe/vabbit81) open the
K8s Dashboard with their Authentik account, mapped to their per-user RBAC.
oauth2-proxy fronts kong-proxy, runs the OIDC code-flow, and injects the
user's id_token as Bearer so the apiserver applies existing namespace-owner
bindings. Additive + one ingress repoint; multi-audience scope mapping
keeps the CLI flow untouched.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
Viktor Barzin 2026-06-04 01:59:49 +00:00
parent f201e4573e
commit 8b72eaebb0

View file

@ -0,0 +1,268 @@
# K8s Dashboard SSO via Authentik (oauth2-proxy) — Design
**Date:** 2026-06-04
**Status:** Approved (design)
**Author:** Viktor + Claude
**Scope:** Let namespace-owner users (e.g. gheorghe / `vabbit81`) open the
Kubernetes Dashboard at `https://k8s.viktorbarzin.me`, authenticate once with
their Authentik account, and manage their own namespace (inspect/edit
Deployments, read logs) with read-only visibility elsewhere — no second login,
no manually-pasted token.
---
## 1. Goal & Non-Goals
### Goal
A user in an Authentik `kubernetes-*` group browses to the dashboard, completes
the normal Authentik SSO redirect, and lands in a dashboard session whose
permissions are **their own** K8s RBAC (scoped by the OIDC `email`/`groups`
claims the apiserver already trusts). gheorghe gets full control of namespace
`vabbit81` and read-only cluster visibility.
### Non-Goals
- No change to the kube-apiserver OIDC flags (already configured).
- No change to the RBAC module or the `k8s_users` schema (gheorghe is already
onboarded; his RoleBindings already exist).
- No change to the existing **CLI** OIDC flow (kubelogin against the public
`kubernetes` client) used by viktor/anca — it must keep working untouched.
- Not replacing the dashboard (Headlamp et al. were considered and rejected).
- Not hardening/removing the existing static cluster-admin SA (tracked as an
optional future item in §9, deliberately out of scope here to keep blast
radius minimal and break-glass intact).
---
## 2. Current State (verified)
| Fact | Evidence |
|---|---|
| Dashboard deployed via Helm chart `kubernetes-dashboard` 7.12.0 in ns `kubernetes-dashboard`; fronted by `kubernetes-dashboard-kong-proxy` (HTTPS 443) | `stacks/k8s-dashboard/main.tf` |
| Ingress `k8s.viktorbarzin.me` currently `auth = "required"` (Authentik forward-auth) → kong-proxy. After auth, the dashboard only had a **static cluster-admin SA token** to talk to the apiserver — every authenticated user was effectively cluster-admin | `stacks/k8s-dashboard/main.tf` (ingress + `admin-user` CRB) |
| kube-apiserver already OIDC-configured: `--oidc-issuer-url=https://authentik.viktorbarzin.me/application/o/kubernetes/`, `--oidc-client-id=kubernetes`, `--oidc-username-claim=email`, `--oidc-groups-claim=groups` | `stacks/rbac/modules/rbac/apiserver-oidc.tf` |
| Per-user RBAC already created from `k8s_users` (Vault `secret/platform`): namespace-owners get a `RoleBinding` to ClusterRole `admin` in their namespace + a cluster-wide read-only ClusterRoleBinding, keyed on **email** | `stacks/rbac/modules/rbac/main.tf` |
| gheorghe = `vabbit81`, email `vabbit81@gmail.com`, namespace `vabbit81`, role `namespace-owner`. Bindings `namespace-owner-vabbit81` + `oidc-ns-owner-readonly-vabbit81` exist | Vault `secret/platform → k8s_users` |
| Authentik is Terraform-managed (provider adopted 2026-04-18, Wave 6a). Proxy providers/outposts/guest flow live in `stacks/authentik/*.tf`. The `goauthentik/authentik` provider is available to every stack via central `terragrunt.hcl` | `stacks/authentik/authentik_provider.tf`, `guest.tf` |
| The existing `kubernetes` OIDC **application** is still UI-managed (no `client_id="kubernetes"` in the repo) — must not be disturbed | repo grep (no match) |
| `ingress_factory` `auth` enum + comment-convention guard (`scripts/check-ingress-auth-comments.py`) require a `# auth = "<tier>": …` comment above any `auth = "none"/"app"` line | `modules/kubernetes/ingress_factory/main.tf`, `infra/.claude/CLAUDE.md` |
The missing link is purely **token injection**: nothing today gives the
dashboard the *user's own* OIDC id_token, so the apiserver can't apply the
per-user RBAC that already exists.
---
## 3. Architecture
```
Browser
→ Cloudflare (proxied)
→ Traefik (ingress auth = "none"; oauth2-proxy is now the gate)
→ oauth2-proxy.kubernetes-dashboard.svc:4180
├─ no session → 302 → Authentik OIDC code-flow (+PKCE) → /oauth2/callback
│ gated by a group policy: only kubernetes-{admins,power-users,namespace-owners}
└─ session valid → proxies upstream + sets `Authorization: Bearer <id_token>`
→ kubernetes-dashboard-kong-proxy.svc:443 (UNCHANGED)
→ dashboard `api` → kube-apiserver (Bearer token)
→ OIDC validates: iss ✓, aud ⊇ {kubernetes} ✓, sig ✓
→ username = email, groups = groups claim
→ RBAC: namespace-owner-<user> (admin in their ns) + cluster read-only
```
**The entire change is additive + one ingress repoint.** New objects:
oauth2-proxy Deployment/Service, an Authentik OIDC application/provider + scope
mapping + group policy, and an ESO-synced secret. The ingress backend flips
from kong-proxy → oauth2-proxy and `auth` flips `required → none`.
---
## 4. Components
### 4.1 Authentik (Terraform, in `stacks/k8s-dashboard/`)
Follows the `stacks/authentik/guest.tf` pattern (provider block reads
`secret/authentik → tf_api_token` from Vault).
- `authentik_provider_oauth2 "k8s_dashboard"`:
- `client_type = "confidential"`, `client_id = "k8s-dashboard"`,
`client_secret` from Vault.
- `allowed_redirect_uris = [{ matching_mode="strict",
url="https://k8s.viktorbarzin.me/oauth2/callback" }]`.
- `authorization_flow` = default implicit-consent (data source);
`invalidation_flow` = default (data source).
- `access_token_validity = "hours=1"`, `refresh_token_validity = "days=30"`.
- `include_claims_in_id_token = true` (so the id_token carries `email` +
`groups`; the apiserver reads the `email` claim for username regardless of
`sub_mode`).
- `property_mappings` = the default OIDC scope mappings (`openid`, `email`,
`profile`) **plus** the goauthentik `groups` scope mapping **plus** the
custom audience mapping below. (Resolved via `data
authentik_property_mapping_provider_scope` lookups so we don't drop the
standard claims.)
- `authentik_property_mapping_provider_scope "k8s_dashboard_aud"`:
- `scope_name = "k8s-dashboard-audience"`,
`expression = return {"aud": ["kubernetes", "k8s-dashboard"]}`.
- Emits **both** audiences so the apiserver (`kubernetes`) and oauth2-proxy
(`k8s-dashboard`) each find their own client id in `aud`.
- `authentik_application "k8s_dashboard"`: slug `k8s-dashboard`,
`protocol_provider` = the oauth2 provider, `policy_engine_mode = "any"`.
- Group gate: `authentik_policy_expression "k8s_dashboard_groups"`
`return ak_is_group_member(request.user, name="kubernetes-admins") or
ak_is_group_member(request.user, name="kubernetes-power-users") or
ak_is_group_member(request.user, name="kubernetes-namespace-owners")`,
bound to the application via `authentik_policy_binding`.
### 4.2 Vault + ESO
- Vault `secret/k8s-dashboard` (new): `oauth2_proxy_client_id`,
`oauth2_proxy_client_secret`, `oauth2_proxy_cookie_secret` (32 random bytes,
base64). The Authentik provider reads the same client id/secret so the two
sides match.
- `ExternalSecret` in `stacks/k8s-dashboard/main.tf` → K8s Secret
`oauth2-proxy` in ns `kubernetes-dashboard`. First-apply: target the
ExternalSecret before the full apply (documented plan-time gotcha).
### 4.3 oauth2-proxy (Terraform, in `stacks/k8s-dashboard/`)
- Image `quay.io/oauth2-proxy/oauth2-proxy:v7.x` (pin a concrete tag; SHA-tag
convention). `linux/amd64`.
- Deployment: 2 replicas (HA path), Recreate not needed (stateless), readiness
`/ping`, requests 25m/64Mi, limit 128Mi. Standard `dns_config`
`ignore_changes` (KYVERNO_LIFECYCLE_V1).
- Config (env `OAUTH2_PROXY_*` or args):
- `provider=oidc`,
`oidc_issuer_url=https://authentik.viktorbarzin.me/application/o/k8s-dashboard/`
- `client_id`/`client_secret`/`cookie_secret` from the `oauth2-proxy` Secret
- `redirect_url=https://k8s.viktorbarzin.me/oauth2/callback`
- `upstreams=https://kubernetes-dashboard-kong-proxy.kubernetes-dashboard.svc.cluster.local:443`
- `ssl_upstream_insecure_skip_verify=true` (kong self-signed cert)
- `pass_authorization_header=true` (passes id_token to the dashboard)
- `set_authorization_header=true` (belt-and-suspenders)
- `oidc_extra_audience=kubernetes` (accept the apiserver audience too)
- `scope=openid email profile groups offline_access`
- `email_domains=*` (the Authentik group policy is the real gate)
- `cookie_secure=true`, `cookie_domains=k8s.viktorbarzin.me`,
`whitelist_domains=k8s.viktorbarzin.me`, `cookie_refresh=30m`,
`cookie_expire=168h`
- `reverse_proxy=true`, `skip_provider_button=true`
- Service `oauth2-proxy` (port 4180).
### 4.4 Ingress (edit existing in `stacks/k8s-dashboard/main.tf`)
- `service_name = "oauth2-proxy"`, `backend_protocol = "HTTP"`, `port = 4180`.
- `auth = "none"` with the mandatory comment:
`# auth = "none": oauth2-proxy is the gate — it runs the Authentik OIDC
code-flow and injects the user's id_token as Bearer for dashboard→apiserver
auth. Group policy on the Authentik app restricts to kubernetes-* groups.`
- Keep `dns_type = "proxied"` and the homepage annotations.
---
## 5. The Audience Strategy (the crux) + Fallback
The apiserver is pinned to a single legacy `--oidc-client-id=kubernetes`; the
CLI uses the public `kubernetes` client. oauth2-proxy must be its own
confidential client (`k8s-dashboard`). A token with `aud=k8s-dashboard` alone
would be rejected by the apiserver; a token with `aud=kubernetes` alone would
be rejected by oauth2-proxy. **Resolution:** the Authentik scope mapping emits
`aud = ["kubernetes", "k8s-dashboard"]`; both validators do a membership check
and each finds its own id. `oidc_extra_audience=kubernetes` on oauth2-proxy is
an extra safety net in case Authentik's scope mapping *overrides* `aud` to a
single value rather than appending.
**Apply-time verification (blocking):** decode a freshly issued id_token and
assert `aud` contains both `kubernetes` and `k8s-dashboard`, and that `groups`
is present. If Authentik refuses to emit a multi-valued `aud` via scope
mapping, **fallback** (documented, not preferred): repoint oauth2-proxy at the
existing `kubernetes` client made confidential and add `--oidc-client-secret`
to the kubelogin setup script — this unifies the audience at the cost of
touching the CLI flow. We try the additive multi-aud path first precisely to
avoid that.
---
## 6. Data Flow — gheorghe end-to-end
1. `https://k8s.viktorbarzin.me` → Cloudflare → Traefik (auth=none) →
oauth2-proxy.
2. No session → 302 to Authentik (`k8s-dashboard` client, code+PKCE).
3. Authentik runs the group policy: `vabbit81 ∈ kubernetes-namespace-owners` ✓.
Issues id_token: `email=vabbit81@gmail.com`,
`groups=[…,kubernetes-namespace-owners]`, `aud=[kubernetes,k8s-dashboard]`.
4. oauth2-proxy validates token (`k8s-dashboard ∈ aud`), sets session cookie,
proxies to kong-proxy with `Authorization: Bearer <id_token>`.
5. kong-proxy → dashboard `api` → kube-apiserver with the Bearer token.
6. apiserver OIDC: `username=vabbit81@gmail.com`, groups from claim.
7. RBAC: `RoleBinding namespace-owner-vabbit81` (ClusterRole `admin`) in ns
`vabbit81` + `ClusterRoleBinding oidc-ns-owner-readonly-vabbit81`.
8. Dashboard shows full control of `vabbit81`, read-only elsewhere — the goal.
No RBAC changes required; bindings already key on his email.
---
## 7. Testing
1. **Token shape (blocking):** decode an issued id_token; assert
`aud ⊇ {kubernetes,k8s-dashboard}` and `groups` present.
2. **Admin:** viktor logs in → sees/edits everything (cluster-admin group).
3. **Namespace-owner:** gheorghe logs in → can edit Deployments / read logs in
`vabbit81`; gets Forbidden creating resources in other namespaces; can view
(read-only) cluster resources.
4. **No regression:** viktor/anca CLI `kubectl` (kubelogin → public
`kubernetes` client) still works.
5. Browser checks driven via Playwright MCP; screenshot on failure.
---
## 8. Rollback
Single-commit revert of the ingress edit restores `service_name=kong-proxy`,
`backend_protocol=HTTPS`, `port=443`, `auth=required` → instant return to
Authentik forward-auth gating. oauth2-proxy + Authentik objects are additive
and inert once the ingress no longer points at them; they can be destroyed in a
follow-up. No apiserver, RBAC, data, or CLI changes to unwind.
---
## 9. Security Notes & Out-of-Scope Hardening
- The id_token only ever lives server-side (Bearer header oauth2-proxy→kong);
the browser holds an opaque oauth2-proxy session cookie
(secure/httponly/samesite-lax, scoped to `k8s.viktorbarzin.me`).
- Two gates: Authentik **group policy** (only `kubernetes-*` groups complete
the flow) and apiserver **RBAC** (per-user, by email). Defense in depth.
- The `authentik_walloff` blackbox guard is for `auth=none` carve-outs that
must NOT redirect to Authentik. The dashboard intentionally **does** redirect
(via oauth2-proxy), so it is **not** added to that guard.
- **Out of scope (optional future hardening):** the existing static
`admin-user` cluster-admin ClusterRoleBinding + SA remain. They predate this
change and provide break-glass. Removing them (admins would rely on SSO +
Vault `kubernetes/creds/dashboard-admin` + kubelogin) is a separate, reversible
security decision the user can request later. Not done here to keep blast
radius minimal.
---
## 10. Monitoring & Docs
- Uptime-Kuma external monitor auto-created (`dns_type=proxied`
`ingress_factory` adds the `external-monitor` label). No manual monitor.
- oauth2-proxy readiness probe on `/ping`.
- Docs updated in the **same commit**: `docs/architecture/authentication.md`
(new OIDC app + dashboard SSO flow), `docs/architecture/multi-tenancy.md`
(dashboard access path for namespace-owners),
`.claude/reference/authentik-state.md` (new app/provider/scope-mapping),
`.claude/reference/service-catalog.md` (k8s-dashboard auth posture), and the
companion `2026-06-04-k8s-dashboard-sso-plan.md`.
---
## 11. Open Risks
| Risk | Mitigation |
|---|---|
| Authentik scope mapping overrides rather than appends `aud` | `oidc_extra_audience=kubernetes` + blocking apply-time token decode; fallback in §5 |
| Dashboard v7 ignores a pre-set Authorization header (known friction: kubernetes/dashboard #5105, #1213) | `pass_authorization_header` + `set_authorization_header`; validate in §7; kong forwards headers by default |
| ESO first-apply ordering | `terragrunt apply -target` the ExternalSecret first (documented plan-time pattern) |
| Single-master apiserver assumption (memory id=2484) | We don't touch apiserver flags; no new exposure |