From ad3432d68528fe541811abd3a0b771660d037b61 Mon Sep 17 00:00:00 2001 From: Viktor Barzin Date: Thu, 4 Jun 2026 02:58:27 +0000 Subject: [PATCH] docs(k8s-dashboard): dashboard SSO as-built (Option B multi-issuer apiserver) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Update authentication.md (structured multi-issuer AuthenticationConfiguration + dashboard SSO flow), multi-tenancy.md (web dashboard access), authentik-state (new k8s-dashboard app + gheorghe groups), service-catalog (dashboard auth), and the k8s-version-upgrade runbook (kubeadm wipes --authentication-config → re-apply rbac post-upgrade). Design/plan addenda record the issuer-constraint pivot from the original dual-aud approach. [ci skip] Co-Authored-By: Claude Opus 4.8 --- .claude/reference/authentik-state.md | 13 ++++- .claude/reference/service-catalog.md | 2 +- docs/architecture/authentication.md | 51 ++++++++++++++---- docs/architecture/multi-tenancy.md | 12 +++++ .../2026-06-04-k8s-dashboard-sso-design.md | 52 +++++++++++++++++++ .../2026-06-04-k8s-dashboard-sso-plan.md | 8 +++ docs/runbooks/k8s-version-upgrade.md | 22 ++++++++ 7 files changed, 147 insertions(+), 13 deletions(-) diff --git a/.claude/reference/authentik-state.md b/.claude/reference/authentik-state.md index 30094776..75efc08a 100644 --- a/.claude/reference/authentik-state.md +++ b/.claude/reference/authentik-state.md @@ -2,7 +2,7 @@ > Snapshot of applications, groups, users, and flows. Use `authentik` skill for management tasks. -## Applications (10) +## Applications (11) | Application | Provider Type | Auth Flow | |-------------|--------------|-----------| | Cloudflare Access | OAuth2/OIDC | explicit consent | @@ -12,10 +12,19 @@ | Headscale | OAuth2/OIDC | explicit consent | | Immich | OAuth2/OIDC | explicit consent | | Kubernetes | OAuth2/OIDC (public) | implicit consent | +| Kubernetes Dashboard | OAuth2/OIDC (confidential) | implicit consent | | linkwarden | OAuth2/OIDC | explicit consent | | Matrix | OAuth2/OIDC | implicit consent | | wrongmove | OAuth2/OIDC | implicit consent | +> **Kubernetes Dashboard** (TF-managed in `stacks/k8s-dashboard/authentik.tf`): +> confidential client `k8s-dashboard` consumed by oauth2-proxy in front of the +> web dashboard. Has a custom scope mapping `k8s-dashboard audience` (scope +> `k8s-dashboard-audience`) emitting `aud=[kubernetes,k8s-dashboard]`, plus a +> group-access policy restricting login to `kubernetes-admins` / +> `kubernetes-power-users` / `kubernetes-namespace-owners`. The apiserver trusts +> this app's issuer via the `rbac` stack structured `AuthenticationConfiguration`. + ## Groups (9) | Group | Parent | Superuser | Purpose | |-------|--------|-----------|---------| @@ -36,7 +45,7 @@ | vbarzin@gmail.com | Viktor Barzin | internal | authentik Admins, Home Server Admins, Wrongmove Users, Headscale Users | | emil.barzin@gmail.com | Emil Barzin | internal | Home Server Admins, Headscale Users | | ancaelena98@gmail.com | Anca Milea | external | Wrongmove Users, Headscale Users | -| vabbit81@gmail.com | GHEORGHE Milea | external | Headscale Users | +| vabbit81@gmail.com | GHEORGHE Milea | external | Headscale Users, kubernetes-namespace-owners, sops-vabbit81 | | valentinakolevabarzina@gmail.com | Valentina | internal | Headscale Users | | anca.r.cristian10@gmail.com | -- | internal | Wrongmove Users | | kadir.tugan@gmail.com | Kadir | internal | Wrongmove Users | diff --git a/.claude/reference/service-catalog.md b/.claude/reference/service-catalog.md index c9a7638c..90ee7367 100644 --- a/.claude/reference/service-catalog.md +++ b/.claude/reference/service-catalog.md @@ -30,7 +30,7 @@ ## Admin | Service | Description | Stack | |---------|-------------|-------| -| k8s-dashboard | Kubernetes dashboard | k8s-dashboard | +| k8s-dashboard | Kubernetes dashboard at `k8s.viktorbarzin.me`. Authentik SSO via **oauth2-proxy** (`auth=none`; oauth2-proxy injects the user's OIDC id_token from the `k8s-dashboard` confidential client as Bearer → per-user RBAC at the apiserver). Multi-issuer apiserver auth in `stacks/rbac`. | k8s-dashboard | | reverse-proxy | Generic reverse proxy | reverse-proxy | | t3code | Multi-user coding-agent GUI at t3.viktorbarzin.me. `auth=required` (Authentik) → DevVM `t3-dispatch` service (`10.0.10.10:3780`, unprivileged user) maps `X-authentik-username` → that user's own `t3-serve@` instance (file perms enforced by uid; wizard→:3773, emo→:3774; unmapped→403) and **auto-injects the t3 session on first visit** (mints via the root `t3-mint` wrapper, scoped sudoers → `/api/auth/bootstrap` `t3_session` cookie). Source of truth `/etc/ttyd-user-map`; `t3-provision-users` reconcile (systemd timer) turns map entries into `t3-serve@` instances + `dispatch.json`. **Add a user:** one line in `/etc/ttyd-user-map` (must already be an OS account + Authentik identity) → reconcile. DevVM artifacts versioned in `infra/scripts/` (`t3-serve@.service`, `t3-provision-users`, `t3-dispatch/`, `t3-mint`, `sudoers-t3-autopair`, `t3-autoupdate.*`); TF (`stacks/t3code`) owns only the ingress + Endpoints→:3780. **t3 binary tracks `nightly`** via `t3-autoupdate` (daily systemd timer; health-check + auto-rollback on a bad build; restarts only idle instances) — so new models (e.g. Opus 4.8) land as t3 ships them. Native app/app.t3.codes unsupported (cross-origin) — deferred until published. Design: `docs/plans/2026-06-01-t3-auto-provision-*`. | t3code | diff --git a/docs/architecture/authentication.md b/docs/architecture/authentication.md index 361352a9..85a9cf78 100644 --- a/docs/architecture/authentication.md +++ b/docs/architecture/authentication.md @@ -99,24 +99,55 @@ Authentik provides OIDC for 10 applications: | Grafana | OIDC | Monitoring dashboard SSO | | Headscale | OIDC | Tailscale control plane auth | | Immich | OIDC | Photo management SSO | -| Kubernetes | OIDC (public client) | K8s API authentication | +| Kubernetes | OIDC (public client) | K8s API authentication (kubectl / kubelogin CLI) | +| Kubernetes Dashboard | OIDC (confidential, via oauth2-proxy) | Web dashboard SSO with per-user RBAC | | Linkwarden | OIDC | Bookmark manager SSO | | Matrix | OIDC | Matrix homeserver SSO | | Wrongmove | OIDC | Real estate app SSO | ### Kubernetes RBAC via OIDC -Kubernetes API server is configured with OIDC issuer: `https://authentik.viktorbarzin.me/application/o/kubernetes/` +The kube-apiserver uses a **structured `AuthenticationConfiguration`** +(`apiserver.config.k8s.io/v1`, file `/etc/kubernetes/pki/auth-config.yaml`, +flag `--authentication-config`) that trusts **two** Authentik issuers — managed +by `stacks/rbac/modules/rbac/apiserver-oidc.tf`: -The public client flow: +| Issuer (Authentik app) | Audience | Used by | +|---|---|---| +| `…/application/o/kubernetes/` | `kubernetes` | `kubectl` / kubelogin CLI (public client) | +| `…/application/o/k8s-dashboard/` | `k8s-dashboard` | oauth2-proxy in front of the web Dashboard (confidential client) | -1. User authenticates to Authentik via `kubectl` plugin or dashboard -2. Receives OIDC token with group claims -3. K8s API validates token against Authentik JWKS endpoint -4. Maps groups to ClusterRoleBindings: - - `kubernetes-admins` → `cluster-admin` (full cluster access) - - `kubernetes-power-users` → custom ClusterRole (read-mostly, limited write) - - `kubernetes-namespace-owners` → RoleBindings per namespace (namespace-scoped admin) +Both map `username <- email` and `groups <- groups` with **empty prefixes** (so +tokens map to RBAC subjects `kind: User, name: ` and verbatim group +names). This replaced the legacy single `--oidc-*` flags (one issuer only), +which a kubeadm upgrade had silently wiped. + +The flow: + +1. User authenticates to Authentik (via the `kubectl` plugin, or via oauth2-proxy + for the web Dashboard). +2. Receives an OIDC id_token with `email` + `groups` claims. +3. K8s API validates the token against the matching issuer's Authentik JWKS. +4. RBAC binds the user (by email) to roles — see `stacks/rbac/modules/rbac/main.tf`: + - `admin` role users → `cluster-admin` + - `power-user` role → custom cluster ClusterRole (read-mostly, limited write) + - `namespace-owner` role → `admin` RoleBinding in their namespace(s) + cluster read-only + +> **Web Dashboard SSO:** the `k8s.viktorbarzin.me` ingress points at +> **oauth2-proxy** (`stacks/k8s-dashboard/oauth2_proxy.tf`, `auth = "none"` — +> oauth2-proxy is the gate), which runs the Authentik OIDC code-flow against the +> `k8s-dashboard` confidential client and injects the user's id_token as +> `Authorization: Bearer` upstream to the Dashboard's Kong proxy. The Dashboard +> then talks to the apiserver **as the user**, so per-user RBAC applies (a +> namespace-owner manages only their namespace; admins see everything). A group +> policy on the Authentik app restricts login to the `kubernetes-*` groups. +> Replaced the prior forward-auth + static cluster-admin ServiceAccount (which +> made every authenticated user cluster-admin). Design: +> `docs/plans/2026-06-04-k8s-dashboard-sso-design.md`. + +> **Upgrade caveat:** `--authentication-config` lives in the kube-apiserver +> static-pod manifest, which `kubeadm upgrade` regenerates — **re-apply the +> `rbac` stack after any control-plane upgrade** to restore apiserver OIDC. ### Authentik Groups diff --git a/docs/architecture/multi-tenancy.md b/docs/architecture/multi-tenancy.md index e3e0b5c7..d4fb8015 100644 --- a/docs/architecture/multi-tenancy.md +++ b/docs/architecture/multi-tenancy.md @@ -171,6 +171,18 @@ Each user receives: ``` 6. User can now run `kubectl` commands +### Web Dashboard (no CLI needed) + +Namespace-owners can also manage their namespace from the **Kubernetes +Dashboard** at `https://k8s.viktorbarzin.me` using their Authentik account — no +kubectl, no token paste. oauth2-proxy runs the SSO flow and injects the user's +OIDC id_token, so the dashboard talks to the apiserver **as the user**: a +namespace-owner gets full control of their namespace(s) and read-only +visibility elsewhere; admins see everything. Login is restricted (Authentik +group policy) to the `kubernetes-*` groups. See +`docs/architecture/authentication.md` → "Kubernetes RBAC via OIDC" and +`docs/plans/2026-06-04-k8s-dashboard-sso-design.md`. + ### RBAC Groups | Group | ClusterRole | Scope | Members | diff --git a/docs/plans/2026-06-04-k8s-dashboard-sso-design.md b/docs/plans/2026-06-04-k8s-dashboard-sso-design.md index f090e55a..61c2d8ed 100644 --- a/docs/plans/2026-06-04-k8s-dashboard-sso-design.md +++ b/docs/plans/2026-06-04-k8s-dashboard-sso-design.md @@ -266,3 +266,55 @@ follow-up. No apiserver, RBAC, data, or CLI changes to unwind. | Dashboard v7 ignores a pre-set Authorization header (known friction: kubernetes/dashboard #5105, #1213) | `pass_authorization_header` + `set_authorization_header`; validate in §7; kong forwards headers by default | | ESO first-apply ordering | `terragrunt apply -target` the ExternalSecret first (documented plan-time pattern) | | Single-master apiserver assumption (memory id=2484) | We don't touch apiserver flags; no new exposure | + +--- + +## 12. ADDENDUM (2026-06-04) — As-built pivoted to Option B (apiserver multi-issuer) + +Sections 4–5 above describe the *original* plan: a separate `k8s-dashboard` +confidential client whose token carries a dual `aud` so the apiserver (pinned +to `--oidc-client-id=kubernetes`) would accept it **without** an apiserver +change. **That approach does not work**, for a reason discovered during +implementation: + +1. **The issuer is the binding constraint, not the audience.** Every Authentik + OAuth2 application has its own per-slug issuer. A token from the + `k8s-dashboard` app has `iss=…/o/k8s-dashboard/`, but the apiserver does an + **exact issuer-string match** against its single configured issuer + (`…/o/kubernetes/`). The dual-`aud` scope mapping is irrelevant — the token + is rejected on issuer before audience is even considered. + +2. **Apiserver OIDC was already silently broken.** Inspecting the live + `kube-apiserver` static-pod manifest showed **no `--oidc-*` flags at all** — + the kubeadm v1.34 upgrade had regenerated the manifest and dropped the + flags the `rbac` stack's `null_resource` had injected (its content-hash + trigger never re-fired). So OIDC apiserver auth was off cluster-wide. + +3. **Reusing the `kubernetes` app (make it confidential) — rejected.** It would + force distributing the now-confidential client secret to every CLI user via + the **public** k8s-portal `/setup/script` endpoint (a leak), plus + re-onboarding existing CLI users. Too invasive. + +**As-built = Option B: structured `AuthenticationConfiguration` on the +apiserver trusting BOTH issuers.** `stacks/rbac/modules/rbac/apiserver-oidc.tf` +now writes `/etc/kubernetes/pki/auth-config.yaml` +(`apiserver.config.k8s.io/v1`) with two `jwt` issuers — `kubernetes` +(audience `kubernetes`, for the kubelogin CLI) and `k8s-dashboard` (audience +`k8s-dashboard`, for oauth2-proxy) — each mapping `username<-email` and +`groups<-groups` with empty prefixes (to match existing RBAC subjects). The +legacy `--oidc-*` flags are replaced by `--authentication-config=…`. The remote +script health-gates `/livez` and **auto-rolls-back** the manifest if the +single-master apiserver doesn't recover. The oauth2-proxy + `k8s-dashboard` +Authentik app from §4 are reused unchanged (the dual-`aud` mapping is now +harmless — issuer2 only requires `k8s-dashboard ∈ aud`). + +This keeps the CLI flow 100% untouched (its own `kubernetes` issuer is one of +the two trusted issuers) and restores the apiserver OIDC that the kubeadm +upgrade had broken. + +**Known drift (carried forward):** a future `kubeadm upgrade` will again +regenerate the manifest and drop `--authentication-config`. The +content-hash trigger won't auto-detect this. **Operational mitigation: +re-apply the `rbac` stack after every k8s control-plane upgrade** (add to the +upgrade runbook). The `rbac` provisioner needs `TF_VAR_ssh_private_key` (an SSH +key authorized on the master) — it is not wired from Vault yet. diff --git a/docs/plans/2026-06-04-k8s-dashboard-sso-plan.md b/docs/plans/2026-06-04-k8s-dashboard-sso-plan.md index 3a45b21f..fafca32e 100644 --- a/docs/plans/2026-06-04-k8s-dashboard-sso-plan.md +++ b/docs/plans/2026-06-04-k8s-dashboard-sso-plan.md @@ -1,5 +1,13 @@ # K8s Dashboard SSO via Authentik (oauth2-proxy) — Implementation Plan +> **⚠️ AS-BUILT DIVERGED (2026-06-04).** Tasks 2–3 (oauth2-proxy + `k8s-dashboard` +> Authentik app) shipped as written, but the audience strategy here is WRONG: the +> apiserver matches the token **issuer** exactly, and a separate app has a +> different per-slug issuer — so the dual-`aud` trick can't avoid an apiserver +> change. The implementation pivoted to **Option B**: a structured multi-issuer +> `AuthenticationConfiguration` on the apiserver (`stacks/rbac`). See the +> **ADDENDUM (§12)** in `2026-06-04-k8s-dashboard-sso-design.md` for the as-built. + > **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking. **Goal:** Let namespace-owner users (e.g. gheorghe / `vabbit81`) open `https://k8s.viktorbarzin.me`, log in once with Authentik, and manage their own namespace in the Kubernetes Dashboard under their existing per-user RBAC. diff --git a/docs/runbooks/k8s-version-upgrade.md b/docs/runbooks/k8s-version-upgrade.md index d3d0a8d1..847d2462 100644 --- a/docs/runbooks/k8s-version-upgrade.md +++ b/docs/runbooks/k8s-version-upgrade.md @@ -127,6 +127,28 @@ Exposed in K8s via ExternalSecret `k8s-upgrade-creds` in the `k8s-upgrade` names ## Common Operations +### Post-upgrade: restore apiserver OIDC (REQUIRED after any control-plane bump) + +`kubeadm upgrade apply` **regenerates `/etc/kubernetes/manifests/kube-apiserver.yaml` +and drops the `--authentication-config` flag**, silently disabling apiserver +OIDC (kubectl/kubelogin CLI **and** the web dashboard SSO break — tokens get +401). This is not auto-detected (the `rbac` stack's `null_resource` trigger is a +content hash that doesn't change). After any control-plane upgrade, re-apply: + +```bash +cd stacks/rbac +TF_VAR_ssh_private_key="$(cat ~/.ssh/id_ed25519)" \ + VAULT_ADDR=https://vault.viktorbarzin.me ../../scripts/tg apply \ + --non-interactive -target=module.rbac.null_resource.apiserver_oidc_config +``` + +(`ssh_private_key` must be a key authorized for `wizard@`; it is not yet +wired from Vault.) The provisioner re-writes `/etc/kubernetes/pki/auth-config.yaml` +(both `kubernetes` + `k8s-dashboard` issuers), re-adds the flag, and +health-gates `/livez` with auto-rollback. Verify: `curl -sk +https://localhost:6443/livez` on the master = `ok`, and the apiserver manifest +contains `--authentication-config`. See `docs/plans/2026-06-04-k8s-dashboard-sso-design.md`. + ### Verify the pipeline is healthy ```bash # CronJob present + not suspended