From 41d3358cc1cb1cf77082b65129b585a9145008bd Mon Sep 17 00:00:00 2001 From: Viktor Barzin Date: Tue, 17 Feb 2026 22:56:03 +0000 Subject: [PATCH] [ci skip] Add skills: authentik-oidc-kubernetes, kubelet-static-pod-manifest-update Two skills extracted from multi-user k8s access implementation: - authentik-oidc-kubernetes: 6 gotchas for Authentik OIDC + kube-apiserver - kubelet-static-pod-manifest-update: full restart cycle for static pod changes --- .../skills/authentik-oidc-kubernetes/SKILL.md | 170 ++++++++++++++++++ .../SKILL.md | 109 +++++++++++ 2 files changed, 279 insertions(+) create mode 100644 .claude/skills/authentik-oidc-kubernetes/SKILL.md create mode 100644 .claude/skills/kubelet-static-pod-manifest-update/SKILL.md diff --git a/.claude/skills/authentik-oidc-kubernetes/SKILL.md b/.claude/skills/authentik-oidc-kubernetes/SKILL.md new file mode 100644 index 00000000..cee033f7 --- /dev/null +++ b/.claude/skills/authentik-oidc-kubernetes/SKILL.md @@ -0,0 +1,170 @@ +--- +name: authentik-oidc-kubernetes +description: | + Configure Authentik as OIDC provider for Kubernetes API server authentication. + Use when: (1) setting up OIDC auth for kubectl with Authentik, (2) kube-apiserver + rejects OIDC tokens with "oidc: email not verified", (3) JWKS endpoint returns + empty {} despite provider being configured, (4) kubelogin fails with "claim not + present" for email, (5) redirect_uri mismatch errors during kubelogin browser auth, + (6) kube-apiserver static pod manifest changes don't take effect after restart. + Covers all gotchas discovered when integrating Authentik 2025.10.x with Kubernetes + 1.34.x using kubelogin (int128/kubelogin). +author: Claude Code +version: 1.0.0 +date: 2026-02-17 +--- + +# Authentik OIDC for Kubernetes API Authentication + +## Problem +Setting up Authentik as an OIDC identity provider for Kubernetes kubectl access +involves multiple non-obvious pitfalls that cause silent failures at different +stages of the authentication flow. + +## Context / Trigger Conditions +- Setting up multi-user kubectl access with OIDC +- Using Authentik as the identity provider and kubelogin (int128/kubelogin) as the kubectl plugin +- Any of these errors: + - `oidc: email not verified` + - `oidc: parse username claims "email": claim not present` + - `The request fails due to a missing, invalid, or mismatching redirection URI` + - JWKS endpoint (`/application/o//jwks/`) returns `{}` + - `Unauthorized` after successful browser login + +## Solution + +### Gotcha 1: Signing Key Must Be Assigned + +Authentik's OAuth2 provider does NOT assign a signing key by default. Without it, +the JWKS endpoint returns `{}` and kube-apiserver can't validate tokens. + +**Fix:** Assign a signing key (e.g., "authentik Self-signed Certificate") to the +OAuth2 provider: +```python +# Via Django shell (kubectl exec into authentik server pod) +from authentik.providers.oauth2.models import OAuth2Provider +from authentik.crypto.models import CertificateKeyPair + +provider = OAuth2Provider.objects.get(name='kubernetes') +cert = CertificateKeyPair.objects.filter(name='authentik Self-signed Certificate').first() +provider.signing_key = cert +provider.save() +``` + +Or via API: +```bash +curl -X PATCH -H "Authorization: Bearer $TOKEN" -H "Content-Type: application/json" \ + "$AUTHENTIK_URL/api/v3/providers/oauth2//" \ + -d '{"signing_key": ""}' +``` + +### Gotcha 2: Default Email Mapping Sets `email_verified: False` + +Authentik's built-in email scope mapping hardcodes `email_verified: False`: +```python +return { + "email": request.user.email, + "email_verified": False # <-- This causes kube-apiserver to reject the token +} +``` + +kube-apiserver requires `email_verified: true` by default. + +**Fix:** Create a custom scope mapping with `email_verified: True` and assign it +to the provider instead of the default: +```python +from authentik.providers.oauth2.models import OAuth2Provider, ScopeMapping + +# Create custom mapping +mapping, _ = ScopeMapping.objects.get_or_create( + name='Kubernetes Email (verified)', + defaults={ + 'scope_name': 'email', + 'expression': 'return {"email": request.user.email, "email_verified": True}' + } +) + +# Replace default email mapping on the provider +provider = OAuth2Provider.objects.get(name='kubernetes') +default_email = ScopeMapping.objects.filter( + managed='goauthentik.io/providers/oauth2/scope-email' +).first() +if default_email: + provider.property_mappings.remove(default_email) +provider.property_mappings.add(mapping) +``` + +### Gotcha 3: kubelogin Needs Extra Scopes + +By default, kubelogin only requests the `openid` scope. The token will lack +`email` and `groups` claims, causing: +``` +oidc: parse username claims "email": claim not present +``` + +**Fix:** Add `--oidc-extra-scope` flags to the kubeconfig exec plugin: +```yaml +users: +- name: oidc-user + user: + exec: + command: kubectl + args: + - oidc-login + - get-token + - --oidc-issuer-url=https://authentik.example.com/application/o/kubernetes/ + - --oidc-client-id=kubernetes + - --oidc-extra-scope=email # Required! + - --oidc-extra-scope=profile + - --oidc-extra-scope=groups +``` + +### Gotcha 4: Redirect URIs Must Use Regex Mode + +kubelogin picks a random available port (tries 8000, 18000, then random). +Strict redirect URI matching like `http://localhost:8000/callback` will fail +when kubelogin uses a different port. + +**Fix:** Use regex matching in the Authentik provider: +```json +{ + "redirect_uris": [ + {"matching_mode": "regex", "url": "http://localhost:.*"}, + {"matching_mode": "regex", "url": "http://127\\.0\\.0\\.1:.*"} + ] +} +``` + +### Gotcha 5: Property Mappings API Endpoint Changed + +In Authentik 2025.10.x, scope mappings are at: +- `propertymappings/provider/scope/` (new, correct) +- NOT `propertymappings/scope/` (old, returns 405 Method Not Allowed on POST) + +### Gotcha 6: Static Pod Manifest Changes Need Full Cycle + +See skill: `kubelet-static-pod-manifest-update` for the full restart procedure. + +## Verification + +After all fixes: +```bash +# 1. JWKS has a key +curl -s https://authentik.example.com/application/o/kubernetes/jwks/ | jq '.keys | length' +# Expected: 1 (or more) + +# 2. Test auth +KUBECONFIG=/path/to/oidc-kubeconfig kubectl get namespaces +# Expected: browser opens, login, namespaces returned + +# 3. Check API server logs for success +ssh master "sudo kubectl logs -n kube-system kube-apiserver-* | grep oidc | tail -5" +# Expected: no "Unable to authenticate" errors +``` + +## Notes +- The OAuth2 provider should use `client_type: public` (no client secret needed for kubelogin) +- Set `sub_mode: user_email` so the OIDC subject matches the RBAC binding +- Set `include_claims_in_id_token: true` for the token to contain claims directly +- Use `issuer_mode: per_provider` for a clean issuer URL +- RBAC ClusterRoleBindings should match on the user's email (the `--oidc-username-claim=email` value) diff --git a/.claude/skills/kubelet-static-pod-manifest-update/SKILL.md b/.claude/skills/kubelet-static-pod-manifest-update/SKILL.md new file mode 100644 index 00000000..ae9699a3 --- /dev/null +++ b/.claude/skills/kubelet-static-pod-manifest-update/SKILL.md @@ -0,0 +1,109 @@ +--- +name: kubelet-static-pod-manifest-update +description: | + Force kubelet to pick up changes to static pod manifests in /etc/kubernetes/manifests/. + Use when: (1) edited kube-apiserver.yaml but the running process still has old flags, + (2) kubelet restart doesn't pick up manifest changes, (3) touching the manifest file + doesn't trigger pod recreation, (4) killing the API server process results in the + same old args on restart, (5) the pod's config.hash annotation doesn't match the + file's hash. Requires a full cycle: remove manifest, stop kubelet, remove containers, + re-add manifest, start kubelet. +author: Claude Code +version: 1.0.0 +date: 2026-02-17 +--- + +# Kubelet Static Pod Manifest Update + +## Problem +After editing a static pod manifest (e.g., `/etc/kubernetes/manifests/kube-apiserver.yaml` +to add OIDC or audit flags), kubelet continues running the pod with the old configuration. +Standard approaches like `touch`, `systemctl restart kubelet`, or `kubectl delete pod` +do not force kubelet to reconcile the new manifest. + +## Context / Trigger Conditions +- Edited `/etc/kubernetes/manifests/kube-apiserver.yaml` (or other static pod manifests) +- The running process (`ps aux | grep kube-apiserver`) shows old flags +- `kubectl get pod -n kube-system kube-apiserver-* -o jsonpath='{.metadata.annotations.kubernetes\.io/config\.hash}'` returns a stale hash +- Any of these actions failed to apply the changes: + - `touch /etc/kubernetes/manifests/kube-apiserver.yaml` + - `systemctl restart kubelet` + - `kubectl delete pod kube-apiserver-*` + - Killing the API server process directly + +## Root Cause +Kubelet maintains an internal cache of static pod specs keyed by a hash of the manifest. +When the manifest changes, kubelet should detect the new hash and recreate the pod. +However, in practice (observed on Kubernetes 1.34.x), kubelet can get stuck with the +old hash if: +- The pod's mirror object in the API server still exists with the old hash +- Kubelet's internal pod cache wasn't cleared between restarts +- The container runtime (containerd) still has the old container running + +## Solution + +Full restart cycle on the master node: + +```bash +# 1. Back up the manifest +sudo cp /etc/kubernetes/manifests/kube-apiserver.yaml /tmp/kube-apiserver.yaml.bak + +# 2. Remove the manifest (kubelet will stop the pod) +sudo rm /etc/kubernetes/manifests/kube-apiserver.yaml + +# 3. Stop kubelet +sudo systemctl stop kubelet + +# 4. Wait for the API server container to stop +sleep 5 + +# 5. Force-remove any remaining API server containers +sudo crictl rm -f $(sudo crictl ps -aq --name kube-apiserver 2>/dev/null) 2>/dev/null + +# 6. Re-add the manifest (with your changes) +sudo cp /tmp/kube-apiserver.yaml.bak /etc/kubernetes/manifests/kube-apiserver.yaml + +# 7. Start kubelet +sudo systemctl start kubelet + +# 8. Wait for API server to come up (30-60 seconds) +sleep 45 + +# 9. Verify new flags are active +sudo cat /proc/$(pgrep -f 'kube-apiserver --' | head -1)/cmdline | tr '\0' '\n' | grep 'your-new-flag' +``` + +**Critical:** The order matters. Removing the manifest BEFORE stopping kubelet ensures +kubelet processes the removal. Then clearing containers ensures no stale state. Finally, +re-adding the manifest with kubelet running triggers a fresh pod creation. + +## What Does NOT Work + +| Approach | Why it fails | +|----------|-------------| +| `touch manifest.yaml` | Kubelet may not detect mtime-only changes | +| `systemctl restart kubelet` | Kubelet reuses cached pod spec if hash matches | +| `kubectl delete pod` | Deletes mirror pod but kubelet recreates from cached spec | +| `kill ` | Container runtime restarts the same container with old args | +| Moving manifest away and back without stopping kubelet | Kubelet may cache the old spec in memory | + +## Verification + +```bash +# Check the running process has new flags +ps aux | grep kube-apiserver | grep -v grep | grep 'your-new-flag' + +# Check the config hash changed +kubectl get pod -n kube-system kube-apiserver-$(hostname) \ + -o jsonpath='{.metadata.annotations.kubernetes\.io/config\.hash}' + +# Check API server logs for successful startup +kubectl logs -n kube-system kube-apiserver-$(hostname) | tail -5 +``` + +## Notes +- This applies to ALL static pods, not just kube-apiserver (etcd, controller-manager, scheduler) +- The cluster will be briefly unavailable during the restart (30-60 seconds) +- On single-master clusters, kubectl commands will fail during the restart — use `sudo kubectl --kubeconfig=/etc/kubernetes/admin.conf` from the master +- Always validate the YAML before removing the manifest: `python3 -c "import yaml; yaml.safe_load(open('/etc/kubernetes/manifests/kube-apiserver.yaml'))"` +- See also: `authentik-oidc-kubernetes` skill for the full OIDC setup context