infra/.claude/skills/add-user/SKILL.md
Viktor Barzin fd0f4a0365 fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip]
6d224861 came from a --no-checkout worktree whose empty index made the
commit drop every file except two. This restores 05b50d2b's full tree and
correctly adds stacks/stem95su/gdrive-sync.tf + the service-catalog stem95su
entry. Forward-only (parent=6d224861, no force-push); [ci skip] since the
live infra was never applied from the broken commit.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-09 08:45:33 +00:00

10 KiB

name description
add-user Add a new namespace-owner to the Kubernetes cluster. Use when: (1) "add user", "onboard user", "create user", "new namespace-owner", (2) someone new needs their own namespace and CI access, (3) user asks to set up cluster access for a person. Interactive: asks questions, updates Vault KV, applies stacks.

Add User

Add a new namespace-owner to the cluster. Two modes: automated (preferred) and manual (fallback).

SOPS state encryption access is automatically provisioned by the vault stack — per-stack Transit keys, policies, identity groups, and group aliases are all created from the k8s_users map. No manual SOPS setup required.

Automated Flow (Preferred)

Admin creates an Authentik invite → user signs up → provisioning happens automatically.

Steps

  1. Create Authentik Invitation

    • Go to Authentik Admin
    • Create a new invitation
    • Pre-assign the user to the kubernetes-namespace-owners group
    • Copy the invite link
  2. Send Invite Link to User

    • The user clicks the link and signs up
  3. Automatic Provisioning (Vault KV + Authentik)

    • Authentik fires a webhook to webhook.viktorbarzin.me/authentik/provision
    • The webhook handler validates the event and triggers the Woodpecker provision-user pipeline
    • Pipeline automatically:
      • Adds user to Vault KV (secret/platformk8s_users) with convention defaults
      • Creates sops-<username> group in Authentik and assigns the user
      • Sends Slack notification with manual apply instructions
  4. Convention Defaults (applied automatically)

    • Namespace: username
    • Quota: CPU 2, Memory 4Gi requests / 8Gi limits, 20 pods
    • Domains: none (user can request later)
  5. Manual Apply (admin receives Slack notification)

    • The vault stack requires TLS certs (git-crypt) and can't run in CI. Apply manually:
    cd /Users/viktorbarzin/code/infra
    cd stacks/vault && ../../scripts/tg apply --non-interactive && cd ../..
    cd stacks/rbac && ../../scripts/tg apply --non-interactive && cd ../..
    cd stacks/woodpecker && ../../scripts/tg apply --non-interactive && cd ../..
    
  6. Post-Provisioning

    • Send user the onboarding link: https://k8s-portal.viktorbarzin.me/onboarding?role=namespace-owner
    • If custom quota/domains needed, update Vault KV manually and re-apply stacks

Monitoring the Pipeline

Watch the pipeline at: https://ci.viktorbarzin.me → infra repo → provision-user pipeline

Manual Flow (Fallback)

Use when automated flow isn't available or custom configuration is needed.

Step 1: Collect Information

Ask the user for ALL of the following before proceeding:

Field Question Default
username Username (must match Forgejo username for CI)
email Email address (used for OIDC identity)
namespaces Namespace name(s) to create [username]
domains Subdomain(s) under viktorbarzin.me for their apps []
cpu_requests CPU request quota "2"
memory_requests Memory request quota "4Gi"
memory_limits Memory limit quota "8Gi"
pods Max pods "20"

Also confirm:

  • Has the user been added to the kubernetes-namespace-owners group in Authentik? (Manual step — admin must do this in the UI)
  • Has the user been added to the sops-USERNAME group in Authentik? (Required for terraform state decrypt — the vault stack creates the Vault external group, but the Authentik group must exist and the user must be in it)
  • Does the user need VPN access? If yes, also add to Headscale Users group in Authentik.

Do NOT proceed until the Authentik group assignments are confirmed.

Step 2: Update Vault KV

Read the current k8s_users JSON from Vault, add the new entry, and write it back.

# Ensure authenticated
vault login -method=oidc

# Read current value
vault kv get -format=json secret/platform | jq -r '.data.data.k8s_users' > /tmp/k8s_users.json

# Add the new user entry (use jq to merge)
jq --arg user "USERNAME" \
   --arg email "EMAIL" \
   --argjson ns '["NAMESPACE"]' \
   --argjson domains '["DOMAIN1"]' \
   --argjson quota '{"cpu_requests":"2","memory_requests":"4Gi","memory_limits":"8Gi","pods":"20"}' \
   '. + {($user): {"role":"namespace-owner","email":$email,"namespaces":$ns,"domains":$domains,"quota":$quota}}' \
   /tmp/k8s_users.json > /tmp/k8s_users_updated.json

# Write back — must write the entire platform secret, not just k8s_users
# First get all current keys
vault kv get -format=json secret/platform | jq -r '.data.data' > /tmp/platform_secret.json

# Update k8s_users key with new JSON (as a string, since complex types are stored as JSON strings)
jq --arg users "$(cat /tmp/k8s_users_updated.json)" '.k8s_users = $users' /tmp/platform_secret.json > /tmp/platform_updated.json

# Write back
vault kv put secret/platform @/tmp/platform_updated.json

# Clean up
rm -f /tmp/k8s_users.json /tmp/k8s_users_updated.json /tmp/platform_secret.json /tmp/platform_updated.json

Verify the write:

vault kv get -field=k8s_users secret/platform | jq '.USERNAME'

Step 3: Apply Stacks

Apply in order. Use the scripts/tg wrapper.

cd /Users/viktorbarzin/code/infra

# 1. Vault stack — creates namespace, Vault policy, identity entity, deployer role,
#    SOPS Transit key, SOPS policy, SOPS identity group + alias
cd stacks/vault && ../../scripts/tg apply --non-interactive
cd ../..

# 2. RBAC stack — creates RBAC bindings, ResourceQuota, TLS secret
cd stacks/rbac && ../../scripts/tg apply --non-interactive
cd ../..

# 3. Woodpecker stack — adds user to Woodpecker admin list
cd stacks/woodpecker && ../../scripts/tg apply --non-interactive
cd ../..

Step 4: Verify

# Namespace exists
kubectl get namespace USERNAME_NAMESPACE

# ResourceQuota applied
kubectl describe resourcequota -n USERNAME_NAMESPACE

# Vault policy exists (namespace-owner + SOPS)
vault policy read namespace-owner-USERNAME
vault policy read sops-user-USERNAME

# Vault identity entity exists (with both policies)
vault read identity/entity/name/USERNAME

# SOPS group exists
vault read identity/group/name/sops-USERNAME

# K8s deployer role works
vault write kubernetes/creds/NAMESPACE-deployer kubernetes_namespace=NAMESPACE

# SOPS Transit key exists
vault read transit/keys/sops-state-NAMESPACE

Step 5: Notify User

Tell the user to share these onboarding instructions with the new user:

  • K8s Portal: https://k8s-portal.viktorbarzin.me/onboarding?role=namespace-owner
  • README: https://github.com/ViktorBarzin/infra#new-user-onboarding

Web dashboard access (auto-login, no token paste): the rbac stack auto-creates a dashboard-<user> SA + token for every namespace-owner (dashboard-sa.tf), and the k8s-dashboard stack's token-injector maps the user's Authentik identity → that token (dashboard_injector.tf, auto-derived from k8s_users). The new user just logs into https://k8s.viktorbarzin.me and lands in the dashboard scoped to their namespace (admin on their namespace + read-only on the namespace list & nodes for nav — no cross-tenant resource reads).

Apply order for a new namespace-owner: after the vault/rbac/woodpecker applies above, ALSO cd stacks/k8s-dashboard && ../../scripts/tg apply so the injector map picks up the new user. (Manual token fallback: kubectl -n NAMESPACE get secret dashboard-USERNAME-token -o jsonpath='{.data.token}' | base64 -d.) Seamless OIDC SSO is built but blocked — see docs/plans/2026-06-04-k8s-dashboard-sso-design.md §12.

Auto-login works only for the user's k8s_users HOME namespace. The dashboard injects the user's dashboard-<user> SA token, which the rbac stack binds to admin on their home namespace only. If their workload lives in a DIFFERENT / pre-existing namespace (e.g. gheorghe's app is in novelapp, not his home vabbit81), that namespace's stack must ALSO grant their dashboard SAkind: ServiceAccount, name: dashboard-<user>, namespace: <home-ns> — not just their OIDC User email (the dashboard uses the SA, and apiserver OIDC is blocked). See stacks/novelapp/main.tf novelapp_owner_vabbit81 for the pattern (two subjects: User + SA). Best practice: set the user's k8s_users namespace to where their workload actually runs, so the home-ns auto-path covers them with no extra binding.

The user can decrypt their stack's state with:

vault login -method=oidc   # authenticates via Authentik SSO
scripts/state-sync decrypt NAMESPACE   # decrypts only their stack

What Gets Auto-Generated

Resource Stack Driven by
Kubernetes namespace vault namespaces list
Vault policy (namespace-owner-{user}) vault user key
Vault identity entity + OIDC alias vault user email
K8s deployer Role + Vault K8s role vault namespaces list
SOPS Transit key (sops-state-{ns}) vault namespaces list
SOPS Vault policy (sops-user-{user}) vault user key + namespaces
SOPS identity group (sops-{user}) vault user key
SOPS group alias (maps Authentik group) vault user key
RBAC RoleBinding (namespace admin) rbac namespaces list
RBAC ClusterRoleBinding (cluster read-only) rbac user role
ResourceQuota rbac quota object
TLS secret in namespace rbac namespaces list
Cloudflare DNS records cloudflared domains list
Woodpecker admin access woodpecker user key

Checklist (Manual Flow)

  • Authentik: user added to kubernetes-namespace-owners group
  • Authentik: user added to sops-USERNAME group (for SOPS state decrypt)
  • Authentik: user added to Headscale Users group (if VPN needed)
  • Vault KV: k8s_users entry added to secret/platform
  • Vault stack applied — namespace + policy + identity + deployer role + SOPS Transit key + SOPS policy + SOPS group created
  • RBAC stack applied — RBAC + quota + TLS created
  • Woodpecker stack applied — admin list updated
  • Verification: namespace, quota, policies (namespace-owner + sops-user), deployer role, Transit key all confirmed
  • User notified with onboarding link