workstation: docs — multi-tenancy Workstation section + offboard runbook + service-catalog fix [ci skip]
multi-tenancy.md: new DevVM Workstation section (roster SSoT, tiers, config inheritance, locked clone, built-vs-gated status). service-catalog.md t3code row: corrected the stale 'source of truth = /etc/ttyd-user-map' (now roster.yaml; the map/dispatch are GENERATED). offboard-user.md: written (was a referenced-but-missing dead link) — staged reversible-cut-then-gated-destructive for both cluster + workstation surfaces. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
parent
08bf1e0a3a
commit
c611ecf84d
3 changed files with 87 additions and 1 deletions
|
|
@ -533,6 +533,20 @@ cd stacks/actualbudget/factory
|
|||
terragrunt apply
|
||||
```
|
||||
|
||||
## DevVM Workstation (Claude Code multi-user)
|
||||
|
||||
Separate from the in-cluster namespace-owner model above, the **devvm** (`10.0.10.10`, VMID 102) hosts per-user **Claude Code Workstations** behind `t3.viktorbarzin.me`. It reuses the same identity backbone — the Vault `k8s_users` map and Authentik — but adds a devvm-side layer. Authoritative design + phased plan: `docs/plans/2026-06-07-multi-user-workstation-{design,plan}.md` (PRD: ViktorBarzin/infra#9).
|
||||
|
||||
**Single source of truth:** `infra/scripts/workstation/roster.yaml` (`os_user → authentik_user / k8s_user / tier / namespaces`). `roster_engine.py` (pytest-covered pure core) derives desired state; `t3-provision-users` (hourly timer) applies it — **additive-only** for existing users (never strips a group, replaces a home, or re-locks an account). `/etc/ttyd-user-map` + `dispatch.json` are **generated** from the roster (do not hand-edit).
|
||||
|
||||
**RBAC tiers:** `admin` (Viktor — cluster-admin, unlocked tree, secrets) · `power-user` (cluster-wide read-only, NO Secrets, via a dedicated `oidc-power-user-readonly` ClusterRole) · `namespace-owner` (admin in own namespace only). Each session acts as the user's **own** OIDC identity (kubelogin), never the admin's.
|
||||
|
||||
**Config inheritance (live):** wizard authors the base (his chezmoi-versioned `~/.claude`). Two native layers carry it to every user — the enforced org `claudeMd` in `/etc/claude-code/managed-settings.json` (top precedence, all sessions) and per-user `~/.claude/{skills,rules,…}` **symlinks** to the base (seeded via `/etc/skel`; edits propagate live). Secrets stay per-user at mode 600, never symlinked.
|
||||
|
||||
**Infra access:** non-admins get their own **writable, git-crypt-LOCKED** clone of the (public) infra repo at `~/code` — code/docs plaintext, secret files (`*.tfvars`, `secrets/**`) stay ciphertext. Changes are ungated (push ≠ apply); the real boundary is apply-time (`scripts/tg apply` needs an admin Vault token + cluster RBAC).
|
||||
|
||||
**Status (2026-06-08):** built + verified on the live host — capacity (8 GiB swap), config inheritance, roster-driven provisioner, per-user locked clone. **Gated / pending:** per-user OIDC kubeconfig + the `oidc-power-user-readonly` ClusterRole + emo's `k8s_users` entry, the Authentik `T3 Users` edge gate, the emo cutover (Phase 5), and the offboarding apply-side (Phase 7). See `../runbooks/offboard-user.md` for deprovisioning.
|
||||
|
||||
## Related
|
||||
|
||||
- [CI/CD Pipeline](./ci-cd.md) — Per-user Woodpecker pipelines
|
||||
|
|
|
|||
72
docs/runbooks/offboard-user.md
Normal file
72
docs/runbooks/offboard-user.md
Normal file
|
|
@ -0,0 +1,72 @@
|
|||
# Runbook: Offboard a User
|
||||
|
||||
Removing a user can span two surfaces — the **in-cluster** namespace-owner model
|
||||
(Vault `k8s_users` / RBAC / namespace) and the **devvm Workstation** (roster /
|
||||
OS account / t3 instance). Both are **staged**: a *reversible cut* (revoke access,
|
||||
delete nothing) first, then an explicit, gated *destructive removal*. Do the
|
||||
reversible cut immediately; only do the destructive step once you're sure.
|
||||
|
||||
> Architecture: `../architecture/multi-tenancy.md`. Workstation design:
|
||||
> `../plans/2026-06-07-multi-user-workstation-design.md`.
|
||||
|
||||
---
|
||||
|
||||
## Part A — DevVM Workstation offboarding
|
||||
|
||||
Driven by removing the user's entry from `infra/scripts/workstation/roster.yaml`.
|
||||
`roster_engine.py offboard_plan` computes the staged actions (reversible cut vs the
|
||||
gated `userdel_archive`, which is **never** auto-applied).
|
||||
|
||||
### A1. Reversible cut (revoke access; delete nothing)
|
||||
|
||||
1. **Delete the user's entry** from `roster.yaml`; commit + push.
|
||||
2. **Reconcile** (`sudo /usr/local/bin/t3-provision-users`, or wait for the hourly
|
||||
timer). This **regenerates** `/etc/ttyd-user-map` + `dispatch.json` *without* the
|
||||
user → `t3-dispatch` now returns **403** for them. *(Automated.)*
|
||||
3. **Disable their instance + lock login** *(manual today; Phase 7 will fold this into
|
||||
the reconcile):*
|
||||
```bash
|
||||
sudo systemctl disable --now t3-serve@<os_user>.service
|
||||
sudo passwd -l <os_user>
|
||||
```
|
||||
4. **Verify:** they can no longer reach `t3.viktorbarzin.me` (302 → Authentik, then
|
||||
denied once removed from the `T3 Users` group — Part C) and cannot log in. Nothing
|
||||
is deleted; re-adding the roster entry + reconcile fully restores them.
|
||||
|
||||
### A2. Destructive removal (explicit, gated — NEVER automatic)
|
||||
|
||||
Only after the reversible cut and a deliberate decision:
|
||||
```bash
|
||||
sudo tar czf /mnt/backup/offboard/<os_user>-$(date +%Y%m%d).tar.gz /home/<os_user>
|
||||
sudo userdel -r <os_user> # removes home + mail spool — IRREVERSIBLE
|
||||
```
|
||||
Rollback before this step: re-add the roster entry + reconcile. After it: restore
|
||||
from the archive.
|
||||
|
||||
---
|
||||
|
||||
## Part B — In-cluster (namespace-owner) offboarding
|
||||
|
||||
1. **Reversible cut:** remove the user's Authentik group membership (edge/RBAC blocked)
|
||||
and their entry from the Vault `k8s_users` map (`secret/platform`).
|
||||
2. **Apply:** `scripts/tg apply` the `vault` → `platform` → `woodpecker` stacks (drops the
|
||||
RBAC binding, Vault identity/policy, and per-user CI). Their OIDC kubeconfig stops
|
||||
authorizing immediately.
|
||||
3. **Destructive (gated):** deleting their namespace(s) removes all their workloads +
|
||||
data — back up first (PVCs, DBs), then delete only on explicit decision.
|
||||
|
||||
---
|
||||
|
||||
## Part C — Authentik (both surfaces)
|
||||
|
||||
Remove the user from the relevant Authentik group(s) — `kubernetes-namespace-owners`
|
||||
(cluster) and/or `T3 Users` (workstation edge gate). This is the edge revocation; do
|
||||
it as part of the reversible cut so they're locked out at the front door.
|
||||
|
||||
---
|
||||
|
||||
## Order of operations
|
||||
|
||||
Reversible cut on **all** relevant surfaces first (Authentik group → roster removal +
|
||||
reconcile → `k8s_users` removal + apply) → verify access is gone → only then the gated
|
||||
destructive steps (`userdel -r`, namespace deletion), each after its own archive.
|
||||
Loading…
Add table
Add a link
Reference in a new issue