workstation: per-user code_layout — workspace puts project repos under ~/code (ancamilea + tripit)

Viktor asked to restructure Anca's setup: her ~/code WAS the infra clone
itself; he wants ~/code to be the directory where all her project repos
(tripit etc.) live side by side, with infra moved to a subdirectory.

- roster.yaml gains per-user 'code_layout: single|workspace' + 'repos',
  validated + derived by roster_engine.py (12 new tests, 40 total).
- t3-provision-users reconcile: auto-migrates a single-layout ~/code to
  ~/code/infra (running processes follow the moved inode), hoists nested
  project clones to the workspace root, clones roster repos from Forgejo
  AS the user (their PAT makes private repos work), and wires the
  documented forgejo remote + forgejo/master upstream into clones that
  predate that contract.
- Fixed a latent TSV bug: empty jq @tsv fields collapse under tab-IFS
  read, shifting later fields left (groups was only safe by being the
  last field) — emit '-' sentinels instead.
- start-claude.sh session freshen is layout-aware (freshens each repo
  under ~/code for workspace users).
- managed claudeMd + AGENTS.md non-admin recipe + multi-tenancy.md
  updated in the same change.

Applied live: ancamilea = workspace (infra at ~/code/infra, her existing
tripit clone hoisted to ~/code/tripit, master upstream switched to
forgejo/master); emo stays single layout, untouched. [ci skip]
This commit is contained in:
Viktor Barzin 2026-06-10 18:05:31 +00:00
parent 3b6a5c6737
commit 2825cb1703
8 changed files with 306 additions and 44 deletions

View file

@ -231,10 +231,18 @@ Per-workload opt-out: add the label `keel.sh/policy: never` on the Deployment me
Non-admin devvm users (power-user / namespace-owner tiers) may not know git at
all. Their agent handles every version-control step silently — never ask them
to commit, push, pull, or open a PR, and never surface git jargon at them.
Their `~/code` clone arrives preconfigured: git identity, a `forgejo` remote
Their infra clone arrives preconfigured: git identity, a `forgejo` remote
authenticated via `~/.git-credentials`, and `master` tracking `forgejo/master`
(auto-freshened hourly and at session launch, fast-forward only).
Two per-user layouts exist (`code_layout` in
`scripts/workstation/roster.yaml`): `single` (the default) — `~/code` IS the
locked infra clone — and `workspace``~/code` is a plain directory of
per-project clones: the infra clone at `~/code/infra`, plus each roster
`repos` entry (e.g. `~/code/tripit`) cloned from Forgejo `viktor/<name>` with
the user's own PAT. The reconcile auto-migrates a single-layout `~/code` when
a user is flipped to `workspace`, and keeps every clone fresh either way.
The model is **allow-then-audit** (Viktor, 2026-06-10): whitelisted users (emo)
push straight to `master` — no PR gate — and the record of *what changed and
why* is what matters. Force-push is disabled for everyone, so master history

View file

@ -543,19 +543,19 @@ Separate from the in-cluster namespace-owner model above, the **devvm** (`10.0.1
**Config inheritance (live):** wizard authors the base (his chezmoi-versioned `~/.claude`). Two native layers carry it to every user — the enforced org `claudeMd` in `/etc/claude-code/managed-settings.json` (top precedence, all sessions) and per-user `~/.claude/{skills,rules,…}` **symlinks** to the base (seeded via `/etc/skel`; edits propagate live). Secrets stay per-user at mode 600, never symlinked. **The managed config self-deploys from the repo** (2026-06-10): the hourly reconcile's `sync_managed_config` installs `scripts/workstation/managed-settings.json` to `/etc/claude-code/` whenever the repo copy changes — so editing the claudeMd = edit + commit, no manual install — and `refresh_codex_mirror` regenerates each user's `~/.codex/AGENTS.md` (a static mirror of the claudeMd; only files carrying the mirror header are touched, user-customized ones are left alone). Repo-level guidance (`.claude/CLAUDE.md`, `AGENTS.md`, `CONTEXT.md` in the infra repo) reaches non-admins through their auto-freshened clones — commit + push and every user has it within the hour.
**Infra access:** non-admins get their own **writable, git-crypt-LOCKED** clone of the (public) infra repo at `~/code` — code/docs plaintext, secret files (`*.tfvars`, `secrets/**`) stay ciphertext. The provisioner clones anonymously from the public GitHub mirror; **contribute access is wired per-user on top** (see below). The apply boundary still holds (`scripts/tg apply` needs an admin Vault token + cluster RBAC), but **pushing `master` is NOT inert** — the Forgejo→Woodpecker webhook fires `.woodpecker/default.yml` (`event: push, branch: master`, `require_approval: forks` only), which terragrunt-applies changed stacks. `master` is **branch-protected on Forgejo** (force-push disabled for everyone — history is append-only; push + merge whitelists = `viktor` + explicitly granted users, deploy keys allowed). **Allow-then-audit (Viktor, 2026-06-10):** `ebarzin` (emo) is on the whitelist and pushes straight to `master` — no PR gate. The tracking burden moves to: (a) **commit messages that record what + why** (the agent instructions in AGENTS.md and the managed claudeMd require the body to paraphrase the user's request), (b) the **`notify-nonadmin-push` Slack audit step** in `.woodpecker/default.yml` — every master push by a non-admin author is posted to Slack (admin pushes are not), and (c) non-admins **never use `[ci skip]`** so every change fires the pipeline (and thus the audit feed). Users NOT on the whitelist fall back to `<user>/<topic>` branches + PRs. **Clones stay fresh automatically** (2026-06-10): the hourly `t3-provision-users` reconcile runs `refresh_locked_clone` (fetch all remotes + fast-forward `master`, ONLY when on master with a clean tree and an upstream — dirty trees and local commits are left alone with a WARN), and `start-claude.sh` does the same freshen at session launch (15s-capped so an offline remote never stalls the session).
**Infra access:** non-admins get their own **writable, git-crypt-LOCKED** clone of the (public) infra repo — code/docs plaintext, secret files (`*.tfvars`, `secrets/**`) stay ciphertext. Its location depends on the per-user `code_layout` in `roster.yaml`: `single` (default) puts the clone AT `~/code`; `workspace` makes `~/code` a plain directory of per-project clones — the infra clone at `~/code/infra` plus each roster `repos` entry cloned from Forgejo `viktor/<name>` **as the user** (their PAT authenticates, so private repos work; clone failures WARN and retry next hour). Flipping a user to `workspace` auto-migrates their existing `~/code` clone to `~/code/infra` (local branches/dirty state survive; running processes follow the moved inode). ancamilea = workspace + `tripit` since 2026-06-10. The provisioner clones infra anonymously from the public GitHub mirror; **contribute access is wired per-user on top** (see below). The apply boundary still holds (`scripts/tg apply` needs an admin Vault token + cluster RBAC), but **pushing `master` is NOT inert** — the Forgejo→Woodpecker webhook fires `.woodpecker/default.yml` (`event: push, branch: master`, `require_approval: forks` only), which terragrunt-applies changed stacks. `master` is **branch-protected on Forgejo** (force-push disabled for everyone — history is append-only; push + merge whitelists = `viktor` + explicitly granted users, deploy keys allowed). **Allow-then-audit (Viktor, 2026-06-10):** `ebarzin` (emo) is on the whitelist and pushes straight to `master` — no PR gate. The tracking burden moves to: (a) **commit messages that record what + why** (the agent instructions in AGENTS.md and the managed claudeMd require the body to paraphrase the user's request), (b) the **`notify-nonadmin-push` Slack audit step** in `.woodpecker/default.yml` — every master push by a non-admin author is posted to Slack (admin pushes are not), and (c) non-admins **never use `[ci skip]`** so every change fires the pipeline (and thus the audit feed). Users NOT on the whitelist fall back to `<user>/<topic>` branches + PRs. **Clones stay fresh automatically** (2026-06-10): the hourly `t3-provision-users` reconcile runs `refresh_user_clone` over every managed clone — the infra clone and any workspace repos (fetch all remotes + fast-forward `master`, ONLY when on master with a clean tree and an upstream — dirty trees and local commits are left alone with a WARN) — and also `wire_forgejo_remote`, which idempotently adds the documented `forgejo` remote + `forgejo/master` upstream to infra clones that predate that contract. `start-claude.sh` does the same freshen at session launch (10s fetch cap per repo so an offline remote never stalls the session; workspace layouts freshen each repo under `~/code`).
**Contribute access (per non-admin, manual — the anca/tripit PAT precedent):**
1. Add their Forgejo user as a **write** collaborator on `viktor/infra` (`PUT /api/v1/repos/viktor/infra/collaborators/<login>`).
2. Mint a PAT — the admin REST endpoint 404s here, use the in-pod CLI: `kubectl -n forgejo exec deploy/forgejo -- su -s /bin/sh git -c "forgejo admin user generate-access-token --username <login> --token-name devvm-infra-git --scopes 'write:repository'"`.
3. Install it in their `~/.git-credentials` (`https://<login>:<token>@forgejo.viktorbarzin.me`, mode 600) + `git config --global credential.helper store`, set `user.name`/`user.email`.
4. In their clone: `git remote add forgejo https://forgejo.viktorbarzin.me/viktor/infra.git` and `git branch --set-upstream-to=forgejo/master master` (origin stays the anonymous GitHub mirror).
4. The reconcile wires the clone side automatically (`wire_forgejo_remote`): `forgejo` remote + `master` tracking `forgejo/master` on every non-admin infra clone (origin stays the anonymous GitHub mirror). No manual step since 2026-06-10.
5. (Optional — Viktor's call per user) Grant direct master push: add their login to the `master` branch-protection push + merge whitelists (`PATCH /api/v1/repos/viktor/infra/branch_protections/master`). Done for `ebarzin` 2026-06-10.
6. Verify: branch push succeeds; a `master` push succeeds for whitelisted users and is rejected with `Not allowed to push to protected branch` otherwise.
**Web-terminal session persistence (2026-06-10):** the tmux-based web terminal's named sessions (each running one Claude conversation) survive devvm reboots — `tmux-persist-save.timer` (5-min) snapshots every terminal user's sessions (name, cwd, conversation uuid from argv or the cwd-slug transcript dir) to `/var/lib/tmux-persist/<user>.tsv`, and `tmux-persist-restore.service` recreates missing sessions at boot with `claude --resume <uuid>` (per-session idempotent; also handles partial loss). This is a **tmux/terminal-surface** feature, deliberately outside the t3 namespace: the t3 chat surface persists its own threads (`~/.t3` state, plus the daily `t3-backup-state` dump), and Claude conversations themselves were always durable (`~/.claude/projects/`) — what this adds is the volatile tmux wiring.
**Status (2026-06-10):** built + verified on the live host — capacity (8 GiB swap), config inheritance, roster-driven provisioner, per-user locked clone, per-user OIDC kubeconfig + the `oidc-power-user-readonly` ClusterRole + emo's `k8s_users` entry (applied + impersonation-verified), the Authentik `T3 Users` edge gate, **the emo Phase-5 cutover (own clone + launcher repoint + `code-shared` removal, completed 2026-06-10) and emo's contribute access (`ebarzin` write collaborator + PAT + protected `master`)**. Per the live `/etc/skel` design, non-admin `~/.claude/{rules,skills}` symlinks into the admin base are **kept** (they ARE the shared-base delivery mechanism — the plan's step to remove them is obsolete). **Remaining (held / future):** the offboarding apply-side (Phase 7), per-user MCP/auth injection, and roster-reconciled `T3 Users` membership. See `../runbooks/offboard-user.md` for deprovisioning.
**Status (2026-06-10):** built + verified on the live host — capacity (8 GiB swap), config inheritance, roster-driven provisioner, per-user locked clone, per-user OIDC kubeconfig + the `oidc-power-user-readonly` ClusterRole + emo's `k8s_users` entry (applied + impersonation-verified), the Authentik `T3 Users` edge gate, **the emo Phase-5 cutover (own clone + launcher repoint + `code-shared` removal, completed 2026-06-10) and emo's contribute access (`ebarzin` write collaborator + PAT + protected `master`)**, and **per-user `code_layout` with the ancamilea workspace cutover (infra → `~/code/infra`, `tripit` alongside, 2026-06-10)**. Per the live `/etc/skel` design, non-admin `~/.claude/{rules,skills}` symlinks into the admin base are **kept** (they ARE the shared-base delivery mechanism — the plan's step to remove them is obsolete). **Remaining (held / future):** the offboarding apply-side (Phase 7), per-user MCP/auth injection, and roster-reconciled `T3 Users` membership. See `../runbooks/offboard-user.md` for deprovisioning.
## Related

View file

@ -20,6 +20,12 @@ MAP=/etc/ttyd-user-map
DRY_RUN="${DRY_RUN:-0}"
# Public infra repo for the locked clone (no auth; the monorepo has no remote).
INFRA_REMOTE="${INFRA_REMOTE:-https://github.com/ViktorBarzin/infra.git}"
# Canonical push target for non-admin infra clones (AGENTS.md "Non-admin
# workstation users"), and the base URL for workspace-layout `repos` entries —
# those clone AS the user so their ~/.git-credentials PAT authenticates
# against private Forgejo repos.
FORGEJO_INFRA_REMOTE="${FORGEJO_INFRA_REMOTE:-https://forgejo.viktorbarzin.me/viktor/infra.git}"
REPO_REMOTE_BASE="${REPO_REMOTE_BASE:-https://forgejo.viktorbarzin.me/viktor}"
# Per-user OIDC kubeconfig (kubelogin/PKCE; cluster server+CA copied from the admin kubeconfig).
OIDC_ISSUER="${OIDC_ISSUER:-https://authentik.viktorbarzin.me/application/o/kubernetes/}"
ADMIN_KUBECONFIG="${ADMIN_KUBECONFIG:-/home/wizard/.kube/config}"
@ -27,22 +33,24 @@ ADMIN_KUBECONFIG="${ADMIN_KUBECONFIG:-/home/wizard/.kube/config}"
log() { echo "[t3-provision] $*"; }
run() { if [[ "$DRY_RUN" == 1 ]]; then echo "[dry-run] $*"; else "$@"; fi; }
# Per-non-admin writable, git-crypt-LOCKED infra clone at ~/code. Keyless +
# Per-non-admin writable, git-crypt-LOCKED infra clone at ~/<subpath>. Keyless +
# filter=cat ⇒ code/docs are plaintext, git-crypt'd secret files stay ciphertext.
# Writable + ungated (push != apply; applies are admin-only). NEVER touches an
# existing ~/code (so emo's symlink survives until the gated cutover).
# existing target (so emo's symlink survives until the gated cutover). subpath
# is "code" (single layout) or "code/infra" (workspace layout).
install_locked_clone() {
local user="$1" home
local user="$1" sub="$2" home dst
home="$(getent passwd "$user" | cut -d: -f6)"
[[ -z "$home" ]] && return 0
[[ -e "$home/code" || -L "$home/code" ]] && return 0
if [[ "$DRY_RUN" == 1 ]]; then echo "[dry-run] locked infra clone -> $user:$home/code"; return 0; fi
log "clone locked infra -> $user:~/code"
runuser -u "$user" -- git clone --quiet --no-checkout "$INFRA_REMOTE" "$home/code"
runuser -u "$user" -- git -C "$home/code" config filter.git-crypt.smudge cat
runuser -u "$user" -- git -C "$home/code" config filter.git-crypt.clean cat
runuser -u "$user" -- git -C "$home/code" config filter.git-crypt.required false
runuser -u "$user" -- git -C "$home/code" checkout --quiet master
dst="$home/$sub"
[[ -e "$dst" || -L "$dst" ]] && return 0
if [[ "$DRY_RUN" == 1 ]]; then echo "[dry-run] locked infra clone -> $user:$dst"; return 0; fi
log "clone locked infra -> $user:~/$sub"
runuser -u "$user" -- git clone --quiet --no-checkout "$INFRA_REMOTE" "$dst"
runuser -u "$user" -- git -C "$dst" config filter.git-crypt.smudge cat
runuser -u "$user" -- git -C "$dst" config filter.git-crypt.clean cat
runuser -u "$user" -- git -C "$dst" config filter.git-crypt.required false
runuser -u "$user" -- git -C "$dst" checkout --quiet master
}
# Keep an EXISTING non-admin clone fresh (the admin's tree is never touched): fetch
@ -50,18 +58,98 @@ install_locked_clone() {
# clean tree, upstream configured. Never rebases/merges; a non-ff master (local
# commits) is the user's to reconcile and is only WARNed about. Fetch failures
# (offline, missing credentials) are non-fatal: freshness is best-effort.
refresh_locked_clone() {
local user="$1" home
refresh_user_clone() {
local user="$1" sub="$2" home dir
home="$(getent passwd "$user" | cut -d: -f6)"
[[ -n "$home" && -d "$home/code/.git" ]] || return 0
if [[ "$DRY_RUN" == 1 ]]; then echo "[dry-run] refresh clone -> $user:$home/code"; return 0; fi
runuser -u "$user" -- env GIT_TERMINAL_PROMPT=0 git -C "$home/code" fetch --all --prune --quiet 2>/dev/null \
|| { log "WARN: clone fetch failed for $user (offline/credentials?) — skipped"; return 0; }
[[ "$(runuser -u "$user" -- git -C "$home/code" symbolic-ref --short -q HEAD)" == master ]] || return 0
[[ -z "$(runuser -u "$user" -- git -C "$home/code" status --porcelain)" ]] || return 0
runuser -u "$user" -- git -C "$home/code" rev-parse --verify -q 'master@{upstream}' >/dev/null || return 0
runuser -u "$user" -- git -C "$home/code" merge --ff-only 'master@{upstream}' >/dev/null 2>&1 \
|| log "WARN: $user master not fast-forwardable (local commits?) — left as-is"
dir="$home/$sub"
[[ -n "$home" && -d "$dir/.git" ]] || return 0
if [[ "$DRY_RUN" == 1 ]]; then echo "[dry-run] refresh clone -> $user:$dir"; return 0; fi
runuser -u "$user" -- env GIT_TERMINAL_PROMPT=0 git -C "$dir" fetch --all --prune --quiet 2>/dev/null \
|| { log "WARN: fetch failed for $user:$sub (offline/credentials?) — skipped"; return 0; }
[[ "$(runuser -u "$user" -- git -C "$dir" symbolic-ref --short -q HEAD)" == master ]] || return 0
[[ -z "$(runuser -u "$user" -- git -C "$dir" status --porcelain)" ]] || return 0
runuser -u "$user" -- git -C "$dir" rev-parse --verify -q 'master@{upstream}' >/dev/null || return 0
runuser -u "$user" -- git -C "$dir" merge --ff-only 'master@{upstream}' >/dev/null 2>&1 \
|| log "WARN: $user:$sub master not fast-forwardable (local commits?) — left as-is"
}
# Non-admin infra clones are documented to carry a `forgejo` remote (the
# canonical push target) with master tracking forgejo/master — see AGENTS.md
# "Non-admin workstation users". Clones made before that contract only have
# the GitHub origin; wire the remote + upstream idempotently. Best-effort: an
# offline fetch leaves the upstream as-is.
wire_forgejo_remote() {
local user="$1" sub="$2" home dir
home="$(getent passwd "$user" | cut -d: -f6)"
dir="$home/$sub"
[[ -n "$home" && -d "$dir/.git" ]] || return 0
if ! runuser -u "$user" -- git -C "$dir" remote get-url forgejo >/dev/null 2>&1; then
if [[ "$DRY_RUN" == 1 ]]; then echo "[dry-run] add forgejo remote -> $user:$sub"; return 0; fi
log "add forgejo remote -> $user:~/$sub"
runuser -u "$user" -- git -C "$dir" remote add forgejo "$FORGEJO_INFRA_REMOTE"
fi
[[ "$DRY_RUN" == 1 ]] && return 0
[[ "$(runuser -u "$user" -- git -C "$dir" rev-parse --abbrev-ref -q 'master@{upstream}' 2>/dev/null)" == forgejo/master ]] && return 0
runuser -u "$user" -- env GIT_TERMINAL_PROMPT=0 git -C "$dir" fetch --quiet forgejo 2>/dev/null \
|| { log "WARN: forgejo fetch failed for $user — upstream left as-is"; return 0; }
runuser -u "$user" -- git -C "$dir" branch --set-upstream-to=forgejo/master master >/dev/null 2>&1 \
&& log "set $user:~/$sub master upstream -> forgejo/master" \
|| log "WARN: could not set $user:~/$sub master upstream to forgejo/master"
}
# Workspace layout: ~/code is a plain directory of per-project clones. A user
# still on the single layout (~/code IS the infra clone) is migrated by moving
# the whole clone — local branches, dirty files, untracked state all survive —
# to ~/code/infra. Running processes follow the moved inode, so live sessions
# keep working (their cwd lands inside ~/code/infra).
ensure_workspace_layout() {
local user="$1" home tmp
home="$(getent passwd "$user" | cut -d: -f6)"
[[ -z "$home" ]] && return 0
if [[ -d "$home/code/.git" ]]; then
if [[ "$DRY_RUN" == 1 ]]; then echo "[dry-run] migrate $user:~/code (single clone) -> ~/code/infra"; return 0; fi
log "migrate $user: ~/code (single infra clone) -> ~/code/infra"
tmp="$home/.code-workspace-migrate.$$"
mv "$home/code" "$tmp"
install -d -o "$user" -g "$user" -m 0755 "$home/code"
mv "$tmp" "$home/code/infra"
elif [[ ! -e "$home/code" ]]; then
if [[ "$DRY_RUN" == 1 ]]; then echo "[dry-run] create workspace dir $user:~/code"; return 0; fi
install -d -o "$user" -g "$user" -m 0755 "$home/code"
fi
}
# Single-layout clones often accumulated nested project clones (the old layout
# gave users nowhere else to put them — e.g. ancamilea's tripit inside ~/code).
# After migration such a clone would sit buried at ~/code/infra/<repo>; hoist a
# roster repo to its workspace home instead of stranding it + cloning fresh.
# Only untracked git dirs move — content the infra repo tracks is never touched.
hoist_nested_repo() {
local user="$1" repo="$2" home src dst
home="$(getent passwd "$user" | cut -d: -f6)"
[[ -z "$home" ]] && return 0
src="$home/code/infra/$repo"; dst="$home/code/$repo"
[[ -d "$src/.git" && ! -e "$dst" ]] || return 0
runuser -u "$user" -- git -C "$home/code/infra" ls-files --error-unmatch "$repo" >/dev/null 2>&1 && return 0
if [[ "$DRY_RUN" == 1 ]]; then echo "[dry-run] hoist nested $repo -> $user:$dst"; return 0; fi
log "hoist nested $repo clone -> $user:~/code/$repo"
mv "$src" "$dst"
}
# Extra per-project repos for workspace-layout users, cloned from Forgejo AS
# the user (their ~/.git-credentials PAT authenticates against private repos).
# A failed clone (no access yet, offline) is a WARN — the reconcile must never
# abort over a single repo; the next hourly run retries.
install_user_repo() {
local user="$1" repo="$2" home dst
home="$(getent passwd "$user" | cut -d: -f6)"
[[ -z "$home" ]] && return 0
dst="$home/code/$repo"
[[ -e "$dst" || -L "$dst" ]] && return 0
if [[ "$DRY_RUN" == 1 ]]; then echo "[dry-run] clone $REPO_REMOTE_BASE/$repo.git -> $user:$dst"; return 0; fi
log "clone $repo -> $user:~/code/$repo"
runuser -u "$user" -- env GIT_TERMINAL_PROMPT=0 git clone --quiet "$REPO_REMOTE_BASE/$repo.git" "$dst" 2>/dev/null \
|| log "WARN: clone of $repo failed for $user (access/offline?) — skipped"
}
# Machine-wide Claude managed config: the repo file (in the admin tree, like the
@ -218,7 +306,11 @@ jq -e . "$desired_file" >/dev/null || { echo "[t3-provision] derive produced inv
sync_managed_config
# 4) per-account: create-if-absent + ADDITIVE tier groups (never strip) + locked clone
while IFS=$'\t' read -r os_user tier shell groups_csv; do
# NB: empty @tsv fields collapse under tab-IFS read (tab is IFS whitespace), so
# the jq below emits "-" for empty groups/repos and we map it back here.
while IFS=$'\t' read -r os_user tier shell groups_csv code_layout repos_csv; do
[[ "$groups_csv" == "-" ]] && groups_csv=""
[[ "$repos_csv" == "-" ]] && repos_csv=""
if ! id "$os_user" >/dev/null 2>&1; then
log "create account: $os_user (shell $shell)"
run useradd -m -s "$shell" "$os_user"
@ -234,14 +326,29 @@ while IFS=$'\t' read -r os_user tier shell groups_csv; do
log "add $os_user -> group $g"; run gpasswd -a "$os_user" "$g" >/dev/null
done
fi
if [[ "$tier" != admin ]]; then # non-admins: locked clone (kept fresh) + kubeconfig + shared Claude token
install_locked_clone "$os_user"
refresh_locked_clone "$os_user"
if [[ "$tier" != admin ]]; then # non-admins: locked clone(s) (kept fresh) + kubeconfig + shared Claude token
if [[ "$code_layout" == workspace ]]; then
ensure_workspace_layout "$os_user"
install_locked_clone "$os_user" code/infra
wire_forgejo_remote "$os_user" code/infra # before refresh: ff targets the canonical upstream same-pass
refresh_user_clone "$os_user" code/infra
IFS=',' read -ra extra_repos <<< "$repos_csv"
for repo in "${extra_repos[@]}"; do
[[ -n "$repo" ]] || continue
hoist_nested_repo "$os_user" "$repo"
install_user_repo "$os_user" "$repo"
refresh_user_clone "$os_user" "code/$repo"
done
else
install_locked_clone "$os_user" code
wire_forgejo_remote "$os_user" code # before refresh: ff targets the canonical upstream same-pass
refresh_user_clone "$os_user" code
fi
install_user_kubeconfig "$os_user"
install_user_claude_token "$os_user"
fi
refresh_codex_mirror "$os_user" # all tiers — mirror of the managed claudeMd
done < <(jq -r '.accounts[] | [.os_user, .tier, .shell, (.groups|join(","))] | @tsv' "$desired_file")
done < <(jq -r '.accounts[] | [.os_user, .tier, .shell, (if (.groups|length)==0 then "-" else (.groups|join(",")) end), .code_layout, (if (.repos|length)==0 then "-" else (.repos|join(",")) end)] | @tsv' "$desired_file")
# 5) per-user .env (sticky port) + enable t3-serve@
while IFS=$'\t' read -r os_user port; do

View file

@ -1,4 +1,4 @@
{
"claudeMd": "# Viktor Barzin homelab — shared multi-user Claude Code Workstation (devvm)\n\nYou are running as a specific OS user on a SHARED devvm Workstation, not as the admin. These org-wide rules apply to EVERY user and sit at the top of settings precedence (they cannot be overridden by a user's own config):\n\n- Respect your permission tier. Your kubectl, Vault, and infra access are scoped to your RBAC tier (admin / power-user / namespace-owner). Do not attempt to escalate privileges or reach another user's resources.\n- Secrets are per-user. Never read another user's home directory, credentials, tokens, or ~/.claude secrets. Your own secrets live in your home at mode 600.\n- Infrastructure changes go through Terraform/Terragrunt — never direct kubectl apply/edit/patch. Committed stack changes are auto-applied by CI on push to master; you can verify the live result with your read-only kubectl.\n- The AGENT does ALL git mechanics silently — the user may not know git, so never ask them to commit, push, pull, or open anything, and never surface git jargon. Feature-sized work is done in an isolated git worktree (`.worktrees/<topic>`, branch `<os-user>/<topic>`) and merged into master when finished, so several agents can work the same project at once — full lifecycle in ~/.claude/rules/execution.md §3; trivial single-commit fixes may go straight to master. When you finish a change in ~/code: commit it ON master and push to the forgejo remote. THE COMMIT MESSAGE IS THE AUDIT TRAIL — subject says WHAT changed; body says WHY in plain words (paraphrase the user's actual request) — this matters more than the change itself. Never use [ci skip] as a non-admin (it would hide the change from the audit feed; harmless no-op applies are fine). If the push is rejected non-fast-forward, git pull --rebase forgejo master and push again. If it is rejected by branch protection (user not whitelisted), fall back to a <os-user>/<topic> branch + PR via the Forgejo API (token = password field in ~/.git-credentials). Keep ~/code on a clean master when done so background auto-refresh keeps working. Tell the user in plain words what happened ('done — your change is live/recorded'). Full recipe: AGENTS.md → 'Non-admin workstation users' in ~/code.\n- Follow the engineering rules in ~/.claude/rules/ (execution, planning, quality) and every CLAUDE.md in the repo tree.\n- The monorepo is at ~/code. Non-admins get a git-crypt-LOCKED clone: secret files read as ciphertext — that is expected, not an error.",
"claudeMd": "# Viktor Barzin homelab — shared multi-user Claude Code Workstation (devvm)\n\nYou are running as a specific OS user on a SHARED devvm Workstation, not as the admin. These org-wide rules apply to EVERY user and sit at the top of settings precedence (they cannot be overridden by a user's own config):\n\n- Respect your permission tier. Your kubectl, Vault, and infra access are scoped to your RBAC tier (admin / power-user / namespace-owner). Do not attempt to escalate privileges or reach another user's resources.\n- Secrets are per-user. Never read another user's home directory, credentials, tokens, or ~/.claude secrets. Your own secrets live in your home at mode 600.\n- Infrastructure changes go through Terraform/Terragrunt — never direct kubectl apply/edit/patch. Committed stack changes are auto-applied by CI on push to master; you can verify the live result with your read-only kubectl.\n- The AGENT does ALL git mechanics silently — the user may not know git, so never ask them to commit, push, pull, or open anything, and never surface git jargon. Feature-sized work is done in an isolated git worktree (`.worktrees/<topic>`, branch `<os-user>/<topic>`) and merged into master when finished, so several agents can work the same project at once — full lifecycle in ~/.claude/rules/execution.md §3; trivial single-commit fixes may go straight to master. When you finish a change in a repo under ~/code (or ~/code itself when it IS the clone): commit it ON master and push to the forgejo remote. THE COMMIT MESSAGE IS THE AUDIT TRAIL — subject says WHAT changed; body says WHY in plain words (paraphrase the user's actual request) — this matters more than the change itself. Never use [ci skip] as a non-admin (it would hide the change from the audit feed; harmless no-op applies are fine). If the push is rejected non-fast-forward, git pull --rebase forgejo master and push again. If it is rejected by branch protection (user not whitelisted), fall back to a <os-user>/<topic> branch + PR via the Forgejo API (token = password field in ~/.git-credentials). Keep every clone on a clean master when done so background auto-refresh keeps working. Tell the user in plain words what happened ('done — your change is live/recorded'). Full recipe: AGENTS.md → 'Non-admin workstation users' in your infra clone.\n- Follow the engineering rules in ~/.claude/rules/ (execution, planning, quality) and every CLAUDE.md in the repo tree.\n- Code lives under ~/code, in one of two per-user layouts: either ~/code IS the git-crypt-LOCKED infra clone (single layout), or ~/code is a workspace directory of per-project clones — the locked infra clone at ~/code/infra plus other project repos alongside it (e.g. ~/code/tripit). [ -d ~/code/.git ] means single. In locked infra clones secret files read as ciphertext — that is expected, not an error.",
"model": "claude-fable-5"
}

View file

@ -10,6 +10,14 @@
# power-user - cluster-wide READ (no Secrets) via oidc-power-user-readonly; locked clone
# namespace-owner - admin in their own namespace(s) only; locked clone
#
# Optional per-user code layout (non-admins):
# code_layout: single (default) - ~/code IS the locked infra clone
# code_layout: workspace - ~/code is a directory of per-project clones:
# the locked infra clone at ~/code/infra, plus
# each `repos` entry cloned from Forgejo
# viktor/<name> with the user's own PAT.
# A single-layout ~/code is auto-migrated.
#
# wizard IS listed (as admin): the reconcile REGENERATES /etc/ttyd-user-map +
# dispatch.json from this file, so omitting him would drop his t3 instance. The
# provisioner skips account/group/clone mutations for already-existing users, so
@ -17,5 +25,5 @@
users:
wizard: {authentik_user: vbarzin, k8s_user: wizard, tier: admin} # base config author + cluster-admin
emo: {authentik_user: emil.barzin, k8s_user: emo, tier: power-user} # NET-NEW k8s_users entry (add as power-user before provisioning)
ancamilea: {authentik_user: ancaelena98, k8s_user: anca, tier: namespace-owner, namespaces: [plotting-book]} # ALREADY provisioned in-cluster -- assert, don't re-create
ancamilea: {authentik_user: ancaelena98, k8s_user: anca, tier: namespace-owner, namespaces: [plotting-book], code_layout: workspace, repos: [tripit]} # ALREADY provisioned in-cluster -- assert, don't re-create
# gheorghe: {authentik_user: vabbit81, k8s_user: vabbit81, tier: namespace-owner, namespaces: [vabbit81]} # already a cluster ns-owner; uncomment to give him a devvm workstation

View file

@ -13,6 +13,7 @@ per person and are recorded explicitly (no email->username derivation).
from __future__ import annotations
import json
import re
import sys
from dataclasses import dataclass, field
from typing import Iterable
@ -21,6 +22,13 @@ import yaml
BASE_PORT = 3773
VALID_TIERS = ("admin", "power-user", "namespace-owner")
# single - ~/code IS the locked infra clone (the original non-admin layout)
# workspace - ~/code is a plain directory of per-project clones; the locked
# infra clone lives at ~/code/infra and `repos` clone alongside it
VALID_CODE_LAYOUTS = ("single", "workspace")
# Repo names become root-executed clone/mv paths under ~/code — plain
# leading-alphanumeric names only (no separators, dotfiles, or option-like names).
_REPO_NAME_RE = re.compile(r"^[A-Za-z0-9][A-Za-z0-9._-]*$")
# Tier -> supplementary groups the reconcile ENSURES (additive-only; never stripped).
TIER_GROUPS: dict[str, tuple[str, ...]] = {
"admin": ("code-shared", "docker", "sudo"),
@ -48,6 +56,8 @@ class User:
k8s_user: str
tier: str
namespaces: tuple[str, ...] = ()
code_layout: str = "single"
repos: tuple[str, ...] = ()
@dataclass(frozen=True)
@ -62,6 +72,8 @@ class Account:
shell: str
login_locked: bool
groups: tuple[str, ...]
code_layout: str = "single"
repos: tuple[str, ...] = ()
@dataclass(frozen=True)
@ -98,7 +110,31 @@ def _parse_user(os_user: str, spec: dict) -> User:
raise RosterError(f"user {os_user!r}: namespace-owner requires namespaces")
if tier != "namespace-owner" and namespaces:
raise RosterError(f"user {os_user!r}: only namespace-owner may set namespaces")
return User(os_user, spec["authentik_user"], spec["k8s_user"], tier, namespaces)
code_layout = spec.get("code_layout", "single")
if code_layout not in VALID_CODE_LAYOUTS:
raise RosterError(
f"user {os_user!r}: unknown code_layout {code_layout!r} "
f"(valid: {list(VALID_CODE_LAYOUTS)})"
)
repos = tuple(spec.get("repos") or ())
if repos and code_layout != "workspace":
raise RosterError(f"user {os_user!r}: repos require code_layout: workspace")
for repo in repos:
if not _REPO_NAME_RE.match(repo):
raise RosterError(f"user {os_user!r}: unsafe repo name {repo!r}")
if "infra" in repos:
raise RosterError(
f"user {os_user!r}: infra is implicit at ~/code/infra — drop it from repos"
)
return User(
os_user,
spec["authentik_user"],
spec["k8s_user"],
tier,
namespaces,
code_layout,
repos,
)
def load_roster(text: str) -> Roster:
@ -205,6 +241,8 @@ def derive_desired_state(
shell=DEFAULT_SHELL,
login_locked=True,
groups=TIER_GROUPS[u.tier],
code_layout=u.code_layout,
repos=u.repos,
)
for u in roster.users.values()
}
@ -257,6 +295,8 @@ def _desired_state_to_dict(ds: DesiredState) -> dict:
"shell": a.shell,
"login_locked": a.login_locked,
"groups": list(a.groups),
"code_layout": a.code_layout,
"repos": list(a.repos),
}
for name, a in ds.accounts.items()
},

View file

@ -19,17 +19,25 @@ fi
cd "$HOME/code" 2>/dev/null || cd "$HOME"
# Freshen ~/code at session start so the user begins on current upstream state
# (the hourly t3-provision-users reconcile does the same in the background).
# Fast-forward only, and only when safe (on master + clean tree); hard 15s cap so
# an offline remote never stalls the launch. No-op for repos without remotes.
if [ -d "$HOME/code/.git" ]; then
GIT_TERMINAL_PROMPT=0 timeout 15 git -C "$HOME/code" fetch --all --prune --quiet 2>/dev/null || true
if [ "$(git -C "$HOME/code" symbolic-ref --short -q HEAD)" = master ] \
&& [ -z "$(git -C "$HOME/code" status --porcelain 2>/dev/null)" ] \
&& git -C "$HOME/code" rev-parse --verify -q 'master@{upstream}' >/dev/null 2>&1; then
git -C "$HOME/code" merge --ff-only 'master@{upstream}' >/dev/null 2>&1 || true
# Freshen the user's clone(s) at session start so they begin on current upstream
# state (the hourly t3-provision-users reconcile does the same in the background).
# Single layout freshens ~/code itself; workspace layout freshens each repo under
# ~/code. Fast-forward only, and only when safe (on master + clean tree); hard
# 10s fetch cap per repo so an offline remote never stalls the launch.
freshen_repo() {
GIT_TERMINAL_PROMPT=0 timeout 10 git -C "$1" fetch --all --prune --quiet 2>/dev/null || true
if [ "$(git -C "$1" symbolic-ref --short -q HEAD)" = master ] \
&& [ -z "$(git -C "$1" status --porcelain 2>/dev/null)" ] \
&& git -C "$1" rev-parse --verify -q 'master@{upstream}' >/dev/null 2>&1; then
git -C "$1" merge --ff-only 'master@{upstream}' >/dev/null 2>&1 || true
fi
}
if [ -d "$HOME/code/.git" ]; then
freshen_repo "$HOME/code"
else
for repo_git in "$HOME"/code/*/.git; do
[ -d "$repo_git" ] && freshen_repo "${repo_git%/.git}"
done
fi
# Prefer the system-wide `claude` (installed by setup-devvm.sh); fall back to npx.

View file

@ -86,6 +86,97 @@ def test_missing_users_key_is_valid_empty():
assert _roster("{}").users == {}
# --------------------------------------------------------------------------
# code_layout + repos: per-user workspace layout (~/code/<repo> clones)
# --------------------------------------------------------------------------
def test_code_layout_defaults_to_single_with_no_repos():
r = _roster("users: {emo: {authentik_user: e, k8s_user: emo, tier: power-user}}")
assert r.users["emo"].code_layout == "single"
assert r.users["emo"].repos == ()
def test_workspace_layout_carries_repos():
r = _roster(
"""
users:
ancamilea: {authentik_user: ancaelena98, k8s_user: anca,
tier: namespace-owner, namespaces: [plotting-book],
code_layout: workspace, repos: [tripit]}
"""
)
u = r.users["ancamilea"]
assert u.code_layout == "workspace"
assert u.repos == ("tripit",)
def test_rejects_unknown_code_layout():
with pytest.raises(eng.RosterError, match="code_layout"):
_roster(
"users: {bob: {authentik_user: b, k8s_user: b, tier: power-user, "
"code_layout: flat}}"
)
def test_repos_require_workspace_layout():
# repos clone to ~/code/<name>, which only exists under the workspace layout.
with pytest.raises(eng.RosterError, match="workspace"):
_roster(
"users: {bob: {authentik_user: b, k8s_user: b, tier: power-user, "
"repos: [tripit]}}"
)
@pytest.mark.parametrize("bad", ["../evil", "a/b", "", ".hidden", "-flag"])
def test_rejects_path_unsafe_repo_name(bad):
# Repo names become root-executed clone/mv paths — reject anything that
# isn't a plain leading-alphanumeric name.
with pytest.raises(eng.RosterError, match="repo"):
_roster(
"users: {bob: {authentik_user: b, k8s_user: b, tier: power-user, "
f"code_layout: workspace, repos: ['{bad}']" "}}"
)
def test_rejects_infra_in_repos():
# The infra clone is implicit at ~/code/infra for workspace users.
with pytest.raises(eng.RosterError, match="implicit"):
_roster(
"users: {bob: {authentik_user: b, k8s_user: b, tier: power-user, "
"code_layout: workspace, repos: [infra]}}"
)
def test_derive_accounts_carry_code_layout_and_repos():
r = _roster(
"""
users:
emo: {authentik_user: e, k8s_user: emo, tier: power-user}
ancamilea: {authentik_user: a, k8s_user: anca, tier: namespace-owner,
namespaces: [plotting-book], code_layout: workspace,
repos: [tripit]}
"""
)
ds = eng.derive_desired_state(r, {})
assert ds.accounts["emo"].code_layout == "single"
assert ds.accounts["emo"].repos == ()
assert ds.accounts["ancamilea"].code_layout == "workspace"
assert ds.accounts["ancamilea"].repos == ("tripit",)
def test_desired_state_dict_includes_code_layout_and_repos():
# The JSON adapter is the contract the bash provisioner consumes via jq.
r = _roster(
"users: {ancamilea: {authentik_user: a, k8s_user: anca, "
"tier: namespace-owner, namespaces: [plotting-book], "
"code_layout: workspace, repos: [tripit]}}"
)
d = eng._desired_state_to_dict(eng.derive_desired_state(r, {}))
assert d["accounts"]["ancamilea"]["code_layout"] == "workspace"
assert d["accounts"]["ancamilea"]["repos"] == ["tripit"]
# --------------------------------------------------------------------------
# validate_tiers: roster tier vs live k8s_users (fail-loud, module #1)
# --------------------------------------------------------------------------