From a49d1eadf6aa525f6c4b6dc8b4b36fb7eedeeb40 Mon Sep 17 00:00:00 2001 From: Viktor Barzin Date: Wed, 10 Jun 2026 14:53:43 +0000 Subject: [PATCH] =?UTF-8?q?workstation:=20emo=20direct=20master=20push=20?= =?UTF-8?q?=E2=80=94=20allow-then-audit=20[ci=20skip]?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Viktor: emo may make any change; what matters is tracking what changed and why. ebarzin added to master push+merge whitelists (force-push stays disabled β€” append-only history). Tracking enforced three ways: - agent instructions (managed claudeMd + AGENTS.md): commit body MUST carry the user's plain-language intent; commits land on master directly; [ci skip] forbidden for non-admins - new notify-nonadmin-push step in .woodpecker/default.yml: Slack message for every non-admin master push (admin pushes silent) - PR flow remains the fallback for non-whitelisted users Accepted consequence (informed): emo's pushes auto-apply changed stacks via CI. Offboard runbook gains whitelist-removal step. Co-Authored-By: Claude Fable 5 --- .woodpecker/default.yml | 20 +++++++++ AGENTS.md | 44 +++++++++++++------ docs/architecture/multi-tenancy.md | 5 ++- ...026-06-07-multi-user-workstation-design.md | 1 + docs/runbooks/offboard-user.md | 5 +++ scripts/workstation/managed-settings.json | 2 +- 6 files changed, 60 insertions(+), 17 deletions(-) diff --git a/.woodpecker/default.yml b/.woodpecker/default.yml index 5661bccd..0df7500e 100644 --- a/.woodpecker/default.yml +++ b/.woodpecker/default.yml @@ -24,6 +24,26 @@ clone: backoff: 10s steps: + # Audit feed for the allow-then-audit contribution model: any master push by + # a NON-admin author is surfaced in Slack (Viktor's own pushes are not). + # Runs before apply and never blocks it. Note: [ci skip] commits never reach + # this step (Woodpecker skips the whole pipeline) β€” hence the rule that + # non-admins must not use [ci skip]. + - name: notify-nonadmin-push + image: curlimages/curl + environment: + SLACK_WEBHOOK: + from_secret: slack_webhook + commands: + - | + case "$CI_COMMIT_AUTHOR" in + viktor|ViktorBarzin|wizard) echo "admin push β€” no notify"; exit 0 ;; + esac + SUBJECT=$(echo "$CI_COMMIT_MESSAGE" | head -1 | tr -d '"\\') + curl -s -X POST -H 'Content-type: application/json' \ + --data "{\"text\":\"πŸ“ infra master push by *$CI_COMMIT_AUTHOR*: $SUBJECT\n$CI_REPO_URL/commit/$CI_COMMIT_SHA\"}" \ + "$SLACK_WEBHOOK" || true + - name: apply image: forgejo.viktorbarzin.me/viktor/infra-ci:latest pull: true diff --git a/AGENTS.md b/AGENTS.md index 585b3da7..7559d276 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -235,23 +235,39 @@ Their `~/code` clone arrives preconfigured: git identity, a `forgejo` remote authenticated via `~/.git-credentials`, and `master` tracking `forgejo/master` (auto-freshened hourly and at session launch, fast-forward only). +The model is **allow-then-audit** (Viktor, 2026-06-10): whitelisted users (emo) +push straight to `master` β€” no PR gate β€” and the record of *what changed and +why* is what matters. Force-push is disabled for everyone, so master history +is append-only. + To land a finished change from such a clone: -1. `git checkout -b / master` β€” always branch off fresh master -2. Commit with a clear message (identity is preconfigured) -3. `git push forgejo /` -4. Open the PR with the user's own PAT (`write:repository` suffices β€” verified 2026-06-10): - ```bash - TOK=$(sed -E 's#https://[^:]+:([^@]+)@.*#\1#' ~/.git-credentials) - curl -X POST -H "Authorization: token $TOK" -H 'Content-Type: application/json' \ - https://forgejo.viktorbarzin.me/api/v1/repos/viktor/infra/pulls \ - -d '{"title":"","head":"<os-user>/<short-topic>","base":"master","body":"<what + why>"}' - ``` -5. `git checkout master` β€” leave the clone clean so auto-refresh keeps working -6. Tell the user in plain language that the change is submitted for Viktor's review +1. Commit on `master`. **The commit message is the audit trail** β€” this matters + more than the change itself: + - subject: what changed, specific ("ha-sofia: lower fan curve bias to -5") + - body: WHY, in plain words β€” paraphrase the user's actual request and any + reasoning ("Emil asked for quieter fans in the evening; curve was + overshooting after the 2026-06-08 redesign") +2. `git push forgejo master`. If rejected non-fast-forward: `git pull --rebase + forgejo master` and push again. +3. **Never use `[ci skip]`** as a non-admin β€” it hides the change from the + Slack audit feed; a no-op CI apply on a docs-only commit is harmless. +4. Leave the clone on clean `master` so auto-refresh keeps working. +5. Tell the user in plain language what happened. Stack changes are + auto-applied by CI β€” verify the live result with the user's read-only + kubectl before saying "it's live". -Direct pushes to `master` are rejected by branch protection; merging (and the -CI apply a master push triggers) is admin-only. +If a push to `master` is rejected by branch protection (user not on the +whitelist β€” e.g. new users before Viktor grants it), fall back to a +`<os-user>/<short-topic>` branch + PR with the user's own PAT +(`write:repository` suffices β€” verified 2026-06-10): + +```bash +TOK=$(sed -E 's#https://[^:]+:([^@]+)@.*#\1#' ~/.git-credentials) +curl -X POST -H "Authorization: token $TOK" -H 'Content-Type: application/json' \ + https://forgejo.viktorbarzin.me/api/v1/repos/viktor/infra/pulls \ + -d '{"title":"<title>","head":"<os-user>/<short-topic>","base":"master","body":"<what + why>"}' +``` ## Common Operations - **Deploy new service**: Use `stacks/<existing-service>/` as template. Create stack, add DNS in tfvars, apply platform then service. diff --git a/docs/architecture/multi-tenancy.md b/docs/architecture/multi-tenancy.md index 387611b7..067cdcaf 100644 --- a/docs/architecture/multi-tenancy.md +++ b/docs/architecture/multi-tenancy.md @@ -543,14 +543,15 @@ Separate from the in-cluster namespace-owner model above, the **devvm** (`10.0.1 **Config inheritance (live):** wizard authors the base (his chezmoi-versioned `~/.claude`). Two native layers carry it to every user β€” the enforced org `claudeMd` in `/etc/claude-code/managed-settings.json` (top precedence, all sessions) and per-user `~/.claude/{skills,rules,…}` **symlinks** to the base (seeded via `/etc/skel`; edits propagate live). Secrets stay per-user at mode 600, never symlinked. -**Infra access:** non-admins get their own **writable, git-crypt-LOCKED** clone of the (public) infra repo at `~/code` β€” code/docs plaintext, secret files (`*.tfvars`, `secrets/**`) stay ciphertext. The provisioner clones anonymously from the public GitHub mirror; **contribute access is wired per-user on top** (see below). The apply boundary still holds (`scripts/tg apply` needs an admin Vault token + cluster RBAC), but **pushing `master` is NOT inert** β€” the Forgejoβ†’Woodpecker webhook fires `.woodpecker/default.yml` (`event: push, branch: master`, `require_approval: forks` only), which terragrunt-applies changed stacks. `master` is therefore **branch-protected on Forgejo** (push + merge whitelists = `viktor`, deploy keys allowed): non-admins contribute via `<user>/<topic>` branches + PRs, and only an admin merge lands (and thus applies) their change. **Clones stay fresh automatically** (2026-06-10): the hourly `t3-provision-users` reconcile runs `refresh_locked_clone` (fetch all remotes + fast-forward `master`, ONLY when on master with a clean tree and an upstream β€” dirty trees and local commits are left alone with a WARN), and `start-claude.sh` does the same freshen at session launch (15s-capped so an offline remote never stalls the session). +**Infra access:** non-admins get their own **writable, git-crypt-LOCKED** clone of the (public) infra repo at `~/code` β€” code/docs plaintext, secret files (`*.tfvars`, `secrets/**`) stay ciphertext. The provisioner clones anonymously from the public GitHub mirror; **contribute access is wired per-user on top** (see below). The apply boundary still holds (`scripts/tg apply` needs an admin Vault token + cluster RBAC), but **pushing `master` is NOT inert** β€” the Forgejoβ†’Woodpecker webhook fires `.woodpecker/default.yml` (`event: push, branch: master`, `require_approval: forks` only), which terragrunt-applies changed stacks. `master` is **branch-protected on Forgejo** (force-push disabled for everyone β€” history is append-only; push + merge whitelists = `viktor` + explicitly granted users, deploy keys allowed). **Allow-then-audit (Viktor, 2026-06-10):** `ebarzin` (emo) is on the whitelist and pushes straight to `master` β€” no PR gate. The tracking burden moves to: (a) **commit messages that record what + why** (the agent instructions in AGENTS.md and the managed claudeMd require the body to paraphrase the user's request), (b) the **`notify-nonadmin-push` Slack audit step** in `.woodpecker/default.yml` β€” every master push by a non-admin author is posted to Slack (admin pushes are not), and (c) non-admins **never use `[ci skip]`** so every change fires the pipeline (and thus the audit feed). Users NOT on the whitelist fall back to `<user>/<topic>` branches + PRs. **Clones stay fresh automatically** (2026-06-10): the hourly `t3-provision-users` reconcile runs `refresh_locked_clone` (fetch all remotes + fast-forward `master`, ONLY when on master with a clean tree and an upstream β€” dirty trees and local commits are left alone with a WARN), and `start-claude.sh` does the same freshen at session launch (15s-capped so an offline remote never stalls the session). **Contribute access (per non-admin, manual β€” the anca/tripit PAT precedent):** 1. Add their Forgejo user as a **write** collaborator on `viktor/infra` (`PUT /api/v1/repos/viktor/infra/collaborators/<login>`). 2. Mint a PAT β€” the admin REST endpoint 404s here, use the in-pod CLI: `kubectl -n forgejo exec deploy/forgejo -- su -s /bin/sh git -c "forgejo admin user generate-access-token --username <login> --token-name devvm-infra-git --scopes 'write:repository'"`. 3. Install it in their `~/.git-credentials` (`https://<login>:<token>@forgejo.viktorbarzin.me`, mode 600) + `git config --global credential.helper store`, set `user.name`/`user.email`. 4. In their clone: `git remote add forgejo https://forgejo.viktorbarzin.me/viktor/infra.git` and `git branch --set-upstream-to=forgejo/master master` (origin stays the anonymous GitHub mirror). -5. Verify: branch push succeeds; a push to `master` is rejected with `Not allowed to push to protected branch`. +5. (Optional β€” Viktor's call per user) Grant direct master push: add their login to the `master` branch-protection push + merge whitelists (`PATCH /api/v1/repos/viktor/infra/branch_protections/master`). Done for `ebarzin` 2026-06-10. +6. Verify: branch push succeeds; a `master` push succeeds for whitelisted users and is rejected with `Not allowed to push to protected branch` otherwise. **Status (2026-06-10):** built + verified on the live host β€” capacity (8 GiB swap), config inheritance, roster-driven provisioner, per-user locked clone, per-user OIDC kubeconfig + the `oidc-power-user-readonly` ClusterRole + emo's `k8s_users` entry (applied + impersonation-verified), the Authentik `T3 Users` edge gate, **the emo Phase-5 cutover (own clone + launcher repoint + `code-shared` removal, completed 2026-06-10) and emo's contribute access (`ebarzin` write collaborator + PAT + protected `master`)**. Per the live `/etc/skel` design, non-admin `~/.claude/{rules,skills}` symlinks into the admin base are **kept** (they ARE the shared-base delivery mechanism β€” the plan's step to remove them is obsolete). **Remaining (held / future):** the offboarding apply-side (Phase 7), per-user MCP/auth injection, and roster-reconciled `T3 Users` membership. See `../runbooks/offboard-user.md` for deprovisioning. diff --git a/docs/plans/2026-06-07-multi-user-workstation-design.md b/docs/plans/2026-06-07-multi-user-workstation-design.md index 7267cc8c..8e54fa95 100644 --- a/docs/plans/2026-06-07-multi-user-workstation-design.md +++ b/docs/plans/2026-06-07-multi-user-workstation-design.md @@ -167,6 +167,7 @@ Design principle: **every bit of devvm setup is an idempotent git script** β€” n - **ADR-0003 β€” Config inheritance via native machine-wide layers + per-user override.** Rejected: periodic sync, OverlayFS (no live lowerdir edits), Nix (rebuild not live). - **ADR-0004 β€” Infra access via per-user writable git-crypt-locked clones (changes ungated).** Each non-admin gets their own writable, keyless (locked) clone β€” read + edit + push freely, no PR gate. Safe because infra apply is manual + admin-only (push β‰  apply, id=4355) and the clone can't decrypt secrets. Rejected: the shared read-only mirror (gated changes) and the shared unlocked tree (secret leak + commit entanglement). Trade: repo-local CLAUDE.md updates via pull, not live (global config inheritance stays live via Β§4). - **AMENDED 2026-06-10 β€” the "push β‰  apply" premise was WRONG.** The Forgejoβ†’Woodpecker webhook on `viktor/infra` fires `.woodpecker/default.yml` on `push` to `master` (`require_approval: forks` only), which terragrunt-applies changed stacks β€” so an ungated master push IS a deploy. Enforcement added instead of dropping the ADR: Forgejo **branch protection on `master`** (push + merge whitelists = `viktor`, deploy keys allowed). Non-admins keep free branch pushes + PRs; only admin merges land on master. "No PR gate" is thereby reversed for non-admins; the rest of the ADR (per-user locked clones) stands. As-built: `../architecture/multi-tenancy.md` β†’ "Contribute access". + - **AMENDED AGAIN 2026-06-10 (later) β€” allow-then-audit.** Viktor granted emo (`ebarzin`) direct master push ("he's allowed to make any change; what matters is tracking what changed and why"). The PR gate is dropped FOR WHITELISTED USERS; tracking is enforced instead: agent-written commit messages must carry the user's plain-language intent (the WHY), a `notify-nonadmin-push` Slack step in `.woodpecker/default.yml` surfaces every non-admin master push, `[ci skip]` is forbidden for non-admins, and force-push stays disabled (append-only history). Accepted consequence: emo's pushes auto-apply changed stacks via CI. Branch protection + the PR fallback remain for non-whitelisted users. - **ADR-0005 β€” Power-user = cluster-wide read-only (no Secrets), via a NEW dedicated ClusterRole.** Re-widens cross-tenant READ for the trusted power-user tier only β€” but via a NEW `oidc-power-user-readonly` ClusterRole (get/list/watch, NO `secrets`), NOT the existing `oidc-power-user` (which grants read+write+Secrets and is unbound). Bound to the user's OIDC identity (kubelogin) β€” the apiserver accepts Authentik OIDC for the `kubernetes` audience; the dashboard's SA-token pattern is for the dashboard UI only. - **ADR-0006 β€” The roster is the single source of truth for the FULL lifecycle.** `roster.yaml` drives onboard *and* offboard; `/etc/ttyd-user-map`, `dispatch.json`, and Authentik `T3 Users` membership are *derived* from it, and tier is *validated* against `k8s_users` (fail-loud on mismatch). Rejected: hand-maintaining the four membership lists in parallel (guaranteed drift). Offboarding is first-class + staged (reversible cut β†’ cluster revoke β†’ gated `userdel`), not an afterthought. - **ADR-0007 β€” Add swap + a capacity budget to the devvm before onboarding active users.** A shared 24 GB / **0-swap** host OOM-kills live sessions under multi-user load (wizard alone runs ~20). Swap + a max-concurrent ceiling are prerequisites, not follow-ups. diff --git a/docs/runbooks/offboard-user.md b/docs/runbooks/offboard-user.md index 269e2f1a..05c0c5a8 100644 --- a/docs/runbooks/offboard-user.md +++ b/docs/runbooks/offboard-user.md @@ -36,6 +36,11 @@ gated `userdel_archive`, which is **never** auto-applied). # drop write access to the infra repo curl -X DELETE -H "Authorization: token <admin_pat>" \ https://forgejo.viktorbarzin.me/api/v1/repos/viktor/infra/collaborators/<forgejo_login> + # if they were whitelisted for direct master push, remove them from the + # branch-protection whitelists (PATCH with the remaining usernames) + curl -X PATCH -H "Authorization: token <admin_pat>" -H 'Content-Type: application/json' \ + https://forgejo.viktorbarzin.me/api/v1/repos/viktor/infra/branch_protections/master \ + -d '{"push_whitelist_usernames":["viktor"],"merge_whitelist_usernames":["viktor"]}' # revoke their devvm git PAT (token name: devvm-infra-git; admin PAT may # manage other users' tokens β€” verified 2026-06-10; the CLI has no delete) curl -X DELETE -H "Authorization: token <admin_pat>" \ diff --git a/scripts/workstation/managed-settings.json b/scripts/workstation/managed-settings.json index 77bfdbab..d9a7ccaf 100644 --- a/scripts/workstation/managed-settings.json +++ b/scripts/workstation/managed-settings.json @@ -1,4 +1,4 @@ { - "claudeMd": "# Viktor Barzin homelab β€” shared multi-user Claude Code Workstation (devvm)\n\nYou are running as a specific OS user on a SHARED devvm Workstation, not as the admin. These org-wide rules apply to EVERY user and sit at the top of settings precedence (they cannot be overridden by a user's own config):\n\n- Respect your permission tier. Your kubectl, Vault, and infra access are scoped to your RBAC tier (admin / power-user / namespace-owner). Do not attempt to escalate privileges or reach another user's resources.\n- Secrets are per-user. Never read another user's home directory, credentials, tokens, or ~/.claude secrets. Your own secrets live in your home at mode 600.\n- Infrastructure changes go through Terraform/Terragrunt (scripts/tg apply) β€” never direct kubectl apply/edit/patch. Applies are manual and admin-gated: a non-admin's edits cannot take effect without an admin merge + apply.\n- Non-admins land changes via branch + PR, and the AGENT does ALL git/PR mechanics silently β€” the user may not know git, so never ask them to commit, push, pull, or open anything. When you finish a change in ~/code: commit it, push to a branch named <os-user>/<short-topic> on the forgejo remote, open a PR to master via the Forgejo API (token = the password field in ~/.git-credentials), then check out clean master again so background auto-refresh keeps working. Tell the user in plain words that the change is submitted for Viktor's review. Direct pushes to master are rejected (branch protection). Full recipe: AGENTS.md β†’ 'Non-admin workstation users' in ~/code.\n- Follow the engineering rules in ~/.claude/rules/ (execution, planning, quality) and every CLAUDE.md in the repo tree.\n- The monorepo is at ~/code. Non-admins get a git-crypt-LOCKED clone: secret files read as ciphertext β€” that is expected, not an error.", + "claudeMd": "# Viktor Barzin homelab β€” shared multi-user Claude Code Workstation (devvm)\n\nYou are running as a specific OS user on a SHARED devvm Workstation, not as the admin. These org-wide rules apply to EVERY user and sit at the top of settings precedence (they cannot be overridden by a user's own config):\n\n- Respect your permission tier. Your kubectl, Vault, and infra access are scoped to your RBAC tier (admin / power-user / namespace-owner). Do not attempt to escalate privileges or reach another user's resources.\n- Secrets are per-user. Never read another user's home directory, credentials, tokens, or ~/.claude secrets. Your own secrets live in your home at mode 600.\n- Infrastructure changes go through Terraform/Terragrunt β€” never direct kubectl apply/edit/patch. Committed stack changes are auto-applied by CI on push to master; you can verify the live result with your read-only kubectl.\n- The AGENT does ALL git mechanics silently β€” the user may not know git, so never ask them to commit, push, pull, or open anything, and never surface git jargon. When you finish a change in ~/code: commit it ON master and push to the forgejo remote. THE COMMIT MESSAGE IS THE AUDIT TRAIL β€” subject says WHAT changed; body says WHY in plain words (paraphrase the user's actual request) β€” this matters more than the change itself. Never use [ci skip] as a non-admin (it would hide the change from the audit feed; harmless no-op applies are fine). If the push is rejected non-fast-forward, git pull --rebase forgejo master and push again. If it is rejected by branch protection (user not whitelisted), fall back to a <os-user>/<topic> branch + PR via the Forgejo API (token = password field in ~/.git-credentials). Keep ~/code on a clean master when done so background auto-refresh keeps working. Tell the user in plain words what happened ('done β€” your change is live/recorded'). Full recipe: AGENTS.md β†’ 'Non-admin workstation users' in ~/code.\n- Follow the engineering rules in ~/.claude/rules/ (execution, planning, quality) and every CLAUDE.md in the repo tree.\n- The monorepo is at ~/code. Non-admins get a git-crypt-LOCKED clone: secret files read as ciphertext β€” that is expected, not an error.", "model": "claude-fable-5" }