Merge forgejo/master: reconcile diverged lineages [ci skip]
Local checkout carried the 2026-06-10 DNS/registry architecture series (pfSense forward-zone, CoreDNS viktorbarzin.me:53 carve-out, nodes stock) + vzdump/nfs-mirror/workstation-rebuild commits that never reached the canonical remote, while forgejo master received the emo-access series via isolated worktrees. Viktor asked to merge. Conflict resolutions (newest iteration wins in each file): - stacks/forgejo/cleanup.tf: LOCAL — dry_run=true (2026-06-10 revert after live retention orphaned OCI indexes; remote had 06-09 enable) - .claude/CLAUDE.md, docs/architecture/backup-dr.md: LOCAL — final registry/DNS architecture + implemented vzdump alerts - scripts/workstation/setup-devvm.sh: LOCAL — pinned-version, reproducible-rebuild refactor (kubelogin pin, restructured staging) - scripts/workstation/managed-settings.json: FORGEJO — the allow-then-audit claudeMd (matches /etc deployment byte-for-byte) - scripts/t3-provision-users.sh: FORGEJO comment; refresh_locked_clone intact [ci skip]: all stack changes in the local lineage were applied live this morning — CI would re-walk 100+ stacks via the modules/ fallback for zero state change. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
This commit is contained in:
commit
8cfd0e5e5c
10 changed files with 160 additions and 5 deletions
|
|
@ -197,6 +197,34 @@ steps:
|
|||
- Keeps Woodpecker global secrets in sync with Vault
|
||||
- Runs in `woodpecker` namespace
|
||||
|
||||
## Infra repo CI (Woodpecker repo 82 — Forgejo forge)
|
||||
|
||||
The infra repo itself runs on Woodpecker via the **Forgejo** forge (repo id 82,
|
||||
registered 2026-06-08; the GitHub-side repo id 1 also remains registered).
|
||||
Pushes to `master` fire `.woodpecker/default.yml` (changed-stacks terragrunt
|
||||
apply) plus the `notify-nonadmin-push` Slack audit step (allow-then-audit
|
||||
contribution model — see `multi-tenancy.md`). Operational facts (2026-06-10):
|
||||
|
||||
- **Webhook URL is the IN-CLUSTER service**: `http://woodpecker-server.woodpecker.svc.cluster.local/api/hook?...`
|
||||
(PATCHed via the Forgejo API). The Woodpecker-generated default
|
||||
(`https://ci.viktorbarzin.me/...`) resolves to the non-proxied public A
|
||||
record from pods → NAT hairpin → intermittent `context deadline exceeded`,
|
||||
silently dropping push events (found when a push produced no pipeline).
|
||||
If Woodpecker ever "repairs" the repo it will rewrite the hook back to
|
||||
`ci.viktorbarzin.me` — re-apply the in-cluster URL (or pin `ci.viktorbarzin.me`
|
||||
in the CoreDNS pod carve-out alongside forgejo).
|
||||
- **Repo-scoped secrets must exist on BOTH repos**: pipelines reference
|
||||
repo-level secrets (`registry_ssh_key`, `pve_ssh_key`, `CLOUDFLARE_TOKEN`,
|
||||
…). Repo 82 was registered without them and every all-workflow compile
|
||||
errored with `secret "registry_ssh_key" not found`. Fixed by cloning repo-1
|
||||
rows to repo 82 in the Woodpecker DB (`insert into secrets … select … where
|
||||
repo_id=1`). When registering a new forge repo for infra, clone the secret
|
||||
set too.
|
||||
- **Empty commits defeat path filters**: a commit with no changed files makes
|
||||
Woodpecker include ALL workflow files (path conditions can't exclude), so
|
||||
every repo secret must resolve. Normal commits with real files only compile
|
||||
the matching workflows.
|
||||
|
||||
## Decisions & Rationale
|
||||
|
||||
### Why GitHub Actions + Woodpecker?
|
||||
|
|
|
|||
|
|
@ -543,9 +543,17 @@ Separate from the in-cluster namespace-owner model above, the **devvm** (`10.0.1
|
|||
|
||||
**Config inheritance (live):** wizard authors the base (his chezmoi-versioned `~/.claude`). Two native layers carry it to every user — the enforced org `claudeMd` in `/etc/claude-code/managed-settings.json` (top precedence, all sessions) and per-user `~/.claude/{skills,rules,…}` **symlinks** to the base (seeded via `/etc/skel`; edits propagate live). Secrets stay per-user at mode 600, never symlinked.
|
||||
|
||||
**Infra access:** non-admins get their own **writable, git-crypt-LOCKED** clone of the (public) infra repo at `~/code` — code/docs plaintext, secret files (`*.tfvars`, `secrets/**`) stay ciphertext. Changes are ungated (push ≠ apply); the real boundary is apply-time (`scripts/tg apply` needs an admin Vault token + cluster RBAC).
|
||||
**Infra access:** non-admins get their own **writable, git-crypt-LOCKED** clone of the (public) infra repo at `~/code` — code/docs plaintext, secret files (`*.tfvars`, `secrets/**`) stay ciphertext. The provisioner clones anonymously from the public GitHub mirror; **contribute access is wired per-user on top** (see below). The apply boundary still holds (`scripts/tg apply` needs an admin Vault token + cluster RBAC), but **pushing `master` is NOT inert** — the Forgejo→Woodpecker webhook fires `.woodpecker/default.yml` (`event: push, branch: master`, `require_approval: forks` only), which terragrunt-applies changed stacks. `master` is **branch-protected on Forgejo** (force-push disabled for everyone — history is append-only; push + merge whitelists = `viktor` + explicitly granted users, deploy keys allowed). **Allow-then-audit (Viktor, 2026-06-10):** `ebarzin` (emo) is on the whitelist and pushes straight to `master` — no PR gate. The tracking burden moves to: (a) **commit messages that record what + why** (the agent instructions in AGENTS.md and the managed claudeMd require the body to paraphrase the user's request), (b) the **`notify-nonadmin-push` Slack audit step** in `.woodpecker/default.yml` — every master push by a non-admin author is posted to Slack (admin pushes are not), and (c) non-admins **never use `[ci skip]`** so every change fires the pipeline (and thus the audit feed). Users NOT on the whitelist fall back to `<user>/<topic>` branches + PRs. **Clones stay fresh automatically** (2026-06-10): the hourly `t3-provision-users` reconcile runs `refresh_locked_clone` (fetch all remotes + fast-forward `master`, ONLY when on master with a clean tree and an upstream — dirty trees and local commits are left alone with a WARN), and `start-claude.sh` does the same freshen at session launch (15s-capped so an offline remote never stalls the session).
|
||||
|
||||
**Status (2026-06-08):** built + verified on the live host — capacity (8 GiB swap), config inheritance, roster-driven provisioner, per-user locked clone, **per-user OIDC kubeconfig + the `oidc-power-user-readonly` ClusterRole + emo's `k8s_users` entry (applied + impersonation-verified), and the Authentik `T3 Users` edge gate (applied + verified)**. **Remaining (held / future):** the emo cutover to his own locked clone (Phase 5), the offboarding apply-side (Phase 7), per-user MCP/auth injection, and roster-reconciled `T3 Users` membership. See `../runbooks/offboard-user.md` for deprovisioning.
|
||||
**Contribute access (per non-admin, manual — the anca/tripit PAT precedent):**
|
||||
1. Add their Forgejo user as a **write** collaborator on `viktor/infra` (`PUT /api/v1/repos/viktor/infra/collaborators/<login>`).
|
||||
2. Mint a PAT — the admin REST endpoint 404s here, use the in-pod CLI: `kubectl -n forgejo exec deploy/forgejo -- su -s /bin/sh git -c "forgejo admin user generate-access-token --username <login> --token-name devvm-infra-git --scopes 'write:repository'"`.
|
||||
3. Install it in their `~/.git-credentials` (`https://<login>:<token>@forgejo.viktorbarzin.me`, mode 600) + `git config --global credential.helper store`, set `user.name`/`user.email`.
|
||||
4. In their clone: `git remote add forgejo https://forgejo.viktorbarzin.me/viktor/infra.git` and `git branch --set-upstream-to=forgejo/master master` (origin stays the anonymous GitHub mirror).
|
||||
5. (Optional — Viktor's call per user) Grant direct master push: add their login to the `master` branch-protection push + merge whitelists (`PATCH /api/v1/repos/viktor/infra/branch_protections/master`). Done for `ebarzin` 2026-06-10.
|
||||
6. Verify: branch push succeeds; a `master` push succeeds for whitelisted users and is rejected with `Not allowed to push to protected branch` otherwise.
|
||||
|
||||
**Status (2026-06-10):** built + verified on the live host — capacity (8 GiB swap), config inheritance, roster-driven provisioner, per-user locked clone, per-user OIDC kubeconfig + the `oidc-power-user-readonly` ClusterRole + emo's `k8s_users` entry (applied + impersonation-verified), the Authentik `T3 Users` edge gate, **the emo Phase-5 cutover (own clone + launcher repoint + `code-shared` removal, completed 2026-06-10) and emo's contribute access (`ebarzin` write collaborator + PAT + protected `master`)**. Per the live `/etc/skel` design, non-admin `~/.claude/{rules,skills}` symlinks into the admin base are **kept** (they ARE the shared-base delivery mechanism — the plan's step to remove them is obsolete). **Remaining (held / future):** the offboarding apply-side (Phase 7), per-user MCP/auth injection, and roster-reconciled `T3 Users` membership. See `../runbooks/offboard-user.md` for deprovisioning.
|
||||
|
||||
## Related
|
||||
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue