devvm multi-user Claude Code workstation: role-driven profiles (admin/power-user/namespace-owner) off one git roster (single source of truth, full onboard->offboard lifecycle); config inheritance via Claude's native machine-wide managed layer; per-user writable git-crypt-locked infra clone (ungated, apply-time is the boundary); per-tier OIDC kubectl; per-user secrets/auth (memory isolation deferred); incremental, emo-safe migration; capacity prereqs. Folds in gap-analysis findings verified live 2026-06-08. Designed, not yet implemented. [ci skip] Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
213 lines
19 KiB
Markdown
213 lines
19 KiB
Markdown
# Infra
|
||
|
||
Terragrunt-managed homelab declaring a 7-node Kubernetes cluster (1 control plane + 6 workers) on a single Proxmox host. Vault is the secrets source of truth; everything else flows from this repo via `scripts/tg apply`.
|
||
|
||
## Language
|
||
|
||
### Code organization
|
||
|
||
**Service**:
|
||
The deployed app as a domain concept — one logical thing that runs in the cluster (e.g. immich, technitium, freshrss). Defined by exactly one **Stack**.
|
||
_Avoid_: bare "app" without the Service definition; "deployment" (collides with K8s `Deployment`).
|
||
|
||
**Stack**:
|
||
The HCL directory under `stacks/<name>/` that defines a Service, applied independently with `scripts/tg apply`. A Stack is the unit of Terraform organisation; a Service is the running thing. They are 1:1 but not synonyms. A Stack is either **flat** (resources declared directly in its own `.tf` files — the majority, ~94, e.g. immich) or wraps a **Stack-local module** (~31, the larger/older ones).
|
||
_Avoid_: using "Stack" when you mean the running Service.
|
||
|
||
**Module**:
|
||
A unit of HCL consumed via `source =`. Two homes, two purposes: **shared** modules under the top-level `modules/` tree (reused across many Stacks) and **Stack-local** modules nested under `stacks/<name>/modules/` (one Stack only). Bare "Module" means the shared kind.
|
||
_Avoid_: "library", "package".
|
||
|
||
**Factory module**:
|
||
A shared **Module** that hides convention (defaults, drift handling, secret wiring) behind a small input surface. `modules/kubernetes/` holds exactly four, all factories: `ingress_factory` (103 Stacks), `setup_tls_secret` (93), `nfs_volume` (41), `anubis_instance` (8).
|
||
_Avoid_: "wrapper"; citing `k8s_app` / `helm_app` / `postgres_app` (these never existed in the repo).
|
||
|
||
**Stack-local module**:
|
||
A single Stack's implementation factored into a nested `stacks/<name>/modules/<name>/`, sourced by that one Stack only — organisation, not reuse. ~31 Stacks (authentik, kyverno, dbaas, mailserver, metallb, cloudflared, technitium, …). The alternative to a **flat** Stack.
|
||
_Avoid_: calling it a "Module" unqualified (it isn't reusable); "submodule".
|
||
|
||
**State tier**:
|
||
Terraform state-backend partition. **Tier 0** = bootstrap Stacks (`infra`, `platform`, `cnpg`, `vault`, `dbaas`, `external-secrets`) on local SOPS-encrypted state. **Tier 1** = every other Stack, on PG-backed state.
|
||
_Avoid_: "phase", "bootstrap stack" — say Tier 0 explicitly.
|
||
|
||
### Cluster
|
||
|
||
**Node**:
|
||
A K8s cluster VM — `k8s-master` (control plane) plus `k8s-node1..6` (workers). Default reading of the bare word "node" in this repo.
|
||
_Avoid_: "k8s node" (redundant), "host" (ambiguous).
|
||
|
||
**PVE node** / **PVE host**:
|
||
The single physical Dell R730 running Proxmox; sole hypervisor and sole NFS server. There is exactly one.
|
||
_Avoid_: "server", "hypervisor", "Proxmox" alone when you mean the host.
|
||
|
||
**Namespace tier**:
|
||
A namespace-prefix partition (`0-core-*`, `1-cluster-*`, `2-gpu-*`, `3-edge-*`, `4-aux-*`) driving PriorityClass, default resources, and ResourceQuota — generated by **Kyverno policy** from the namespace name. Orthogonal to **State tier**.
|
||
_Avoid_: "Service tier" (the partition is on the namespace, not the Service); collapsing Namespace tier with State tier — they are different axes.
|
||
|
||
**Kyverno policy**:
|
||
The convention engine of the cluster — a ClusterPolicy or Policy resource that mutates/generates/validates on admission. Owns Namespace tier limits/quotas, `dns_config` injection on every pod-owning workload, Forgejo pull-credential sync across namespaces, TLS-secret replication. When the repo says "this happens automatically", a Kyverno policy is usually the actor.
|
||
_Avoid_: bare "policy" (overloaded with Vault, RBAC, NetworkPolicy).
|
||
|
||
**Critical-path Service**:
|
||
One of {Traefik, Authentik, CrowdSec LAPI, PgBouncer, Cloudflared} — replicas ≥3, PDB enforced, monitored independently.
|
||
_Avoid_: "core service" (collides with the `0-core-*` Namespace tier name).
|
||
|
||
**Namespace-owner**:
|
||
A non-admin identity declared in `secret/platform → k8s_users` (JSON map). Owns one or more namespaces and one or more public subdomains. Also drives a **Workstation profile** (an identity has both a cluster facet and a workstation facet).
|
||
_Avoid_: bare "user", "tenant".
|
||
|
||
### Workstation (multi-user devvm)
|
||
|
||
**devvm**:
|
||
The dev VM (`10.0.10.10`), a non-cluster VM on the **PVE host** that hosts each person's Claude Code coding environment (the `t3-serve@<user>` and terminal-lobby sessions). Not a **Node** (it isn't in the cluster).
|
||
_Avoid_: calling it a "Node"; "host" (reserved for the PVE host).
|
||
|
||
**Workstation**:
|
||
A person's identity-scoped Claude Code environment on the **devvm** — one OS account, their session runs as that uid. The same human may also be a **Namespace-owner**; the cluster identity and the Workstation are two facets of one person.
|
||
_Avoid_: "t3 instance" (only one surface of a Workstation); bare "user".
|
||
|
||
**RBAC tier**:
|
||
The role band that governs a person everywhere — `kubernetes-admins` (Viktor; cluster-admin, secrets, apply), `kubernetes-power-users` (infra-aware, broad read, no destructive change), `kubernetes-namespace-owners` (own-namespace app dev). The single axis that keys both cluster RBAC **and** the **Workstation profile**.
|
||
_Avoid_: inventing per-service roles; conflating with **Namespace tier** / **State tier** (those are not identity).
|
||
|
||
**Workstation profile**:
|
||
The **RBAC tier**-keyed bundle a **Workstation** receives: **Config inheritance** (identical for everyone) plus the person's **Infra visibility** and cluster scope (varies by tier). Never hand-tuned per person — one identity decision (Authentik group + `k8s_users`) provisions the cluster facet and the Workstation together.
|
||
_Avoid_: per-person bespoke setup (the rejected "stitched-together" status quo).
|
||
|
||
**Config inheritance**:
|
||
The universal half of every **Workstation profile** — Viktor's *static* Claude config (skills, rules, agents, commands, `CLAUDE.md`, hooks) **live-extends** from a **Config base**, it is NOT copied: each person's `~/.claude` draws these from the shared base, so an edit Viktor makes appears in every Workstation immediately, with no seed/copy/sync step. Users may layer their own items on top (rarely do). **RBAC tier**-independent. Per-user *mutable* state (`~/.claude.json`, `.credentials.json`, `projects/`, sessions) is never shared — local only.
|
||
_Avoid_: a periodic copy/seed/sync of `~/.claude` (rejected — inheritance must be live); sharing `~/.claude.json` / `.credentials.json` (per-user, secret-bearing, corrupts under concurrent writes — see emo's multi-session profile).
|
||
|
||
**Config base**:
|
||
The shared, secret-free, version-controlled source of truth for the *static* Claude config that every **Workstation** live-extends (see **Config inheritance**). Viktor's authoring surface — when he edits a skill/rule, he edits the base; the chezmoi dotfiles repo is its versioned form (commit = audit/rollback, NOT a push to users). Holds only skills/rules/agents/commands/`CLAUDE.md`/hooks — never secrets or per-user mutable state.
|
||
_Avoid_: treating it as a per-user seed target (it is a live shared source, not a copy); putting secrets in it.
|
||
|
||
**Infra visibility**:
|
||
What a non-admin **Workstation** may SEE of the infra: the public repo **code** and the person's own **RBAC**-scoped view of the live cluster (kubectl / dashboard within their namespaces). Explicitly excludes the **git-crypt** secrets (`terraform.tfvars`, `secrets/`) and any out-of-scope mutation. The boundary that "respect their permissions" enforces — violated today because `~/code` is one git-crypt-*unlocked* tree shared via the `code-shared` group.
|
||
_Avoid_: reading "see the infra" as access to secrets or apply rights.
|
||
|
||
### Networking
|
||
|
||
**Public domain**:
|
||
`viktorbarzin.me`, served through Cloudflare. DNS records are either **proxied** (Cloudflare CDN/WAF in front) or **non-proxied** (direct A/AAAA reachable via Cloudflared Tunnel).
|
||
_Avoid_: "external", "outside".
|
||
|
||
**Internal domain**:
|
||
`viktorbarzin.lan`, served by Technitium DNS. Resolves only inside the homelab network.
|
||
_Avoid_: bare "lan", "private", "intranet".
|
||
|
||
**Ingress auth**:
|
||
The `auth = "..."` parameter on `ingress_factory` — a discrete *mode*, not a ranked tier — one of `required` (Authentik forward-auth gates every request), `app` (the backend owns its login), `public` (anonymous Authentik binding for audit only), or `none` (Anubis-fronted content, or native-client API). Default `required` (fail-closed).
|
||
_Avoid_: "auth tier" / "auth mode" — refer to it by the canonical key, `auth` (e.g. `auth = "required"`). "tier" is reserved for State tier and Namespace tier.
|
||
|
||
**Authentik outpost**:
|
||
A standalone Authentik deployment that terminates the proxy/auth flow for a specific binding model. The repo runs two distinct ones: the default outpost (used by `auth = "required"`) and the `public` outpost (anonymous binding, used by `auth = "public"`).
|
||
_Avoid_: conflating outpost with Authentik core; "Authentik instance".
|
||
|
||
**Cloudflared Tunnel**:
|
||
The channel by which non-proxied **public domain** traffic reaches the cluster, terminating at Traefik. Backs every `dns_type = "non-proxied"` record and is the fallback path for the wildcard `*.viktorbarzin.me`.
|
||
_Avoid_: "the tunnel" without "Cloudflared" (could mean Headscale).
|
||
|
||
**Ingress chain**:
|
||
The opinionated stack of Traefik middlewares that `ingress_factory` layers onto every Ingress. Slots, in order: forward-auth (per **Ingress auth**) → anti-AI scraping (default-on when no Authentik is in the path) → CrowdSec bouncer (fail-open) → retry (2× / 100ms) → rate-limit (429, not 503). Adding or removing a middleware is a Stack-level choice, but the chain order is convention.
|
||
_Avoid_: "middleware list", "Traefik chain". The Anubis PoW gate is upstream of this chain, not inside it.
|
||
|
||
**MetalLB / LB IP**:
|
||
The bare-metal load-balancer that assigns external IPs to `type=LoadBalancer` Services. Two IPs matter: the **shared LB IP** `10.0.20.200` (~10 services — PG state-backend, headscale, wireguard, coturn, xray… — all `externalTrafficPolicy: Cluster`) and **Traefik's dedicated LB IP** `10.0.20.203` (`externalTrafficPolicy: Local`). Traefik runs on its own IP because ETP:Local preserves the **real client IP** (for CrowdSec) and enables QUIC, and MetalLB forbids mixed ETP on one shared IP.
|
||
_Avoid_: calling `.200` "the cluster IP" or assuming all ingress shares one LB IP.
|
||
|
||
**Calico**:
|
||
The cluster CNI and **NetworkPolicy** engine (also GlobalNetworkPolicy + flow logs). Egress lockdown follows an **observe-then-enforce** rollout — flow logs build an empirical allowlist, then default-deny egress is enforced per-namespace, tier by tier (wave 1 began at `recruiter-responder`; Tier 0/1/2 deferred).
|
||
_Avoid_: "firewall" (it's pod-level policy, not a perimeter); conflating a Calico **NetworkPolicy** (enforced in the data path) with a **Kyverno policy** (enforced at admission) — different layers.
|
||
|
||
### Storage
|
||
|
||
**proxmox-lvm-encrypted**:
|
||
Default StorageClass for any workload holding sensitive data (databases, auth, password managers, email, financial data). LUKS2 over a Proxmox LVM-thin LV.
|
||
_Avoid_: bare "encrypted PVC" — name the StorageClass.
|
||
|
||
**proxmox-lvm**:
|
||
Block StorageClass for non-sensitive workloads (caches, monitoring data, indexes, app state without secrets).
|
||
|
||
**NFS volume**:
|
||
RWX file storage for shared media libraries, large datasets, or anything that needs to be inspected from outside K8s. Provisioned via the `nfs_volume` Module.
|
||
_Avoid_: "shared storage" (ambiguous).
|
||
|
||
**nfs-truenas StorageClass**:
|
||
A historical SC name retained only because StorageClass strings are immutable on bound PVs. The underlying server is the **PVE host**, not TrueNAS; TrueNAS is decommissioned.
|
||
_Avoid_: assuming this means TrueNAS.
|
||
|
||
**local-path**:
|
||
The cluster's Kubernetes default StorageClass (`rancher.io/local-path`) — node-local hostpath, **non-replicated**, no CSI snapshots, outside the backup pipeline. A PVC that omits `storageClassName` silently binds here, pinned to one Node's disk. Always set an explicit `storageClassName`; reach for local-path only for genuinely throwaway, node-pinned data.
|
||
_Avoid_: relying on the default. Note the two senses of "default": local-path is the *cluster default SC* (what an unspecified PVC gets); proxmox-lvm-encrypted is the *default choice* for sensitive data. Different things.
|
||
|
||
**3-2-1 backup**:
|
||
The named posture of where data lives: **Copy 1** = live on the PVE thin pool (sdc), **Copy 2** = sda backup disk (`/mnt/backup`), **Copy 3** = offsite Synology NAS. Per-PVC file-level rsync from LVM thin snapshots; databases additionally dump to NFS for per-DB restore.
|
||
_Avoid_: bare "backup" without saying which copy you mean (a service is "backed up" only once it's on Copy 2; Copy 3 is the disaster floor).
|
||
|
||
### Data
|
||
|
||
**CNPG** / **pg-cluster**:
|
||
**CNPG** is the CloudNativePG operator; **`pg-cluster`** is the Postgres cluster it manages — the shared Postgres substrate. Backs Tier-1 Terraform state (`pg-cluster-rw.dbaas.svc.cluster.local:5432/terraform_state`) and ~12 application databases, reached through **PgBouncer** (a **critical-path Service**) for connection pooling; app credentials rotate via the `vault-database` ClusterSecretStore.
|
||
_Avoid_: "the database" (many DBs share one cluster); the legacy `postgresql.dbaas` Service (no endpoints — dead); conflating the CNPG operator with the `pg-cluster` it manages.
|
||
|
||
### Secrets
|
||
|
||
**Vault path**:
|
||
Convention: `secret/<service>` for Service-owned secrets, `secret/viktor` for personal/global, `secret/platform` for cluster-wide maps (`k8s_users`, `homepage_credentials`).
|
||
_Avoid_: conflating Vault path (e.g. `secret/viktor`) with Vault field (e.g. `forgejo_pull_token`).
|
||
|
||
**ExternalSecret** / **ESO**:
|
||
A K8s manifest that materialises a Vault KV value as a K8s Secret. Two ClusterSecretStores: `vault-kv` (KV engine) and `vault-database` (rotating DB creds).
|
||
|
||
**Plan-time secret**:
|
||
A secret value read in Terraform via `data "kubernetes_secret"` (i.e. via the ESO-created K8s Secret) at plan time, with no Vault provider call. Distinct from a **vault data source** read (`data "vault_kv_secret_v2"`), which still goes through the Vault provider. A few Stacks remain hybrid (plan-time for env vars, vault data source for module inputs).
|
||
|
||
**Sealed Secret**:
|
||
A user-managed secret committed to a Stack directory as `sealed-*.yaml`. Distinct from ExternalSecret — Sealed Secrets carry their own bytes, ExternalSecrets reference Vault.
|
||
|
||
### CI/CD
|
||
|
||
**GHA build + Woodpecker deploy**:
|
||
The split where Docker images are built+pushed by GitHub Actions and Woodpecker only runs `kubectl set image` on a deploy-only pipeline. Repos that can't fit GHA limits stay on Woodpecker for build too.
|
||
_Avoid_: bare "Woodpecker pipeline" — say "build" or "deploy".
|
||
|
||
**Keel**:
|
||
The **poll-driven** rollout orchestrator — watches registries for new image tags and rolls the matching Deployments automatically. The actor behind "auto-upgrade" for upstream images, and a redundant net for owned apps (already rolled on push by **Woodpecker deploy**).
|
||
_Avoid_: conflating with **Woodpecker deploy** (push-driven, fires on commit) or **Diun** (watches but only notifies). Never point Keel / `set image` at operator-managed StatefulSets.
|
||
|
||
**Diun**:
|
||
**Notify-only** image-update monitoring — reports that a newer image exists, never rolls anything (contrast **Keel**, which acts). Disabled on pinned images (MySQL, PostgreSQL, Redis) so version pins aren't nagged.
|
||
_Avoid_: expecting Diun to deploy; conflating with **Keel**.
|
||
|
||
**Anubis**:
|
||
A PoW reverse-proxy issuing a 30-day JWT cookie, used in front of public content-bearing sites without app-level auth (blog, wiki, landing pages). Never in front of Git, WebDAV, CalDAV, or API endpoints (clients can't solve PoW).
|
||
|
||
## Relationships
|
||
|
||
- A **Service** is defined by exactly one **Stack** — **flat** or wrapping a **Stack-local module** — which sources zero or more shared **Factory modules** and resolves to one or more K8s workloads.
|
||
- A **Namespace-owner** owns one or more namespaces and one or more public subdomains.
|
||
- A **Service** owns its **Vault path** at `secret/<service>`, surfaces values through **ExternalSecrets**, and reads them at plan time via **plan-time secrets**.
|
||
- An **Ingress** picks exactly one **Ingress auth** mode; the choice defines how strangers reach the backend.
|
||
- A **proxmox-lvm-encrypted** PVC binds to one Node at a time (RWO) and requires a Service-level backup CronJob; an **NFS volume** is RWX and is backed up at the host level via rsync.
|
||
- **State tier** and **Namespace tier** are orthogonal — a Tier 0 Stack can deploy a Service into any Namespace tier and vice versa.
|
||
- A **Service**'s image reaches the cluster via **Woodpecker deploy** (push-driven, on commit) or **Keel** (poll-driven, on a new registry tag); **Diun** only notifies. Operator-managed StatefulSets are rolled by neither.
|
||
- Tier-1 **State tier** state and ~12 app databases share one **CNPG** `pg-cluster`, reached through **PgBouncer**; their credentials rotate via the `vault-database` store.
|
||
|
||
## Example dialogue
|
||
|
||
> **Dev:** "I'm adding a new **Service** — FastAPI backend with its own JWT login. Do I need Authentik?"
|
||
> **Domain expert:** "If the FastAPI login is the gate, set `auth = "app"` on the ingress. That records the intent that you _chose_ not to layer Authentik — leave a one-line comment above stating what gates the Service, or `scripts/tg` will refuse the apply."
|
||
> **Dev:** "And storage?"
|
||
> **Domain expert:** "Does it hold user data? If yes, `proxmox-lvm-encrypted` — that's the default for anything sensitive. Add a backup CronJob writing to `/mnt/main/<service>-backup/`. If the data is just caches, plain `proxmox-lvm` is fine."
|
||
> **Dev:** "What about a Secret with the JWT signing key?"
|
||
> **Domain expert:** "Put the key in `secret/<service>` in Vault, then declare an **ExternalSecret** to materialise it as a K8s Secret. Read it at plan time with `data "kubernetes_secret"` — that keeps Vault out of the plan path."
|
||
|
||
## Flagged ambiguities
|
||
|
||
- **"tier"** has exactly two senses — always qualify which: *State tier* (Tier 0 / Tier 1, Terraform backend partition) and *Namespace tier* (`0-core`..`4-aux`, scheduling priority/quota). They are orthogonal axes. Do **not** coin new "tier"s: **Ingress auth** is a *mode* (not a tier), and storage speed (SSD vs HDD) is *not* a "tier" either.
|
||
- **"node"** can mean a K8s Node (default) or a PVE node. For Proxmox-level statements, say **PVE node** explicitly.
|
||
- **"service"** spans two distinct concepts: the deployed app (capitalised **Service**, this repo's domain noun) and the K8s `Service` object (in backticks or qualified "K8s Service"). Lowercase "service" in prose is fine when context disambiguates; flag it when it doesn't.
|
||
- **"secret"** spans Vault entries, K8s Secret objects, **ExternalSecrets**, and **Sealed Secrets**. Always specify which.
|
||
- **"proxied"** / **"non-proxied"** refer to Cloudflare's CDN posture for a DNS record, _not_ Anubis or forward-auth layering.
|
||
- **"policy"** spans **Kyverno policy** (admission-time mutate/generate/validate), **Calico NetworkPolicy** (data-path ingress/egress), Vault policy (KV access), and K8s RBAC. Always qualify which engine.
|