From ea2342b8e25c1faf5d6c98ef90cbbc639ee22f15 Mon Sep 17 00:00:00 2001 From: Viktor Barzin Date: Sat, 16 May 2026 11:48:19 +0000 Subject: [PATCH] docs: add CONTEXT.md domain glossary [ci skip] MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Adds the per-repo domain glossary that engineering skills (diagnose, tdd, improve-codebase-architecture, grill-with-docs) read before working in this repo. Terms only — no implementation detail. Six clusters (code organization, cluster, networking, storage, secrets, CI/CD), 22 terms, plus relationships, an example dialogue, and five flagged ambiguities. Co-Authored-By: Claude Opus 4.7 --- CONTEXT.md | 150 +++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 150 insertions(+) create mode 100644 CONTEXT.md diff --git a/CONTEXT.md b/CONTEXT.md new file mode 100644 index 00000000..ffdab01f --- /dev/null +++ b/CONTEXT.md @@ -0,0 +1,150 @@ +# Infra + +Terragrunt-managed homelab declaring a 5-node Kubernetes cluster on a single Proxmox host. Vault is the secrets source of truth; everything else flows from this repo via `scripts/tg apply`. + +## Language + +### Code organization + +**Service**: +The deployed app as a domain concept — one logical thing that runs in the cluster (e.g. immich, technitium, freshrss). Defined by exactly one **Stack**. +_Avoid_: bare "app" without the Service definition; "deployment" (collides with K8s `Deployment`). + +**Stack**: +The HCL directory under `stacks//` that defines a Service, applied independently with `scripts/tg apply`. A Stack is the unit of Terraform organisation; a Service is the running thing. They are 1:1 but not synonyms. +_Avoid_: using "Stack" when you mean the running Service. + +**Module**: +A reusable HCL primitive under `modules/`, consumed by Stacks via `source =`. +_Avoid_: "library", "package". + +**Factory module**: +A Module that hides convention (defaults, drift handling, secret wiring) behind a small input surface. Canonical examples: `ingress_factory`, `nfs_volume`, `k8s_app`, `helm_app`, `postgres_app`. +_Avoid_: "wrapper". + +**State tier**: +Terraform state-backend partition. **Tier 0** = bootstrap Stacks (`infra`, `platform`, `cnpg`, `vault`, `dbaas`, `external-secrets`) on local SOPS-encrypted state. **Tier 1** = every other Stack, on PG-backed state. +_Avoid_: "phase", "bootstrap stack" — say Tier 0 explicitly. + +### Cluster + +**Node**: +A K8s worker VM (`k8s-master`, `k8s-node1..4`). Default reading of the bare word "node" in this repo. +_Avoid_: "k8s node" (redundant), "host" (ambiguous). + +**PVE node** / **PVE host**: +The single physical Dell R730 running Proxmox; sole hypervisor and sole NFS server. There is exactly one. +_Avoid_: "server", "hypervisor", "Proxmox" alone when you mean the host. + +**Namespace tier**: +A namespace-prefix partition (`0-core-*`, `1-cluster-*`, `2-gpu-*`, `3-edge-*`, `4-aux-*`) driving PriorityClass, default resources, and ResourceQuota — generated by **Kyverno policy** from the namespace name. Orthogonal to **State tier**. +_Avoid_: "Service tier" (the partition is on the namespace, not the Service); collapsing Namespace tier with State tier — they are different axes. + +**Kyverno policy**: +The convention engine of the cluster — a ClusterPolicy or Policy resource that mutates/generates/validates on admission. Owns Namespace tier limits/quotas, `dns_config` injection on every pod-owning workload, Forgejo pull-credential sync across namespaces, TLS-secret replication. When the repo says "this happens automatically", a Kyverno policy is usually the actor. +_Avoid_: bare "policy" (overloaded with Vault, RBAC, NetworkPolicy). + +**Critical-path Service**: +One of {Traefik, Authentik, CrowdSec LAPI, PgBouncer, Cloudflared} — replicas ≥3, PDB enforced, monitored independently. +_Avoid_: "core service" (collides with the `0-core-*` Namespace tier name). + +**Namespace-owner**: +A non-admin identity declared in `secret/platform → k8s_users` (JSON map). Owns one or more namespaces and one or more public subdomains. +_Avoid_: bare "user", "tenant". + +### Networking + +**Public domain**: +`viktorbarzin.me`, served through Cloudflare. DNS records are either **proxied** (Cloudflare CDN/WAF in front) or **non-proxied** (direct A/AAAA reachable via Cloudflared Tunnel). +_Avoid_: "external", "outside". + +**Internal domain**: +`viktorbarzin.lan`, served by Technitium DNS. Resolves only inside the homelab network. +_Avoid_: bare "lan", "private", "intranet". + +**Ingress auth tier**: +The `auth = "..."` parameter on `ingress_factory`, one of `required` (Authentik forward-auth gates every request), `app` (the backend owns its login), `public` (anonymous Authentik binding for audit only), or `none` (Anubis-fronted content, or native-client API). +_Avoid_: "auth mode" — the canonical key is `auth`. + +**Authentik outpost**: +A standalone Authentik deployment that terminates the proxy/auth flow for a specific binding model. The repo runs two distinct ones: the default outpost (used by `auth = "required"`) and the `public` outpost (anonymous binding, used by `auth = "public"`). +_Avoid_: conflating outpost with Authentik core; "Authentik instance". + +**Cloudflared Tunnel**: +The channel by which non-proxied **public domain** traffic reaches the cluster, terminating at Traefik. Backs every `dns_type = "non-proxied"` record and is the fallback path for the wildcard `*.viktorbarzin.me`. +_Avoid_: "the tunnel" without "Cloudflared" (could mean Headscale). + +**Ingress chain**: +The opinionated stack of Traefik middlewares that `ingress_factory` layers onto every Ingress. Slots, in order: forward-auth (per **Ingress auth tier**) → anti-AI scraping (default-on when no Authentik is in the path) → CrowdSec bouncer (fail-open) → retry (2× / 100ms) → rate-limit (429, not 503). Adding or removing a middleware is a Stack-level choice, but the chain order is convention. +_Avoid_: "middleware list", "Traefik chain". The Anubis PoW gate is upstream of this chain, not inside it. + +### Storage + +**proxmox-lvm-encrypted**: +Default StorageClass for any workload holding sensitive data (databases, auth, password managers, email, financial data). LUKS2 over a Proxmox LVM-thin LV. +_Avoid_: bare "encrypted PVC" — name the StorageClass. + +**proxmox-lvm**: +Block StorageClass for non-sensitive workloads (caches, monitoring data, indexes, app state without secrets). + +**NFS volume**: +RWX file storage for shared media libraries, large datasets, or anything that needs to be inspected from outside K8s. Provisioned via the `nfs_volume` Module. +_Avoid_: "shared storage" (ambiguous). + +**nfs-truenas StorageClass**: +A historical SC name retained only because StorageClass strings are immutable on bound PVs. The underlying server is the **PVE host**, not TrueNAS; TrueNAS is decommissioned. +_Avoid_: assuming this means TrueNAS. + +**3-2-1 backup**: +The named posture of where data lives: **Copy 1** = live on the PVE thin pool (sdc), **Copy 2** = sda backup disk (`/mnt/backup`), **Copy 3** = offsite Synology NAS. Per-PVC file-level rsync from LVM thin snapshots; databases additionally dump to NFS for per-DB restore. +_Avoid_: bare "backup" without saying which copy you mean (a service is "backed up" only once it's on Copy 2; Copy 3 is the disaster floor). + +### Secrets + +**Vault path**: +Convention: `secret/` for Service-owned secrets, `secret/viktor` for personal/global, `secret/platform` for cluster-wide maps (`k8s_users`, `homepage_credentials`). +_Avoid_: conflating Vault path (e.g. `secret/viktor`) with Vault field (e.g. `forgejo_pull_token`). + +**ExternalSecret** / **ESO**: +A K8s manifest that materialises a Vault KV value as a K8s Secret. Two ClusterSecretStores: `vault-kv` (KV engine) and `vault-database` (rotating DB creds). + +**Plan-time secret**: +A secret value read in Terraform via `data "kubernetes_secret"` (i.e. via the ESO-created K8s Secret) at plan time, with no Vault provider call. Distinct from a **vault data source** read (`data "vault_kv_secret_v2"`), which still goes through the Vault provider. A few Stacks remain hybrid (plan-time for env vars, vault data source for module inputs). + +**Sealed Secret**: +A user-managed secret committed to a Stack directory as `sealed-*.yaml`. Distinct from ExternalSecret — Sealed Secrets carry their own bytes, ExternalSecrets reference Vault. + +### CI/CD + +**GHA build + Woodpecker deploy**: +The split where Docker images are built+pushed by GitHub Actions and Woodpecker only runs `kubectl set image` on a deploy-only pipeline. Repos that can't fit GHA limits stay on Woodpecker for build too. +_Avoid_: bare "Woodpecker pipeline" — say "build" or "deploy". + +**Anubis**: +A PoW reverse-proxy issuing a 30-day JWT cookie, used in front of public content-bearing sites without app-level auth (blog, wiki, landing pages). Never in front of Git, WebDAV, CalDAV, or API endpoints (clients can't solve PoW). + +## Relationships + +- A **Service** is defined by exactly one **Stack**, which declares zero or more **Modules** and resolves to one or more K8s workloads. +- A **Namespace-owner** owns one or more namespaces and one or more public subdomains. +- A **Service** owns its **Vault path** at `secret/`, surfaces values through **ExternalSecrets**, and reads them at plan time via **plan-time secrets**. +- An **Ingress** picks exactly one **Ingress auth tier**; the choice defines how strangers reach the backend. +- A **proxmox-lvm-encrypted** PVC binds to one Node at a time (RWO) and requires a Service-level backup CronJob; an **NFS volume** is RWX and is backed up at the host level via rsync. +- **State tier** and **Namespace tier** are orthogonal — a Tier 0 Stack can deploy a Service into any Namespace tier and vice versa. + +## Example dialogue + +> **Dev:** "I'm adding a new **Service** — FastAPI backend with its own JWT login. Do I need Authentik?" +> **Domain expert:** "If the FastAPI login is the gate, set `auth = "app"` on the ingress. That records the intent that you _chose_ not to layer Authentik — leave a one-line comment above stating what gates the Service, or `scripts/tg` will refuse the apply." +> **Dev:** "And storage?" +> **Domain expert:** "Does it hold user data? If yes, `proxmox-lvm-encrypted` — that's the default for anything sensitive. Add a backup CronJob writing to `/mnt/main/-backup/`. If the data is just caches, plain `proxmox-lvm` is fine." +> **Dev:** "What about a Secret with the JWT signing key?" +> **Domain expert:** "Put the key in `secret/` in Vault, then declare an **ExternalSecret** to materialise it as a K8s Secret. Read it at plan time with `data "kubernetes_secret"` — that keeps Vault out of the plan path." + +## Flagged ambiguities + +- **"tier"** is overloaded — *Namespace tier* (`0-core`..`4-aux`, scheduling priority) is distinct from *State tier* (Tier 0 / Tier 1, Terraform backend partition). Always qualify which axis. +- **"node"** can mean a K8s Node (default) or a PVE node. For Proxmox-level statements, say **PVE node** explicitly. +- **"service"** spans two distinct concepts: the deployed app (capitalised **Service**, this repo's domain noun) and the K8s `Service` object (in backticks or qualified "K8s Service"). Lowercase "service" in prose is fine when context disambiguates; flag it when it doesn't. +- **"secret"** spans Vault entries, K8s Secret objects, **ExternalSecrets**, and **Sealed Secrets**. Always specify which. +- **"proxied"** / **"non-proxied"** refer to Cloudflare's CDN posture for a DNS record, _not_ Anubis or forward-auth layering.