docs(context): freshen infra glossary (modules, tiers, new concepts) [ci skip]

Refresh CONTEXT.md against current repo + cluster reality (grill-with-docs):

- Module taxonomy rewrite: drop fictional k8s_app/helm_app/postgres_app
  factory modules (never existed); name the real four (ingress_factory,
  nfs_volume, anubis_instance, setup_tls_secret) + the shared / Stack-local
  / flat distinction; flag vestigial modules/kubernetes/<app> dirs.
- Rename "Ingress auth tier" -> "Ingress auth" (discrete modes, not tiers);
  reserve "tier" for State tier + Namespace tier only.
- Add local-path entry (cluster default SC; node-local footgun warning).
- Add concepts: Keel, Diun, CNPG/pg-cluster, MetalLB LB-IP split, Calico.
- Add "policy" ambiguity flag (Kyverno vs Calico NetworkPolicy vs Vault/RBAC).
- Fix node count 5 -> 7 (k8s-master + k8s-node1..6).

Doc-sync (same commit per repo rules):
- overview.md: replace fictional factory modules with the real shared
  modules + the flat/stack-local pattern.
- .claude/CLAUDE.md: drop dead nfs-proxmox column from the storage decision
  table + stale cross-reference (vault migrated off it 2026-04-25).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
Viktor Barzin 2026-06-05 19:34:49 +00:00
parent aa948be581
commit 52f5de905d
3 changed files with 60 additions and 27 deletions

View file

@ -189,20 +189,20 @@ Plan in `docs/architecture/security.md` + response playbook in `docs/runbooks/se
Choose storage class based on workload type:
| Use **proxmox-lvm-encrypted** when | Use **proxmox-lvm** when | Use **NFS** (`nfs_volume` module) when | Use **nfs-proxmox** SC when |
|------------------------------------|--------------------------|----------------------------------------|-----------------------------|
| **Any service storing sensitive data** | Non-sensitive app state (configs, caches) | Shared data across multiple pods (RWX) | Dynamic provisioning on Proxmox host NFS |
| Databases (user data, credentials) | Media indexes, search caches | Media libraries (music, ebooks, photos) | Vault (dynamic PVC creation) |
| Auth/identity services | Monitoring data (Prometheus) | Backup destinations (cloud sync picks up from NFS) | |
| Password managers, email, git repos | Tools with no user secrets | Large datasets (>10Gi) where snapshots matter | |
| Health/financial data | | Data you want to browse/inspect from outside k8s | |
| Use **proxmox-lvm-encrypted** when | Use **proxmox-lvm** when | Use **NFS** (`nfs_volume` module) when |
|------------------------------------|--------------------------|----------------------------------------|
| **Any service storing sensitive data** | Non-sensitive app state (configs, caches) | Shared data across multiple pods (RWX) |
| Databases (user data, credentials) | Media indexes, search caches | Media libraries (music, ebooks, photos) |
| Auth/identity services | Monitoring data (Prometheus) | Backup destinations (cloud sync picks up from NFS) |
| Password managers, email, git repos | Tools with no user secrets | Large datasets (>10Gi) where snapshots matter |
| Health/financial data | | Data you want to browse/inspect from outside k8s |
**Default for sensitive data is proxmox-lvm-encrypted.** Use plain `proxmox-lvm` only for non-sensitive workloads. Use NFS when you need RWX, backup pipeline integration, or it's a large shared media library.
**NFS server:**
- **Proxmox host** (192.168.1.127): Sole NFS for all workloads. HDD at `/srv/nfs` (ext4 thin LV `pve/nfs-data`, 3 TB). SSD at `/srv/nfs-ssd` (ext4 LV `ssd/nfs-ssd-data`, 100GB). Exports use `async,insecure` options (`async` — safe with UPS + Vault Raft replication + databases on block storage; `insecure` — pfSense NATs source ports >1024 between VLANs).
- **Nextcloud as NFS browser**: Nextcloud (`nextcloud.viktorbarzin.me`) mounts the PVE NFS roots (`/srv/nfs`, `/srv/nfs-ssd`) inside the NC pod at `/mnt/pve-nfs` + `/mnt/pve-nfs-ssd`. Surfaced to users via two ACL patterns: (1) admin-only root browsers `PVE NFS Pool` + `PVE NFS-SSD Pool` (scoped to NC group `admin`); (2) per-archive mounts (e.g. `/anca-elements`) with `applicable_users` set to the owners. ACL is at the mount level via `occ files_external:applicable` — Files Access Control is NOT used (NC 30/31's workflow engine lacks FilePath / UserId checks). Manifest lives in `kubernetes_config_map_v1.nextcloud_external_storage_manifest` (`stacks/nextcloud/external_storage.tf`); a one-shot K8s Job applies it idempotently.
- **`nfs-truenas` StorageClass**: Historical name retained only because SC names are immutable on PVs (48 bound PVs reference it — renaming would require mass PV churn, not worth it). Now points to the Proxmox host, identical to `nfs-proxmox`. TrueNAS (VM 9000, 10.0.10.15) operationally decommissioned 2026-04-13; VM still exists in stopped state on PVE pending user decision on deletion.
- **`nfs-truenas` StorageClass**: Historical name retained only because SC names are immutable on PVs (48 bound PVs reference it — renaming would require mass PV churn, not worth it). Now points to the Proxmox host (`nfs.csi.k8s.io` dynamic provisioning on `192.168.1.127:/srv/nfs`). TrueNAS (VM 9000, 10.0.10.15) operationally decommissioned 2026-04-13; VM still exists in stopped state on PVE pending user decision on deletion.
**Migration note**: CSI PV `volumeAttributes` are immutable — cannot update NFS server in place. New PV/PVC pairs required (convention: append `-host` to PV name).