diff --git a/.claude/CLAUDE.md b/.claude/CLAUDE.md index 28b91818..0ad70344 100755 --- a/.claude/CLAUDE.md +++ b/.claude/CLAUDE.md @@ -143,15 +143,15 @@ Repo IDs: infra=1, Website=2, finance=3, health=4, travel_blog=5, webhook-handle Choose storage class based on workload type: -| Use **proxmox-lvm** when | Use **NFS** (`nfs_volume` module) when | Use **nfs-proxmox** SC when | -|--------------------------|----------------------------------------|-----------------------------| -| Database files (SQLite, embedded DBs) | Shared data across multiple pods (RWX) | Dynamic provisioning on Proxmox host NFS | -| Write-heavy / fsync-heavy workloads | Media libraries (music, ebooks, photos) | Vault (dynamic PVC creation) | -| Single-pod app state (RWO is fine) | Backup destinations (cloud sync picks up from NFS) | | -| Latency-sensitive data | Large datasets (>10Gi) where snapshots matter | | -| Any new service by default | Data you want to browse/inspect from outside k8s | | +| Use **proxmox-lvm-encrypted** when | Use **proxmox-lvm** when | Use **NFS** (`nfs_volume` module) when | Use **nfs-proxmox** SC when | +|------------------------------------|--------------------------|----------------------------------------|-----------------------------| +| **Any service storing sensitive data** | Non-sensitive app state (configs, caches) | Shared data across multiple pods (RWX) | Dynamic provisioning on Proxmox host NFS | +| Databases (user data, credentials) | Media indexes, search caches | Media libraries (music, ebooks, photos) | Vault (dynamic PVC creation) | +| Auth/identity services | Monitoring data (Prometheus) | Backup destinations (cloud sync picks up from NFS) | | +| Password managers, email, git repos | Tools with no user secrets | Large datasets (>10Gi) where snapshots matter | | +| Health/financial data | | Data you want to browse/inspect from outside k8s | | -**Default is proxmox-lvm.** Only use NFS when you need RWX, backup pipeline integration, or it's a large shared media library. +**Default for sensitive data is proxmox-lvm-encrypted.** Use plain `proxmox-lvm` only for non-sensitive workloads. Use NFS when you need RWX, backup pipeline integration, or it's a large shared media library. **NFS servers:** - **Proxmox host** (192.168.1.127): Primary NFS for all workloads. HDD at `/srv/nfs` (ext4 thin LV `pve/nfs-data`, 1TB). SSD at `/srv/nfs-ssd` (ext4 LV `ssd/nfs-ssd-data`, 100GB). Exports use `async,insecure` options (`async` — safe with UPS + Vault Raft replication + databases on block storage; `insecure` — pfSense NATs source ports >1024 between VLANs). @@ -193,6 +193,35 @@ resource "kubernetes_persistent_volume_claim" "data_proxmox" { - Autoresizer annotations are **required** on all proxmox-lvm PVCs - Every proxmox-lvm app **MUST** add a backup CronJob writing to NFS `/mnt/main/-backup/` +**proxmox-lvm-encrypted PVC template** (Terraform) — use for all sensitive data: +```hcl +resource "kubernetes_persistent_volume_claim" "data_encrypted" { + wait_until_bound = false + metadata { + name = "-data-encrypted" + namespace = kubernetes_namespace..metadata[0].name + annotations = { + "resize.topolvm.io/threshold" = "80%" + "resize.topolvm.io/increase" = "100%" + "resize.topolvm.io/storage_limit" = "5Gi" + } + } + spec { + access_modes = ["ReadWriteOnce"] + storage_class_name = "proxmox-lvm-encrypted" + resources { + requests = { storage = "1Gi" } + } + } +} +``` +- Same rules as `proxmox-lvm` (wait_until_bound, Recreate strategy, autoresizer, backup CronJob) +- Uses LUKS2 encryption with Argon2id key derivation via Proxmox CSI plugin +- Encryption passphrase stored in Vault KV (`secret/viktor/proxmox_csi_encryption_passphrase`), synced to K8s Secret `proxmox-csi-encryption` in `kube-system` via ExternalSecret +- Backup key at `/root/.luks-backup-key` on PVE host (chmod 600) +- CSI node plugin needs 1280Mi memory limit for LUKS operations (`node.plugin.resources` in Helm values) +- Convention: PVC names end in `-encrypted` (not `-proxmox`) + ### 3-2-1 Backup Strategy **Copy 1**: Live data on sdc thin pool (65 PVCs + VMs) **Copy 2**: sda backup disk (`/mnt/backup`, 1.1TB ext4, VG `backup`) diff --git a/AGENTS.md b/AGENTS.md index e7d8d753..92325edd 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -60,7 +60,8 @@ Terragrunt-based homelab managing a Kubernetes cluster (5 nodes, v1.34.2) on Pro ## Storage - **NFS** (`nfs-proxmox` StorageClass): For app data. Use the `nfs_volume` module, never inline `nfs {}` blocks. -- **proxmox-lvm** (`proxmox-lvm` StorageClass): For databases (PostgreSQL, MySQL). Proxmox CSI driver. +- **proxmox-lvm-encrypted** (`proxmox-lvm-encrypted` StorageClass): **Default for all sensitive data** — databases, auth, email, passwords, git repos, health data. LUKS2 encryption via Proxmox CSI. Passphrase in Vault, backup key on PVE host. +- **proxmox-lvm** (`proxmox-lvm` StorageClass): For non-sensitive stateful apps (configs, caches, tools). Proxmox CSI driver. - **NFS server**: Proxmox host at 192.168.1.127. HDD NFS at `/srv/nfs` (2TB ext4 LV `pve/nfs-data`), SSD NFS at `/srv/nfs-ssd` (100GB ext4 LV `ssd/nfs-ssd-data`). Exports use `async` mode (safe with UPS + databases on block storage). TrueNAS (10.0.10.15) decommissioned. - **SQLite on NFS is unreliable** (fsync issues) — always use proxmox-lvm or local disk for databases. - **NFS mount options**: Always `soft,timeo=30,retrans=3` to prevent uninterruptible sleep (D state). diff --git a/docs/architecture/storage.md b/docs/architecture/storage.md index ed2254ee..f7d0b656 100644 --- a/docs/architecture/storage.md +++ b/docs/architecture/storage.md @@ -1,12 +1,16 @@ # Storage Architecture -Last updated: 2026-04-13 +Last updated: 2026-04-15 ## Overview The cluster uses two storage backends: **Proxmox CSI** for database block storage and **Proxmox NFS** for application data. -**Block storage (Proxmox CSI)**: 65 PVCs for databases and stateful apps (CNPG PostgreSQL, MySQL InnoDB, Redis, Vaultwarden, Prometheus, Nextcloud, Calibre-Web, Forgejo, FreshRSS, ActualBudget, NovelApp, Headscale, Uptime Kuma, etc.) use `StorageClass: proxmox-lvm`, which provisions thin LVs directly from the Proxmox host's `local-lvm` storage (sdc, 10.7TB RAID1 HDD thin pool). This eliminates the previous double-CoW (ZFS + LVM-thin) path that caused 56 ZFS checksum errors. +**Block storage (Proxmox CSI)**: ~95 PVCs for databases and stateful apps use two StorageClasses provisioned from the same `local-lvm` thin pool (sdc, 10.7TB RAID1 HDD): +- **`proxmox-lvm`**: Unencrypted block storage for non-sensitive workloads (~67 PVCs) +- **`proxmox-lvm-encrypted`**: LUKS2-encrypted block storage for all sensitive data (~28 PVCs) — databases, auth, email, password managers, git repos, health data, etc. Uses Argon2id key derivation with passphrase from Vault KV. + +All services storing sensitive data were migrated to `proxmox-lvm-encrypted` on 2026-04-15. This eliminates the previous double-CoW (ZFS + LVM-thin) path and ensures data-at-rest encryption. **NFS storage (Proxmox host)**: ~100 NFS shares for media libraries (Immich, audiobookshelf, servarr, navidrome), backup targets (`*-backup/` directories), and app data are served directly from the Proxmox host at `192.168.1.127`. Two NFS export roots exist: - **HDD NFS**: `/srv/nfs` on ext4 LV `pve/nfs-data` (2TB) — bulk media and backup targets @@ -25,7 +29,7 @@ Both `StorageClass: nfs-truenas` (name kept for compatibility) and `StorageClass ```mermaid graph TB subgraph Proxmox["Proxmox Host (192.168.1.127)"] - sdc["sdc: 10.7TB RAID1 HDD
VG pve, LV data (thin pool)
65 proxmox-lvm PVCs"] + sdc["sdc: 10.7TB RAID1 HDD
VG pve, LV data (thin pool)
~67 proxmox-lvm PVCs
~28 proxmox-lvm-encrypted PVCs"] sda["sda: 1.1TB RAID1 SAS
VG backup, LV data (ext4)
/mnt/backup"] NFS_HDD["LV pve/nfs-data (2TB ext4)
/srv/nfs
~100 NFS shares
Media + backup targets"] NFS_SSD["LV ssd/nfs-ssd-data (100GB ext4)
/srv/nfs-ssd
High-performance data
(Immich ML)"] @@ -36,10 +40,11 @@ graph TB subgraph K8s["Kubernetes Cluster"] CSI_NFS["nfs-csi driver
StorageClass: nfs-truenas / nfs-proxmox
soft,timeo=30,retrans=3"] - CSI_PVE["Proxmox CSI plugin
StorageClass: proxmox-lvm"] + CSI_PVE["Proxmox CSI plugin
StorageClass: proxmox-lvm
StorageClass: proxmox-lvm-encrypted"] NFS_PV["NFS PersistentVolumes
RWX, ~100 volumes"] - Block_PV["Block PersistentVolumes
RWO, 65 PVCs"] + Block_PV["Block PersistentVolumes
RWO, ~67 PVCs (unencrypted)"] + Enc_PV["Encrypted Block PVs
RWO, ~28 PVCs (LUKS2)"] Pods["Application Pods"] DBPods["Database Pods
PostgreSQL CNPG
MySQL InnoDB"] @@ -50,9 +55,11 @@ graph TB CSI_NFS --> NFS_PV CSI_PVE --> Block_PV + CSI_PVE --> Enc_PV NFS_PV --> Pods - Block_PV --> DBPods + Block_PV --> Pods + Enc_PV --> DBPods style Proxmox fill:#e1f5ff style K8s fill:#fff4e1 @@ -65,7 +72,8 @@ graph TB | Component | Version/Config | Location | Purpose | |-----------|---------------|----------|---------| | **Proxmox CSI plugin** | Helm chart | Namespace: proxmox-csi | Block storage via LVM-thin hotplug | -| **StorageClass `proxmox-lvm`** | RWO, WaitForFirstConsumer | Cluster-wide | Databases and stateful apps | +| **StorageClass `proxmox-lvm`** | RWO, WaitForFirstConsumer | Cluster-wide | Non-sensitive stateful apps | +| **StorageClass `proxmox-lvm-encrypted`** | RWO, WaitForFirstConsumer, LUKS2 | Cluster-wide | **All sensitive data** (databases, auth, email, passwords, git) | | Proxmox NFS (HDD) | LV `pve/nfs-data`, 2TB ext4 | 192.168.1.127:/srv/nfs | Bulk NFS data for all services | | Proxmox NFS (SSD) | LV `ssd/nfs-ssd-data`, 100GB ext4 | 192.168.1.127:/srv/nfs-ssd | High-performance data (Immich ML) | | nfs-csi | Helm chart | Namespace: nfs-csi | NFS CSI driver | @@ -112,6 +120,21 @@ graph TB **Proxmox API token**: `csi@pve!csi-token` with CSI role (`VM.Audit VM.Config.Disk Datastore.Allocate Datastore.AllocateSpace Datastore.Audit`). Stored in Vault at `secret/viktor`. +### Encrypted Block Storage Flow (proxmox-lvm-encrypted) — 2026-04-15 + +1. **PVC creation**: Pod requests a PVC with `storageClass: proxmox-lvm-encrypted` +2. **CSI provisioning**: Same as `proxmox-lvm` — thin LV created in `local-lvm` +3. **LUKS encryption**: CSI node plugin reads the encryption passphrase from K8s Secret `proxmox-csi-encryption` (namespace `kube-system`), formats the disk with LUKS2 (Argon2id key derivation), then creates ext4 on top +4. **Transparent mounting**: Application sees a normal ext4 filesystem — encryption/decryption is handled by dm-crypt in the kernel +5. **Passphrase management**: ExternalSecret syncs passphrase from Vault KV (`secret/viktor/proxmox_csi_encryption_passphrase`) → K8s Secret. Backup key at `/root/.luks-backup-key` on PVE host. + +**Services on encrypted storage (2026-04-15 migration):** +vaultwarden, dbaas (mysql+pg+pgadmin), mailserver, nextcloud, forgejo, matrix, n8n, affine, health, hackmd, redis, headscale, frigate, meshcentral, technitium, actualbudget, grampsweb, owntracks, paperless-ngx, wealthfolio, monitoring (alertmanager) + +**CSI node plugin memory**: Requires 1280Mi limit for LUKS2 Argon2id key derivation (~1GiB). Set via `node.plugin.resources` in Helm values (not `node.resources`). + +**Terraform stack**: `stacks/proxmox-csi/` manages both StorageClasses, the ExternalSecret, and CSI plugin resources. + ### iSCSI Storage Flow (DEPRECATED — replaced 2026-04-02) > **This section is historical.** All iSCSI PVCs have been migrated to Proxmox CSI (`proxmox-lvm`). The democratic-csi iSCSI driver is pending removal. @@ -149,12 +172,13 @@ SQLite uses `fsync()` to guarantee durability. NFS's soft mount + async semantic | Path | Contents | |------|----------| +| `secret/viktor/proxmox_csi_encryption_passphrase` | LUKS2 encryption passphrase for `proxmox-lvm-encrypted` StorageClass | | ~~`secret/viktor/truenas_ssh_key`~~ | **LEGACY** — was SSH key for democratic-csi SSH driver (TrueNAS decommissioned) | | ~~`secret/viktor/truenas_root_password`~~ | **LEGACY** — was TrueNAS root password (TrueNAS decommissioned) | ### Terraform Stacks -- **`stacks/proxmox-csi/`**: Deploys Proxmox CSI plugin + `proxmox-lvm` StorageClass + node topology labels +- **`stacks/proxmox-csi/`**: Deploys Proxmox CSI plugin + `proxmox-lvm` and `proxmox-lvm-encrypted` StorageClasses + ExternalSecret for encryption passphrase + node topology labels - **`stacks/nfs-csi/`**: Deploys NFS CSI driver + StorageClasses for Proxmox NFS - All application stacks reference NFS volumes via `module "nfs_"` calls - Database PVCs use `storageClass: proxmox-lvm` (CNPG, MySQL Helm VCT, Redis Helm, standalone PVCs)