feat(storage): migrate all sensitive services to proxmox-lvm-encrypted

Reconcile Terraform with cluster state after manual encrypted PVC migrations
and complete the remaining unfinished migrations. All services storing
sensitive data now use LUKS2-encrypted block storage via the Proxmox CSI
plugin.

## Context

Only Technitium DNS was using encrypted storage in Terraform. Many services
had been manually migrated to encrypted PVCs in the cluster, but Terraform
was never updated — creating dangerous state drift where a `tg apply` could
recreate unencrypted PVCs.

## This change

Phase 0 — Infrastructure:
- Add `proxmox-lvm-encrypted` StorageClass to Helm values (extraParameters)
- Add ExternalSecret for LUKS encryption passphrase to Terraform
- Fix CSI node plugin memory: `node.plugin.resources` (not `node.resources`)
  with 1280Mi limit for LUKS2 Argon2id key derivation

Phase 1 — TF state reconciliation (zero downtime):
- Health, Matrix, N8N, Forgejo, Vaultwarden, Mailserver: state rm + import
- Redis, DBAAS MySQL, DBAAS PostgreSQL: Helm/CNPG value updates

Phase 2 — Data migration (encrypted PVCs existed but unused):
- Headscale, Frigate, MeshCentral: rsync + switchover
- Nextcloud (20Gi): rsync + chart_values update

Phase 3 — New encrypted PVCs:
- Roundcube HTML, HackMD, Affine, DBAAS pgadmin: create + rsync + switchover

Phase 4 — Cleanup:
- Deleted 5 orphaned unencrypted PVCs

## Services migrated (18 PVCs across 14 namespaces)

```
vaultwarden     → vaultwarden-data-encrypted
dbaas           → datadir-mysql-cluster-0, pg-cluster-{1,2}, dbaas-pgadmin-encrypted
mailserver      → mailserver-data-encrypted, roundcubemail-{enigma,html}-encrypted
nextcloud       → nextcloud-data-encrypted
forgejo         → forgejo-data-encrypted
matrix          → matrix-data-encrypted
n8n             → n8n-data-encrypted
affine          → affine-data-encrypted
health          → health-uploads-encrypted
hackmd          → hackmd-data-encrypted
redis           → redis-data-redis-node-{0,1}
headscale       → headscale-data-encrypted
frigate         → frigate-config-encrypted
meshcentral     → meshcentral-{data,files}-encrypted
```

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
Viktor Barzin 2026-04-15 20:15:30 +00:00
parent aafb7eea34
commit 8b004c4c94
16 changed files with 151 additions and 96 deletions

View file

@ -197,7 +197,7 @@ resource "helm_release" "mysql_cluster" {
}
datadirVolumeClaimTemplate = {
storageClassName = "proxmox-lvm"
storageClassName = "proxmox-lvm-encrypted"
metadata = {
annotations = {
"resize.topolvm.io/threshold" = "80%"
@ -353,13 +353,13 @@ resource "helm_release" "mysql_cluster" {
]
}
# MySQL Router - explicitly set resources (chart does not expose router.resources)
# VPA shows 100Mi upper bound, setting to 128Mi
# Note: This requires manual kubectl patch after helm release:
# kubectl patch deployment mysql-cluster-router -n dbaas --type=json -p='[
# {"op": "replace", "path": "/spec/template/spec/containers/0/resources",
# "value": {"requests": {"cpu": "25m", "memory": "128Mi"}, "limits": {"memory": "128Mi"}}}]'
# TODO: migrate to mysql-operator fork or wait for upstream router.resources support
# MySQL Router - explicitly set resources (chart does not expose router.resources)
# VPA shows 100Mi upper bound, setting to 128Mi
# Note: This requires manual kubectl patch after helm release:
# kubectl patch deployment mysql-cluster-router -n dbaas --type=json -p='[
# {"op": "replace", "path": "/spec/template/spec/containers/0/resources",
# "value": {"requests": {"cpu": "25m", "memory": "128Mi"}, "limits": {"memory": "128Mi"}}}]'
# TODO: migrate to mysql-operator fork or wait for upstream router.resources support
})]
@ -398,10 +398,10 @@ module "nfs_mysql_backup_host" {
nfs_path = "/srv/nfs/mysql-backup"
}
resource "kubernetes_persistent_volume_claim" "pgadmin_proxmox" {
resource "kubernetes_persistent_volume_claim" "pgadmin_encrypted" {
wait_until_bound = false
metadata {
name = "dbaas-pgadmin-proxmox"
name = "dbaas-pgadmin-encrypted"
namespace = kubernetes_namespace.dbaas.metadata[0].name
annotations = {
"resize.topolvm.io/threshold" = "80%"
@ -411,7 +411,7 @@ resource "kubernetes_persistent_volume_claim" "pgadmin_proxmox" {
}
spec {
access_modes = ["ReadWriteOnce"]
storage_class_name = "proxmox-lvm"
storage_class_name = "proxmox-lvm-encrypted"
resources {
requests = {
storage = "1Gi"
@ -523,9 +523,9 @@ resource "kubernetes_cron_job_v1" "mysql-backup-per-db" {
namespace = kubernetes_namespace.dbaas.metadata[0].name
}
spec {
concurrency_policy = "Replace"
failed_jobs_history_limit = 3
schedule = "45 0 * * *"
concurrency_policy = "Replace"
failed_jobs_history_limit = 3
schedule = "45 0 * * *"
starting_deadline_seconds = 10
successful_jobs_history_limit = 3
job_template {
@ -1093,7 +1093,7 @@ resource "null_resource" "pg_cluster" {
instances = "2"
image = "ghcr.io/cloudnative-pg/postgis:16"
storage_size = "20Gi"
storage_class = "proxmox-lvm"
storage_class = "proxmox-lvm-encrypted"
memory_limit = "2Gi"
pg_params = "v2-shared512-walcomp-workmem16"
}
@ -1127,7 +1127,7 @@ resource "null_resource" "pg_cluster" {
resize.topolvm.io/storage_limit: "100Gi"
storage:
size: 20Gi
storageClass: proxmox-lvm
storageClass: proxmox-lvm-encrypted
resources:
requests:
cpu: "50m"
@ -1257,7 +1257,7 @@ resource "kubernetes_deployment" "pgadmin" {
# name = "pgadmin-config"
# }
persistent_volume_claim {
claim_name = kubernetes_persistent_volume_claim.pgadmin_proxmox.metadata[0].name
claim_name = kubernetes_persistent_volume_claim.pgadmin_encrypted.metadata[0].name
}
}
dns_config {
@ -1386,9 +1386,9 @@ resource "kubernetes_cron_job_v1" "postgresql-backup-per-db" {
namespace = kubernetes_namespace.dbaas.metadata[0].name
}
spec {
concurrency_policy = "Replace"
failed_jobs_history_limit = 3
schedule = "15 0 * * *"
concurrency_policy = "Replace"
failed_jobs_history_limit = 3
schedule = "15 0 * * *"
starting_deadline_seconds = 10
successful_jobs_history_limit = 3
job_template {