diff --git a/docs/architecture/storage.md b/docs/architecture/storage.md
index 22ddf9a0..856c2d2d 100644
--- a/docs/architecture/storage.md
+++ b/docs/architecture/storage.md
@@ -1,6 +1,6 @@
# Storage Architecture
-Last updated: 2026-05-09
+Last updated: 2026-05-24
## Overview
@@ -13,7 +13,7 @@ The cluster uses two storage backends: **Proxmox CSI** for database block storag
All services storing sensitive data were migrated to `proxmox-lvm-encrypted` on 2026-04-15. This eliminates the previous double-CoW (ZFS + LVM-thin) path and ensures data-at-rest encryption.
**NFS storage (Proxmox host)**: ~100 NFS shares for media libraries (Immich, audiobookshelf, servarr, navidrome), backup targets (`*-backup/` directories), and app data are served directly from the Proxmox host at `192.168.1.127`. Two NFS export roots exist:
-- **HDD NFS**: `/srv/nfs` on ext4 LV `pve/nfs-data` (3TB) — bulk media and backup targets
+- **HDD NFS**: `/srv/nfs` on ext4 LV `pve/nfs-data` (4TB) — bulk media and backup targets
- **SSD NFS**: `/srv/nfs-ssd` on ext4 LV `ssd/nfs-ssd-data` (100GB) — high-performance data (Immich ML)
Both `StorageClass: nfs-truenas` and `StorageClass: nfs-proxmox` point to the Proxmox host and are functionally identical. The `nfs-truenas` name is historical — it was retained because StorageClass names are immutable on bound PVs (48 PVs reference it) and renaming would force mass PV churn across the cluster.
@@ -31,7 +31,7 @@ graph TB
subgraph Proxmox["Proxmox Host (192.168.1.127)"]
sdc["sdc: 10.7TB RAID1 HDD
VG pve, LV data (thin pool)
~67 proxmox-lvm PVCs
~28 proxmox-lvm-encrypted PVCs"]
sda["sda: 1.1TB RAID1 SAS
VG backup, LV data (ext4)
/mnt/backup"]
- NFS_HDD["LV pve/nfs-data (3TB ext4)
/srv/nfs
~100 NFS shares
Media + backup targets"]
+ NFS_HDD["LV pve/nfs-data (4TB ext4)
/srv/nfs
~100 NFS shares
Media + backup targets"]
NFS_SSD["LV ssd/nfs-ssd-data (100GB ext4)
/srv/nfs-ssd
High-performance data
(Immich ML)"]
NFS_Exports["NFS Exports
managed by /etc/exports"]
NFS_HDD --> NFS_Exports
@@ -74,7 +74,7 @@ graph TB
| **Proxmox CSI plugin** | Helm chart | Namespace: proxmox-csi | Block storage via LVM-thin hotplug |
| **StorageClass `proxmox-lvm`** | RWO, WaitForFirstConsumer | Cluster-wide | Non-sensitive stateful apps |
| **StorageClass `proxmox-lvm-encrypted`** | RWO, WaitForFirstConsumer, LUKS2 | Cluster-wide | **All sensitive data** (databases, auth, email, passwords, git) |
-| Proxmox NFS (HDD) | LV `pve/nfs-data`, 3TB ext4 | 192.168.1.127:/srv/nfs | Bulk NFS data for all services |
+| Proxmox NFS (HDD) | LV `pve/nfs-data`, 4TB ext4 | 192.168.1.127:/srv/nfs | Bulk NFS data for all services |
| Proxmox NFS (SSD) | LV `ssd/nfs-ssd-data`, 100GB ext4 | 192.168.1.127:/srv/nfs-ssd | High-performance data (Immich ML) |
| nfs-csi | Helm chart | Namespace: nfs-csi | NFS CSI driver |
| StorageClass `nfs-proxmox` | RWX, soft mount | Cluster-wide | NFS storage, points to Proxmox host |
diff --git a/docs/runbooks/grow-pve-nfs-lv.md b/docs/runbooks/grow-pve-nfs-lv.md
new file mode 100644
index 00000000..156ebe84
--- /dev/null
+++ b/docs/runbooks/grow-pve-nfs-lv.md
@@ -0,0 +1,47 @@
+# Runbook: Grow `/srv/nfs` LV (`pve/nfs-data`)
+
+Use when `/srv/nfs` on the PVE host is filling up and the workloads writing to it cannot be slimmed down. The LV sits on the LVM-thin pool `pve/data` (10.54 TB total). Thin-pool free space is the real gate — confirm before extending.
+
+## When to use
+
+- `df -h /srv/nfs` shows usage > ~85 % and projected growth exceeds free space within a backup retention window.
+- An upcoming bulk write (media import, restore) needs headroom that the current free space won't absorb.
+
+## Steps
+
+1. **Check thin-pool headroom on PVE host:**
+
+ ```bash
+ ssh root@192.168.1.127 'lvs pve/data; lvs pve/nfs-data; df -h /srv/nfs'
+ ```
+
+ The `pve/data` thin pool's `Data%` should leave room for the extension (target `Data%` after extend < 90 %).
+
+2. **Extend the LV and online-resize ext4:**
+
+ ```bash
+ ssh root@192.168.1.127 '
+ lvextend -L +1T pve/nfs-data &&
+ resize2fs /dev/pve/nfs-data
+ '
+ ```
+
+ Both commands are safe online: `lvextend` only grows allocation, `resize2fs` extends ext4 while mounted.
+
+3. **Verify:**
+
+ ```bash
+ ssh root@192.168.1.127 'lvs pve/nfs-data; df -h /srv/nfs'
+ ```
+
+ `df` should show the new size; `Use%` should drop proportionally.
+
+## Notes
+
+- **Not Terraform-managed.** PVE host LVs live outside the IaC tree (no `infra/stacks/pve-host/`). Record the new size in `docs/architecture/storage.md` (the "HDD NFS" line and the diagram label) in the same commit.
+- **Thin-pool overcommit warning** from `lvextend` is informational — it reports the sum of all thin volume virtual sizes (currently ~12 TiB) vs. the physical pool (10.7 TiB). Real fill is `pve/data` `Data%`; ignore the overcommit warning unless `Data%` itself is climbing toward 100 %.
+- **`/srv/nfs-ssd`** lives on a separate LV (`ssd/nfs-ssd-data`) backed by SSDs — the same `lvextend`/`resize2fs` pattern applies, but the source pool is `ssd/data`.
+
+## Backout
+
+Online shrinks are unsafe with active workloads. Don't try to shrink `pve/nfs-data` in place — restore from snapshot or migrate data out and rebuild the LV instead.
diff --git a/stacks/immich/main.tf b/stacks/immich/main.tf
index ae9ea310..5a3d2413 100644
--- a/stacks/immich/main.tf
+++ b/stacks/immich/main.tf
@@ -124,6 +124,19 @@ module "nfs_ml_cache_host" {
nfs_path = "/srv/nfs-ssd/immich/machine-learning"
}
+# Read-only source for one-shot bulk imports into individual users' accounts
+# (currently: Anca's WD Elements dump, mirrored to /srv/nfs/anca-elements from
+# her Synology). Consumed only by the import Job below — NOT mounted into the
+# immich-server Deployment. PVC stays after the Job is removed so videos can
+# follow in batch 2.
+module "nfs_anca_elements_host" {
+ source = "../../modules/kubernetes/nfs_volume"
+ name = "immich-anca-elements-host"
+ namespace = kubernetes_namespace.immich.metadata[0].name
+ nfs_server = var.proxmox_host
+ nfs_path = "/srv/nfs/anca-elements"
+}
+
resource "kubernetes_namespace" "immich" {
metadata {
name = "immich"
@@ -865,6 +878,123 @@ resource "kubernetes_cron_job_v1" "postgresql-backup" {
}
}
+# One-shot bulk import of Anca's Synology Elements photo archive into her
+# Immich account. Reads /srv/nfs/anca-elements via the RO PVC above and posts
+# assets to immich-server in-cluster (bypasses ingress + CrowdSec entirely).
+#
+# Auth: Anca's personal Immich API key. Add to Vault `secret/immich` under key
+# `anca_api_key`, then force-refresh the existing `immich-secrets` ExternalSecret:
+# kubectl annotate externalsecret immich-secrets -n immich \
+# force-sync=$(date +%s) --overwrite
+#
+# After successful completion: REMOVE this resource block + apply again. The
+# PVC stays for a videos batch later. Filters target a photo-only subset of
+# the dump (videos / installers / docs / courses banned); EXIF is preserved
+# end-to-end since immich-go uploads originals byte-for-byte.
+resource "kubernetes_job_v1" "anca_elements_import" {
+ metadata {
+ name = "anca-elements-import"
+ namespace = kubernetes_namespace.immich.metadata[0].name
+ labels = {
+ app = "anca-elements-import"
+ tier = local.tiers.gpu
+ }
+ }
+
+ # Don't block `terragrunt apply` on the multi-hour upload — TF returns once
+ # the Job is created; monitor via `kubectl logs -n immich -f job/...`.
+ wait_for_completion = false
+
+ spec {
+ backoff_limit = 2
+ ttl_seconds_after_finished = 604800
+ template {
+ metadata {
+ labels = {
+ app = "anca-elements-import"
+ }
+ }
+ spec {
+ restart_policy = "OnFailure"
+ container {
+ name = "immich-go"
+ image = "alpine:3.20"
+ command = [
+ "/bin/sh",
+ "-c",
+ <<-EOT
+ set -eu
+ apk add --no-cache curl tar ca-certificates >/dev/null
+
+ IMMICH_GO_VERSION="v0.31.0"
+ cd /tmp
+ echo "Downloading immich-go $${IMMICH_GO_VERSION}…"
+ curl -sL "https://github.com/simulot/immich-go/releases/download/$${IMMICH_GO_VERSION}/immich-go_Linux_x86_64.tar.gz" \
+ | tar -xz
+ chmod +x ./immich-go
+
+ echo "Starting upload from /data → http://immich-server.immich.svc.cluster.local:2283 …"
+ exec ./immich-go upload from-folder /data \
+ --server http://immich-server.immich.svc.cluster.local:2283 \
+ --api-key "$${IMMICH_API_KEY}" \
+ --include-extensions .jpg,.jpeg,.png,.heic,.heif,.gif,.tif,.tiff,.webp,.nef,.cr2,.dng,.raw \
+ --into-album "Poze (Elements)" \
+ --ban-file "filme/" --ban-file "Music/" --ban-file "carti/" \
+ --ban-file "cursuri/" --ban-file "Adobe.*/" \
+ --ban-file "Fullstack Web Development*/" \
+ --ban-file "Contracte and CV/" --ban-file "Cv/" \
+ --ban-file "docum/" --ban-file "finance/" \
+ --ban-file "download/" --ban-file "kit/" \
+ --ban-file "csp/" --ban-file "KOREAN/" \
+ --ban-file "System Volume Information/" \
+ --pause-immich-jobs=false \
+ --concurrent-tasks 8 \
+ --client-timeout 1h \
+ --no-ui \
+ --on-errors continue
+ EOT
+ ]
+ env {
+ name = "IMMICH_API_KEY"
+ value_from {
+ secret_key_ref {
+ name = "immich-secrets"
+ key = "anca_api_key"
+ }
+ }
+ }
+ volume_mount {
+ name = "anca-elements"
+ mount_path = "/data"
+ read_only = true
+ }
+ resources {
+ requests = {
+ cpu = "500m"
+ memory = "1Gi"
+ }
+ limits = {
+ memory = "1Gi"
+ }
+ }
+ }
+ volume {
+ name = "anca-elements"
+ persistent_volume_claim {
+ claim_name = module.nfs_anca_elements_host.claim_name
+ read_only = true
+ }
+ }
+ }
+ }
+ }
+ lifecycle {
+ # KYVERNO_LIFECYCLE_V1: Kyverno admission webhook mutates dns_config with ndots=2
+ ignore_changes = [spec[0].template[0].spec[0].dns_config]
+ }
+ depends_on = [kubernetes_manifest.external_secret]
+}
+
# POWER TOOLS
# resource "kubernetes_deployment" "powertools" {