## Context `modules/kubernetes/nfs_volume` creates the K8s PV but NOT the underlying directory on the Proxmox NFS host (`192.168.1.127:/srv/nfs/<subdir>`). The first time a new consumer is added, the mount fails with `mount.nfs: … No such file or directory` and the pod hangs in ContainerCreating. This bit us twice during the Wave 1/2 rollout — once for the mailserver backup (code-z26) and again for the Roundcube backup (code-1f6). Both times the fix was `ssh root@192.168.1.127 'mkdir -p /srv/nfs/<subdir>'`. Rather than automate the SSH dependency into the module (which would break hermeticity and fail for operators without host SSH), this runbook documents the manual bootstrap step and the rationale. Addresses bd code-yo4. ## This change New file: `docs/runbooks/nfs-prerequisites.md`. Lists known consumers, gives the copy-paste SSH command, and explains why auto-creation was rejected (two options, neither worth the churn). ## What is NOT in this change - Any automation of the bootstrap — runbook only - Migration to `nfs-subdir-external-provisioner` — explicitly out of scope ## Test Plan ### Automated ``` $ cat docs/runbooks/nfs-prerequisites.md | head -5 # NFS Prerequisites for `modules/kubernetes/nfs_volume` The `nfs_volume` Terraform module creates a `PersistentVolume` pointing at a path on the Proxmox NFS server (`192.168.1.127`). It does **not** create the underlying directory on the server. ``` ### Manual Verification Before the next stack adds a new `nfs_volume` consumer, read the runbook and run the `ssh root@192.168.1.127 'mkdir -p ...'` step. First pod reaches Ready within a minute of the PV creation. Closes: code-yo4 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2.8 KiB
NFS Prerequisites for modules/kubernetes/nfs_volume
The nfs_volume Terraform module creates a PersistentVolume pointing at a
path on the Proxmox NFS server (192.168.1.127). It does not create the
underlying directory on the server.
If the path does not exist, the first pod that tries to mount the resulting
PVC gets stuck in ContainerCreating with the kubelet event:
MountVolume.SetUp failed for volume "<name>" : mount failed: exit status 32
mount.nfs: mounting 192.168.1.127:/srv/nfs/<path> failed, reason given by
server: No such file or directory
Bootstrap before first apply
Before adding a new nfs_volume consumer (backup CronJob, data PV, etc.),
create the export root on the PVE host:
# Replace <app> with the backup stack name, e.g. mailserver-backup,
# roundcube-backup, immich-backup, etc.
ssh root@192.168.1.127 'mkdir -p /srv/nfs/<app> && chmod 755 /srv/nfs/<app>'
# Confirm exports are live (no change to /etc/exports needed — `/srv/nfs`
# is already exported via the root entry in pve-nfs-exports).
ssh root@192.168.1.127 exportfs -v | grep '/srv/nfs\b'
/srv/nfs is exported with the root entry. Subdirectories inherit the
export automatically; they just have to exist on disk.
Known consumers
| Consumer | NFS path | Owning stack |
|---|---|---|
mailserver-backup |
/srv/nfs/mailserver-backup |
stacks/mailserver/ |
roundcube-backup |
/srv/nfs/roundcube-backup |
stacks/mailserver/ |
mysql-backup |
/srv/nfs/mysql-backup |
stacks/dbaas/ |
postgresql-backup |
/srv/nfs/postgresql-backup |
stacks/dbaas/ |
vaultwarden-backup |
/srv/nfs/vaultwarden-backup |
stacks/vaultwarden/ |
Use grep -rn 'nfs_volume' infra/stacks/ to find all active consumers.
Why not auto-create?
Two options were considered for automating this:
null_resource+local-execSSHmkdirin thenfs_volumemodule — works but adds an SSH dependency to every Terraform run, makes the module non-hermetic, and fails if the operator does not have SSH to the PVE host.nfs-subdir-external-provisioner— handles subdirs automatically but changes the PV/PVC shape and would require migrating all existing consumers.
Neither is worth the churn for a one-time operation per new backup stack. Document + checklist is the current call; re-evaluate if we start adding one NFS consumer per week.
Related tasks
code-yo4— this runbookcode-z26— mailserver backup CronJob (first-time setup hit this)code-1f6— Roundcube backup CronJob (also hit this)