immich: bulk-import Anca's Elements photo archive into her account
Grows pve/nfs-data 3T → 4T (online lvextend + resize2fs) to absorb ~340 GB of new originals landing under /srv/nfs/immich/upload during the import. Adds: - module "nfs_anca_elements_host" — RO PVC over /srv/nfs/anca-elements, consumed only by the import Job (not mounted in immich-server). - kubernetes_job_v1.anca_elements_import — immich-go v0.31.0 uploader posting to immich-server.immich.svc:2283 with Anca's API key (synced via the existing immich-secrets ExternalSecret from secret/immich.anca_api_key). Filters to image extensions, bans the non-photo top-level dirs (filme/, Music/, carti/, courses, installers, docs, etc.), puts every asset in the album "Poze (Elements)". Default `--pause-immich-jobs` is disabled — non-admin keys can't pause jobs. - docs/architecture/storage.md — note the new 4 TB size in 3 places. - docs/runbooks/grow-pve-nfs-lv.md — captures the one-shot lvextend procedure (no pve-host TF stack exists for this). Job is removed in the follow-up cleanup commit once the upload completes; the PVC stays for a videos batch later. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
parent
4d756be4f5
commit
d6590612b2
3 changed files with 181 additions and 4 deletions
|
|
@ -1,6 +1,6 @@
|
|||
# Storage Architecture
|
||||
|
||||
Last updated: 2026-05-09
|
||||
Last updated: 2026-05-24
|
||||
|
||||
## Overview
|
||||
|
||||
|
|
@ -13,7 +13,7 @@ The cluster uses two storage backends: **Proxmox CSI** for database block storag
|
|||
All services storing sensitive data were migrated to `proxmox-lvm-encrypted` on 2026-04-15. This eliminates the previous double-CoW (ZFS + LVM-thin) path and ensures data-at-rest encryption.
|
||||
|
||||
**NFS storage (Proxmox host)**: ~100 NFS shares for media libraries (Immich, audiobookshelf, servarr, navidrome), backup targets (`*-backup/` directories), and app data are served directly from the Proxmox host at `192.168.1.127`. Two NFS export roots exist:
|
||||
- **HDD NFS**: `/srv/nfs` on ext4 LV `pve/nfs-data` (3TB) — bulk media and backup targets
|
||||
- **HDD NFS**: `/srv/nfs` on ext4 LV `pve/nfs-data` (4TB) — bulk media and backup targets
|
||||
- **SSD NFS**: `/srv/nfs-ssd` on ext4 LV `ssd/nfs-ssd-data` (100GB) — high-performance data (Immich ML)
|
||||
|
||||
Both `StorageClass: nfs-truenas` and `StorageClass: nfs-proxmox` point to the Proxmox host and are functionally identical. The `nfs-truenas` name is historical — it was retained because StorageClass names are immutable on bound PVs (48 PVs reference it) and renaming would force mass PV churn across the cluster.
|
||||
|
|
@ -31,7 +31,7 @@ graph TB
|
|||
subgraph Proxmox["Proxmox Host (192.168.1.127)"]
|
||||
sdc["sdc: 10.7TB RAID1 HDD<br/>VG pve, LV data (thin pool)<br/>~67 proxmox-lvm PVCs<br/>~28 proxmox-lvm-encrypted PVCs"]
|
||||
sda["sda: 1.1TB RAID1 SAS<br/>VG backup, LV data (ext4)<br/>/mnt/backup"]
|
||||
NFS_HDD["LV pve/nfs-data (3TB ext4)<br/>/srv/nfs<br/>~100 NFS shares<br/>Media + backup targets"]
|
||||
NFS_HDD["LV pve/nfs-data (4TB ext4)<br/>/srv/nfs<br/>~100 NFS shares<br/>Media + backup targets"]
|
||||
NFS_SSD["LV ssd/nfs-ssd-data (100GB ext4)<br/>/srv/nfs-ssd<br/>High-performance data<br/>(Immich ML)"]
|
||||
NFS_Exports["NFS Exports<br/>managed by /etc/exports"]
|
||||
NFS_HDD --> NFS_Exports
|
||||
|
|
@ -74,7 +74,7 @@ graph TB
|
|||
| **Proxmox CSI plugin** | Helm chart | Namespace: proxmox-csi | Block storage via LVM-thin hotplug |
|
||||
| **StorageClass `proxmox-lvm`** | RWO, WaitForFirstConsumer | Cluster-wide | Non-sensitive stateful apps |
|
||||
| **StorageClass `proxmox-lvm-encrypted`** | RWO, WaitForFirstConsumer, LUKS2 | Cluster-wide | **All sensitive data** (databases, auth, email, passwords, git) |
|
||||
| Proxmox NFS (HDD) | LV `pve/nfs-data`, 3TB ext4 | 192.168.1.127:/srv/nfs | Bulk NFS data for all services |
|
||||
| Proxmox NFS (HDD) | LV `pve/nfs-data`, 4TB ext4 | 192.168.1.127:/srv/nfs | Bulk NFS data for all services |
|
||||
| Proxmox NFS (SSD) | LV `ssd/nfs-ssd-data`, 100GB ext4 | 192.168.1.127:/srv/nfs-ssd | High-performance data (Immich ML) |
|
||||
| nfs-csi | Helm chart | Namespace: nfs-csi | NFS CSI driver |
|
||||
| StorageClass `nfs-proxmox` | RWX, soft mount | Cluster-wide | NFS storage, points to Proxmox host |
|
||||
|
|
|
|||
47
docs/runbooks/grow-pve-nfs-lv.md
Normal file
47
docs/runbooks/grow-pve-nfs-lv.md
Normal file
|
|
@ -0,0 +1,47 @@
|
|||
# Runbook: Grow `/srv/nfs` LV (`pve/nfs-data`)
|
||||
|
||||
Use when `/srv/nfs` on the PVE host is filling up and the workloads writing to it cannot be slimmed down. The LV sits on the LVM-thin pool `pve/data` (10.54 TB total). Thin-pool free space is the real gate — confirm before extending.
|
||||
|
||||
## When to use
|
||||
|
||||
- `df -h /srv/nfs` shows usage > ~85 % and projected growth exceeds free space within a backup retention window.
|
||||
- An upcoming bulk write (media import, restore) needs headroom that the current free space won't absorb.
|
||||
|
||||
## Steps
|
||||
|
||||
1. **Check thin-pool headroom on PVE host:**
|
||||
|
||||
```bash
|
||||
ssh root@192.168.1.127 'lvs pve/data; lvs pve/nfs-data; df -h /srv/nfs'
|
||||
```
|
||||
|
||||
The `pve/data` thin pool's `Data%` should leave room for the extension (target `Data%` after extend < 90 %).
|
||||
|
||||
2. **Extend the LV and online-resize ext4:**
|
||||
|
||||
```bash
|
||||
ssh root@192.168.1.127 '
|
||||
lvextend -L +1T pve/nfs-data &&
|
||||
resize2fs /dev/pve/nfs-data
|
||||
'
|
||||
```
|
||||
|
||||
Both commands are safe online: `lvextend` only grows allocation, `resize2fs` extends ext4 while mounted.
|
||||
|
||||
3. **Verify:**
|
||||
|
||||
```bash
|
||||
ssh root@192.168.1.127 'lvs pve/nfs-data; df -h /srv/nfs'
|
||||
```
|
||||
|
||||
`df` should show the new size; `Use%` should drop proportionally.
|
||||
|
||||
## Notes
|
||||
|
||||
- **Not Terraform-managed.** PVE host LVs live outside the IaC tree (no `infra/stacks/pve-host/`). Record the new size in `docs/architecture/storage.md` (the "HDD NFS" line and the diagram label) in the same commit.
|
||||
- **Thin-pool overcommit warning** from `lvextend` is informational — it reports the sum of all thin volume virtual sizes (currently ~12 TiB) vs. the physical pool (10.7 TiB). Real fill is `pve/data` `Data%`; ignore the overcommit warning unless `Data%` itself is climbing toward 100 %.
|
||||
- **`/srv/nfs-ssd`** lives on a separate LV (`ssd/nfs-ssd-data`) backed by SSDs — the same `lvextend`/`resize2fs` pattern applies, but the source pool is `ssd/data`.
|
||||
|
||||
## Backout
|
||||
|
||||
Online shrinks are unsafe with active workloads. Don't try to shrink `pve/nfs-data` in place — restore from snapshot or migrate data out and rebuild the LV instead.
|
||||
|
|
@ -124,6 +124,19 @@ module "nfs_ml_cache_host" {
|
|||
nfs_path = "/srv/nfs-ssd/immich/machine-learning"
|
||||
}
|
||||
|
||||
# Read-only source for one-shot bulk imports into individual users' accounts
|
||||
# (currently: Anca's WD Elements dump, mirrored to /srv/nfs/anca-elements from
|
||||
# her Synology). Consumed only by the import Job below — NOT mounted into the
|
||||
# immich-server Deployment. PVC stays after the Job is removed so videos can
|
||||
# follow in batch 2.
|
||||
module "nfs_anca_elements_host" {
|
||||
source = "../../modules/kubernetes/nfs_volume"
|
||||
name = "immich-anca-elements-host"
|
||||
namespace = kubernetes_namespace.immich.metadata[0].name
|
||||
nfs_server = var.proxmox_host
|
||||
nfs_path = "/srv/nfs/anca-elements"
|
||||
}
|
||||
|
||||
resource "kubernetes_namespace" "immich" {
|
||||
metadata {
|
||||
name = "immich"
|
||||
|
|
@ -865,6 +878,123 @@ resource "kubernetes_cron_job_v1" "postgresql-backup" {
|
|||
}
|
||||
}
|
||||
|
||||
# One-shot bulk import of Anca's Synology Elements photo archive into her
|
||||
# Immich account. Reads /srv/nfs/anca-elements via the RO PVC above and posts
|
||||
# assets to immich-server in-cluster (bypasses ingress + CrowdSec entirely).
|
||||
#
|
||||
# Auth: Anca's personal Immich API key. Add to Vault `secret/immich` under key
|
||||
# `anca_api_key`, then force-refresh the existing `immich-secrets` ExternalSecret:
|
||||
# kubectl annotate externalsecret immich-secrets -n immich \
|
||||
# force-sync=$(date +%s) --overwrite
|
||||
#
|
||||
# After successful completion: REMOVE this resource block + apply again. The
|
||||
# PVC stays for a videos batch later. Filters target a photo-only subset of
|
||||
# the dump (videos / installers / docs / courses banned); EXIF is preserved
|
||||
# end-to-end since immich-go uploads originals byte-for-byte.
|
||||
resource "kubernetes_job_v1" "anca_elements_import" {
|
||||
metadata {
|
||||
name = "anca-elements-import"
|
||||
namespace = kubernetes_namespace.immich.metadata[0].name
|
||||
labels = {
|
||||
app = "anca-elements-import"
|
||||
tier = local.tiers.gpu
|
||||
}
|
||||
}
|
||||
|
||||
# Don't block `terragrunt apply` on the multi-hour upload — TF returns once
|
||||
# the Job is created; monitor via `kubectl logs -n immich -f job/...`.
|
||||
wait_for_completion = false
|
||||
|
||||
spec {
|
||||
backoff_limit = 2
|
||||
ttl_seconds_after_finished = 604800
|
||||
template {
|
||||
metadata {
|
||||
labels = {
|
||||
app = "anca-elements-import"
|
||||
}
|
||||
}
|
||||
spec {
|
||||
restart_policy = "OnFailure"
|
||||
container {
|
||||
name = "immich-go"
|
||||
image = "alpine:3.20"
|
||||
command = [
|
||||
"/bin/sh",
|
||||
"-c",
|
||||
<<-EOT
|
||||
set -eu
|
||||
apk add --no-cache curl tar ca-certificates >/dev/null
|
||||
|
||||
IMMICH_GO_VERSION="v0.31.0"
|
||||
cd /tmp
|
||||
echo "Downloading immich-go $${IMMICH_GO_VERSION}…"
|
||||
curl -sL "https://github.com/simulot/immich-go/releases/download/$${IMMICH_GO_VERSION}/immich-go_Linux_x86_64.tar.gz" \
|
||||
| tar -xz
|
||||
chmod +x ./immich-go
|
||||
|
||||
echo "Starting upload from /data → http://immich-server.immich.svc.cluster.local:2283 …"
|
||||
exec ./immich-go upload from-folder /data \
|
||||
--server http://immich-server.immich.svc.cluster.local:2283 \
|
||||
--api-key "$${IMMICH_API_KEY}" \
|
||||
--include-extensions .jpg,.jpeg,.png,.heic,.heif,.gif,.tif,.tiff,.webp,.nef,.cr2,.dng,.raw \
|
||||
--into-album "Poze (Elements)" \
|
||||
--ban-file "filme/" --ban-file "Music/" --ban-file "carti/" \
|
||||
--ban-file "cursuri/" --ban-file "Adobe.*/" \
|
||||
--ban-file "Fullstack Web Development*/" \
|
||||
--ban-file "Contracte and CV/" --ban-file "Cv/" \
|
||||
--ban-file "docum/" --ban-file "finance/" \
|
||||
--ban-file "download/" --ban-file "kit/" \
|
||||
--ban-file "csp/" --ban-file "KOREAN/" \
|
||||
--ban-file "System Volume Information/" \
|
||||
--pause-immich-jobs=false \
|
||||
--concurrent-tasks 8 \
|
||||
--client-timeout 1h \
|
||||
--no-ui \
|
||||
--on-errors continue
|
||||
EOT
|
||||
]
|
||||
env {
|
||||
name = "IMMICH_API_KEY"
|
||||
value_from {
|
||||
secret_key_ref {
|
||||
name = "immich-secrets"
|
||||
key = "anca_api_key"
|
||||
}
|
||||
}
|
||||
}
|
||||
volume_mount {
|
||||
name = "anca-elements"
|
||||
mount_path = "/data"
|
||||
read_only = true
|
||||
}
|
||||
resources {
|
||||
requests = {
|
||||
cpu = "500m"
|
||||
memory = "1Gi"
|
||||
}
|
||||
limits = {
|
||||
memory = "1Gi"
|
||||
}
|
||||
}
|
||||
}
|
||||
volume {
|
||||
name = "anca-elements"
|
||||
persistent_volume_claim {
|
||||
claim_name = module.nfs_anca_elements_host.claim_name
|
||||
read_only = true
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
lifecycle {
|
||||
# KYVERNO_LIFECYCLE_V1: Kyverno admission webhook mutates dns_config with ndots=2
|
||||
ignore_changes = [spec[0].template[0].spec[0].dns_config]
|
||||
}
|
||||
depends_on = [kubernetes_manifest.external_secret]
|
||||
}
|
||||
|
||||
# POWER TOOLS
|
||||
|
||||
# resource "kubernetes_deployment" "powertools" {
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue