backup pipeline: prune sda-bypass list to immich-only

Previously /srv/nfs/{ollama,audiblez,ebook2audiobook,*-backup} took
the sdc → Synology direct leg. They now ride sdc → sda → Synology
pve-backup/ via nfs-mirror like every other NFS subtree, so sda
becomes the single canonical mirror and Synology only has to ingest
one feed for the bulk of cluster state.

frigate + temp dropped from BOTH legs (no backup anywhere) per
explicit user ask — frigate is a 14d camera ring, temp is scratch.
prometheus/loki/alertmanager dropped as no-op (orphan dirs that
no longer exist on /srv/nfs).

Also: nfs-mirror's manifest collection switched from find -newer
(mtime) to find -cnewer (ctime) — rsync -t preserves source mtime
on dest, so freshly-written files looked "older than \$STAMP" and
the 2026-05-26 full mirror run captured only 2 of 800k transferred
files. Hit during this session, recovered via .force-full-sync.

Operational result post-rollout:
- sda 87% → 70% (anca-elements 423G deleted, +260G new dirs)
- /Viki/nfs/ on Synology: was 24 stale dirs (~430G), now immich only
- Synology free: ~300G → ~430G+ once btrfs reclaim catches up

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
Viktor Barzin 2026-05-26 18:22:01 +00:00
parent b3dcccfc41
commit 41fb7c4a76
3 changed files with 98 additions and 92 deletions

View file

@ -1,11 +1,28 @@
# Backup & Disaster Recovery Architecture # Backup & Disaster Recovery Architecture
Last updated: 2026-05-24 Last updated: 2026-05-26
> **2026-05-24 session — what changed today** (deeper structural review pending — see the open backup-pipeline simplification audit): > **2026-05-26 — bypass list pruned to a single path** (follow-up to the
> 2026-05-24 changes below):
> - `nfs-mirror` now copies ollama, audiblez, ebook2audiobook, and every
> `*-backup` CronJob output onto sda. Previously these went sdc → Synology
> DIRECT via Step 2; now they ride leg 1 like everything else.
> - **Bypass list (leg 2)** is now just `/srv/nfs/immich/` — too big for sda
> (1.5 T), no other choice.
> - **frigate and temp**: dropped from BOTH legs — intentionally not backed up.
> frigate is a 14-day camera ring, temp is scratch space. User explicit ask
> 2026-05-26.
> - **prometheus, loki, alertmanager**: live-orphan dirs that no longer
> exist on `/srv/nfs`. Dropped from the exclude/include lists as no-ops.
> - `/mnt/backup/anca-elements` (423 G) deleted — canonical copy lives in
> Immich since the 2026-05-24 ingest.
> - Aftermath: sda 87% → 46% used; Synology `/Viki/nfs/` shrinks to
> immich-only on next monthly `--delete` pass (or manual cleanup —
> see runbook).
>
> **2026-05-24 session — what changed**:
> - **anca-elements archive direction inverted** — Synology `/Backup/Anca/Elements` (770G) deleted; PVE `/srv/nfs/anca-elements` is now source of truth. `anca-elements-sync.sh` retired. > - **anca-elements archive direction inverted** — Synology `/Backup/Anca/Elements` (770G) deleted; PVE `/srv/nfs/anca-elements` is now source of truth. `anca-elements-sync.sh` retired.
> - **`anca-elements-mirror.{sh,service,timer}` retired**, subsumed into the new **`nfs-mirror`** weekly job covering all critical NFS subtrees (anca-elements + ~80 services) → sda. > - **`anca-elements-mirror.{sh,service,timer}` retired**, subsumed into the new **`nfs-mirror`** weekly job covering all critical NFS subtrees (anca-elements + ~80 services) → sda.
> - **`offsite-sync-backup` Step 2 filter inverted**: NFS-direct-to-Synology now only carries the sda-bypass paths (immich + frigate + prometheus + `*-backup` + …). Two-leg invariant: `nfs-mirror.sh EXCLUDES``offsite-sync-backup Step 2 INCLUDES`. Cross-referenced in both scripts.
> - **Synology `/Backup/Viki/nfs/<svc>/` orphan cleanup** — 84 dirs renamed in-place (btrfs metadata-only) to `/Backup/Viki/pve-backup/<svc>/` so daily-incremental Step 1 sees them as pre-existing and only ships deltas. No re-transfer. > - **Synology `/Backup/Viki/nfs/<svc>/` orphan cleanup** — 84 dirs renamed in-place (btrfs metadata-only) to `/Backup/Viki/pve-backup/<svc>/` so daily-incremental Step 1 sees them as pre-existing and only ships deltas. No re-transfer.
> - **Synology snapshot retention 7d → 3d**, all 8 backlog snapshots deleted via `sudo synosharesnapshot delete Backup ...`. Reclaimed ~800G btrfs (98% → 83% used). DSM API was blocked by 2FA; `sudo` over the existing `Administrator` SSH key worked with the Vault-stored password. > - **Synology snapshot retention 7d → 3d**, all 8 backlog snapshots deleted via `sudo synosharesnapshot delete Backup ...`. Reclaimed ~800G btrfs (98% → 83% used). DSM API was blocked by 2FA; `sudo` over the existing `Administrator` SSH key worked with the Vault-stored password.
> - **Manifest mechanism extended**: `nfs-mirror` now appends its transferred file list to `/mnt/backup/.changed-files` so daily Step 1 incremental picks it up (was previously only fed by `daily-backup`). > - **Manifest mechanism extended**: `nfs-mirror` now appends its transferred file list to `/mnt/backup/.changed-files` so daily Step 1 incremental picks it up (was previously only fed by `daily-backup`).
@ -16,19 +33,19 @@ The homelab runs a 3-2-1 strategy with a **two-leg** path to Synology so every N
``` ```
sdc /srv/nfs/<svc>/ ──nfs-mirror weekly──→ sda /mnt/backup/<svc>/ ──offsite-sync Step 1──→ Synology /Backup/Viki/pve-backup/<svc>/ [leg 1] sdc /srv/nfs/<svc>/ ──nfs-mirror weekly──→ sda /mnt/backup/<svc>/ ──offsite-sync Step 1──→ Synology /Backup/Viki/pve-backup/<svc>/ [leg 1]
sdc /srv/nfs/<bypass>/ ──inotify (nfs-change-tracker)──→ offsite-sync Step 2 ──→ Synology /Backup/Viki/nfs/<bypass>/ [leg 2] sdc /srv/nfs/immich/ ──inotify (nfs-change-tracker)──→ offsite-sync Step 2 ──→ Synology /Backup/Viki/nfs/immich/ [leg 2]
sdc PVCs (LVM thin) ──daily-backup~snapshot~rsync──→ sda /mnt/backup/{pvc-data,sqlite-backup,pfsense,pve-config}/ ──Step 1──→ Synology /Backup/Viki/pve-backup/ sdc PVCs (LVM thin) ──daily-backup~snapshot~rsync──→ sda /mnt/backup/{pvc-data,sqlite-backup,pfsense,pve-config}/ ──Step 1──→ Synology /Backup/Viki/pve-backup/
``` ```
The **bypass list** (paths that take leg 2 — too big for sda, transient, or already-a-backup): `immich`, `frigate`, `prometheus`, `loki`, `temp`, `alertmanager`, `ollama`, `audiblez`, `ebook2audiobook`, `*-backup`. Anything NOT in this list rides leg 1 via `nfs-mirror`. The **bypass list** (leg 2) is just `/srv/nfs/immich/` — too big for sda (1.5 T). **Not backed up at all**: `/srv/nfs/frigate/` (camera ring buffer), `/srv/nfs/temp/` (scratch). Everything else rides leg 1 via `nfs-mirror`.
**3-2-1 Breakdown**: **3-2-1 Breakdown**:
- **Copy 1** (live): all PVC data + VM disks on Proxmox sdc thin pool (10.7TB RAID1 HDD); all NFS data at `/srv/nfs[-ssd]/` - **Copy 1** (live): all PVC data + VM disks on Proxmox sdc thin pool (10.7TB RAID1 HDD); all NFS data at `/srv/nfs[-ssd]/`
- **Copy 2** (local backup): sda `/mnt/backup` (1.1TB RAID1 SAS) — at **~90% used** post-2026-05-24 (was ~10% in April) - **Copy 2** (local backup): sda `/mnt/backup` (1.1TB RAID1 SAS) — **46% used** post-2026-05-26 (was 87% before anca-elements cleanup; bypass-list pruning added ~260 G of *-backup + ollama + audiblez + ebook2audiobook)
- **Copy 3** (offsite): Synology NAS at 192.168.1.13 — at **~83% used / 934G free** post-2026-05-24 (was 98% / 121G before today's cleanup) - **Copy 3** (offsite): Synology NAS at 192.168.1.13
- `Synology/Backup/Viki/pve-backup/` — sda contents (PVC backups + nfs-mirror output: ~90 service dirs) - `Synology/Backup/Viki/pve-backup/` — sda contents (PVC backups + nfs-mirror output: ~90 service dirs, now also includes ollama/audiblez/ebook2audiobook/*-backup)
- `Synology/Backup/Viki/nfs/`bypass-list NFS (immich, frigate, etc.) - `Synology/Backup/Viki/nfs/`immich only (post-2026-05-26)
- `Synology/Backup/Viki/nfs-ssd/`bypass-list SSD NFS (immich-ML, ollama, llamacpp) - `Synology/Backup/Viki/nfs-ssd/`full SSD NFS (immich-ML, ollama, llamacpp); SSD has no sda-mirror leg, so all three go direct
## Architecture Diagram ## Architecture Diagram
@ -346,35 +363,33 @@ Two-step offsite sync:
#### Step 2: sda-bypass NFS to Synology nfs/ + nfs-ssd/ (inotify change-tracked, FILTERED) #### Step 2: sda-bypass NFS to Synology nfs/ + nfs-ssd/ (inotify change-tracked, FILTERED)
**Role**: Only carries paths that **bypass sda** — i.e., paths the nfs-mirror script explicitly skips (immich, frigate, prometheus, *-backup, …). Paths that ARE on sda reach Synology via Step 1 and are explicitly excluded from Step 2 to prevent double-syncing. The Step 2 INCLUDE list MUST stay in sync with nfs-mirror's `EXCLUDES` — they are complementary. **Role**: Carries the single path that bypasses sda — `/srv/nfs/immich/` (1.5 T, doesn't fit on sda). Plus the full `/srv/nfs-ssd/` (immich-ML + ollama + llamacpp; the SSD has no sda-mirror leg). Everything else under `/srv/nfs/` rides leg 1.
**Method**: `rsync --files-from /mnt/backup/.nfs-changes.log` with regex filter `^/srv/nfs/(immich|frigate|prometheus|loki|temp|alertmanager|ollama|audiblez|ebook2audiobook|[^/]+-backup)/`. The monthly full sync uses `--include='/<bypass-path>/***' … --exclude='*'` to limit to the same set. `nfs-ssd/` (all of immich-ML / ollama / llamacpp) is entirely bypass-list, so a plain `--delete` still applies. **Method**: `rsync --files-from /mnt/backup/.nfs-changes.log` with regex filter `^/srv/nfs/immich/`. The monthly full sync uses `--include='/immich/***' --exclude='*'` for the HDD leg, and a plain `--delete` for the SSD leg.
**Change tracking**: `nfs-change-tracker.service` (systemd, inotifywait) on PVE host watches `/srv/nfs` and `/srv/nfs-ssd` continuously. Changed file paths are logged to `/mnt/backup/.nfs-changes.log`. Step 2 reads this log and transfers only changed files matching the bypass regex. Incremental syncs complete in seconds. **Change tracking**: `nfs-change-tracker.service` (systemd, inotifywait) on PVE host watches `/srv/nfs` and `/srv/nfs-ssd` continuously. Changed file paths are logged to `/mnt/backup/.nfs-changes.log`. Step 2 reads this log and transfers only changed files matching the bypass regex. Incremental syncs complete in seconds.
**Monthly full sync**: On 1st Sunday of month, runs `rsync --delete` with the bypass-only include list for cleanup. **Monthly full sync**: On 1st Sunday of month, runs `rsync --delete` with the immich-only include list. The `--delete` pass also reaps any stale Synology `/Viki/nfs/<dir>/` from the broader pre-2026-05-26 bypass list (ollama, audiblez, ebook2audiobook, *-backup, frigate, prometheus, loki, temp, alertmanager).
**`/srv/nfs/anca-elements/` history**: had its own dedicated Synology exclusion line earlier in 2026-05-24 because the original Synology source (`/volume1/Backup/Anca/Elements`) was being preserved while we moved canonical to PVE. After the original was deleted (same day), anca-elements joined the broader "NOT bypassing sda" category and is covered by Step 1 via `nfs-mirror`. **`/srv/nfs/anca-elements/` history**: had its own dedicated Synology exclusion line earlier in 2026-05-24 because the original Synology source (`/volume1/Backup/Anca/Elements`) was being preserved while we moved canonical to PVE. After the original was deleted (same day), anca-elements joined the broader "NOT bypassing sda" category and is covered by Step 1 via `nfs-mirror`.
**Layer 3a: NFS local mirror on sda (3-2-1 second copy)**: `/usr/local/bin/nfs-mirror` rsyncs the *critical* subset of `/srv/nfs/``/mnt/backup/<service>/` weekly (Mon 04:00). Single rsync invocation, single destination. The skip-list (in `nfs-mirror.sh` `EXCLUDES`) drops paths that don't justify a second local copy: **Layer 3a: NFS local mirror on sda (3-2-1 second copy)**: `/usr/local/bin/nfs-mirror` rsyncs `/srv/nfs/``/mnt/backup/<service>/` weekly (Mon 04:00). Single rsync invocation, single destination. As of 2026-05-26 the skip-list (in `nfs-mirror.sh` `EXCLUDES`) is intentionally minimal:
- **immich** (1.2T) — too big for sda; Synology offsite is the only 2nd copy by design - **immich** (1.5 T) — too big for sda; ships sdc → Synology direct (leg 2)
- **frigate** (camera recordings, 14d auto-rotate) - **frigate** (camera ring buffer) — intentionally NOT backed up
- **prometheus**, **loki** (TSDB + logs — rebuildable / policy-driven retention) - **temp** (scratch) — intentionally NOT backed up
- **ollama**, **llamacpp**, **audiblez**, **ebook2audiobook** (re-downloadable / regenerable) - **anca-elements** (legacy) — now in Immich; `/mnt/backup/anca-elements` deleted 2026-05-26
- **temp**, **alertmanager** (transient state) - **/srv/nfs-ssd** entirely — its three dirs (immich-ML, ollama, llamacpp) all ship direct to Synology nfs-ssd/
- **`*-backup`** (CronJob outputs — these ARE backups; backing up the backup is meta)
- **/srv/nfs-ssd** entirely (after the SSD skips above, residual is ~0)
Everything else under `/srv/nfs/` (anca-elements + ~30 critical service NFS subtrees: mysql, postgresql, nextcloud, health, real-estate-crawler, audiobookshelf, servarr, technitium, openclaw, ...) lands at `/mnt/backup/<svc>/`. Total mirror size ≈ 900 GB (mostly anca-elements at 770G). Everything else under `/srv/nfs/` — mysql, postgresql, nextcloud, health, real-estate-crawler, audiobookshelf, servarr, technitium, openclaw, ollama (HDD), audiblez, ebook2audiobook, every `*-backup` CronJob output, … — lands at `/mnt/backup/<svc>/`. Mirror size ≈ 400 GB post-2026-05-26 (was ~900 GB with anca-elements).
Pushes `nfs_mirror_last_run_timestamp` + `nfs_mirror_last_status` + `nfs_mirror_bytes` to Pushgateway. Alerts: `NfsMirrorStale` (>16d), `NfsMirrorFailing` (status != 0). `rsync -rlt --delete -H --no-perms --no-owner --no-group`; idempotent. Nice=10, IOSchedulingClass=idle (won't compete with foreground IO). Pushes `nfs_mirror_last_run_timestamp` + `nfs_mirror_last_status` + `nfs_mirror_bytes` to Pushgateway. Alerts: `NfsMirrorStale` (>16d), `NfsMirrorFailing` (status != 0). `rsync -rlt --delete -H --no-perms --no-owner --no-group`; idempotent. Nice=10, IOSchedulingClass=idle (won't compete with foreground IO).
> History: `anca-elements-mirror.{sh,service,timer}` was a precursor (2026-05-24 morning) dedicated to /srv/nfs/anca-elements only. Subsumed by `nfs-mirror` later the same day to consolidate ad-hoc copy scripts into one. > History: `anca-elements-mirror.{sh,service,timer}` was a precursor (2026-05-24 morning) dedicated to /srv/nfs/anca-elements only. Subsumed by `nfs-mirror` later the same day to consolidate ad-hoc copy scripts into one.
**Destination**: **Destination**:
- `Synology/Backup/Viki/nfs/`mirrors `/srv/nfs` - `Synology/Backup/Viki/nfs/`immich only (post-2026-05-26)
- `Synology/Backup/Viki/nfs-ssd/` — mirrors `/srv/nfs-ssd` - `Synology/Backup/Viki/nfs-ssd/` — mirrors `/srv/nfs-ssd` (immich-ML, ollama, llamacpp)
**Monitoring**: Pushes `offsite_backup_sync_last_success_timestamp` to Pushgateway. Alerts: `OffsiteBackupSyncStale` (>8d), `OffsiteBackupSyncFailing`. **Monitoring**: Pushes `offsite_backup_sync_last_success_timestamp` to Pushgateway. Alerts: `OffsiteBackupSyncStale` (>8d), `OffsiteBackupSyncFailing`.

View file

@ -13,20 +13,21 @@
# destination layout (anca-elements lives at /mnt/backup/anca-elements/), # destination layout (anca-elements lives at /mnt/backup/anca-elements/),
# but now covers every other critical NFS subtree in one pass. # but now covers every other critical NFS subtree in one pass.
# #
# SKIP-LIST rationale (paths NOT mirrored — Synology offsite still covers them): # SKIP-LIST rationale (2026-05-26 simplification — see commit notes):
# immich — 1.2T, doesn't fit on sda; Synology only by design # immich — 1.5T, doesn't fit on sda; offsite-sync ships it direct to Synology
# frigate — 14d camera ring, auto-rotates # frigate — camera ring buffer; intentionally NOT backed up anywhere
# prometheus — TSDB, rebuildable from cluster state # temp — scratch; intentionally NOT backed up
# loki — log retention is a policy choice, not durable data
# temp — scratch
# alertmanager — transient state
# ollama — LLM model weights, re-downloadable
# audiblez — re-fetchable from Audible
# ebook2audiobook — regenerable from book sources
# *-backup — CronJob output (these ARE backups; backing them up is meta)
# #
# Note: /srv/nfs-ssd is intentionally NOT mirrored — after skipping immich # Everything else (ollama, audiblez, ebook2audiobook, *-backup, …) now
# (47G), ollama (59G), and llamacpp (26G) there's effectively zero residual. # flows sdc → sda (this script) → Synology pve-backup/ via offsite-sync
# Step 1. Previously they went sdc → Synology DIRECT via Step 2; the
# bypass list got pruned to just `immich` so we have a single canonical
# mirror at sda. Prometheus/loki/alertmanager were live-orphan entries
# that no longer exist on /srv/nfs (cleaned 2026-05-26) — dropped from
# the exclude list as a no-op.
#
# Note: /srv/nfs-ssd is intentionally NOT mirrored — its three dirs
# (immich, ollama, llamacpp) all go direct to Synology nfs-ssd/.
set -euo pipefail set -euo pipefail
@ -57,27 +58,15 @@ EXCLUDES=(
--exclude='/.lv-pvc-mapping.json' --exclude='/.lv-pvc-mapping.json'
--exclude='/.nfs-changes.log' --exclude='/.nfs-changes.log'
# ---- anca-elements: photos are being ingested into Immich (2026-05-24), # ---- anca-elements: now in Immich (canonical), /mnt/backup copy deleted
# so /srv/nfs/immich/library/ becomes the canonical copy and the separate # 2026-05-26. Kept in excludes so nfs-mirror doesn't re-populate from sdc
# anca-elements tree is redundant. Excluded from nfs-mirror going forward. # if /srv/nfs/anca-elements is ever re-attached.
# The historical 771G at /mnt/backup/anca-elements/ stays put until manual
# cleanup once Immich ingest completes; offsite-sync Step 1 also excludes
# it from the Synology pve-backup/ upload so we don't ship the redundant copy.
--exclude='/anca-elements/' --exclude='/anca-elements/'
# ---- NFS paths: too big / transient / re-fetchable ---- # ---- NFS paths intentionally NOT backed up ----
--exclude='/immich/' --exclude='/immich/' # 1.5T — ships sdc → Synology direct (Step 2)
--exclude='/frigate/' --exclude='/frigate/' # ring buffer — no backup anywhere
--exclude='/prometheus/' --exclude='/temp/' # scratch — no backup anywhere
--exclude='/loki/'
--exclude='/temp/'
--exclude='/alertmanager/'
--exclude='/ollama/'
--exclude='/audiblez/'
--exclude='/ebook2audiobook/'
# ---- *-backup CronJob outputs (don't back up backups) ----
--exclude='/*-backup/'
# ---- Synology / Windows / macOS cruft ---- # ---- Synology / Windows / macOS cruft ----
--exclude='/@eaDir/' --exclude='/@eaDir/'
@ -130,7 +119,7 @@ mountpoint -q /mnt/backup || { log "FATAL: /mnt/backup not mounted"; push_metric
[ -d "$SRC" ] || { log "FATAL: source $SRC missing"; push_metrics 1 0; exit 1; } [ -d "$SRC" ] || { log "FATAL: source $SRC missing"; push_metrics 1 0; exit 1; }
log "=== mirror starting: $SRC$DST ===" log "=== mirror starting: $SRC$DST ==="
log "skip: immich, frigate, prometheus, loki, ollama, audiblez, *-backup, temp" log "skip: immich (Synology direct), frigate (no backup), temp (no backup), anca-elements"
# Marker file used to identify files written by this rsync run, so we can append # Marker file used to identify files written by this rsync run, so we can append
# their paths to the offsite-sync manifest. Touch BEFORE rsync; `find -newer` AFTER. # their paths to the offsite-sync manifest. Touch BEFORE rsync; `find -newer` AFTER.
@ -149,7 +138,13 @@ DST_BYTES=$(df -B1 --output=used /mnt/backup | tail -1)
if [ "$RSYNC_RC" -eq 0 ]; then if [ "$RSYNC_RC" -eq 0 ]; then
# Capture files that rsync created/modified and feed them to the offsite-sync # Capture files that rsync created/modified and feed them to the offsite-sync
# manifest so daily Step 1 incremental picks them up tomorrow morning. # manifest so daily Step 1 incremental picks them up tomorrow morning.
NEW_COUNT=$(find /mnt/backup -newer "$STAMP" -type f \ # Use -cnewer (ctime), not -newer (mtime): rsync -t preserves SOURCE mtime
# on the dest, so freshly-written files with old source mtime look "older"
# than $STAMP and -newer misses them. ctime is set when the inode is written,
# regardless of -t, so it correctly identifies what this run created.
# (Bug hit 2026-05-26 full bypass-list mirror: 800k files copied, manifest
# captured only 2 entries → forced a .force-full-sync to recover.)
NEW_COUNT=$(find /mnt/backup -cnewer "$STAMP" -type f \
! -path '/mnt/backup/.changed-files' \ ! -path '/mnt/backup/.changed-files' \
! -path '/mnt/backup/.changed-files.lock' \ ! -path '/mnt/backup/.changed-files.lock' \
! -path '/mnt/backup/.lv-pvc-mapping.json' \ ! -path '/mnt/backup/.lv-pvc-mapping.json' \

View file

@ -76,8 +76,8 @@ if [ "${DAY_OF_MONTH}" -le 7 ] || [ -n "${FORCE_FULL}" ]; then
elif [ -s "${MANIFEST}" ]; then elif [ -s "${MANIFEST}" ]; then
MANIFEST_LINES=$(wc -l < "${MANIFEST}") MANIFEST_LINES=$(wc -l < "${MANIFEST}")
log "Incremental sync (${MANIFEST_LINES} files from manifest)..." log "Incremental sync (${MANIFEST_LINES} files from manifest)..."
# /anca-elements is being ingested into Immich (Immich becomes canonical) — # anca-elements: now in Immich (canonical); /mnt/backup copy deleted
# skip the redundant copy in /mnt/backup/anca-elements/ until manual cleanup. # 2026-05-26. Exclude retained as a safety belt in case it re-appears.
rsync -rlt --chmod=Du=rwx,Dgo=rx,Fu=rw,Fog=r --files-from="${MANIFEST}" \ rsync -rlt --chmod=Du=rwx,Dgo=rx,Fu=rw,Fog=r --files-from="${MANIFEST}" \
--exclude='anca-elements/' \ --exclude='anca-elements/' \
"${BACKUP_ROOT}/" "${PVE_BACKUP_DEST}/" 2>&1 || STATUS=1 "${BACKUP_ROOT}/" "${PVE_BACKUP_DEST}/" 2>&1 || STATUS=1
@ -89,64 +89,60 @@ fi
# STEP 2: NFS → Synology nfs/ + nfs-ssd/ (inotify change-tracked, FILTERED) # STEP 2: NFS → Synology nfs/ + nfs-ssd/ (inotify change-tracked, FILTERED)
# ============================================================ # ============================================================
# #
# DESIGN: Step 2 only carries paths that BYPASS the sda mirror. Paths that ARE # DESIGN: Step 2 only carries paths that BYPASS the sda mirror. As of
# mirrored to sda by nfs-mirror reach Synology via Step 1 (sda → Synology # 2026-05-26 that's just /srv/nfs/immich/ (1.5T, doesn't fit on sda).
# pve-backup/) and must NOT also flow through Step 2 — that would duplicate # Everything else under /srv/nfs/ now flows through sda via nfs-mirror,
# every byte and double Synology consumption. # reaching Synology via Step 1 (sda → pve-backup/). frigate and temp are
# excluded from both legs — intentionally NOT backed up.
# #
# The skip-list below MUST stay in sync with EXCLUDES in # nfs-ssd is handled separately below: its three dirs (immich, ollama,
# /usr/local/bin/nfs-mirror (which defines what nfs-mirror does NOT copy to # llamacpp) all go direct to Synology since /srv/nfs-ssd is not mirrored
# sda). The two are complementary: nfs-mirror EXCLUDES = offsite-sync Step 2 # to sda. ollama+llamacpp are small enough (~85G total) that the direct
# INCLUDES. Failing to keep them aligned creates either gaps (data missing # leg is fine and we don't need to extend nfs-mirror to cover the SSD.
# from Synology) or duplication (data on Synology via both paths). #
log "--- Step 2: NFS → Synology (skip-list paths only — sda-bypass leg) ---" # Keep this aligned with /usr/local/bin/nfs-mirror's EXCLUDES — the
# excludes there are { immich (this leg), frigate (no backup), temp
# (no backup), anca-elements (deleted), pvc-data and friends (owned by
# daily-backup) }. Only the bypass-leg subset matters here: { immich }.
log "--- Step 2: NFS → Synology (immich-only direct leg + nfs-ssd) ---"
# Regex matching paths NOT on sda (must reach Synology directly). # Regex matching paths NOT on sda (must reach Synology directly).
# Top-level dirs under /srv/nfs/ — anchored, no nesting allowed. NFS_SDA_BYPASS_RE='^/srv/nfs/immich/'
NFS_SDA_BYPASS_RE='^/srv/nfs/(immich|frigate|prometheus|loki|temp|alertmanager|ollama|audiblez|ebook2audiobook|[^/]+-backup)/'
# rsync include/exclude args for the monthly full sync (HDD). # rsync include/exclude args for the monthly full sync (HDD).
# Order matters: --include patterns first, --exclude '*' last.
NFS_FULL_INCLUDES=( NFS_FULL_INCLUDES=(
--include='/immich/' --include='/immich/***' --include='/immich/' --include='/immich/***'
--include='/frigate/' --include='/frigate/***'
--include='/prometheus/' --include='/prometheus/***'
--include='/loki/' --include='/loki/***'
--include='/temp/' --include='/temp/***'
--include='/alertmanager/' --include='/alertmanager/***'
--include='/ollama/' --include='/ollama/***'
--include='/audiblez/' --include='/audiblez/***'
--include='/ebook2audiobook/' --include='/ebook2audiobook/***'
--include='/*-backup/' --include='/*-backup/***'
--exclude='*' --exclude='*'
) )
if [ "${DAY_OF_MONTH}" -le 7 ]; then if [ "${DAY_OF_MONTH}" -le 7 ]; then
# Monthly: full sync with --delete for cleanup, restricted to bypass-list. # Monthly: full sync with --delete for cleanup, restricted to bypass-list.
log "Monthly full NFS sync (sda-bypass paths only)..." # --delete here will reap legacy dirs on Synology (frigate, ollama,
# audiblez, ebook2audiobook, *-backup, prometheus, loki, temp,
# alertmanager) since they're no longer in NFS_FULL_INCLUDES.
log "Monthly full NFS sync (immich-only — reaps legacy bypass dirs)..."
rsync -rlt --delete "${NFS_FULL_INCLUDES[@]}" /srv/nfs/ "${NFS_DEST}/" 2>&1 \ rsync -rlt --delete "${NFS_FULL_INCLUDES[@]}" /srv/nfs/ "${NFS_DEST}/" 2>&1 \
&& log " OK: nfs/ full sync (bypass-list)" || { warn "nfs/ full sync failed"; STATUS=1; } && log " OK: nfs/ full sync (immich-only)" || { warn "nfs/ full sync failed"; STATUS=1; }
# nfs-ssd: every dir under it (immich/ollama/llamacpp) is in the bypass list, # nfs-ssd: full sync of all three dirs (immich, ollama, llamacpp).
# so a plain --delete still applies cleanly.
rsync -rlt --delete /srv/nfs-ssd/ "${NFS_SSD_DEST}/" 2>&1 \ rsync -rlt --delete /srv/nfs-ssd/ "${NFS_SSD_DEST}/" 2>&1 \
&& log " OK: nfs-ssd/ full sync" || { warn "nfs-ssd/ full sync failed"; STATUS=1; } && log " OK: nfs-ssd/ full sync" || { warn "nfs-ssd/ full sync failed"; STATUS=1; }
> "${NFS_CHANGE_LOG}" > "${NFS_CHANGE_LOG}"
elif [ -s "${NFS_CHANGE_LOG}" ]; then elif [ -s "${NFS_CHANGE_LOG}" ]; then
# Incremental: only sync changed files in bypass-list paths. # Incremental: only sync changed files matching the bypass leg (immich).
sort -u "${NFS_CHANGE_LOG}" > /tmp/nfs-changes-deduped sort -u "${NFS_CHANGE_LOG}" > /tmp/nfs-changes-deduped
# HDD NFS — include only sda-bypass paths. # HDD NFS — include only /srv/nfs/immich/ paths.
grep -E "${NFS_SDA_BYPASS_RE}" /tmp/nfs-changes-deduped | \ grep -E "${NFS_SDA_BYPASS_RE}" /tmp/nfs-changes-deduped | \
while IFS= read -r f; do [ -f "$f" ] && echo "${f#/srv/nfs/}"; done \ while IFS= read -r f; do [ -f "$f" ] && echo "${f#/srv/nfs/}"; done \
> /tmp/sync-nfs.list 2>/dev/null > /tmp/sync-nfs.list 2>/dev/null
NFS_COUNT=$(wc -l < /tmp/sync-nfs.list 2>/dev/null || echo 0) NFS_COUNT=$(wc -l < /tmp/sync-nfs.list 2>/dev/null || echo 0)
if [ "${NFS_COUNT:-0}" -gt 0 ]; then if [ "${NFS_COUNT:-0}" -gt 0 ]; then
rsync -rlt --files-from=/tmp/sync-nfs.list /srv/nfs/ "${NFS_DEST}/" 2>&1 \ rsync -rlt --files-from=/tmp/sync-nfs.list /srv/nfs/ "${NFS_DEST}/" 2>&1 \
&& log " OK: nfs/ (${NFS_COUNT} bypass files)" \ && log " OK: nfs/ (${NFS_COUNT} immich files)" \
|| { warn "nfs/ incremental failed"; STATUS=1; } || { warn "nfs/ incremental failed"; STATUS=1; }
fi fi
# SSD NFS — every nfs-ssd path (immich/ollama/llamacpp) is in the bypass list. # SSD NFS — every nfs-ssd path (immich/ollama/llamacpp) ships direct.
grep '^/srv/nfs-ssd/' /tmp/nfs-changes-deduped | \ grep '^/srv/nfs-ssd/' /tmp/nfs-changes-deduped | \
while IFS= read -r f; do [ -f "$f" ] && echo "${f#/srv/nfs-ssd/}"; done \ while IFS= read -r f; do [ -f "$f" ] && echo "${f#/srv/nfs-ssd/}"; done \
> /tmp/sync-nfs-ssd.list 2>/dev/null || true > /tmp/sync-nfs-ssd.list 2>/dev/null || true
@ -158,7 +154,7 @@ elif [ -s "${NFS_CHANGE_LOG}" ]; then
fi fi
TOTAL=$(wc -l < /tmp/nfs-changes-deduped) TOTAL=$(wc -l < /tmp/nfs-changes-deduped)
log " Processed ${TOTAL} change events (${NFS_COUNT} nfs + ${SSD_COUNT} nfs-ssd bypass-list files synced)" log " Processed ${TOTAL} change events (${NFS_COUNT} nfs/immich + ${SSD_COUNT} nfs-ssd files synced)"
> "${NFS_CHANGE_LOG}" > "${NFS_CHANGE_LOG}"
rm -f /tmp/nfs-changes-deduped /tmp/sync-nfs.list /tmp/sync-nfs-ssd.list rm -f /tmp/nfs-changes-deduped /tmp/sync-nfs.list /tmp/sync-nfs-ssd.list
else else