Compare commits
5 commits
9277d71d81
...
5cdac421c2
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
5cdac421c2 | ||
|
|
5a0e4b3dac | ||
|
|
d5f73ce109 | ||
|
|
c948dc0dbe | ||
|
|
4798583db7 |
9 changed files with 347 additions and 167 deletions
|
|
@ -1,18 +1,34 @@
|
|||
# Backup & Disaster Recovery Architecture
|
||||
|
||||
Last updated: 2026-04-13
|
||||
Last updated: 2026-05-24
|
||||
|
||||
> **2026-05-24 session — what changed today** (deeper structural review pending — see the open backup-pipeline simplification audit):
|
||||
> - **anca-elements archive direction inverted** — Synology `/Backup/Anca/Elements` (770G) deleted; PVE `/srv/nfs/anca-elements` is now source of truth. `anca-elements-sync.sh` retired.
|
||||
> - **`anca-elements-mirror.{sh,service,timer}` retired**, subsumed into the new **`nfs-mirror`** weekly job covering all critical NFS subtrees (anca-elements + ~80 services) → sda.
|
||||
> - **`offsite-sync-backup` Step 2 filter inverted**: NFS-direct-to-Synology now only carries the sda-bypass paths (immich + frigate + prometheus + `*-backup` + …). Two-leg invariant: `nfs-mirror.sh EXCLUDES` ≡ `offsite-sync-backup Step 2 INCLUDES`. Cross-referenced in both scripts.
|
||||
> - **Synology `/Backup/Viki/nfs/<svc>/` orphan cleanup** — 84 dirs renamed in-place (btrfs metadata-only) to `/Backup/Viki/pve-backup/<svc>/` so daily-incremental Step 1 sees them as pre-existing and only ships deltas. No re-transfer.
|
||||
> - **Synology snapshot retention 7d → 3d**, all 8 backlog snapshots deleted via `sudo synosharesnapshot delete Backup ...`. Reclaimed ~800G btrfs (98% → 83% used). DSM API was blocked by 2FA; `sudo` over the existing `Administrator` SSH key worked with the Vault-stored password.
|
||||
> - **Manifest mechanism extended**: `nfs-mirror` now appends its transferred file list to `/mnt/backup/.changed-files` so daily Step 1 incremental picks it up (was previously only fed by `daily-backup`).
|
||||
|
||||
## Overview
|
||||
|
||||
The homelab uses a defense-in-depth 3-2-1 backup strategy: **3 copies** (live PVCs on sdc, weekly backups on sda, offsite on Synology), **2 media types** (SSD thin LVM, HDD), **1 offsite copy** (Synology NAS). This architecture provides <1s RPO for recent changes (via 7-day LVM snapshots), <7d RPO for file-level recovery, and <30min RTO for most services.
|
||||
The homelab runs a 3-2-1 strategy with a **two-leg** path to Synology so every NFS byte takes exactly one route to offsite (no duplication, no gaps):
|
||||
|
||||
```
|
||||
sdc /srv/nfs/<svc>/ ──nfs-mirror weekly──→ sda /mnt/backup/<svc>/ ──offsite-sync Step 1──→ Synology /Backup/Viki/pve-backup/<svc>/ [leg 1]
|
||||
sdc /srv/nfs/<bypass>/ ──inotify (nfs-change-tracker)──→ offsite-sync Step 2 ──→ Synology /Backup/Viki/nfs/<bypass>/ [leg 2]
|
||||
sdc PVCs (LVM thin) ──daily-backup~snapshot~rsync──→ sda /mnt/backup/{pvc-data,sqlite-backup,pfsense,pve-config}/ ──Step 1──→ Synology /Backup/Viki/pve-backup/
|
||||
```
|
||||
|
||||
The **bypass list** (paths that take leg 2 — too big for sda, transient, or already-a-backup): `immich`, `frigate`, `prometheus`, `loki`, `temp`, `alertmanager`, `ollama`, `audiblez`, `ebook2audiobook`, `*-backup`. Anything NOT in this list rides leg 1 via `nfs-mirror`.
|
||||
|
||||
**3-2-1 Breakdown**:
|
||||
- **Copy 1** (live): All PVC data + VM disks on Proxmox sdc thin pool (10.7TB RAID1 HDD)
|
||||
- **Copy 2** (local backup): Weekly file-level backup to sda `/mnt/backup` (1.1TB RAID1 SAS)
|
||||
- **Copy 3** (offsite): Synology NAS at 192.168.1.13:
|
||||
- `Synology/Backup/Viki/pve-backup/` — PVC snapshots, pfSense, PVE config (rsync from sda weekly)
|
||||
- `Synology/Backup/Viki/nfs/` — NFS HDD data (inotify change-tracked rsync from `/srv/nfs`)
|
||||
- `Synology/Backup/Viki/nfs-ssd/` — NFS SSD data (inotify change-tracked rsync from `/srv/nfs-ssd`)
|
||||
- **Copy 1** (live): all PVC data + VM disks on Proxmox sdc thin pool (10.7TB RAID1 HDD); all NFS data at `/srv/nfs[-ssd]/`
|
||||
- **Copy 2** (local backup): sda `/mnt/backup` (1.1TB RAID1 SAS) — at **~90% used** post-2026-05-24 (was ~10% in April)
|
||||
- **Copy 3** (offsite): Synology NAS at 192.168.1.13 — at **~83% used / 934G free** post-2026-05-24 (was 98% / 121G before today's cleanup)
|
||||
- `Synology/Backup/Viki/pve-backup/` — sda contents (PVC backups + nfs-mirror output: ~90 service dirs)
|
||||
- `Synology/Backup/Viki/nfs/` — bypass-list NFS (immich, frigate, etc.)
|
||||
- `Synology/Backup/Viki/nfs-ssd/` — bypass-list SSD NFS (immich-ML, ollama, llamacpp)
|
||||
|
||||
## Architecture Diagram
|
||||
|
||||
|
|
@ -366,6 +382,38 @@ Pushes `nfs_mirror_last_run_timestamp` + `nfs_mirror_last_status` + `nfs_mirror_
|
|||
|
||||
> TrueNAS Cloud Sync was decommissioned along with TrueNAS (2026-04-13). The current offsite path is inotify-change-tracked rsync from the Proxmox host NFS (`/srv/nfs`, `/srv/nfs-ssd`) to Synology.
|
||||
|
||||
### Synology snapshot management
|
||||
|
||||
Synology DSM keeps daily btrfs snapshots of every shared folder (the `Backup` share most importantly). Retention is configured per-share in DSM's Snapshot Replication app, and persists in `synosharesnapshot shareconf`.
|
||||
|
||||
**Current settings** (`Backup` share, 2026-05-24): daily at 02:00, **`snap_auto_remove_keep_days=3`** (tightened from 7 to reduce the window where deleted data continues to consume space).
|
||||
|
||||
Snapshots are CoW — deleting a file from the live filesystem does NOT free its blocks while any retained snapshot references them. Reclaim only happens after ALL referencing snapshots roll off.
|
||||
|
||||
**DSM Web API is gated by 2FA (FIDO/OTP)** — programmatic snapshot management has to go via SSH + sudo instead:
|
||||
|
||||
```bash
|
||||
# Password is in Vault: secret/viktor → synology_admin_password
|
||||
PASS=$(VAULT_ADDR=https://vault.viktorbarzin.me vault kv get -field=synology_admin_password secret/viktor)
|
||||
|
||||
# List snapshots on the Backup share
|
||||
ssh Administrator@192.168.1.13 "echo '$PASS' | sudo -S /usr/syno/sbin/synosharesnapshot list Backup"
|
||||
|
||||
# Bulk delete ALL snapshots (reclaims everything once btrfs cleaner runs)
|
||||
ssh Administrator@192.168.1.13 "
|
||||
SNAPS=\$(echo '$PASS' | sudo -S /usr/syno/sbin/synosharesnapshot list Backup 2>/dev/null \
|
||||
| grep -oE 'GMT-[0-9]+\.[0-9]+\.[0-9]+-[0-9]+\.[0-9]+\.[0-9]+' | sort -u)
|
||||
echo '$PASS' | sudo -S /usr/syno/sbin/synosharesnapshot delete Backup \$SNAPS
|
||||
"
|
||||
|
||||
# Tighten retention
|
||||
ssh Administrator@192.168.1.13 "echo '$PASS' | sudo -S /usr/syno/sbin/synosharesnapshot shareconf set Backup snap_auto_remove_keep_days=3"
|
||||
```
|
||||
|
||||
The btrfs cleaner thread reclaims async — `df` may lag the snapshot-delete by minutes (typical reclaim rate observed 2026-05-24: ~300 MB/s sustained, with bursts of 800 GB in 2 minutes).
|
||||
|
||||
> Memory: id=2673-2676 (Synology snapshot retention gotcha — deletion vs reclaim timing).
|
||||
|
||||
## Configuration
|
||||
|
||||
### Key Files
|
||||
|
|
@ -387,6 +435,8 @@ Pushes `nfs_mirror_last_run_timestamp` + `nfs_mirror_last_status` + `nfs_mirror_
|
|||
| `stacks/vault/` | Terraform: Vault backup CronJob |
|
||||
| `stacks/vaultwarden/` | Terraform: Vaultwarden backup + integrity CronJobs |
|
||||
| `stacks/monitoring/` | Terraform: Prometheus alerts |
|
||||
| `synology:Administrator@192.168.1.13` | Synology SSH; sudo password = Vault `secret/viktor` `synology_admin_password`; DSM API itself gated by 2FA |
|
||||
| `/usr/syno/sbin/synosharesnapshot` | Synology: btrfs snapshot CLI — must run as root via sudo |
|
||||
|
||||
### Vault Paths
|
||||
|
||||
|
|
|
|||
|
|
@ -20,6 +20,34 @@ log() { echo "[$(date '+%Y-%m-%d %H:%M:%S')] $*"; }
|
|||
warn() { log "WARN: $*" >&2; }
|
||||
die() { log "FATAL: $*" >&2; push_metrics 1 0; exit 1; }
|
||||
|
||||
# --- Manifest append helper ---
|
||||
# Both daily-backup and nfs-mirror append to /mnt/backup/.changed-files.
|
||||
# If their runs overlap (e.g. nfs-mirror Mon 04:11 still running when
|
||||
# daily-backup starts Mon 05:00) the appends can interleave mid-line.
|
||||
# `flock -x` on a sibling lock file makes appends atomic across processes.
|
||||
MANIFEST_LOCK="${MANIFEST}.lock"
|
||||
manifest_append() {
|
||||
(
|
||||
flock -x 200
|
||||
cat >> "${MANIFEST}"
|
||||
) 200>"${MANIFEST_LOCK}"
|
||||
}
|
||||
|
||||
# Cap manifest size to prevent unbounded growth (e.g. Synology unreachable
|
||||
# for many days, every daily-backup keeps appending). At >500k lines,
|
||||
# `--files-from=` rsync becomes pathological — fall back to a full Step 1
|
||||
# sync by signalling offsite-sync to ignore the manifest this round.
|
||||
MANIFEST_MAX_LINES=500000
|
||||
check_manifest_size() {
|
||||
[ -f "${MANIFEST}" ] || return 0
|
||||
local lines
|
||||
lines=$(wc -l < "${MANIFEST}" 2>/dev/null || echo 0)
|
||||
if [ "${lines:-0}" -gt "${MANIFEST_MAX_LINES}" ]; then
|
||||
warn "manifest at ${lines} lines (>${MANIFEST_MAX_LINES}) — flagging next offsite-sync as full"
|
||||
touch "${BACKUP_ROOT}/.force-full-sync"
|
||||
fi
|
||||
}
|
||||
|
||||
# --- Locking ---
|
||||
# Track whether we got SIGTERM/SIGINT so cleanup can push a non-success metric.
|
||||
# Without this, a systemd timeout-kill leaves WeeklyBackupFailing alerts blind:
|
||||
|
|
@ -123,7 +151,7 @@ check_nfs_exports() {
|
|||
}
|
||||
|
||||
# --- Main ---
|
||||
log "=== Weekly backup starting ==="
|
||||
log "=== daily-backup starting ==="
|
||||
|
||||
if ! mountpoint -q "${BACKUP_ROOT}"; then
|
||||
die "${BACKUP_ROOT} is not mounted"
|
||||
|
|
@ -138,16 +166,25 @@ check_nfs_exports || {
|
|||
STATUS=0
|
||||
TOTAL_BYTES=0
|
||||
|
||||
# Clear manifest for this run
|
||||
> "${MANIFEST}"
|
||||
# DO NOT truncate the manifest here.
|
||||
#
|
||||
# Truncation lives in offsite-sync-backup (only on successful sync). If
|
||||
# offsite-sync failed yesterday — Synology unreachable, transient error —
|
||||
# the manifest holds yesterday's unconsumed file list. Truncating at the
|
||||
# start of today's daily-backup would silently lose those entries; they'd
|
||||
# only reach Synology on the next monthly full sync.
|
||||
#
|
||||
# Appending duplicates across multiple runs is harmless — rsync transfers
|
||||
# each file once. If the manifest grows pathologically (Synology down for
|
||||
# weeks), the OffsiteBackupSync{Stale,Failing} alerts catch it.
|
||||
|
||||
# NFS data is synced directly to Synology via inotifywait + offsite-sync-backup.sh
|
||||
# No NFS mirror step on sda — saves 53GB and eliminates duplication.
|
||||
# NFS data is synced to Synology via two paths: nfs-mirror → sda → Step 1
|
||||
# for the curated subset, and inotify + Step 2 for the sda-bypass list.
|
||||
|
||||
# ============================================================
|
||||
# STEP 1: PVC file-level copy from LVM thin snapshots
|
||||
# ============================================================
|
||||
log "--- Step 2: PVC file copy from snapshots ---"
|
||||
log "--- Step 1: PVC file copy from snapshots ---"
|
||||
WEEK=$(date +%Y-%W)
|
||||
PREV=$(ls -1d "${BACKUP_ROOT}/pvc-data"/????-?? 2>/dev/null | tail -1 || true)
|
||||
|
||||
|
|
@ -215,7 +252,7 @@ else
|
|||
# (immich-postgres ~10 GiB, ~3 min on local ext4) and well
|
||||
# below the unit-level budget so we still have headroom to
|
||||
# finish the rest.
|
||||
timeout 1800 rsync -az --delete \
|
||||
timeout 1800 rsync -a --delete \
|
||||
${PREV:+--link-dest="${PREV}/${ns_pvc}/"} \
|
||||
"${PVC_MOUNT}/" "${dst}/" 2>&1 || rsync_rc=$?
|
||||
if [ "$rsync_rc" -eq 0 ]; then
|
||||
|
|
@ -274,10 +311,10 @@ else
|
|||
log " PVC copy: ${PVC_COUNT} OK, ${PVC_FAIL} failed"
|
||||
[ "${PVC_FAIL}" -gt 0 ] && STATUS=1
|
||||
|
||||
# Add PVC files to manifest
|
||||
# Add PVC files to manifest (locked append)
|
||||
if [ -d "${BACKUP_ROOT}/pvc-data/${WEEK}" ]; then
|
||||
find "${BACKUP_ROOT}/pvc-data/${WEEK}" -type f 2>/dev/null | \
|
||||
sed "s|^${BACKUP_ROOT}/||" >> "${MANIFEST}"
|
||||
sed "s|^${BACKUP_ROOT}/||" | manifest_append
|
||||
fi
|
||||
|
||||
# Prune old weekly versions (keep 4)
|
||||
|
|
@ -301,23 +338,31 @@ if timeout 10 ssh -o BatchMode=yes -o ConnectTimeout=5 root@10.0.20.1 true 2>/de
|
|||
# config.xml — primary restore artifact
|
||||
if scp -o ConnectTimeout=10 root@10.0.20.1:/cf/conf/config.xml "${PFSENSE_DEST}/config-${DATE}.xml" 2>/dev/null; then
|
||||
log " OK: config.xml"
|
||||
echo "pfsense/config-${DATE}.xml" >> "${MANIFEST}"
|
||||
echo "pfsense/config-${DATE}.xml" | manifest_append
|
||||
else
|
||||
warn "Failed to copy pfsense config.xml"
|
||||
STATUS=1
|
||||
PFSENSE_STATUS=1
|
||||
fi
|
||||
|
||||
# Full filesystem tar
|
||||
if ssh -o ConnectTimeout=10 root@10.0.20.1 \
|
||||
"tar czf - --exclude=/dev --exclude=/proc --exclude=/tmp --exclude=/var/run /" \
|
||||
> "${PFSENSE_DEST}/pfsense-full-${DATE}.tar.gz" 2>/dev/null; then
|
||||
log " OK: full tar ($(du -sh "${PFSENSE_DEST}/pfsense-full-${DATE}.tar.gz" | cut -f1))"
|
||||
echo "pfsense/pfsense-full-${DATE}.tar.gz" >> "${MANIFEST}"
|
||||
# Full filesystem tar — Sundays only (weekly).
|
||||
# config.xml is the primary restore artifact and runs daily above; the
|
||||
# full filesystem tar is for forensic / package-state recovery only and
|
||||
# rarely-needed. Re-tarring 100M+ daily writes ~3G/month to sda + Synology
|
||||
# for unchanged content. Keep one fresh tarball per week instead.
|
||||
if [ "$(date +%u)" = "7" ]; then
|
||||
if ssh -o ConnectTimeout=10 root@10.0.20.1 \
|
||||
"tar czf - --exclude=/dev --exclude=/proc --exclude=/tmp --exclude=/var/run /" \
|
||||
> "${PFSENSE_DEST}/pfsense-full-${DATE}.tar.gz" 2>/dev/null; then
|
||||
log " OK: weekly full tar ($(du -sh "${PFSENSE_DEST}/pfsense-full-${DATE}.tar.gz" | cut -f1))"
|
||||
echo "pfsense/pfsense-full-${DATE}.tar.gz" | manifest_append
|
||||
else
|
||||
warn "Failed to tar pfsense filesystem"
|
||||
STATUS=1
|
||||
PFSENSE_STATUS=1
|
||||
fi
|
||||
else
|
||||
warn "Failed to tar pfsense filesystem"
|
||||
STATUS=1
|
||||
PFSENSE_STATUS=1
|
||||
log " skip weekly full tar (only runs Sundays)"
|
||||
fi
|
||||
|
||||
# Retention: keep 4 weekly copies
|
||||
|
|
@ -344,13 +389,15 @@ fi
|
|||
# ============================================================
|
||||
log "--- Step 4: PVE host config ---"
|
||||
mkdir -p "${BACKUP_ROOT}/pve-config/scripts"
|
||||
timeout 300 rsync -az --delete /etc/pve/ "${BACKUP_ROOT}/pve-config/etc-pve/" 2>&1 || { warn "Failed to sync /etc/pve"; STATUS=1; }
|
||||
timeout 300 rsync -a --delete /etc/pve/ "${BACKUP_ROOT}/pve-config/etc-pve/" 2>&1 || { warn "Failed to sync /etc/pve"; STATUS=1; }
|
||||
for script in /usr/local/bin/lvm-pvc-snapshot /usr/local/bin/daily-backup /usr/local/bin/offsite-sync-backup; do
|
||||
[ -f "${script}" ] && cp "${script}" "${BACKUP_ROOT}/pve-config/scripts/" 2>/dev/null || true
|
||||
done
|
||||
find "${BACKUP_ROOT}/pve-config" -type f 2>/dev/null | sed "s|^${BACKUP_ROOT}/||" >> "${MANIFEST}"
|
||||
find "${BACKUP_ROOT}/pve-config" -type f 2>/dev/null | sed "s|^${BACKUP_ROOT}/||" | manifest_append
|
||||
log " OK: PVE config"
|
||||
|
||||
check_manifest_size
|
||||
|
||||
# ============================================================
|
||||
# STEP 5: Prune LVM snapshots older than 7 days
|
||||
# ============================================================
|
||||
|
|
@ -361,6 +408,6 @@ log "--- Step 5: Snapshot pruning (7-day retention) ---"
|
|||
# Done
|
||||
# ============================================================
|
||||
MANIFEST_LINES=$(wc -l < "${MANIFEST}" 2>/dev/null || echo 0)
|
||||
log "=== Weekly backup complete (status=${STATUS}, ${TOTAL_BYTES} bytes, ${MANIFEST_LINES} files in manifest) ==="
|
||||
log "=== daily-backup complete (status=${STATUS}, ${TOTAL_BYTES} bytes, ${MANIFEST_LINES} files in manifest) ==="
|
||||
push_metrics "${STATUS}" "${TOTAL_BYTES}"
|
||||
exit "${STATUS}"
|
||||
|
|
|
|||
|
|
@ -57,6 +57,14 @@ EXCLUDES=(
|
|||
--exclude='/.lv-pvc-mapping.json'
|
||||
--exclude='/.nfs-changes.log'
|
||||
|
||||
# ---- anca-elements: photos are being ingested into Immich (2026-05-24),
|
||||
# so /srv/nfs/immich/library/ becomes the canonical copy and the separate
|
||||
# anca-elements tree is redundant. Excluded from nfs-mirror going forward.
|
||||
# The historical 771G at /mnt/backup/anca-elements/ stays put until manual
|
||||
# cleanup once Immich ingest completes; offsite-sync Step 1 also excludes
|
||||
# it from the Synology pve-backup/ upload so we don't ship the redundant copy.
|
||||
--exclude='/anca-elements/'
|
||||
|
||||
# ---- NFS paths: too big / transient / re-fetchable ----
|
||||
--exclude='/immich/'
|
||||
--exclude='/frigate/'
|
||||
|
|
@ -81,6 +89,17 @@ EXCLUDES=(
|
|||
log() { echo "[$(date -u '+%Y-%m-%dT%H:%M:%SZ')] $*" | tee -a "$LOG"; }
|
||||
warn() { log "WARN: $*"; }
|
||||
|
||||
# Locked manifest append (shared with daily-backup) — see daily-backup.sh
|
||||
# for the rationale. flock prevents interleaved appends when nfs-mirror
|
||||
# (Mon 04:11) overruns into daily-backup (Mon 05:00).
|
||||
MANIFEST_LOCK="${MANIFEST}.lock"
|
||||
manifest_append() {
|
||||
(
|
||||
flock -x 200
|
||||
cat >> "${MANIFEST}"
|
||||
) 200>"${MANIFEST_LOCK}"
|
||||
}
|
||||
|
||||
push_metrics() {
|
||||
local status="${1:-0}" bytes="${2:-0}"
|
||||
cat <<EOF | curl -s --connect-timeout 5 --max-time 10 --data-binary @- "${PUSHGATEWAY}/metrics/job/${PUSHGATEWAY_JOB}" 2>/dev/null || true
|
||||
|
|
@ -132,10 +151,12 @@ if [ "$RSYNC_RC" -eq 0 ]; then
|
|||
# manifest so daily Step 1 incremental picks them up tomorrow morning.
|
||||
NEW_COUNT=$(find /mnt/backup -newer "$STAMP" -type f \
|
||||
! -path '/mnt/backup/.changed-files' \
|
||||
! -path '/mnt/backup/.changed-files.lock' \
|
||||
! -path '/mnt/backup/.lv-pvc-mapping.json' \
|
||||
! -path '/mnt/backup/.nfs-changes.log' \
|
||||
! -path '/mnt/backup/.last-offsite-sync' \
|
||||
-printf '%P\n' 2>/dev/null | tee -a "$MANIFEST" | wc -l)
|
||||
! -path '/mnt/backup/.force-full-sync' \
|
||||
-printf '%P\n' 2>/dev/null | tee >(manifest_append) | wc -l)
|
||||
log "=== mirror complete; ${NEW_COUNT} files added to offsite manifest ==="
|
||||
log "/mnt/backup used: $(df -h --output=used /mnt/backup | tail -1 | tr -d ' ')"
|
||||
push_metrics 0 "$DST_BYTES"
|
||||
|
|
|
|||
|
|
@ -54,18 +54,32 @@ DAY_OF_MONTH=$(date +%d)
|
|||
# ============================================================
|
||||
log "--- Step 1: sda → Synology pve-backup/ ---"
|
||||
|
||||
if [ "${DAY_OF_MONTH}" -le 7 ]; then
|
||||
log "Monthly full sync (1st Sunday)..."
|
||||
rsync -rltz --delete --chmod=Du=rwx,Dgo=rx,Fu=rw,Fog=r \
|
||||
# Trigger: monthly cleanup window OR daily-backup signalled the manifest grew
|
||||
# past its cap (Synology was unreachable too long for incremental to keep up).
|
||||
FORCE_FULL_FLAG="${BACKUP_ROOT}/.force-full-sync"
|
||||
FORCE_FULL=""
|
||||
[ -f "${FORCE_FULL_FLAG}" ] && FORCE_FULL=1
|
||||
if [ "${DAY_OF_MONTH}" -le 7 ] || [ -n "${FORCE_FULL}" ]; then
|
||||
[ -n "${FORCE_FULL}" ] && log "Forced full sync (manifest size cap tripped)..." || log "Monthly full sync (1st Sunday)..."
|
||||
# No -z on LAN: gigabit hop to 192.168.1.13 doesn't benefit from compression
|
||||
# and burns CPU on the PVE host that's already busy with cluster IO.
|
||||
rsync -rlt --delete --chmod=Du=rwx,Dgo=rx,Fu=rw,Fog=r \
|
||||
--exclude='.changed-files' \
|
||||
--exclude='.changed-files.lock' \
|
||||
--exclude='.last-offsite-sync' \
|
||||
--exclude='.lv-pvc-mapping.json' \
|
||||
--exclude='.nfs-changes.log' \
|
||||
--exclude='.force-full-sync' \
|
||||
--exclude='/anca-elements/' \
|
||||
"${BACKUP_ROOT}/" "${PVE_BACKUP_DEST}/" 2>&1 || STATUS=1
|
||||
rm -f "${FORCE_FULL_FLAG}"
|
||||
elif [ -s "${MANIFEST}" ]; then
|
||||
MANIFEST_LINES=$(wc -l < "${MANIFEST}")
|
||||
log "Incremental sync (${MANIFEST_LINES} files from manifest)..."
|
||||
rsync -rltz --chmod=Du=rwx,Dgo=rx,Fu=rw,Fog=r --files-from="${MANIFEST}" \
|
||||
# /anca-elements is being ingested into Immich (Immich becomes canonical) —
|
||||
# skip the redundant copy in /mnt/backup/anca-elements/ until manual cleanup.
|
||||
rsync -rlt --chmod=Du=rwx,Dgo=rx,Fu=rw,Fog=r --files-from="${MANIFEST}" \
|
||||
--exclude='anca-elements/' \
|
||||
"${BACKUP_ROOT}/" "${PVE_BACKUP_DEST}/" 2>&1 || STATUS=1
|
||||
else
|
||||
log "No changed files in manifest, nothing to sync"
|
||||
|
|
@ -110,11 +124,11 @@ NFS_FULL_INCLUDES=(
|
|||
if [ "${DAY_OF_MONTH}" -le 7 ]; then
|
||||
# Monthly: full sync with --delete for cleanup, restricted to bypass-list.
|
||||
log "Monthly full NFS sync (sda-bypass paths only)..."
|
||||
rsync -rltz --delete "${NFS_FULL_INCLUDES[@]}" /srv/nfs/ "${NFS_DEST}/" 2>&1 \
|
||||
rsync -rlt --delete "${NFS_FULL_INCLUDES[@]}" /srv/nfs/ "${NFS_DEST}/" 2>&1 \
|
||||
&& log " OK: nfs/ full sync (bypass-list)" || { warn "nfs/ full sync failed"; STATUS=1; }
|
||||
# nfs-ssd: every dir under it (immich/ollama/llamacpp) is in the bypass list,
|
||||
# so a plain --delete still applies cleanly.
|
||||
rsync -rltz --delete /srv/nfs-ssd/ "${NFS_SSD_DEST}/" 2>&1 \
|
||||
rsync -rlt --delete /srv/nfs-ssd/ "${NFS_SSD_DEST}/" 2>&1 \
|
||||
&& log " OK: nfs-ssd/ full sync" || { warn "nfs-ssd/ full sync failed"; STATUS=1; }
|
||||
> "${NFS_CHANGE_LOG}"
|
||||
elif [ -s "${NFS_CHANGE_LOG}" ]; then
|
||||
|
|
@ -127,7 +141,7 @@ elif [ -s "${NFS_CHANGE_LOG}" ]; then
|
|||
> /tmp/sync-nfs.list 2>/dev/null
|
||||
NFS_COUNT=$(wc -l < /tmp/sync-nfs.list 2>/dev/null || echo 0)
|
||||
if [ "${NFS_COUNT:-0}" -gt 0 ]; then
|
||||
rsync -rltz --files-from=/tmp/sync-nfs.list /srv/nfs/ "${NFS_DEST}/" 2>&1 \
|
||||
rsync -rlt --files-from=/tmp/sync-nfs.list /srv/nfs/ "${NFS_DEST}/" 2>&1 \
|
||||
&& log " OK: nfs/ (${NFS_COUNT} bypass files)" \
|
||||
|| { warn "nfs/ incremental failed"; STATUS=1; }
|
||||
fi
|
||||
|
|
@ -138,7 +152,7 @@ elif [ -s "${NFS_CHANGE_LOG}" ]; then
|
|||
> /tmp/sync-nfs-ssd.list 2>/dev/null || true
|
||||
SSD_COUNT=$(wc -l < /tmp/sync-nfs-ssd.list 2>/dev/null || echo 0)
|
||||
if [ "${SSD_COUNT:-0}" -gt 0 ]; then
|
||||
rsync -rltz --files-from=/tmp/sync-nfs-ssd.list /srv/nfs-ssd/ "${NFS_SSD_DEST}/" 2>&1 \
|
||||
rsync -rlt --files-from=/tmp/sync-nfs-ssd.list /srv/nfs-ssd/ "${NFS_SSD_DEST}/" 2>&1 \
|
||||
&& log " OK: nfs-ssd/ (${SSD_COUNT} files)" \
|
||||
|| { warn "nfs-ssd/ incremental failed"; STATUS=1; }
|
||||
fi
|
||||
|
|
|
|||
|
|
@ -1,13 +1,24 @@
|
|||
"""Aceztrims extractor - scrapes F1 streaming links from Aceztrims pages.
|
||||
"""Aceztrims extractor — scrapes embed URLs from acestrlms.pages.dev/f11/.
|
||||
|
||||
Parses HTML for iframe button onclick handlers and extracts streams from:
|
||||
- /iframe1?s=<m3u8_url> → direct m3u8
|
||||
- https://pooembed.eu/embed/... → embed URL
|
||||
The page (Cloudflare Pages, no anti-bot) hosts an iframe + a strip of
|
||||
onclick channel-switcher buttons. Each button rewrites the iframe via
|
||||
`document.getElementById('iframe').src = '<embed_url>'`. The initial
|
||||
channel is hard-coded as `<iframe id='iframe' src='...'>`.
|
||||
|
||||
We strip HTML comments first because the page keeps ~20 legacy channel
|
||||
buttons inside `<!-- ... -->` blocks for easy re-enablement; the previous
|
||||
loose regex picked them up as false positives.
|
||||
|
||||
All channels are iframe embeds (no direct m3u8) — `stream_type='embed'`.
|
||||
|
||||
Site naming note: the extractor key stays `aceztrims` (the previous
|
||||
domain) so registry/cache identifiers don't churn. The current domain
|
||||
is `acestrlms.pages.dev` and the F1 path is `/f11/` (two ones — `/f1/`
|
||||
is the cross-sport schedule page and has no stream buttons).
|
||||
"""
|
||||
|
||||
import logging
|
||||
import re
|
||||
from urllib.parse import parse_qs, urlparse
|
||||
|
||||
import httpx
|
||||
|
||||
|
|
@ -17,9 +28,8 @@ from backend.extractors.models import ExtractedStream
|
|||
logger = logging.getLogger(__name__)
|
||||
|
||||
BASE_URL = "https://acestrlms.pages.dev"
|
||||
# Pages to scrape for streams
|
||||
F1_PAGES = [
|
||||
("/f1/", "Formula 1"),
|
||||
("/f11/", "Formula 1"),
|
||||
]
|
||||
|
||||
USER_AGENT = (
|
||||
|
|
@ -28,13 +38,21 @@ USER_AGENT = (
|
|||
"Chrome/120.0.0.0 Safari/537.36"
|
||||
)
|
||||
|
||||
# `document.getElementById('iframe').src = '<URL>'` — current channel-switcher format.
|
||||
_ONCLICK_IFRAME_SRC = re.compile(
|
||||
r"""document\.getElementById\(['"]iframe['"]\)\.src\s*=\s*['"]([^'"]+)['"]""",
|
||||
re.IGNORECASE,
|
||||
)
|
||||
# `<iframe id='iframe' src='<URL>'>` — the default/initial channel.
|
||||
_DEFAULT_IFRAME = re.compile(
|
||||
r"""<iframe[^>]*id\s*=\s*['"]iframe['"][^>]*src\s*=\s*['"]([^'"]+)['"]""",
|
||||
re.IGNORECASE,
|
||||
)
|
||||
_HTML_COMMENT = re.compile(r"<!--.*?-->", re.DOTALL)
|
||||
|
||||
|
||||
class AceztrimsExtractor(BaseExtractor):
|
||||
"""Extracts streams from Aceztrims pages by parsing HTML for iframe URLs.
|
||||
|
||||
Looks for onclick handlers on buttons/links that open iframes, and
|
||||
extracts the stream URLs from them.
|
||||
"""
|
||||
"""Pulls iframe embed URLs out of the acestrlms.pages.dev F1 page."""
|
||||
|
||||
@property
|
||||
def site_key(self) -> str:
|
||||
|
|
@ -45,7 +63,6 @@ class AceztrimsExtractor(BaseExtractor):
|
|||
return "Aceztrims"
|
||||
|
||||
async def extract(self) -> list[ExtractedStream]:
|
||||
"""Scrape all configured F1 pages for stream URLs."""
|
||||
streams: list[ExtractedStream] = []
|
||||
|
||||
async with httpx.AsyncClient(
|
||||
|
|
@ -55,12 +72,9 @@ class AceztrimsExtractor(BaseExtractor):
|
|||
) as client:
|
||||
for path, category in F1_PAGES:
|
||||
try:
|
||||
page_streams = await self._scrape_page(client, path, category)
|
||||
streams.extend(page_streams)
|
||||
streams.extend(await self._scrape_page(client, path, category))
|
||||
except Exception:
|
||||
logger.exception(
|
||||
"[aceztrims] Failed to scrape page %s", path
|
||||
)
|
||||
logger.exception("[aceztrims] Failed to scrape %s", path)
|
||||
|
||||
logger.info("[aceztrims] Extracted %d stream(s)", len(streams))
|
||||
return streams
|
||||
|
|
@ -68,85 +82,39 @@ class AceztrimsExtractor(BaseExtractor):
|
|||
async def _scrape_page(
|
||||
self, client: httpx.AsyncClient, path: str, category: str
|
||||
) -> list[ExtractedStream]:
|
||||
"""Scrape a single page for stream URLs."""
|
||||
url = f"{BASE_URL}{path}"
|
||||
resp = await client.get(url)
|
||||
if resp.status_code != 200:
|
||||
logger.warning(
|
||||
"[aceztrims] Page %s returned HTTP %d", path, resp.status_code
|
||||
"[aceztrims] %s returned HTTP %d", path, resp.status_code
|
||||
)
|
||||
return []
|
||||
|
||||
html = resp.text
|
||||
# The page keeps a block of legacy channel buttons inside
|
||||
# `<!-- ... -->` for quick re-enablement. Strip comments first so
|
||||
# the regex only sees live buttons.
|
||||
html = _HTML_COMMENT.sub("", resp.text)
|
||||
|
||||
seen: set[str] = set()
|
||||
streams: list[ExtractedStream] = []
|
||||
seen_urls: set[str] = set()
|
||||
|
||||
# Pattern 1: /iframe1?s=<m3u8_url> — direct m3u8
|
||||
iframe1_pattern = re.compile(
|
||||
r"""['"]((?:https?://[^'"]*)?/iframe1\?s=([^'"&]+))['""]""",
|
||||
re.IGNORECASE,
|
||||
)
|
||||
for match in iframe1_pattern.finditer(html):
|
||||
m3u8_url = match.group(2)
|
||||
if m3u8_url in seen_urls:
|
||||
continue
|
||||
seen_urls.add(m3u8_url)
|
||||
|
||||
streams.append(
|
||||
ExtractedStream(
|
||||
url=m3u8_url,
|
||||
site_key=self.site_key,
|
||||
site_name=self.site_name,
|
||||
quality="",
|
||||
title=f"{category} Stream",
|
||||
stream_type="m3u8",
|
||||
for pattern in (_DEFAULT_IFRAME, _ONCLICK_IFRAME_SRC):
|
||||
for match in pattern.finditer(html):
|
||||
embed_url = match.group(1).strip()
|
||||
if not embed_url or embed_url in seen:
|
||||
continue
|
||||
seen.add(embed_url)
|
||||
streams.append(
|
||||
ExtractedStream(
|
||||
url=embed_url,
|
||||
site_key=self.site_key,
|
||||
site_name=self.site_name,
|
||||
quality="",
|
||||
title=f"{category} Stream",
|
||||
stream_type="embed",
|
||||
embed_url=embed_url,
|
||||
)
|
||||
)
|
||||
)
|
||||
|
||||
# Pattern 2: embed URLs (pooembed.eu or similar)
|
||||
embed_pattern = re.compile(
|
||||
r"""['"]((https?://(?:pooembed\.eu|[^'"]*embed)[^'"]*))['"]""",
|
||||
re.IGNORECASE,
|
||||
)
|
||||
for match in embed_pattern.finditer(html):
|
||||
embed_url = match.group(1)
|
||||
if embed_url in seen_urls:
|
||||
continue
|
||||
seen_urls.add(embed_url)
|
||||
|
||||
streams.append(
|
||||
ExtractedStream(
|
||||
url=embed_url,
|
||||
site_key=self.site_key,
|
||||
site_name=self.site_name,
|
||||
quality="",
|
||||
title=f"{category} Stream (Embed)",
|
||||
stream_type="embed",
|
||||
embed_url=embed_url,
|
||||
)
|
||||
)
|
||||
|
||||
# Pattern 3: Generic onclick handlers with URLs
|
||||
onclick_pattern = re.compile(
|
||||
r"""onclick\s*=\s*['"].*?['"]?(https?://[^'")\s]+\.m3u8[^'")\s]*)['"]?""",
|
||||
re.IGNORECASE,
|
||||
)
|
||||
for match in onclick_pattern.finditer(html):
|
||||
m3u8_url = match.group(1)
|
||||
if m3u8_url in seen_urls:
|
||||
continue
|
||||
seen_urls.add(m3u8_url)
|
||||
|
||||
streams.append(
|
||||
ExtractedStream(
|
||||
url=m3u8_url,
|
||||
site_key=self.site_key,
|
||||
site_name=self.site_name,
|
||||
quality="",
|
||||
title=f"{category} Stream",
|
||||
stream_type="m3u8",
|
||||
)
|
||||
)
|
||||
|
||||
logger.info(
|
||||
"[aceztrims] Found %d stream(s) on %s", len(streams), path
|
||||
|
|
|
|||
|
|
@ -34,7 +34,7 @@ USER_AGENT = (
|
|||
# to also surface MotoGP and adjacent motorsports — keeps the f1-stream
|
||||
# UI useful between race weekends and during the off-season.
|
||||
MOTORSPORT_CATEGORIES = {
|
||||
"formula 1", "formula 2", "formula 3",
|
||||
"f1", "formula 1", "formula 2", "formula 3",
|
||||
"motogp", "moto gp", "moto2", "moto3", "motoe",
|
||||
"world rally championship", "wrc",
|
||||
"world endurance championship", "wec",
|
||||
|
|
@ -85,27 +85,61 @@ _is_f1_category = _is_motorsport_category
|
|||
_is_f1_event = _is_motorsport_event
|
||||
|
||||
|
||||
def _parse_live_events(html: str) -> list[_PitsportEvent]:
|
||||
"""Parse live events from the main page RSC payload.
|
||||
def _decode_rsc_payload(html: str) -> str:
|
||||
"""Concatenate and unescape all `self.__next_f.push([1, "..."])` chunks.
|
||||
|
||||
The main page contains event cards with props:
|
||||
category, title, time, imageUrl
|
||||
wrapped in <a href="/watch/{UUID}"> links.
|
||||
Next.js RSC ships its tree as escape-encoded strings inside repeated
|
||||
`self.__next_f.push` calls. Regex over the raw HTML misses everything
|
||||
interesting; we have to decode unicode escapes first.
|
||||
"""
|
||||
chunks = re.findall(r'self\.__next_f\.push\(\[1,"(.*?)"\]\)', html, re.DOTALL)
|
||||
if not chunks:
|
||||
return ""
|
||||
payload = ""
|
||||
for chunk in chunks:
|
||||
try:
|
||||
payload += chunk.encode().decode("unicode_escape")
|
||||
except Exception:
|
||||
payload += chunk
|
||||
return payload
|
||||
|
||||
|
||||
def _parse_live_events(html: str) -> list[_PitsportEvent]:
|
||||
"""Parse live events from the main page (or `/live-now`) RSC payload.
|
||||
|
||||
The pages embed event cards inside the Next.js RSC payload; the raw
|
||||
HTML keeps it escape-encoded so we decode first, then match.
|
||||
Two shapes are common:
|
||||
1) Older card props: "category":"...","title":"..." next to
|
||||
"href":"/watch/UUID".
|
||||
2) Newer `event` prop: an `event` object with `uri:"/watch/UUID"`
|
||||
carrying `category` and `title`.
|
||||
"""
|
||||
payload = _decode_rsc_payload(html) or html
|
||||
|
||||
events: list[_PitsportEvent] = []
|
||||
|
||||
# Match event cards in the RSC payload - they appear as JSON-like structures
|
||||
# Pattern: href="/watch/UUID" ... category":"...", "title":"..."
|
||||
# In the RSC payload, the data is in the format:
|
||||
# ["$","$L2","/watch/UUID",{"href":"/watch/UUID","children":["$","$L10",null,
|
||||
# {"category":"...","title":"...","time":...,"imageUrl":"..."}]}]
|
||||
pattern = re.compile(
|
||||
href_pattern = re.compile(
|
||||
r'"href":"(/watch/([0-9a-f-]{36}))"[^}]*?"category":"([^"]+)","title":"([^"]+)"',
|
||||
)
|
||||
for match in pattern.finditer(html):
|
||||
for match in href_pattern.finditer(payload):
|
||||
_, uuid, category, title = match.groups()
|
||||
events.append(_PitsportEvent(category=category, title=title, watch_uuid=uuid))
|
||||
|
||||
event_pattern = re.compile(
|
||||
r'"event":\{[^{}]*?"title":"([^"]+)"[^{}]*?"uri":"/watch/([0-9a-f-]{36})"[^{}]*?"category":"([^"]+)"',
|
||||
)
|
||||
for match in event_pattern.finditer(payload):
|
||||
title, uuid, category = match.groups()
|
||||
events.append(_PitsportEvent(category=category, title=title, watch_uuid=uuid))
|
||||
|
||||
event_pattern_alt = re.compile(
|
||||
r'"event":\{[^{}]*?"category":"([^"]+)"[^{}]*?"title":"([^"]+)"[^{}]*?"uri":"/watch/([0-9a-f-]{36})"',
|
||||
)
|
||||
for match in event_pattern_alt.finditer(payload):
|
||||
category, title, uuid = match.groups()
|
||||
events.append(_PitsportEvent(category=category, title=title, watch_uuid=uuid))
|
||||
|
||||
return events
|
||||
|
||||
|
||||
|
|
@ -301,13 +335,12 @@ def _is_m3u8_method(method: str) -> bool:
|
|||
|
||||
|
||||
def _extract_m3u8_url(link: str) -> str:
|
||||
"""Convert a serveplay.site player URL to an m3u8 playlist URL.
|
||||
"""Pass through the link from pushembdz's `api/stream/<slug>` response.
|
||||
|
||||
Input: https://dash.serveplay.site/{channel}/index.html
|
||||
Output: https://dash.serveplay.site/{channel}/index.html
|
||||
|
||||
The index.html IS the m3u8 playlist (served with proper content-type
|
||||
when fetched with the correct Referer header).
|
||||
The host has rotated over time (serveplay.site → oe1.ossfeed.store →
|
||||
…); the response is always a master playlist URL we hand to the
|
||||
player as-is. Content-Type may be `text/css` or `application/json` —
|
||||
treat as HLS based on body sniffing (`#EXTM3U`), not MIME.
|
||||
"""
|
||||
return link
|
||||
|
||||
|
|
@ -388,6 +421,24 @@ class PitsportExtractor(BaseExtractor):
|
|||
except Exception:
|
||||
logger.exception("[pitsport] Failed to fetch main page")
|
||||
|
||||
# Fetch /live-now — canonical "currently live" list, added 2026.
|
||||
try:
|
||||
resp = await client.get(f"{PITSPORT_BASE}/live-now")
|
||||
if resp.status_code == 200:
|
||||
live_now_events = _parse_live_events(resp.text)
|
||||
logger.info(
|
||||
"[pitsport] Live-now page: %d event(s)", len(live_now_events)
|
||||
)
|
||||
for ev in live_now_events:
|
||||
if _is_f1_event(ev.category, ev.title):
|
||||
all_events.append(ev)
|
||||
else:
|
||||
logger.warning(
|
||||
"[pitsport] Live-now page returned HTTP %d", resp.status_code
|
||||
)
|
||||
except Exception:
|
||||
logger.exception("[pitsport] Failed to fetch live-now page")
|
||||
|
||||
# Fetch schedule page for upcoming events
|
||||
try:
|
||||
resp = await client.get(f"{PITSPORT_BASE}/schedule")
|
||||
|
|
|
|||
|
|
@ -153,21 +153,37 @@ class PPVExtractor(BaseExtractor):
|
|||
if viewers and int(viewers) > 0:
|
||||
title += f" ({viewers} viewers)"
|
||||
|
||||
# Check for substreams (multiple quality/language options)
|
||||
# Always emit the parent stream — substreams are
|
||||
# additional language/source variants, not replacements.
|
||||
streams.append(
|
||||
ExtractedStream(
|
||||
url=embed_url,
|
||||
site_key=self.site_key,
|
||||
site_name=self.site_name,
|
||||
quality=quality,
|
||||
title=title,
|
||||
stream_type="embed",
|
||||
embed_url=embed_url,
|
||||
)
|
||||
)
|
||||
|
||||
substreams = stream_obj.get("substreams")
|
||||
if isinstance(substreams, list) and substreams:
|
||||
if isinstance(substreams, list):
|
||||
for i, sub in enumerate(substreams):
|
||||
sub_embed = sub.get("iframe", "") or sub.get("embed_url", "")
|
||||
if not sub_embed:
|
||||
# Fall back to the parent embed URL
|
||||
sub_embed = embed_url
|
||||
sub_name = sub.get("name", "") or sub.get("label", "")
|
||||
sub_name = (
|
||||
sub.get("source_tag", "")
|
||||
or sub.get("name", "")
|
||||
or sub.get("label", "")
|
||||
)
|
||||
sub_quality = sub.get("tag", "") or sub.get("quality", "") or quality
|
||||
sub_title = f"{name}"
|
||||
if sub_name:
|
||||
sub_title += f" - {sub_name}"
|
||||
elif i > 0:
|
||||
sub_title += f" #{i + 1}"
|
||||
else:
|
||||
sub_title += f" #{i + 2}"
|
||||
|
||||
streams.append(
|
||||
ExtractedStream(
|
||||
|
|
@ -180,19 +196,6 @@ class PPVExtractor(BaseExtractor):
|
|||
embed_url=sub_embed,
|
||||
)
|
||||
)
|
||||
else:
|
||||
# Single stream, no substreams
|
||||
streams.append(
|
||||
ExtractedStream(
|
||||
url=embed_url,
|
||||
site_key=self.site_key,
|
||||
site_name=self.site_name,
|
||||
quality=quality,
|
||||
title=title,
|
||||
stream_type="embed",
|
||||
embed_url=embed_url,
|
||||
)
|
||||
)
|
||||
|
||||
except Exception:
|
||||
logger.exception("[ppv] Failed to extract streams")
|
||||
|
|
|
|||
|
|
@ -61,6 +61,12 @@ resource "kubernetes_deployment" "forgejo" {
|
|||
app = "forgejo"
|
||||
tier = local.tiers.edge
|
||||
}
|
||||
annotations = {
|
||||
# Keel disabled here — its `force` policy rewrote the image tag
|
||||
# from 11.0.14 → 1.18 on 2026-05-24 (same bug as memory id=1933).
|
||||
# TF owns the tag now; bump it manually here when upgrading.
|
||||
"keel.sh/policy" = "never"
|
||||
}
|
||||
}
|
||||
spec {
|
||||
replicas = 1
|
||||
|
|
@ -89,7 +95,14 @@ resource "kubernetes_deployment" "forgejo" {
|
|||
}
|
||||
container {
|
||||
name = "forgejo"
|
||||
image = "codeberg.org/forgejo/forgejo:11"
|
||||
# Pinned to 11.0.14 (latest 11.x as of 2026-05-12) — was on
|
||||
# floating `:11`. On 2026-05-24T15:35:37Z Keel force-policy
|
||||
# rewrote the tag from `11.0.14 → 1.18` (Gitea-era Forgejo
|
||||
# v1.18), exact replay of the 2026-05-16 force-policy
|
||||
# tag-rewriting incident (memory id=1933). The pod crashlooped
|
||||
# because the DB had already been migrated to schema 305 by
|
||||
# 11.0.14 and v1.18 only knows up to migration 231.
|
||||
image = "codeberg.org/forgejo/forgejo:11.0.14"
|
||||
env {
|
||||
name = "USER_UID"
|
||||
value = 1000
|
||||
|
|
@ -182,10 +195,16 @@ resource "kubernetes_deployment" "forgejo" {
|
|||
lifecycle {
|
||||
ignore_changes = [
|
||||
spec[0].template[0].spec[0].dns_config, # KYVERNO_LIFECYCLE_V1
|
||||
spec[0].template[0].spec[0].container[0].image, # KEEL_IGNORE_IMAGE — Keel manages tag updates
|
||||
metadata[0].annotations["keel.sh/policy"],
|
||||
# KEEL_IGNORE_IMAGE removed 2026-05-24 — Keel is disabled for this
|
||||
# workload now (keel.sh/policy=never annotation above), so TF owns
|
||||
# the image tag. Restore this ignore_changes line if you flip
|
||||
# keel.sh/policy back to `force` later.
|
||||
metadata[0].annotations["keel.sh/match-tag"],
|
||||
metadata[0].annotations["keel.sh/trigger"],
|
||||
metadata[0].annotations["keel.sh/pollSchedule"], # KYVERNO_LIFECYCLE_V2
|
||||
metadata[0].annotations["kubernetes.io/change-cause"],
|
||||
metadata[0].annotations["deployment.kubernetes.io/revision"],
|
||||
spec[0].template[0].metadata[0].annotations["keel.sh/update-time"],
|
||||
]
|
||||
}
|
||||
}
|
||||
|
|
|
|||
|
|
@ -1562,6 +1562,13 @@ serverFiles:
|
|||
severity: warning
|
||||
annotations:
|
||||
summary: "Offsite backup sync is {{ $value | humanizeDuration }} old (threshold: 9d)"
|
||||
- alert: OffsiteBackupSyncFailing
|
||||
expr: offsite_sync_last_status{job="offsite-backup-sync"} != 0
|
||||
for: 0m
|
||||
labels:
|
||||
severity: warning
|
||||
annotations:
|
||||
summary: "Offsite backup sync last run reported errors (status={{ $value }})"
|
||||
- alert: NfsMirrorStale
|
||||
expr: (time() - nfs_mirror_last_run_timestamp{job="nfs-mirror"}) > 1382400
|
||||
for: 30m
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue