Commit graph

2470 commits

Author SHA1 Message Date
Viktor Barzin
1ef40daeec docs: update for MySQL 3→1, CrowdSec/Technitium PG migration, PG tuning, NFS async, node OS tuning [ci skip] 2026-04-13 23:05:46 +01:00
Viktor Barzin
f6d9959557 state(proxmox-csi): update encrypted state 2026-04-13 23:04:58 +01:00
Viktor Barzin
61988669e4 state(proxmox-csi): update encrypted state 2026-04-13 23:04:58 +01:00
Viktor Barzin
ad53f66bcd state(proxmox-csi): update encrypted state 2026-04-13 23:04:58 +01:00
Viktor Barzin
8b45944b27 state(proxmox-csi): update encrypted state 2026-04-13 23:04:58 +01:00
Viktor Barzin
0eb96e4e22 state(vault): update encrypted state 2026-04-13 23:04:57 +01:00
Viktor Barzin
c5393e6a72 state(wealthfolio): update encrypted state 2026-04-13 23:04:57 +01:00
Viktor Barzin
a5c92c4c78 state(health): update encrypted state 2026-04-13 23:04:57 +01:00
Viktor Barzin
d94b06627d state(affine): update encrypted state 2026-04-13 23:04:56 +01:00
Viktor Barzin
82f674a0b4 rename weekly-backup → daily-backup across scripts, timers, services, and docs [ci skip]
Reflects the schedule change from weekly to daily. All references updated:
- scripts/weekly-backup.{sh,timer,service} → daily-backup.*
- Pushgateway job name: weekly-backup → daily-backup
- Prometheus metric names: weekly_backup_* → daily_backup_*
- All docs, runbooks, AGENTS.md, CLAUDE.md, proxmox-inventory
- offsite-sync dependency: After=daily-backup.service

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 18:37:04 +00:00
Viktor Barzin
ca5039f8aa switch backup + offsite sync from weekly to daily — RPO 7d → 1d [ci skip]
- weekly-backup.timer: Sun 05:00 → daily 05:00
- offsite-sync-backup.timer: Sun 08:00 → daily 06:00
- Monthly full rsync --delete unchanged (1st-7th of month)
- Total daily I/O cost: ~20GB sdc reads, ~3.5GB sda writes, seconds of network
- Updated script headers and service descriptions

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 18:24:38 +00:00
Viktor Barzin
b45cee5c4a docs: update backup architecture for inotify change tracking + consolidated Synology layout [ci skip]
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 18:16:36 +00:00
Viktor Barzin
28ad11d12c consolidate offsite backup: inotify change tracking, deduplicate Synology paths [ci skip]
Architecture overhaul:
- Synology truenas/ renamed to nfs/, immich paths flattened to match source
- Created nfs-ssd/ on Synology for SSD data (thumbs, ML cache)
- Deleted pve-backup/nfs-mirror (53GB duplication eliminated)
- New inotifywait daemon (nfs-change-tracker.service) watches /srv/nfs + /srv/nfs-ssd
- offsite-sync Step 2: reads inotify change log, rsync --files-from only changed files
- weekly-backup: removed NFS mirror step entirely (NFS goes direct to Synology)
- Cleaned 9 orphaned LVs (101GB + 38 snapshots reclaimed from thin pool)

Performance: incremental sync completes in seconds (vs 30+ min with full rsync)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 18:06:20 +00:00
Viktor Barzin
aa4c125f9c improve 3-2-1 backup: auto-discover dirs, Immich offsite sync, SQLite backup [ci skip]
- weekly-backup.sh: replace hardcoded BACKUP_DIRS with glob auto-discovery
  (catches nextcloud-backup, council-complaints-backup, future dirs)
- weekly-backup.sh: add auto SQLite backup from PVC snapshots
  (magic number check, ?mode=ro URI, fallback to raw copy)
- offsite-sync-backup.sh: add NFS media direct-to-Synology sync
  (Immich, calibre, audiobookshelf — reuses existing TrueNAS Cloud Sync paths)
- Cleaned up 9 orphaned LVs + 38 snapshots on PVE host (101GB reclaimed)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 15:47:56 +00:00
Viktor Barzin
38d51ab0af deprecate TrueNAS: migrate Immich NFS to Proxmox, remove all 10.0.10.15 references [ci skip]
- Migrate Immich (8 NFS PVs, 1.1TB) from TrueNAS to Proxmox host NFS
- Update config.tfvars nfs_server to 192.168.1.127 (Proxmox)
- Update nfs-csi StorageClass share to /srv/nfs
- Update scripts (weekly-backup, cluster-healthcheck) to Proxmox IP
- Delete obsolete TrueNAS scripts (nfs_exports.sh, truenas-status.sh)
- Rewrite nfs-health.sh for Proxmox NFS monitoring
- Update Freedify nfs_music_server default to Proxmox
- Mark CloudSync monitor CronJob as deprecated
- Update Prometheus alert summaries
- Update all architecture docs, AGENTS.md, and reference docs
- Zero PVs remain on TrueNAS — VM ready for decommission

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 14:42:07 +00:00
Viktor Barzin
69248eaa7b state(nfs-csi): update encrypted state 2026-04-13 14:41:56 +00:00
Viktor Barzin
04eae139c6 state(immich): update encrypted state 2026-04-13 14:41:52 +00:00
Viktor Barzin
50ab67b5f7 state(immich): update encrypted state 2026-04-13 14:41:52 +00:00
root
b0303ab17d Woodpecker CI deploy commit [CI SKIP] 2026-04-12 21:39:26 +00:00
Viktor Barzin
a2fad3f20e docs(mailserver): remove HTML visual, fix probe frequency in diagram 2026-04-12 22:25:34 +01:00
Viktor Barzin
1c300a14cf mailserver: overhaul inbound delivery, monitoring, CrowdSec, and migrate to Brevo relay
Inbound:
- Direct MX to mail.viktorbarzin.me (ForwardEmail relay attempted and abandoned)
- Dedicated MetalLB IP 10.0.20.202 with ETP: Local for CrowdSec real-IP detection
- Removed Cloudflare Email Routing (can't store-and-forward)
- Fixed dual SPF violation, hardened to -all
- Added MTA-STS, TLSRPT, imported Rspamd DKIM into Terraform
- Removed dead BIND zones from config.tfvars (199 lines)

Outbound:
- Migrated from Mailgun (100/day) to Brevo (300/day free)
- Added Brevo DKIM CNAMEs and verification TXT

Monitoring:
- Probe frequency: 30m → 20m, alert thresholds adjusted to 60m
- Enabled Dovecot exporter scraping (port 9166)
- Added external SMTP monitor on public IP

Documentation:
- New docs/architecture/mailserver.md with full architecture
- New docs/architecture/mailserver-visual.html visualization
- Updated monitoring.md, CLAUDE.md, historical plan docs
2026-04-12 22:24:38 +01:00
Viktor Barzin
8bc02d1401 state(rybbit): update encrypted state 2026-04-12 22:17:01 +01:00
Viktor Barzin
4e80ac40c4 state(mailserver): update encrypted state 2026-04-12 22:16:25 +01:00
Viktor Barzin
e71a65acc4 state(mailserver): update encrypted state 2026-04-12 22:15:44 +01:00
Viktor Barzin
887152194c state(mailserver): update encrypted state 2026-04-12 22:12:43 +01:00
Viktor Barzin
333b289545 state(cloudflared): update encrypted state 2026-04-12 22:11:30 +01:00
Viktor Barzin
28934afb9a state(cloudflared): update encrypted state 2026-04-12 22:10:33 +01:00
Viktor Barzin
d227a5c896 state(cloudflared): update encrypted state 2026-04-12 22:10:30 +01:00
Viktor Barzin
2ba456e070 state(mailserver): update encrypted state 2026-04-12 21:46:34 +01:00
Viktor Barzin
92881ee6af state(mailserver): update encrypted state 2026-04-12 20:43:56 +01:00
Viktor Barzin
c740ed1301 docs: update Technitium DNS docs after cache optimization
- Fix Technitium IP typo: 10.0.20.101 → 10.0.20.201 (service-catalog, vpn.md)
- Fix PDB minAvailable: 1 → 2 (networking.md)
- Add emrsn.org stub zone, cache TTL tuning, PG query logging, CronJobs
- Update forwarders: was "Cloudflare + Google", actually Cloudflare DoH only
- Update config storage: was generic PVC, now NFS path
2026-04-12 18:29:25 +01:00
Viktor Barzin
82b0f6c4cb truenas deprecation: migrate all non-immich storage to proxmox NFS
- Migrate 7 backup CronJobs to Proxmox host NFS (192.168.1.127)
  (etcd, mysql, postgresql, nextcloud, redis, vaultwarden, plotting-book)
- Migrate headscale backup, ebook2audiobook, osm_routing to Proxmox NFS
- Migrate servarr (lidarr, readarr, soulseek) NFS refs to Proxmox
- Remove 79 orphaned TrueNAS NFS module declarations from 49 stacks
- Delete stacks/platform/modules/ (27 dead module copies, 65MB)
- Update nfs-truenas StorageClass to point to Proxmox (192.168.1.127)
- Remove iscsi DNS record from config.tfvars
- Fix woodpecker persistence config and alertmanager PV

Only Immich (8 PVCs, ~1.4TB) remains on TrueNAS.
2026-04-12 14:35:39 +01:00
Viktor Barzin
3246c4d112 state: update encrypted terraform state 2026-04-12 14:29:17 +01:00
Viktor Barzin
b7aec4c617 state: update encrypted terraform state 2026-04-12 14:17:12 +01:00
Viktor Barzin
78373dcce4 state(mailserver): update encrypted state 2026-04-12 14:02:49 +01:00
Viktor Barzin
9965e47414 state: update encrypted terraform state 2026-04-12 13:05:50 +01:00
Viktor Barzin
8ac5240ff4 state: update encrypted terraform state 2026-04-12 12:59:58 +01:00
Viktor Barzin
8363efc56b state: update encrypted terraform state 2026-04-12 12:59:01 +01:00
Viktor Barzin
f5e456b58d state(osm_routing): update encrypted state 2026-04-12 12:57:31 +01:00
Viktor Barzin
b236a57ebc state(ebook2audiobook): update encrypted state 2026-04-12 12:57:11 +01:00
Viktor Barzin
a58d908b29 state(plotting-book): update encrypted state 2026-04-12 12:56:20 +01:00
Viktor Barzin
3677a163fe state(plotting-book): update encrypted state 2026-04-12 12:56:06 +01:00
Viktor Barzin
f9c9219c13 state(plotting-book): update encrypted state 2026-04-12 12:56:00 +01:00
Viktor Barzin
d48f118195 state(plotting-book): update encrypted state 2026-04-12 12:55:50 +01:00
Viktor Barzin
55456818b5 state(plotting-book): update encrypted state 2026-04-12 12:55:45 +01:00
Viktor Barzin
fa5be3b2fe state(plotting-book): update encrypted state 2026-04-12 12:55:39 +01:00
Viktor Barzin
562f7b1db1 state(vaultwarden): update encrypted state 2026-04-12 12:54:29 +01:00
Viktor Barzin
a8c6daeaa5 state(infra-maintenance): update encrypted state 2026-04-12 12:51:11 +01:00
Viktor Barzin
5da6d75094 fix(monitoring): PodCrashLooping alert now fires only for active CrashLoopBackOff
Switch from restart-count based detection (increase restarts[1h] > 5) to
waiting-reason based (kube_pod_container_status_waiting_reason{reason="CrashLoopBackOff"}).
Alert auto-resolves when pod recovers, making it clear whether the issue is active.
2026-04-12 12:41:07 +01:00
Viktor Barzin
cc670d949c docs: add ha-sofia Version Control add-on to HA skill [ci skip]
HomeAssistantVersionControl v1.2.0 installed on ha-sofia for git-based
config tracking. Auto-commits on file change, pushes hourly to private
GitHub repo ViktorBarzin/ha-sofia-config.
2026-04-12 11:37:02 +01:00