Viktor Barzin
b4b6fd5946
state(nfs-csi): update encrypted state
2026-04-14 09:32:41 +00:00
Viktor Barzin
30e5150ecd
state(status-page): update encrypted state
2026-04-14 09:31:50 +00:00
Viktor Barzin
ac3a6a96dd
state(hermes-agent): update encrypted state
2026-04-14 09:04:35 +00:00
Viktor Barzin
37b3395017
state(hermes-agent): update encrypted state
2026-04-14 09:00:30 +00:00
Viktor Barzin
8b2d3b7e6c
state(hermes-agent): update encrypted state
2026-04-14 08:58:36 +00:00
Viktor Barzin
8787ad9f1d
state(hermes-agent): update encrypted state
2026-04-14 08:39:18 +00:00
Viktor Barzin
71182f2867
state(openclaw): update encrypted state
2026-04-14 08:39:06 +00:00
Viktor Barzin
aa3af753a6
state(openclaw): update encrypted state
2026-04-14 08:38:54 +00:00
Viktor Barzin
4e059b138c
docs: consolidate all post-mortems under docs/post-mortems/
...
Move HTML post-mortems from repo root post-mortems/ to docs/post-mortems/.
Update index.html with all 3 incidents (newest first).
[ci skip]
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 08:24:36 +00:00
Viktor Barzin
bdba15a387
docs: move post-mortems to docs/post-mortems/
...
Consolidate all outage reports under docs/ for better discoverability.
Moved from .claude/post-mortems/ (agent-internal) to docs/post-mortems/
(repo documentation).
[ci skip]
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 08:20:09 +00:00
Viktor Barzin
68c8c5b4a0
fix(technitium): migrate primary to proxmox-lvm-encrypted + post-mortem
...
SEV1 outage: fsid=0 in PVE /etc/exports broke all NFS subdirectory
mounts from k8s (NFSv4 pseudo-root path resolution). Combined with
lockd failure, both NFSv4 and NFSv3 mount paths broken. Cascaded
into DNS primary, Vault (2/3 pods), Alertmanager, 20+ services.
Changes:
- Primary PVC: NFS (nfs-truenas) → proxmox-lvm-encrypted
- Secondary/tertiary PVCs: proxmox-lvm → proxmox-lvm-encrypted
- Removed NFS module dependency from technitium stack
- Added full post-mortem with prevention plan
[ci skip]
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 08:18:59 +00:00
Viktor Barzin
b239af9b6d
state(technitium): update encrypted state
2026-04-14 08:04:07 +00:00
Viktor Barzin
6afba3b338
state(hermes-agent): update encrypted state
2026-04-13 22:33:35 +00:00
Viktor Barzin
e8ef81b276
state(hermes-agent): update encrypted state
2026-04-13 22:31:39 +00:00
Viktor Barzin
ab52b8eec2
state(hermes-agent): update encrypted state
2026-04-13 22:26:19 +00:00
Viktor Barzin
110d5d1f86
state(hermes-agent): update encrypted state
2026-04-13 22:20:23 +00:00
Viktor Barzin
d4c71be41c
state(hermes-agent): update encrypted state
2026-04-13 22:09:19 +00:00
Viktor Barzin
6dcccbd8fc
state(hermes-agent): update encrypted state
2026-04-13 22:07:09 +00:00
Viktor Barzin
04400fb7bd
state(hermes-agent): update encrypted state
2026-04-13 22:07:09 +00:00
Viktor Barzin
e0518802f4
state(hermes-agent): update encrypted state
2026-04-13 22:07:08 +00:00
Viktor Barzin
1ef40daeec
docs: update for MySQL 3→1, CrowdSec/Technitium PG migration, PG tuning, NFS async, node OS tuning [ci skip]
2026-04-13 23:05:46 +01:00
Viktor Barzin
f6d9959557
state(proxmox-csi): update encrypted state
2026-04-13 23:04:58 +01:00
Viktor Barzin
61988669e4
state(proxmox-csi): update encrypted state
2026-04-13 23:04:58 +01:00
Viktor Barzin
ad53f66bcd
state(proxmox-csi): update encrypted state
2026-04-13 23:04:58 +01:00
Viktor Barzin
8b45944b27
state(proxmox-csi): update encrypted state
2026-04-13 23:04:58 +01:00
Viktor Barzin
0eb96e4e22
state(vault): update encrypted state
2026-04-13 23:04:57 +01:00
Viktor Barzin
c5393e6a72
state(wealthfolio): update encrypted state
2026-04-13 23:04:57 +01:00
Viktor Barzin
a5c92c4c78
state(health): update encrypted state
2026-04-13 23:04:57 +01:00
Viktor Barzin
d94b06627d
state(affine): update encrypted state
2026-04-13 23:04:56 +01:00
Viktor Barzin
82f674a0b4
rename weekly-backup → daily-backup across scripts, timers, services, and docs [ci skip]
...
Reflects the schedule change from weekly to daily. All references updated:
- scripts/weekly-backup.{sh,timer,service} → daily-backup.*
- Pushgateway job name: weekly-backup → daily-backup
- Prometheus metric names: weekly_backup_* → daily_backup_*
- All docs, runbooks, AGENTS.md, CLAUDE.md, proxmox-inventory
- offsite-sync dependency: After=daily-backup.service
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 18:37:04 +00:00
Viktor Barzin
ca5039f8aa
switch backup + offsite sync from weekly to daily — RPO 7d → 1d [ci skip]
...
- weekly-backup.timer: Sun 05:00 → daily 05:00
- offsite-sync-backup.timer: Sun 08:00 → daily 06:00
- Monthly full rsync --delete unchanged (1st-7th of month)
- Total daily I/O cost: ~20GB sdc reads, ~3.5GB sda writes, seconds of network
- Updated script headers and service descriptions
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 18:24:38 +00:00
Viktor Barzin
b45cee5c4a
docs: update backup architecture for inotify change tracking + consolidated Synology layout [ci skip]
...
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 18:16:36 +00:00
Viktor Barzin
28ad11d12c
consolidate offsite backup: inotify change tracking, deduplicate Synology paths [ci skip]
...
Architecture overhaul:
- Synology truenas/ renamed to nfs/, immich paths flattened to match source
- Created nfs-ssd/ on Synology for SSD data (thumbs, ML cache)
- Deleted pve-backup/nfs-mirror (53GB duplication eliminated)
- New inotifywait daemon (nfs-change-tracker.service) watches /srv/nfs + /srv/nfs-ssd
- offsite-sync Step 2: reads inotify change log, rsync --files-from only changed files
- weekly-backup: removed NFS mirror step entirely (NFS goes direct to Synology)
- Cleaned 9 orphaned LVs (101GB + 38 snapshots reclaimed from thin pool)
Performance: incremental sync completes in seconds (vs 30+ min with full rsync)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 18:06:20 +00:00
Viktor Barzin
aa4c125f9c
improve 3-2-1 backup: auto-discover dirs, Immich offsite sync, SQLite backup [ci skip]
...
- weekly-backup.sh: replace hardcoded BACKUP_DIRS with glob auto-discovery
(catches nextcloud-backup, council-complaints-backup, future dirs)
- weekly-backup.sh: add auto SQLite backup from PVC snapshots
(magic number check, ?mode=ro URI, fallback to raw copy)
- offsite-sync-backup.sh: add NFS media direct-to-Synology sync
(Immich, calibre, audiobookshelf — reuses existing TrueNAS Cloud Sync paths)
- Cleaned up 9 orphaned LVs + 38 snapshots on PVE host (101GB reclaimed)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 15:47:56 +00:00
Viktor Barzin
38d51ab0af
deprecate TrueNAS: migrate Immich NFS to Proxmox, remove all 10.0.10.15 references [ci skip]
...
- Migrate Immich (8 NFS PVs, 1.1TB) from TrueNAS to Proxmox host NFS
- Update config.tfvars nfs_server to 192.168.1.127 (Proxmox)
- Update nfs-csi StorageClass share to /srv/nfs
- Update scripts (weekly-backup, cluster-healthcheck) to Proxmox IP
- Delete obsolete TrueNAS scripts (nfs_exports.sh, truenas-status.sh)
- Rewrite nfs-health.sh for Proxmox NFS monitoring
- Update Freedify nfs_music_server default to Proxmox
- Mark CloudSync monitor CronJob as deprecated
- Update Prometheus alert summaries
- Update all architecture docs, AGENTS.md, and reference docs
- Zero PVs remain on TrueNAS — VM ready for decommission
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 14:42:07 +00:00
Viktor Barzin
69248eaa7b
state(nfs-csi): update encrypted state
2026-04-13 14:41:56 +00:00
Viktor Barzin
04eae139c6
state(immich): update encrypted state
2026-04-13 14:41:52 +00:00
Viktor Barzin
50ab67b5f7
state(immich): update encrypted state
2026-04-13 14:41:52 +00:00
root
b0303ab17d
Woodpecker CI deploy commit [CI SKIP]
2026-04-12 21:39:26 +00:00
Viktor Barzin
a2fad3f20e
docs(mailserver): remove HTML visual, fix probe frequency in diagram
2026-04-12 22:25:34 +01:00
Viktor Barzin
1c300a14cf
mailserver: overhaul inbound delivery, monitoring, CrowdSec, and migrate to Brevo relay
...
Inbound:
- Direct MX to mail.viktorbarzin.me (ForwardEmail relay attempted and abandoned)
- Dedicated MetalLB IP 10.0.20.202 with ETP: Local for CrowdSec real-IP detection
- Removed Cloudflare Email Routing (can't store-and-forward)
- Fixed dual SPF violation, hardened to -all
- Added MTA-STS, TLSRPT, imported Rspamd DKIM into Terraform
- Removed dead BIND zones from config.tfvars (199 lines)
Outbound:
- Migrated from Mailgun (100/day) to Brevo (300/day free)
- Added Brevo DKIM CNAMEs and verification TXT
Monitoring:
- Probe frequency: 30m → 20m, alert thresholds adjusted to 60m
- Enabled Dovecot exporter scraping (port 9166)
- Added external SMTP monitor on public IP
Documentation:
- New docs/architecture/mailserver.md with full architecture
- New docs/architecture/mailserver-visual.html visualization
- Updated monitoring.md, CLAUDE.md, historical plan docs
2026-04-12 22:24:38 +01:00
Viktor Barzin
8bc02d1401
state(rybbit): update encrypted state
2026-04-12 22:17:01 +01:00
Viktor Barzin
4e80ac40c4
state(mailserver): update encrypted state
2026-04-12 22:16:25 +01:00
Viktor Barzin
e71a65acc4
state(mailserver): update encrypted state
2026-04-12 22:15:44 +01:00
Viktor Barzin
887152194c
state(mailserver): update encrypted state
2026-04-12 22:12:43 +01:00
Viktor Barzin
333b289545
state(cloudflared): update encrypted state
2026-04-12 22:11:30 +01:00
Viktor Barzin
28934afb9a
state(cloudflared): update encrypted state
2026-04-12 22:10:33 +01:00
Viktor Barzin
d227a5c896
state(cloudflared): update encrypted state
2026-04-12 22:10:30 +01:00
Viktor Barzin
2ba456e070
state(mailserver): update encrypted state
2026-04-12 21:46:34 +01:00
Viktor Barzin
92881ee6af
state(mailserver): update encrypted state
2026-04-12 20:43:56 +01:00