Viktor Barzin
82f674a0b4
rename weekly-backup → daily-backup across scripts, timers, services, and docs [ci skip]
...
Reflects the schedule change from weekly to daily. All references updated:
- scripts/weekly-backup.{sh,timer,service} → daily-backup.*
- Pushgateway job name: weekly-backup → daily-backup
- Prometheus metric names: weekly_backup_* → daily_backup_*
- All docs, runbooks, AGENTS.md, CLAUDE.md, proxmox-inventory
- offsite-sync dependency: After=daily-backup.service
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 18:37:04 +00:00
Viktor Barzin
38d51ab0af
deprecate TrueNAS: migrate Immich NFS to Proxmox, remove all 10.0.10.15 references [ci skip]
...
- Migrate Immich (8 NFS PVs, 1.1TB) from TrueNAS to Proxmox host NFS
- Update config.tfvars nfs_server to 192.168.1.127 (Proxmox)
- Update nfs-csi StorageClass share to /srv/nfs
- Update scripts (weekly-backup, cluster-healthcheck) to Proxmox IP
- Delete obsolete TrueNAS scripts (nfs_exports.sh, truenas-status.sh)
- Rewrite nfs-health.sh for Proxmox NFS monitoring
- Update Freedify nfs_music_server default to Proxmox
- Mark CloudSync monitor CronJob as deprecated
- Update Prometheus alert summaries
- Update all architecture docs, AGENTS.md, and reference docs
- Zero PVs remain on TrueNAS — VM ready for decommission
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 14:42:07 +00:00
Viktor Barzin
1c7998163d
extend backup-verify.sh with full 3-2-1 health inspection
...
18 checks covering all backup layers:
- Layer 1: LVM snapshot freshness/status/count, thin pool free, timer
- Layer 2: Weekly backup freshness/status/timer, sda mount/usage,
PVC data/NFS mirror/pfsense freshness
- Layer 3: Offsite sync freshness/status/timer
- DB CronJobs: age check with per-service thresholds
- CNPG backups: existing check preserved
New --fix flag for conservative auto-remediation:
- Re-enable disabled timers
- Clear stale lockfiles
- Mount /mnt/backup if unmounted
2026-04-09 22:48:49 +01:00
Viktor Barzin
d6afbe84c8
post-mortem v2: pipeline team architecture with 4-stage agents [ci skip]
...
Split monolithic orchestrator into triage (haiku), historian (sonnet),
and report-writer (opus) stages. Each stage gets its own tool budget.
Added sev-context.sh for structured cluster context gathering.
2026-03-16 21:59:34 +00:00
Viktor Barzin
ff83ec3325
add infrastructure agent team: 8 specialized agents + 14 diagnostic scripts
...
Agents: devops-engineer, dba, security-engineer, sre, network-engineer,
platform-engineer, observability-engineer, home-automation-engineer.
Scripts: deploy-status, db-health, backup-verify, tls-check, crowdsec-status,
authentik-audit, oom-investigator, resource-report, dns-check, network-health,
nfs-health, truenas-status, platform-status, monitoring-health.
Also: known-issues.md suppression list, cluster-health-checker port-forward fix.
2026-03-15 02:01:07 +00:00
Viktor Barzin
160fda882f
authentik: cleanup unused resources + add invitation enrollment flow [ci skip]
...
Cleanup:
- Deleted 5 unused flows (enrollment-inviation, headscale-auth/authz, default-enrollment, oauth-enrollment)
- Deleted 8 orphaned stages bound only to deleted flows
- Deleted authentik Read-only group and role (0 users)
- Deleted 2 unbound policies (map github username, Map Google Attributes)
Invitation enrollment:
- Created invitation-enrollment flow with 5 stages (invitation validation,
identification with social login, prompt, user write, auto-login)
- Set all OAuth sources (Google/GitHub/Facebook) enrollment_flow to invitation-enrollment
- New users can only sign up via single-use invitation links
- Added authentik-invite.sh script for invitation management
- Updated reference docs and authentik skill
2026-03-13 22:21:10 +00:00