Phase 1: Add 12 PrometheusRules for backup health alerting - PostgreSQL, MySQL, Vault, Vaultwarden, Redis staleness + never-succeeded alerts - CSIDriverCrashLoop alert for nfs-csi/iscsi-csi namespaces - Generic BackupCronJobFailed alert Phase 2: Fix backup rotation - etcd: timestamped snapshots instead of overwriting single file - Redis: timestamped RDB files with 7-day retention purge - PostgreSQL: retention increased from 7 to 14 days Phase 3: Fix MySQL password exposure - Move root password from command line arg to MYSQL_PWD env var via secretKeyRef Phase 5: Add restore runbooks - PostgreSQL, MySQL, Vault, etcd, Vaultwarden, full cluster rebuild
2.9 KiB
2.9 KiB
Restore Vaultwarden
Prerequisites
kubectlaccess to the cluster- Backup available on NFS at
/mnt/main/vaultwarden-backup/
Backup Location
- NFS:
/mnt/main/vaultwarden-backup/YYYY_MM_DD_HH_MM/(directory per backup) - Each backup contains:
db.sqlite3,rsa_key.pem,rsa_key.pub.pem,attachments/,sends/,config.json - Replicated to Synology NAS (192.168.1.13) via TrueNAS ZFS replication
- Retention: 30 days
- Schedule: Daily at 00:00
Backup Contents
| File | Purpose | Critical? |
|---|---|---|
db.sqlite3 |
All passwords, TOTP seeds, org data | Yes |
rsa_key.pem / rsa_key.pub.pem |
JWT signing keys | Yes — without these, all sessions invalidate |
attachments/ |
File attachments on vault items | Yes |
sends/ |
Bitwarden Send files | No |
config.json |
Server configuration | No — can be recreated |
Restore Procedure
1. Identify the backup to restore
# List available backups (directories sorted by date)
kubectl run vw-ls --rm -it --image=alpine \
--overrides='{"spec":{"volumes":[{"name":"backup","persistentVolumeClaim":{"claimName":"vaultwarden-backup"}}],"containers":[{"name":"vw-ls","image":"alpine","volumeMounts":[{"name":"backup","mountPath":"/backup"}],"command":["ls","-lt","/backup/"]}]}}' \
-n vaultwarden
2. Scale down Vaultwarden
kubectl scale deployment vaultwarden -n vaultwarden --replicas=0
3. Restore the backup
BACKUP_DIR="YYYY_MM_DD_HH_MM" # Set to desired backup
kubectl run vw-restore --rm -it --image=alpine \
--overrides='{"spec":{"volumes":[{"name":"backup","persistentVolumeClaim":{"claimName":"vaultwarden-backup"}},{"name":"data","persistentVolumeClaim":{"claimName":"vaultwarden-data"}}],"containers":[{"name":"vw-restore","image":"alpine","volumeMounts":[{"name":"backup","mountPath":"/backup"},{"name":"data","mountPath":"/data"}],"command":["/bin/sh","-c","cp /backup/'$BACKUP_DIR'/db.sqlite3 /data/db.sqlite3 && cp /backup/'$BACKUP_DIR'/rsa_key.pem /data/ && cp /backup/'$BACKUP_DIR'/rsa_key.pub.pem /data/ && cp -a /backup/'$BACKUP_DIR'/attachments /data/ 2>/dev/null; echo Restore complete"]}]}}' \
-n vaultwarden
4. Scale up Vaultwarden
kubectl scale deployment vaultwarden -n vaultwarden --replicas=1
# Wait for pod to be ready
kubectl wait --for=condition=Ready pod -l app=vaultwarden -n vaultwarden --timeout=120s
5. Verify restoration
# Check pod logs for startup errors
kubectl logs -n vaultwarden -l app=vaultwarden --tail=20
# Test web UI access
curl -s -o /dev/null -w "%{http_code}" https://vaultwarden.viktorbarzin.me/
6. Test login
Log in to the Vaultwarden web UI and verify:
- Can log in with your account
- Vault items are present and readable
- Attachments are accessible
- TOTP codes are generating correctly
Estimated Time
- Restore: ~5 minutes
- Verification: ~5 minutes