add hourly SQLite integrity check for vaultwarden with Prometheus alerting

- New CronJob runs PRAGMA integrity_check every hour
- Pushes vaultwarden_sqlite_integrity_ok metric to Prometheus pushgateway
- VaultwardenSQLiteCorrupt alert fires immediately on corruption (critical)
- VaultwardenIntegrityCheckStale alert if check hasn't run in 2h (warning)
- Prevents running for days on a corrupted DB unnoticed
This commit is contained in:
Viktor Barzin 2026-03-23 00:50:15 +02:00
parent 3b89a7d7e4
commit 311ff5dd9e
3 changed files with 182 additions and 0 deletions

View file

@ -636,6 +636,20 @@ serverFiles:
severity: critical
annotations:
summary: "Vaultwarden has no available replicas — password manager down"
- alert: VaultwardenSQLiteCorrupt
expr: vaultwarden_sqlite_integrity_ok == 0
for: 0m
labels:
severity: critical
annotations:
summary: "Vaultwarden SQLite database failed integrity check — data corruption detected"
- alert: VaultwardenIntegrityCheckStale
expr: (time() - vaultwarden_sqlite_integrity_check_timestamp) > 7200
for: 15m
labels:
severity: warning
annotations:
summary: "Vaultwarden integrity check hasn't run in {{ $value | humanizeDuration }} (expected hourly)"
- alert: RedisBackupStale
expr: (time() - kube_cronjob_status_last_successful_time{cronjob="redis-backup", namespace="redis"}) > 691200
for: 30m