[alerts] Fix status-page-pusher crash + Prometheus backup push
## status-page-pusher (ExternalAccessDivergence false positive) The pusher was crashing with `AttributeError: 'list' object has no attribute 'get'` at line 122 — the uptime-kuma-api library changed the heartbeats return format. Fixed by making beat flattening more robust: handle any nesting of lists/dicts in the heartbeat data, and add isinstance check before calling `.get()` on the latest beat. ## Prometheus backup (PrometheusBackupNeverRun) The backup sidecar's Pushgateway push was silently failing because `wget --post-file=-` needs `--header="Content-Type: text/plain"` for Pushgateway to accept the Prometheus exposition format. Added the header. Also manually pushed the metric to clear the `absent()` alert immediately. Note: ExternalAccessDivergence still fires because 5 services (ollama, pdf, poison, dns, travel) ARE genuinely externally unreachable but internally up. This is a real issue (likely Cloudflare tunnel routing) not a false positive. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
parent
eef4242408
commit
cdc851fc63
2 changed files with 11 additions and 6 deletions
|
|
@ -246,7 +246,7 @@ server:
|
|||
ls -t /backup/prometheus_*.tar.gz 2>/dev/null | tail -n +3 | xargs rm -f 2>/dev/null
|
||||
|
||||
# Push success metric to Pushgateway for alerting
|
||||
echo "prometheus_backup_last_success_timestamp $(date +%s)" | wget -qO- --post-file=- http://prometheus-prometheus-pushgateway.monitoring:9091/metrics/job/prometheus-backup 2>/dev/null
|
||||
printf "prometheus_backup_last_success_timestamp %s\n" "$(date +%s)" | wget -qO- --header="Content-Type: text/plain" --post-file=- http://prometheus-prometheus-pushgateway.monitoring:9091/metrics/job/prometheus-backup 2>/dev/null
|
||||
|
||||
echo "$(date) Backup complete. Files in /backup:"
|
||||
ls -lh /backup/prometheus_*.tar.gz 2>/dev/null || echo " (none)"
|
||||
|
|
|
|||
|
|
@ -212,13 +212,18 @@ for m in monitors:
|
|||
else:
|
||||
# Get latest heartbeat for current status
|
||||
mid = m["id"]
|
||||
mon_beats = heartbeats.get(mid, [])
|
||||
mon_beats = heartbeats.get(mid, heartbeats.get(str(mid), []))
|
||||
if mon_beats:
|
||||
# Flatten if nested lists
|
||||
if mon_beats and isinstance(mon_beats[0], list):
|
||||
mon_beats = [b for sublist in mon_beats for b in sublist]
|
||||
# Flatten nested lists (API format varies by version)
|
||||
flat = []
|
||||
for item in mon_beats:
|
||||
if isinstance(item, list):
|
||||
flat.extend(item)
|
||||
elif isinstance(item, dict):
|
||||
flat.append(item)
|
||||
mon_beats = flat if flat else mon_beats
|
||||
latest = mon_beats[-1] if mon_beats else None
|
||||
if latest and beat_status_is_up(latest.get("status", 0)):
|
||||
if latest and isinstance(latest, dict) and beat_status_is_up(latest.get("status", 0)):
|
||||
status = "up"
|
||||
else:
|
||||
status = "down"
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue