cluster-health: uptime_kuma check — only count status==0 as down

check_uptime_kuma flagged a monitor as down whenever its last heartbeat status != 1, and treated "no beats" as down too. But uptime-kuma status 2 = PENDING (mid-retry) and 3 = MAINTENANCE are not outages, and no-beats = no data. So a monitor caught in a momentary pending/retry state at check time produced a false "internal/external down(N)" WARN — observed twice on 2026-06-04 (Novelapp, then ha-sofia) for monitors uptime-kuma itself logged ZERO downs against over 24h (0/2880 and 0/288 beats). Count a monitor as down ONLY on an explicit DOWN beat (status==0); pending, maintenance, and no-data are not-down. Real outages still flag (uptime-kuma persists status==0 beats for genuine downs).
2026-06-04 08:18:54 +00:00 · 2026-06-04 08:18:54 +00:00 · 31b8104b43
commit 31b8104b43
parent bf6ede2b9e
1 changed files with 8 additions and 4 deletions
--- a/scripts/cluster_healthcheck.sh
+++ b/scripts/cluster_healthcheck.sh
@ -819,16 +819,20 @@ try:
            continue

        beats = heartbeats.get(mid, [])
+        status = None
        if beats:
            last_beat = beats[-1]
            if isinstance(last_beat, list):
                last_beat = last_beat[-1] if last_beat else {}
-            status = last_beat.get("status", 0) if isinstance(last_beat, dict) else 0
+            status = last_beat.get("status") if isinstance(last_beat, dict) else None
            if hasattr(status, "value"):
                status = status.value
-            is_up = (status == 1)
-        else:
-            is_up = False
+        # Only an explicit DOWN (status==0) counts as down. PENDING (2,
+        # mid-retry) and MAINTENANCE (3) are NOT outages, and no beats = no
+        # data (not an outage). Counting pending/no-data as down caused
+        # recurring false WARNs (Novelapp, ha-sofia 2026-06-04) for monitors
+        # uptime-kuma itself logged 0 downs over 24h.
+        is_up = (status != 0)

        if is_external:
            if is_up: