[uptime-kuma] Opt-out external monitoring for every public ingress [ci skip]

## Context
After the previous commit migrated monitor discovery to per-ingress annotation
(opt-in via `uptime.viktorbarzin.me/external-monitor=true`), coverage expanded
from 13 → 26 monitors but still left ~99 public ingresses uncovered — notably
Helm-managed services (authentik, grafana, vault, forgejo, ntfy) that don't
go through `ingress_factory`, plus any `dns_type = "non-proxied"` ingress
(Immich was a direct victim: `dns_type = "non-proxied"` → no annotation added
→ no monitor → invisible outage).

The user's concern: "I should have known external Immich was down before
users tried to open it."

## This change
Flipped the semantic from opt-in to **opt-out by default**:
- Every ingress whose host ends in `.viktorbarzin.me` gets a `[External] <label>`
  monitor automatically
- Only ingresses with annotation `uptime.viktorbarzin.me/external-monitor=false`
  are skipped
- Host dedup via a `seen` set (one monitor per hostname, regardless of how
  many Ingress resources share it)

## Verification
Triggered a manual CronJob run post-apply:
```
Sync complete: 102 created, 1 deleted, 23 unchanged
```
Coverage jumped from 26 → ~124 external monitors. All 6 Helm-managed services
now have dedicated monitors:
- [External] immich, authentik, forgejo, grafana, ntfy, vault

## Scope
Only `stacks/uptime-kuma/modules/uptime-kuma/main.tf` (Python script in the
CronJob resource). No RBAC or service account changes — the ones added in the
previous commit still cover this path.

## Test plan

### Automated
\`\`\`
\$ kubectl -n uptime-kuma logs -l job-name=manual-sync-optout-1776422993 --tail=50 | grep -iE 'immich|authentik|grafana|forgejo|vault|ntfy'
Creating monitor: [External] authentik -> https://authentik.viktorbarzin.me
Creating monitor: [External] forgejo   -> https://forgejo.viktorbarzin.me
Creating monitor: [External] immich    -> https://immich.viktorbarzin.me
Creating monitor: [External] grafana   -> https://grafana.viktorbarzin.me
Creating monitor: [External] ntfy      -> https://ntfy.viktorbarzin.me
Creating monitor: [External] vault     -> https://vault.viktorbarzin.me
\`\`\`

### Manual Verification
1. Open `https://uptime.viktorbarzin.me` → confirm `[External] immich` exists
2. Simulate an Immich outage (scale deploy to 0 briefly) → external monitor
   should go red within the probe interval (5min); internal monitor stays up
   (pod-level from a different probe angle) → `ExternalAccessDivergence`
   alert fires after 15 min

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Viktor Barzin 2026-04-17 11:12:00 +00:00
parent 66d2d9916b
commit 366e2ab083

View file

@ -345,7 +345,11 @@ API_SERVER = f"https://{os.environ.get('KUBERNETES_SERVICE_HOST', 'kubernetes.de
def load_from_api():
"""List ingresses via in-cluster API, filter by annotation, derive targets."""
"""List ingresses via in-cluster API. Opt-OUT by default:
every ingress whose host matches *.viktorbarzin.me gets a monitor,
UNLESS its annotation `uptime.viktorbarzin.me/external-monitor` is `"false"`.
This covers Helm-managed ingresses (authentik, grafana, vault, forgejo, ntfy)
that don't go through ingress_factory."""
with open(f"{SA_DIR}/token") as f:
token = f.read().strip()
ctx = ssl.create_default_context(cafile=f"{SA_DIR}/ca.crt")
@ -355,10 +359,11 @@ def load_from_api():
body = json.loads(resp.read())
targets = []
seen = set()
for ing in body.get("items", []):
anns = (ing.get("metadata") or {}).get("annotations") or {}
if anns.get(ANNOTATION_ENABLE, "").lower() != "true":
continue
if anns.get(ANNOTATION_ENABLE, "").lower() == "false":
continue # explicit opt-out
tls = (ing.get("spec") or {}).get("tls") or []
host = None
if tls and tls[0].get("hosts"):
@ -367,11 +372,11 @@ def load_from_api():
rules = (ing.get("spec") or {}).get("rules") or []
if rules:
host = rules[0].get("host")
if not host:
ns = ing["metadata"]["namespace"]
nm = ing["metadata"]["name"]
print(f"WARN: ingress {ns}/{nm} annotated but has no host; skipping")
if not host or not host.endswith(".viktorbarzin.me"):
continue # skip internal-only or non-public hosts
if host in seen:
continue
seen.add(host)
label = anns.get(ANNOTATION_NAME) or host.split(".")[0]
targets.append({"name": label, "url": f"https://{host}"})
return targets