[uptime-kuma] Opt-out external monitoring for every public ingress [ci skip]
## Context After the previous commit migrated monitor discovery to per-ingress annotation (opt-in via `uptime.viktorbarzin.me/external-monitor=true`), coverage expanded from 13 → 26 monitors but still left ~99 public ingresses uncovered — notably Helm-managed services (authentik, grafana, vault, forgejo, ntfy) that don't go through `ingress_factory`, plus any `dns_type = "non-proxied"` ingress (Immich was a direct victim: `dns_type = "non-proxied"` → no annotation added → no monitor → invisible outage). The user's concern: "I should have known external Immich was down before users tried to open it." ## This change Flipped the semantic from opt-in to **opt-out by default**: - Every ingress whose host ends in `.viktorbarzin.me` gets a `[External] <label>` monitor automatically - Only ingresses with annotation `uptime.viktorbarzin.me/external-monitor=false` are skipped - Host dedup via a `seen` set (one monitor per hostname, regardless of how many Ingress resources share it) ## Verification Triggered a manual CronJob run post-apply: ``` Sync complete: 102 created, 1 deleted, 23 unchanged ``` Coverage jumped from 26 → ~124 external monitors. All 6 Helm-managed services now have dedicated monitors: - [External] immich, authentik, forgejo, grafana, ntfy, vault ## Scope Only `stacks/uptime-kuma/modules/uptime-kuma/main.tf` (Python script in the CronJob resource). No RBAC or service account changes — the ones added in the previous commit still cover this path. ## Test plan ### Automated \`\`\` \$ kubectl -n uptime-kuma logs -l job-name=manual-sync-optout-1776422993 --tail=50 | grep -iE 'immich|authentik|grafana|forgejo|vault|ntfy' Creating monitor: [External] authentik -> https://authentik.viktorbarzin.me Creating monitor: [External] forgejo -> https://forgejo.viktorbarzin.me Creating monitor: [External] immich -> https://immich.viktorbarzin.me Creating monitor: [External] grafana -> https://grafana.viktorbarzin.me Creating monitor: [External] ntfy -> https://ntfy.viktorbarzin.me Creating monitor: [External] vault -> https://vault.viktorbarzin.me \`\`\` ### Manual Verification 1. Open `https://uptime.viktorbarzin.me` → confirm `[External] immich` exists 2. Simulate an Immich outage (scale deploy to 0 briefly) → external monitor should go red within the probe interval (5min); internal monitor stays up (pod-level from a different probe angle) → `ExternalAccessDivergence` alert fires after 15 min Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
parent
66d2d9916b
commit
366e2ab083
1 changed files with 12 additions and 7 deletions
|
|
@ -345,7 +345,11 @@ API_SERVER = f"https://{os.environ.get('KUBERNETES_SERVICE_HOST', 'kubernetes.de
|
|||
|
||||
|
||||
def load_from_api():
|
||||
"""List ingresses via in-cluster API, filter by annotation, derive targets."""
|
||||
"""List ingresses via in-cluster API. Opt-OUT by default:
|
||||
every ingress whose host matches *.viktorbarzin.me gets a monitor,
|
||||
UNLESS its annotation `uptime.viktorbarzin.me/external-monitor` is `"false"`.
|
||||
This covers Helm-managed ingresses (authentik, grafana, vault, forgejo, ntfy)
|
||||
that don't go through ingress_factory."""
|
||||
with open(f"{SA_DIR}/token") as f:
|
||||
token = f.read().strip()
|
||||
ctx = ssl.create_default_context(cafile=f"{SA_DIR}/ca.crt")
|
||||
|
|
@ -355,10 +359,11 @@ def load_from_api():
|
|||
body = json.loads(resp.read())
|
||||
|
||||
targets = []
|
||||
seen = set()
|
||||
for ing in body.get("items", []):
|
||||
anns = (ing.get("metadata") or {}).get("annotations") or {}
|
||||
if anns.get(ANNOTATION_ENABLE, "").lower() != "true":
|
||||
continue
|
||||
if anns.get(ANNOTATION_ENABLE, "").lower() == "false":
|
||||
continue # explicit opt-out
|
||||
tls = (ing.get("spec") or {}).get("tls") or []
|
||||
host = None
|
||||
if tls and tls[0].get("hosts"):
|
||||
|
|
@ -367,11 +372,11 @@ def load_from_api():
|
|||
rules = (ing.get("spec") or {}).get("rules") or []
|
||||
if rules:
|
||||
host = rules[0].get("host")
|
||||
if not host:
|
||||
ns = ing["metadata"]["namespace"]
|
||||
nm = ing["metadata"]["name"]
|
||||
print(f"WARN: ingress {ns}/{nm} annotated but has no host; skipping")
|
||||
if not host or not host.endswith(".viktorbarzin.me"):
|
||||
continue # skip internal-only or non-public hosts
|
||||
if host in seen:
|
||||
continue
|
||||
seen.add(host)
|
||||
label = anns.get(ANNOTATION_NAME) or host.split(".")[0]
|
||||
targets.append({"name": label, "url": f"https://{host}"})
|
||||
return targets
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue