All checks were successful
ci/woodpecker/push/default Pipeline was successful
Documents the 2026-06-13 right-sizing review: Kuma is already lean (~1 check/s, 227 monitors mostly at 300s, 77MB on shared MySQL, 30d retention); the 'scraping too much' concern traced to a fixed socket.io login-timeout incident, not load. Records the deliberate decisions (keep per-service [External] monitors over canaries; keep datastore on shared mysql.dbaas) with rejected alternatives + rationale, plus the known internal-sync no-prune gap (stale Goldilocks monitor cleaned up by hand). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
29 lines
1.2 KiB
Markdown
29 lines
1.2 KiB
Markdown
# Uptime Kuma — Context
|
|
|
|
Glossary for the uptime-kuma monitoring context. Terms only — no implementation
|
|
detail. Decisions live in `docs/adr/`.
|
|
|
|
## Glossary
|
|
|
|
**Active check (poll)** — Uptime Kuma actively probes a target on an interval
|
|
(HTTP / TCP / ping / DB). This is *polling*, not "scraping." Prometheus *scrapes*
|
|
exporters; Kuma *polls* targets. (Note: Prometheus does **not** scrape Kuma — a
|
|
separate monitoring lane.)
|
|
|
|
**Monitor** — one configured target plus its check definition.
|
|
|
|
**Internal monitor** — probes a service on its in-cluster address
|
|
(`*.svc.cluster.local`). Answers "is the service itself healthy?"
|
|
|
|
**`[External]` monitor** — probes a service via its full public path
|
|
(DNS → Cloudflare → cloudflared tunnel → Traefik). Answers "is the service
|
|
reachable the way users reach it?" Maintained one-per-externally-reachable-service
|
|
by deliberate choice (see ADR-0001).
|
|
|
|
**Heartbeat** — one recorded check result (up/down + latency), persisted to the
|
|
datastore.
|
|
|
|
**External-access divergence** — the condition where a service is healthy
|
|
*internally* but its `[External]` path is down — i.e. the shared
|
|
Cloudflare/tunnel/Traefik path is broken while the service itself is fine.
|
|
Surfaced by the `ExternalAccessDivergence` alert.
|