2026-02-15 12:33:34 +00:00
|
|
|
---
|
|
|
|
|
name: uptime-kuma
|
|
|
|
|
description: |
|
|
|
|
|
Manage Uptime Kuma monitoring via the Python API. Use when:
|
|
|
|
|
(1) User asks to add, remove, or list monitors,
|
|
|
|
|
(2) User asks about service uptime or monitoring status,
|
|
|
|
|
(3) User asks to check what's being monitored,
|
|
|
|
|
(4) User deploys a new service and needs monitoring added,
|
|
|
|
|
(5) User mentions "uptime", "monitoring", "health check", or "uptime kuma".
|
|
|
|
|
Uptime Kuma v2 running in Kubernetes, managed via uptime-kuma-api Python library.
|
|
|
|
|
author: Claude Code
|
|
|
|
|
version: 1.0.0
|
|
|
|
|
date: 2026-02-14
|
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
# Uptime Kuma Monitoring Management
|
|
|
|
|
|
|
|
|
|
## Overview
|
|
|
|
|
- **URL**: `https://uptime.viktorbarzin.me`
|
|
|
|
|
- **Internal**: `uptime-kuma.uptime-kuma.svc.cluster.local:80`
|
|
|
|
|
- **Image**: `louislam/uptime-kuma:2`
|
|
|
|
|
- **Storage**: NFS at `/mnt/main/uptime-kuma` -> `/app/data`
|
2026-02-17 21:53:32 +00:00
|
|
|
- **API Library**: `uptime-kuma-api` (pip, available via PYTHONPATH)
|
|
|
|
|
- **Credentials**: admin / (from `UPTIME_KUMA_PASSWORD` env var)
|
2026-02-15 12:33:34 +00:00
|
|
|
|
|
|
|
|
## Python API Access
|
|
|
|
|
|
|
|
|
|
### Connection Pattern
|
|
|
|
|
```python
|
2026-02-17 21:53:32 +00:00
|
|
|
import os
|
2026-02-15 12:33:34 +00:00
|
|
|
from uptime_kuma_api import UptimeKumaApi, MonitorType
|
|
|
|
|
|
|
|
|
|
api = UptimeKumaApi('https://uptime.viktorbarzin.me')
|
2026-02-17 21:53:32 +00:00
|
|
|
api.login('admin', os.environ.get('UPTIME_KUMA_PASSWORD', ''))
|
2026-02-15 12:33:34 +00:00
|
|
|
|
|
|
|
|
# ... operations ...
|
|
|
|
|
|
|
|
|
|
api.disconnect()
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
### Execution
|
|
|
|
|
```bash
|
2026-02-17 21:53:32 +00:00
|
|
|
python3 -c "
|
|
|
|
|
import os
|
2026-02-15 12:33:34 +00:00
|
|
|
from uptime_kuma_api import UptimeKumaApi, MonitorType
|
|
|
|
|
api = UptimeKumaApi('https://uptime.viktorbarzin.me')
|
2026-02-17 21:53:32 +00:00
|
|
|
api.login('admin', os.environ.get('UPTIME_KUMA_PASSWORD', ''))
|
2026-02-15 12:33:34 +00:00
|
|
|
# ... your code ...
|
|
|
|
|
api.disconnect()
|
|
|
|
|
"
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
### Common Operations
|
|
|
|
|
|
|
|
|
|
#### List All Monitors
|
|
|
|
|
```python
|
|
|
|
|
monitors = api.get_monitors()
|
|
|
|
|
for m in monitors:
|
|
|
|
|
print(f'{m["id"]:3d} | {m["name"]:30s} | {m["type"]:15s} | interval={m["interval"]}s')
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
#### Add HTTP Monitor
|
|
|
|
|
```python
|
|
|
|
|
api.add_monitor(
|
|
|
|
|
type=MonitorType.HTTP,
|
|
|
|
|
name="Service Name",
|
|
|
|
|
url="http://service.namespace.svc.cluster.local",
|
|
|
|
|
interval=120,
|
|
|
|
|
maxretries=2,
|
|
|
|
|
)
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
#### Add PING Monitor
|
|
|
|
|
```python
|
|
|
|
|
api.add_monitor(
|
|
|
|
|
type=MonitorType.PING,
|
|
|
|
|
name="Host Name",
|
|
|
|
|
hostname="10.0.20.1",
|
|
|
|
|
interval=30,
|
|
|
|
|
maxretries=3,
|
|
|
|
|
)
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
#### Add PORT Monitor
|
|
|
|
|
```python
|
|
|
|
|
api.add_monitor(
|
|
|
|
|
type=MonitorType.PORT,
|
|
|
|
|
name="Service Port",
|
|
|
|
|
hostname="service.namespace.svc.cluster.local",
|
|
|
|
|
port=8080,
|
|
|
|
|
interval=120,
|
|
|
|
|
maxretries=2,
|
|
|
|
|
)
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
#### Edit Monitor
|
|
|
|
|
```python
|
|
|
|
|
api.edit_monitor(monitor_id, interval=120, maxretries=2)
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
#### Delete Monitor
|
|
|
|
|
```python
|
|
|
|
|
api.delete_monitor(monitor_id)
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
#### Pause/Resume Monitor
|
|
|
|
|
```python
|
|
|
|
|
api.pause_monitor(monitor_id)
|
|
|
|
|
api.resume_monitor(monitor_id)
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
## Monitor Types
|
|
|
|
|
- `MonitorType.HTTP` — HTTP(S) endpoint check
|
|
|
|
|
- `MonitorType.PING` — ICMP ping
|
|
|
|
|
- `MonitorType.PORT` — TCP port check
|
|
|
|
|
- `MonitorType.POSTGRES` — PostgreSQL connection
|
|
|
|
|
- `MonitorType.REDIS` — Redis connection
|
|
|
|
|
- `MonitorType.DNS` — DNS resolution check
|
|
|
|
|
|
|
|
|
|
## Tiered Monitoring System
|
|
|
|
|
|
|
|
|
|
Monitors use tiered intervals to balance responsiveness with resource usage:
|
|
|
|
|
|
|
|
|
|
| Tier | Interval | Retries | Use For |
|
|
|
|
|
|------|----------|---------|---------|
|
|
|
|
|
| **1 - Critical** | 30s | 3 | Core infra (DNS, gateway, ingress, NFS, K8s API, auth, mail) |
|
|
|
|
|
| **2 - Important** | 120s | 2 | Actively used services (Nextcloud, Immich, Vaultwarden, etc.) |
|
|
|
|
|
| **3 - Standard** | 300s | 1 | Auxiliary/optional services (blog, games, tools) |
|
|
|
|
|
|
|
|
|
|
### Tier Assignment Guidelines
|
|
|
|
|
- **Tier 1**: If it goes down, multiple other services fail or the cluster is unreachable
|
|
|
|
|
- **Tier 2**: User-facing services that are actively used daily
|
|
|
|
|
- **Tier 3**: Nice-to-have services, tools, dashboards
|
|
|
|
|
|
|
|
|
|
### When Adding a New Service
|
|
|
|
|
Match the tier to the service's DEFCON level from CLAUDE.md:
|
|
|
|
|
- DEFCON 1-2 → Tier 1 (30s)
|
|
|
|
|
- DEFCON 3-4 → Tier 2 (120s)
|
|
|
|
|
- DEFCON 5 → Tier 3 (300s)
|
|
|
|
|
|
|
|
|
|
## Internal Service URL Pattern
|
|
|
|
|
Most K8s services follow: `http://<service-name>.<namespace>.svc.cluster.local:<port>`
|
|
|
|
|
|
|
|
|
|
Common port is 80. Exceptions:
|
|
|
|
|
- Homepage: port 3000
|
|
|
|
|
- Ollama: port 11434
|
|
|
|
|
- Loki: port 3100 (use `/ready` endpoint)
|
|
|
|
|
- Traefik dashboard: port 8080 (use `/dashboard/` path)
|
|
|
|
|
- K8s API: `https://10.0.20.100:6443`
|
|
|
|
|
- Immich: port 2283 (use `/api/server/ping`)
|
|
|
|
|
|
|
|
|
|
## Notes
|
|
|
|
|
1. Uptime Kuma uses Socket.IO (WebSocket) for its API, not REST
|
|
|
|
|
2. The `uptime-kuma-api` Python library wraps Socket.IO
|
|
|
|
|
3. Add `time.sleep(0.3)` between bulk operations to avoid overloading
|
|
|
|
|
4. Homepage dashboard widget slug: `cluster-internal`
|
|
|
|
|
5. Cloudflare-proxied at `uptime.viktorbarzin.me`
|
[uptime-kuma] Codify MySQL monitor (id=663) via idempotent sync CronJob
## Context
Monitor id 663 "MySQL Standalone (dbaas)" was created manually yesterday via
the `uptime-kuma-api` Python library when the dbaas stack migrated from
InnoDB Cluster to standalone MySQL. It worked and was UP, but lived only in
Uptime Kuma's MariaDB — if UK's DB were wiped or restored from an older
backup, the monitor would be lost.
## This change
Adds declarative, self-healing management for internal-service monitors
(databases, non-HTTP endpoints) that can't be discovered from ingress
annotations. Modelled on the existing `external-monitor-sync` CronJob.
- `local.internal_monitors` — list of desired monitors (name, type,
connection string, Vault password key, interval, retries). Seeded with
the MySQL Standalone monitor. Add new entries here to manage more.
- `kubernetes_secret.internal_monitor_sync` — pulls admin password and all
referenced DB passwords from Vault `secret/viktor` at apply time. Secret
key names are derived from monitor name (`DB_PASSWORD_<upper_snake>`).
- `kubernetes_config_map_v1.internal_monitor_targets` — renders the target
list to JSON for the sync container.
- `kubernetes_cron_job_v1.internal_monitor_sync` — runs every 10 min,
looks up monitors by name, creates if missing, patches if drifted,
leaves id and history untouched when already in desired state.
## Why this approach (Option B, not a Terraform provider)
The `louislam/uptime-kuma` Terraform provider does NOT exist in the public
registry (verified — only a CLI tool of the same name). Option A from the
task brief was therefore unavailable. Option B (idempotent K8s CronJob)
matches the established pattern in the same module for
`external-monitor-sync` — no new machinery introduced.
## Monitor 663: no-op on first sync
Manual import was not possible (no provider → no state to import). The
sync job correctly identifies the existing monitor by name and reports:
Monitor MySQL Standalone (dbaas) (id=663) already in desired state
Internal monitor sync complete
DB heartbeats confirm monitor 663 stayed UP throughout with `status=1` and
`Rows: 1` responses every 60s — no disruption.
## Vault key — left manual (by design)
`secret/viktor` is not Terraform-managed anywhere in the repo (only read
via `data "vault_kv_secret_v2"`). It is a user-edited Vault entry holding
135 keys. The `uptimekuma_db_password` key was added manually yesterday;
this change does NOT codify it. Codifying the whole `secret/viktor` entry
is out of scope for this task (would need a separate migration + rotation
story). The sync job reads the existing value at apply time — so if the
value is ever rotated in Vault, the next sync picks it up.
## Plan + apply
Plan: 3 to add, 0 to change, 0 to destroy.
Apply complete! Resources: 3 added, 0 changed, 0 destroyed.
Re-plan: No changes. Your infrastructure matches the configuration.
Also updated `.claude/skills/uptime-kuma/SKILL.md` with the new pattern.
Closes: code-ed2
2026-04-18 12:04:17 +00:00
|
|
|
|
|
|
|
|
## Terraform-Managed Monitors
|
|
|
|
|
|
|
|
|
|
There is NO `louislam/uptime-kuma` Terraform provider. Two patterns exist for
|
|
|
|
|
declarative monitor management in this stack:
|
|
|
|
|
|
|
|
|
|
- **External HTTPS monitors** — auto-discovered from ingress annotations by the
|
|
|
|
|
`external-monitor-sync` CronJob (`*/10 * * * *`). Opt-out via
|
|
|
|
|
`uptime.viktorbarzin.me/external-monitor: "false"` on the ingress.
|
|
|
|
|
- **Internal monitors (DBs, non-HTTP)** — declared in the
|
|
|
|
|
`local.internal_monitors` list in `stacks/uptime-kuma/modules/uptime-kuma/main.tf`
|
|
|
|
|
and synced by the `internal-monitor-sync` CronJob. To add one, append to the
|
|
|
|
|
list (provide `name`, `type`, `database_connection_string`,
|
|
|
|
|
`database_password_vault_key`, `interval`, `retry_interval`, `max_retries`)
|
|
|
|
|
and `scripts/tg apply`. The sync is idempotent — looks up by name, creates
|
|
|
|
|
if missing, patches if drifted. Existing monitors keep their id and history.
|