infra/stacks/monitoring/modules/monitoring
Viktor Barzin 6504911a77 matrix: open (tokenless) registration + bot mitigations + #security alert
User-chosen fully-open registration on tuwunel (no CAPTCHA support; browser
challenges break native clients). Bot defense is layered instead:
- Traefik rate-limit Middleware on a path-scoped /register ingress carve-out,
  keyed on request Host (GLOBAL /register cap) not source IP — the host is
  reachable via both Cloudflare-IPv4 (CF-Connecting-IP) and IPv6-direct (HE
  tunnel, no CF header), so a per-source key let IPv6 bots bypass. 10/min,
  burst 20, per replica; CrowdSec is the hard backstop on both paths.
- Loki ruler rule MatrixNewUserRegistered -> lane=security -> existing
  #security Slack receiver (matches "registered on this server", never the
  rejection line). tuwunel's admin bot also posts signups to the admin room.

Dropped the REGISTRATION_TOKEN env (secret/matrix + ESO kept for revert).
Applied via scripts/tg (matrix tier-1 + targeted monitoring configmap), so
[ci skip] to avoid CI full-applying monitoring (unrelated grafana-acl drift).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-08 13:27:02 +00:00
..
dashboards monitoring(grafana): add professional "Cluster Logs" dashboard (Logs folder) 2026-06-05 17:03:45 +00:00
server-power-cycle Add broker-sync Terraform stack (#7) 2026-04-17 21:17:45 +01:00
alloy.yaml apiserver: enable audit logging (low-write Metadata) + ship to Loki 2026-06-06 16:51:26 +00:00
authentik_walloff_probe.tf Reapply "tripit: Gmail ingest (12-month) + vbarzin owner + plans@ forward-to-parse" 2026-06-03 10:24:25 +00:00
Dockerfile extract monitoring, nvidia, mailserver, cloudflared, kyverno from platform [ci skip] 2026-03-17 21:34:11 +00:00
goflow2.tf monitoring: KEEL/tier ignore_changes on 5 exporters [ci skip] 2026-05-31 15:33:30 +00:00
grafana.tf monitoring(grafana): add professional "Cluster Logs" dashboard (Logs folder) 2026-06-05 17:03:45 +00:00
grafana_chart_values.yaml monitoring: protect grafana ingress with authentik + disable anonymous 2026-05-10 17:01:50 +00:00
idrac.tf monitoring: migrate R730 iDRAC scraping to SNMP (fast primary) + thin Redfish remnant 2026-06-05 16:33:20 +00:00
k8s-monitoring-values.yaml cleanup: remove calibre and audiobookshelf stacks after ebooks migration [ci skip] 2026-03-25 23:56:07 +02:00
loki.tf matrix: open (tokenless) registration + bot mitigations + #security alert 2026-06-08 13:27:02 +00:00
loki.yaml monitoring: right-size loki memory request 3Gi->1Gi (quota 89%->79%) 2026-06-05 09:19:11 +00:00
loki_ingress.tf monitoring: fix ingress auth-comment guard for loki-write-ingress 2026-06-05 13:36:43 +00:00
main.tf cluster-health: emergency-stop Keel + roll back image downgrades + quota raises 2026-05-26 18:48:50 +00:00
prometheus.tf monitoring: add local-only prometheus-query.lan ingress for ha-sofia SNMP sensors 2026-06-05 17:25:06 +00:00
prometheus_chart_values.tpl monitoring: NodeFilesystemFull 90%->95% + Synology storage runbook 2026-06-05 18:18:31 +00:00
prometheus_snmp_chart_values.yaml extract monitoring, nvidia, mailserver, cloudflared, kyverno from platform [ci skip] 2026-03-17 21:34:11 +00:00
pve_exporter.tf monitoring: KEEL/tier ignore_changes on 5 exporters [ci skip] 2026-05-31 15:33:30 +00:00
snmp_exporter.tf monitoring: KEEL/tier ignore_changes on 5 exporters [ci skip] 2026-05-31 15:33:30 +00:00
ups_snmp_values.yaml monitoring: migrate R730 iDRAC scraping to SNMP (fast primary) + thin Redfish remnant 2026-06-05 16:33:20 +00:00