infra

Viktor Barzin a89d4a7d2a anubis: pull f1 off Anubis (XHR-vs-challenge collision) + add latency alerts f1.viktorbarzin.me is a SPA whose JS fetches /schedule, /embed, /embed-asset, … on the same path tree. With Anubis fronting `/`, those XHRs land on the challenge HTML even when the cookie should be valid, breaking the page with `Unexpected token '<', "<!doctype " ... is not valid JSON`. Removed Anubis from f1 — would need a path carve-out (the way wrongmove does for /api) to re-enable. Added a top-of-block comment so future me remembers why. Plus four new Prometheus alerts in `Slow Ingress Latency` group (stacks/monitoring/.../prometheus_chart_values.tpl): - IngressTTFBHigh (warn, 10m, avg latency >1s) - IngressTTFBCritical (crit, 5m, avg latency >3s) - IngressErrorRate5xxHigh (crit, 5m, 5xx >5%) - AnubisChallengeStoreErrors (crit, 5m, any 5xx on anubis services via Traefik — proxies for the in-pod challenge-store error since Anubis itself only exposes Go-runtime metrics) Notes from the alert author: avg-not-p95 because the existing Prometheus scrape config drops traefik bucket series; once those are restored, swap to histogram_quantile(0.95). TraefikDown inhibit rule extended to suppress these four during a Traefik outage.		2026-05-10 11:12:40 +00:00
..
dashboards	monitoring(wealth): monthly contrib-vs-mkt as line chart, not bars	2026-05-07 23:29:35 +00:00
server-power-cycle	Add broker-sync Terraform stack (#7 )	2026-04-17 21:17:45 +01:00
alloy.yaml	extract monitoring, nvidia, mailserver, cloudflared, kyverno from platform [ci skip]	2026-03-17 21:34:11 +00:00
Dockerfile	extract monitoring, nvidia, mailserver, cloudflared, kyverno from platform [ci skip]	2026-03-17 21:34:11 +00:00
goflow2.tf	[infra] Sweep dns_config ignore_changes across all pod-owning resources [ci skip]	2026-04-18 21:19:48 +00:00
grafana.tf	openclaw: realtime usage dashboard via Prometheus exporter sidecar	2026-05-07 23:29:32 +00:00
grafana_chart_values.yaml	grafana: env-var datasources + reloader so Vault rotations stop breaking dashboards	2026-05-10 11:12:39 +00:00
idrac.tf	[infra] Sweep dns_config ignore_changes across all pod-owning resources [ci skip]	2026-04-18 21:19:48 +00:00
k8s-monitoring-values.yaml	cleanup: remove calibre and audiobookshelf stacks after ebooks migration [ci skip]	2026-03-25 23:56:07 +02:00
loki.tf	[infra] Sweep dns_config ignore_changes across all pod-owning resources [ci skip]	2026-04-18 21:19:48 +00:00
loki.yaml	[infra] TrueNAS decommission — remove active references from Terraform + configs	2026-04-19 16:57:05 +00:00
main.tf	[forgejo] Phases 3+4+5: cutover, decommission, docs sweep	2026-05-07 23:29:34 +00:00
prometheus.tf	[monitoring] Tuya Cloud root-cause alert + cascade suppression	2026-04-23 09:59:48 +00:00
prometheus_chart_values.tpl	anubis: pull f1 off Anubis (XHR-vs-challenge collision) + add latency alerts	2026-05-10 11:12:40 +00:00
prometheus_snmp_chart_values.yaml	extract monitoring, nvidia, mailserver, cloudflared, kyverno from platform [ci skip]	2026-03-17 21:34:11 +00:00
pve_exporter.tf	[infra] Sweep dns_config ignore_changes across all pod-owning resources [ci skip]	2026-04-18 21:19:48 +00:00
snmp_exporter.tf	[infra] Sweep dns_config ignore_changes across all pod-owning resources [ci skip]	2026-04-18 21:19:48 +00:00
ups_snmp_values.yaml	extract monitoring, nvidia, mailserver, cloudflared, kyverno from platform [ci skip]	2026-03-17 21:34:11 +00:00