infra/stacks/monitoring/modules/monitoring
Viktor Barzin 2711d4af05 monitoring/loki: bump memory request 2Gi → 3Gi (close gap to 4Gi limit)
krr 2026-05-22 flagged loki as under-requested by 1.9 GiB. Live working
set is sitting at ~3 GiB during normal ingestion; the existing 2 GiB
request meant scheduler didn't reserve enough room and the pod risked
eviction. Limit stays at 4 GiB (documented ceiling in loki.yaml).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-24 01:10:55 +00:00
..
dashboards fire-planner: COL refresh CronJob + Grafana Cost-of-Living dashboard 2026-05-22 14:15:38 +00:00
server-power-cycle Add broker-sync Terraform stack (#7) 2026-04-17 21:17:45 +01:00
alloy.yaml alloy: switch pod log shipping from apiserver to file-tail 2026-05-21 08:27:34 +00:00
Dockerfile extract monitoring, nvidia, mailserver, cloudflared, kyverno from platform [ci skip] 2026-03-17 21:34:11 +00:00
goflow2.tf [infra] Sweep dns_config ignore_changes across all pod-owning resources [ci skip] 2026-04-18 21:19:48 +00:00
grafana.tf fire-planner: COL refresh CronJob + Grafana Cost-of-Living dashboard 2026-05-22 14:15:38 +00:00
grafana_chart_values.yaml monitoring: protect grafana ingress with authentik + disable anonymous 2026-05-10 17:01:50 +00:00
idrac.tf infra: document auth = "app|none" tier on every legacy ingress 2026-05-11 19:25:48 +00:00
k8s-monitoring-values.yaml cleanup: remove calibre and audiobookshelf stacks after ebooks migration [ci skip] 2026-03-25 23:56:07 +02:00
loki.tf security(wave1): W1.1 audit-log shipping LIVE + W1.5 trusted-registries Enforce LIVE 2026-05-19 06:37:54 +00:00
loki.yaml monitoring/loki: bump memory request 2Gi → 3Gi (close gap to 4Gi limit) 2026-05-24 01:10:55 +00:00
main.tf keel: enroll 15 critical-path namespaces for digest-only auto-update 2026-05-17 12:13:22 +00:00
prometheus.tf fix: HA Sofia REST sensors + PVC drift safety 2026-05-10 21:48:29 +00:00
prometheus_chart_values.tpl monitoring: MetalLBSpeakerDown for: 2m → 10m (was upgrade-chain regression) 2026-05-23 09:32:41 +00:00
prometheus_snmp_chart_values.yaml extract monitoring, nvidia, mailserver, cloudflared, kyverno from platform [ci skip] 2026-03-17 21:34:11 +00:00
pve_exporter.tf [infra] Sweep dns_config ignore_changes across all pod-owning resources [ci skip] 2026-04-18 21:19:48 +00:00
snmp_exporter.tf infra: document auth = "app|none" tier on every legacy ingress 2026-05-11 19:25:48 +00:00
ups_snmp_values.yaml extract monitoring, nvidia, mailserver, cloudflared, kyverno from platform [ci skip] 2026-03-17 21:34:11 +00:00