infra/stacks/monitoring
Viktor Barzin dd2a8e640f monitoring: right-size loki memory request 3Gi->1Gi (quota 89%->79%)
monitoring-quota requests.memory sat at 89% (18.2/20Gi), tripping the
ResourceQuota>80% WARN. Root cause was over-provisioned requests, not real
usage: loki requested 3Gi but its VPA upperBound is 364Mi and actual ~315Mi.
prometheus's 4Gi is legitimately required (2Gi tmpfs WAL shares the cgroup;
OOMs at 3Gi during WAL replay) so it stays; grafana's main container is
already 512Mi. Trimmed loki to 1Gi request (~3x its observed ceiling; 4Gi
Burstable limit preserves query-spike headroom) -> quota 78.8%, clears the
WARN. NOTE: alloy DaemonSet (562Mi/node) grows with node count, so revisit
(bump the 20Gi quota) as the cluster expands.
2026-06-05 09:19:11 +00:00
..
modules/monitoring monitoring: right-size loki memory request 3Gi->1Gi (quota 89%->79%) 2026-06-05 09:19:11 +00:00
main.tf [forgejo] Tolerate missing Vault keys during Phase 0 bootstrap 2026-05-07 15:53:08 +00:00
secrets extract monitoring, nvidia, mailserver, cloudflared, kyverno from platform [ci skip] 2026-03-17 21:34:11 +00:00
terragrunt.hcl extract monitoring, nvidia, mailserver, cloudflared, kyverno from platform [ci skip] 2026-03-17 21:34:11 +00:00