monitoring: right-size loki memory request 3Gi->1Gi (quota 89%->79%)
monitoring-quota requests.memory sat at 89% (18.2/20Gi), tripping the ResourceQuota>80% WARN. Root cause was over-provisioned requests, not real usage: loki requested 3Gi but its VPA upperBound is 364Mi and actual ~315Mi. prometheus's 4Gi is legitimately required (2Gi tmpfs WAL shares the cgroup; OOMs at 3Gi during WAL replay) so it stays; grafana's main container is already 512Mi. Trimmed loki to 1Gi request (~3x its observed ceiling; 4Gi Burstable limit preserves query-spike headroom) -> quota 78.8%, clears the WARN. NOTE: alloy DaemonSet (562Mi/node) grows with node count, so revisit (bump the 20Gi quota) as the cluster expands.
This commit is contained in:
parent
90d7c11c16
commit
dd2a8e640f
1 changed files with 7 additions and 1 deletions
|
|
@ -70,7 +70,13 @@ singleBinary:
|
||||||
resources:
|
resources:
|
||||||
requests:
|
requests:
|
||||||
cpu: 250m
|
cpu: 250m
|
||||||
memory: 3Gi
|
# Right-sized 2026-06-04 (3Gi->1Gi): VPA upperBound 364Mi, actual ~315Mi.
|
||||||
|
# 1Gi request is ~3x the observed ceiling; the 4Gi limit (Burstable)
|
||||||
|
# keeps headroom for query spikes. Frees 2Gi of monitoring-quota
|
||||||
|
# requests.memory, taking it 89%->~79% (under the >80% WARN). NOTE: the
|
||||||
|
# alloy DaemonSet (562Mi/node) grows with node count, so this can creep
|
||||||
|
# back over 80% as the cluster expands — bump the quota then.
|
||||||
|
memory: 1Gi
|
||||||
limits:
|
limits:
|
||||||
memory: 4Gi
|
memory: 4Gi
|
||||||
|
|
||||||
|
|
|
||||||
Loading…
Add table
Add a link
Reference in a new issue