equalize memory req=lim across 70+ containers using Prometheus 7d max data

After node2 OOM incident, right-size memory across the cluster by setting
requests=limits based on max_over_time(container_memory_working_set_bytes[7d])
with 1.3x headroom. Eliminates ~37Gi overcommit gap.

Categories:
- Safe equalization (50 containers): set req=lim where max7d well within target
- Limit increases (8 containers): raise limits for services spiking above current
- No Prometheus data (12 containers): conservatively set lim=req
- Exception: nextcloud keeps req=256Mi/lim=8Gi due to Apache memory spikes

Also increased dbaas namespace quota from 12Gi to 16Gi to accommodate mysql
4Gi limits across 3 replicas.
This commit is contained in:
Viktor Barzin 2026-03-14 21:46:49 +00:00
parent eb0301b02b
commit 23019da8e5
39 changed files with 211 additions and 74 deletions

View file

@ -38,8 +38,8 @@ resource "helm_release" "democratic_csi" {
replicas = 2
driver = {
resources = {
requests = { cpu = "25m", memory = "64Mi" }
limits = { memory = "256Mi" }
requests = { cpu = "25m", memory = "192Mi" }
limits = { memory = "192Mi" }
}
}
}
@ -47,8 +47,8 @@ resource "helm_release" "democratic_csi" {
node = {
driver = {
resources = {
requests = { cpu = "25m", memory = "64Mi" }
limits = { memory = "256Mi" }
requests = { cpu = "25m", memory = "192Mi" }
limits = { memory = "192Mi" }
}
}