infra/stacks/platform/modules
Viktor Barzin 23019da8e5 equalize memory req=lim across 70+ containers using Prometheus 7d max data
After node2 OOM incident, right-size memory across the cluster by setting
requests=limits based on max_over_time(container_memory_working_set_bytes[7d])
with 1.3x headroom. Eliminates ~37Gi overcommit gap.

Categories:
- Safe equalization (50 containers): set req=lim where max7d well within target
- Limit increases (8 containers): raise limits for services spiking above current
- No Prometheus data (12 containers): conservatively set lim=req
- Exception: nextcloud keeps req=256Mi/lim=8Gi due to Apache memory spikes

Also increased dbaas namespace quota from 12Gi to 16Gi to accommodate mysql
4Gi limits across 3 replicas.
2026-03-14 21:46:49 +00:00
..
authentik equalize memory req=lim across 70+ containers using Prometheus 7d max data 2026-03-14 21:46:49 +00:00
cloudflared equalize memory req=lim across 70+ containers using Prometheus 7d max data 2026-03-14 21:46:49 +00:00
cnpg equalize memory req=lim across 70+ containers using Prometheus 7d max data 2026-03-14 21:46:49 +00:00
crowdsec equalize memory req=lim across 70+ containers using Prometheus 7d max data 2026-03-14 21:46:49 +00:00
dbaas equalize memory req=lim across 70+ containers using Prometheus 7d max data 2026-03-14 21:46:49 +00:00
headscale equalize memory req=lim across 70+ containers using Prometheus 7d max data 2026-03-14 21:46:49 +00:00
infra-maintenance [ci skip] iSCSI migration, healthcheck fixes, health probes, etcd backup 2026-03-06 19:54:21 +00:00
iscsi-csi equalize memory req=lim across 70+ containers using Prometheus 7d max data 2026-03-14 21:46:49 +00:00
k8s-portal equalize memory req=lim across 70+ containers using Prometheus 7d max data 2026-03-14 21:46:49 +00:00
kyverno equalize memory req=lim across 70+ containers using Prometheus 7d max data 2026-03-14 21:46:49 +00:00
mailserver equalize memory req=lim across 70+ containers using Prometheus 7d max data 2026-03-14 21:46:49 +00:00
metallb [ci skip] Move Terraform modules into stack directories 2026-02-22 14:38:14 +00:00
metrics-server equalize memory req=lim across 70+ containers using Prometheus 7d max data 2026-03-14 21:46:49 +00:00
monitoring equalize memory req=lim across 70+ containers using Prometheus 7d max data 2026-03-14 21:46:49 +00:00
nfs-csi equalize memory req=lim across 70+ containers using Prometheus 7d max data 2026-03-14 21:46:49 +00:00
nvidia equalize memory req=lim across 70+ containers using Prometheus 7d max data 2026-03-14 21:46:49 +00:00
rbac Remove all CPU limits cluster-wide to eliminate CFS throttling 2026-03-14 08:51:45 +00:00
redis equalize memory req=lim across 70+ containers using Prometheus 7d max data 2026-03-14 21:46:49 +00:00
reverse_proxy Remove all CPU limits cluster-wide to eliminate CFS throttling 2026-03-14 08:51:45 +00:00
sealed-secrets equalize memory req=lim across 70+ containers using Prometheus 7d max data 2026-03-14 21:46:49 +00:00
technitium equalize memory req=lim across 70+ containers using Prometheus 7d max data 2026-03-14 21:46:49 +00:00
traefik equalize memory req=lim across 70+ containers using Prometheus 7d max data 2026-03-14 21:46:49 +00:00
uptime-kuma Remove all CPU limits cluster-wide to eliminate CFS throttling 2026-03-14 08:51:45 +00:00
vaultwarden equalize memory req=lim across 70+ containers using Prometheus 7d max data 2026-03-14 21:46:49 +00:00
vpa equalize memory req=lim across 70+ containers using Prometheus 7d max data 2026-03-14 21:46:49 +00:00
wireguard equalize memory req=lim across 70+ containers using Prometheus 7d max data 2026-03-14 21:46:49 +00:00
xray equalize memory req=lim across 70+ containers using Prometheus 7d max data 2026-03-14 21:46:49 +00:00