right-size memory: set requests=limits based on actual usage
- Set memory requests = limits across 56 stacks to prevent overcommit - Right-sized limits based on actual pod usage (2x actual, rounded up) - Scaled down trading-bot (replicas=0) to free memory - Fixed OOMKilled services: forgejo, dawarich, health, meshcentral, paperless-ngx, vault auto-unseal, rybbit, whisper, openclaw, clickhouse - Added startup+liveness probes to calibre-web - Bumped inotify limits on nodes 2,3 (max_user_instances 128->8192) Post node2 OOM incident (2026-03-14). Previous kubelet config had no kubeReserved/systemReserved set, allowing pods to starve the kernel.
This commit is contained in:
parent
41131e58ba
commit
60173ac35c
58 changed files with 123 additions and 121 deletions
|
|
@ -603,7 +603,7 @@ resource "kubernetes_deployment" "openclaw" {
|
|||
}
|
||||
requests = {
|
||||
cpu = "100m"
|
||||
memory = "512Mi"
|
||||
memory = "1536Mi"
|
||||
}
|
||||
}
|
||||
}
|
||||
|
|
@ -640,11 +640,11 @@ resource "kubernetes_deployment" "openclaw" {
|
|||
}
|
||||
resources {
|
||||
limits = {
|
||||
memory = "512Mi"
|
||||
memory = "256Mi"
|
||||
}
|
||||
requests = {
|
||||
cpu = "25m"
|
||||
memory = "64Mi"
|
||||
memory = "256Mi"
|
||||
}
|
||||
}
|
||||
}
|
||||
|
|
@ -879,7 +879,7 @@ resource "kubernetes_deployment" "task_webhook" {
|
|||
resources {
|
||||
requests = {
|
||||
cpu = "5m"
|
||||
memory = "32Mi"
|
||||
memory = "64Mi"
|
||||
}
|
||||
limits = {
|
||||
memory = "64Mi"
|
||||
|
|
@ -1023,7 +1023,7 @@ resource "kubernetes_cron_job_v1" "cluster_healthcheck" {
|
|||
memory = "64Mi"
|
||||
}
|
||||
limits = {
|
||||
memory = "128Mi"
|
||||
memory = "64Mi"
|
||||
}
|
||||
}
|
||||
}
|
||||
|
|
@ -1104,7 +1104,7 @@ resource "kubernetes_cron_job_v1" "task_processor" {
|
|||
memory = "64Mi"
|
||||
}
|
||||
limits = {
|
||||
memory = "128Mi"
|
||||
memory = "64Mi"
|
||||
}
|
||||
}
|
||||
}
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue