infra/stacks/platform
Viktor Barzin 9490f1afad fix: eliminate memory overcommit to prevent node OOM crashes
Set requests = limits (Guaranteed QoS) across LimitRange defaults and
explicit pod resources. Node2 crashed 2026-03-14 from 250% memory
overcommit (61GB limits on 24GB node).

Changes:
- LimitRange: default = defaultRequest for all 6 tiers
- Grafana: 3 → 2 replicas
- Grampsweb: document why replicas=0
- Prometheus: 1Gi/4Gi → 3Gi/3Gi
- OpenClaw: 512Mi/2Gi → 768Mi/768Mi
- Immich server: 256Mi/2Gi → 512Mi/512Mi
- Immich postgresql: 256Mi/1Gi → 512Mi/512Mi
- Calibre: 256Mi/1536Mi → 256Mi/256Mi
- Linkwarden: 256Mi/1536Mi → 768Mi/768Mi
- N8N: 256Mi/1Gi → 512Mi/512Mi
- MySQL cluster: 1Gi/3-4Gi → 2Gi/2Gi
- pg-cluster (CNPG): 512Mi/4Gi → 512Mi/512Mi
- DBaaS ResourceQuota limits.memory: 64Gi → 12Gi

[ci skip]
2026-03-18 08:03:59 +00:00
..
modules fix: eliminate memory overcommit to prevent node OOM crashes 2026-03-18 08:03:59 +00:00
.gitkeep [ci skip] Add Terragrunt directory skeleton and root config 2026-02-22 13:01:37 +00:00
.terraform.lock.hcl Woodpecker CI deploy commit [CI SKIP] 2026-03-07 20:47:22 +00:00
backend.tf Woodpecker CI deploy commit [CI SKIP] 2026-03-07 20:47:22 +00:00
main.tf Remove all CPU limits cluster-wide to eliminate CFS throttling 2026-03-18 08:03:58 +00:00
providers.tf [ci skip] fix false-positive sensitive=true on kube_config_path 2026-03-07 15:48:19 +00:00
redis-25.3.2.tgz [ci skip] add auto-generated tiers.tf, planning docs, and helm chart cache 2026-03-06 23:55:57 +00:00
secrets [ci skip] Migrate 22 platform service states to stacks/platform 2026-02-22 13:35:10 +00:00
terragrunt.hcl [ci skip] Add platform stack (core services) for Terragrunt migration 2026-02-22 13:21:09 +00:00
tiers.tf [ci skip] Phase 1: PostgreSQL migrated to CNPG on local disk 2026-02-28 19:08:06 +00:00