infra/stacks/frigate
Viktor Barzin 194281e527 right-size cluster memory: reduce overprovisioned, fix under-provisioned services
Phase 1 - Quick wins (~4.5 Gi saved):
- democratic-csi: add explicit sidecar resources (64-80Mi vs 256Mi LimitRange default)
- caretta: 768Mi → 600Mi (VPA upper 485Mi)
- immich-ml: 4Gi → 3584Mi (VPA upper 2.95Gi, GPU margin)
- onlyoffice: 3Gi → 2304Mi (VPA upper 1.82Gi)

Phase 2 - Safety fixes (prevent OOMKills):
- frigate: 2Gi/8Gi → 5Gi/10Gi (VPA upper 7.7Gi, was 4% headroom)
- openclaw: 1280Mi req → 2Gi req=limit (documented 2Gi requirement)

Phase 3 - Additional right-sizing:
- authentik workers: 1Gi → 896Mi x3 (VPA upper 722Mi)
- shlink: 512Mi/768Mi → 960Mi req=limit (VPA upper 780Mi, safety increase)

Phase 4 - Burstable QoS for lower tiers:
- tier-3-edge: 128Mi/128Mi → 96Mi req / 192Mi limit
- tier-4-aux: 128Mi/128Mi → 64Mi req / 256Mi limit

Phase 5 - Monitoring:
- Add ClusterMemoryRequestsHigh alert (>85% allocatable, 15m)
- Add ContainerNearOOM alert (>85% limit, 30m)
- Add PodUnschedulable alert (5m, critical)

Cluster: 92.7% → 90.8% memory requests. Stirling-pdf now schedulable.
2026-03-15 15:30:18 +00:00
..
.terraform.lock.hcl add vaultwarden daily backup CronJob to NFS 2026-03-15 00:03:59 +00:00
backend.tf [ci skip] Move Terraform modules into stack directories 2026-02-22 14:38:14 +00:00
main.tf right-size cluster memory: reduce overprovisioned, fix under-provisioned services 2026-03-15 15:30:18 +00:00
providers.tf add vaultwarden daily backup CronJob to NFS 2026-03-15 00:03:59 +00:00
secrets [ci skip] Move Terraform modules into stack directories 2026-02-22 14:38:14 +00:00
terragrunt.hcl [ci skip] Phase 3: Create 66 service stacks and migrate state 2026-02-22 13:56:34 +00:00
tiers.tf [ci skip] add auto-generated tiers.tf, planning docs, and helm chart cache 2026-03-06 23:55:57 +00:00