infra/stacks/immich
Viktor Barzin 43b49f7f6c cluster recovery: fix resource limits and node1 memory
- nvidia quota: requests.memory 8Gi → 12Gi (unblock cuda-validator)
- calibre: startup probe initial_delay 60→120s, timeout 1→5s,
  wait_for_rollout=false (DOCKER_MODS install takes 10+ min)
- immich ML: memory 2Gi → 4Gi (OOMKilled loading CLIP models)

Also done outside TF (not in this commit):
- node1 VM: 16 GiB → 24 GiB RAM (Proxmox)
- tigera-operator: kubectl patch 128→256Mi
- nvidia-driver-daemonset: kubectl patch 1→4Gi memory
- kyverno reports-controller: kubectl patch 128→256Mi
- CNPG operator: kubectl rollout restart
2026-03-15 01:44:28 +00:00
..
.terraform.lock.hcl add vaultwarden daily backup CronJob to NFS 2026-03-15 00:03:59 +00:00
backend.tf [ci skip] Move Terraform modules into stack directories 2026-02-22 14:38:14 +00:00
chart_values.tpl [ci skip] Flatten module wrappers into stack roots 2026-02-22 15:13:55 +00:00
frame.tf migrate all secrets from SOPS to Vault KV 2026-03-14 17:15:48 +00:00
main.tf cluster recovery: fix resource limits and node1 memory 2026-03-15 01:44:28 +00:00
providers.tf add vaultwarden daily backup CronJob to NFS 2026-03-15 00:03:59 +00:00
secrets [ci skip] Move Terraform modules into stack directories 2026-02-22 14:38:14 +00:00
terragrunt.hcl migrate all secrets from SOPS to Vault KV 2026-03-14 17:15:48 +00:00
tiers.tf [ci skip] add auto-generated tiers.tf, planning docs, and helm chart cache 2026-03-06 23:55:57 +00:00