fix cluster health: resolve 21/23 failures from healthcheck

- nvidia: change GPU taint NoSchedule -> PreferNoSchedule to allow overflow scheduling on k8s-node1 (frees ~7Gi capacity) - kyverno: increase reports-controller memory 256Mi -> 512Mi (OOMKilled) - speedtest: add missing DB_PORT=3306 env var (nc: service "" unknown) - realestate-crawler: increase API memory 64Mi -> 256Mi (OOMKilled) - calibre: increase liveness probe timeout 1s -> 5s (false restarts)
2026-03-15 02:33:46 +00:00 · 2026-03-15 02:33:46 +00:00 · 52bce849d7
commit 52bce849d7
parent 5f71a53b08
5 changed files with 10 additions and 5 deletions
--- a/stacks/calibre/main.tf
+++ b/stacks/calibre/main.tf
@ -215,6 +215,7 @@ resource "kubernetes_deployment" "calibre-web-automated" {
              path = "/"
              port = 8083
            }
+            timeout_seconds   = 5
            period_seconds    = 30
            failure_threshold = 3
          }