Cluster health remediation: cleanup CronJob, disable Collabora, fix GPU probe, add NFS exports [ci skip]
- Add daily CronJob to auto-clean Failed/Evicted pods cluster-wide (infra-maintenance) - Disable Collabora in Nextcloud (broken HPA caused scaling storm; using OnlyOffice instead) - Increase gpu-pod-exporter liveness probe timeout from 1s to 5s - Add osm-routing NFS exports (osrm-data, otp-data)
This commit is contained in:
parent
3da35166ab
commit
a73f3fcb6b
4 changed files with 71 additions and 5 deletions
|
|
@ -605,6 +605,7 @@ resource "kubernetes_daemonset" "gpu_pod_exporter" {
|
|||
}
|
||||
initial_delay_seconds = 30
|
||||
period_seconds = 30
|
||||
timeout_seconds = 5
|
||||
}
|
||||
}
|
||||
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue