Add 3 new alert groups and 1 rule to existing group: - Storage: NodeFilesystemFull (<10% free), PVFillingUp (>85% used) - K8s Health: PodCrashLooping, ContainerOOMKilled, NodeNotReady, NodeConditionBad, JobFailed - Infrastructure Health: CoreDNSErrors, ScrapeTargetDown, PrometheusStorageFull, PrometheusNotificationsFailing - R730 Host: FanFailure (iDRAC Redfish fan health) |
||
|---|---|---|
| .. | ||
| create-template-vm | ||
| create-vm | ||
| docker-registry | ||
| kubernetes | ||