The existing probe only checked nvidia-smi + API availability, which passes even when the detector falls back to CPU. Now also checks /api/stats and restarts the pod if inference speed exceeds 100ms (normal GPU: ~20ms, CPU fallback: 200ms+). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> |
||
|---|---|---|
| .. | ||
| .terraform.lock.hcl | ||
| backend.tf | ||
| main.tf | ||
| providers.tf | ||
| secrets | ||
| terragrunt.hcl | ||
| tiers.tf | ||