The GPU operator needs ~19 CPU limits across 16 pods (NFD, device plugin, driver, validators, exporters). The Kyverno auto-generated quota of 16 CPU was insufficient, blocking NFD worker and GC pods from scheduling. - Add custom-quota label to nvidia namespace to exempt from Kyverno generation - Add explicit ResourceQuota with limits.cpu=32, limits.memory=48Gi - Fix: nvidia namespace tier label was missing after CI re-apply, causing Kyverno to use fallback LimitRange instead of tier-2-gpu specific one Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> |
||
|---|---|---|
| .. | ||
| modules | ||
| .gitkeep | ||
| .terraform.lock.hcl | ||
| backend.tf | ||
| main.tf | ||
| providers.tf | ||
| redis-25.3.2.tgz | ||
| secrets | ||
| terragrunt.hcl | ||
| tiers.tf | ||