[ci skip] remove low-traffic pull-through caches (registry.k8s.io, quay.io, reg.kyverno.io)
Pull-through cache at 10.0.20.10 was serving corrupted/truncated images for low-traffic registries, breaking VPA certgen (ImagePullBackOff) and previously causing Kyverno image pull failures. Kept: docker.io (port 5000) and ghcr.io (port 5010) — high traffic, Docker Hub rate limits make caching essential. Removed from cloud-init template and all 5 live nodes: - registry.k8s.io (port 5030) — 14 system images, very low churn - quay.io (port 5020) — 11 images - reg.kyverno.io (port 5040) — 5 images The registry containers on the 10.0.20.10 VM still run but nodes no longer route to them. They can be stopped/removed from the VM later.
This commit is contained in:
parent
53be356f41
commit
de598996f1
2 changed files with 11 additions and 18 deletions
|
|
@ -122,7 +122,7 @@ terraform fmt -recursive # Format all
|
|||
## Infrastructure
|
||||
- Proxmox hypervisor (192.168.1.127) — see `.claude/reference/proxmox-inventory.md` for full VM table
|
||||
- Kubernetes cluster: 5 nodes (k8s-master + k8s-node1-4, v1.34.2), GPU on node1 (Tesla T4)
|
||||
- Docker registry pull-through cache at `10.0.20.10` (ports 5000/5010/5020/5030/5040)
|
||||
- Docker registry pull-through cache at `10.0.20.10` — only docker.io (port 5000) and ghcr.io (port 5010) are active. quay.io/registry.k8s.io/reg.kyverno.io caches disabled (caused corrupted images).
|
||||
- GPU workloads need: `node_selector = { "gpu": "true" }` + `toleration { key = "nvidia.com/gpu", value = "true", effect = "NoSchedule" }`
|
||||
|
||||
### Node Rebuild Procedure
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue