forgejo pulls: pin registry name to internal Traefik in node /etc/hosts [ci skip]

tuya-bridge was down 7.5h (ImagePullBackOff on k8s-node3): fresh kubelet
pulls of forgejo.viktorbarzin.me images depended on the intermittently
broken public-IP hairpin. The containerd hosts.toml mirror cannot keep
pulls internal on its own — Traefik 404s its bare-IP requests (no
Host/SNI match) and the registry Bearer realm is an absolute public URL
fetched outside the mirror. Third incident of this class (buildkit
06-04, tripit/devvm 06-09).

Fix: /etc/hosts pin 10.0.20.203 forgejo.viktorbarzin.me on every node —
covers resolve + token + blob legs with correct SNI and valid cert.
Applied live to all 7 nodes; persisted in the cloud-init bootstrap and
the existing-node rollout script. Docs updated (registry bullet, dns.md
hairpin scope + stale .200 literals, runbook) + post-mortem.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
This commit is contained in:
Viktor Barzin 2026-06-10 07:15:24 +00:00
parent eb8695743b
commit b6976ce014
6 changed files with 150 additions and 11 deletions

View file

@ -119,8 +119,11 @@ cd infra/stacks/kyverno && scripts/tg apply
cd infra/stacks/monitoring && scripts/tg apply
cd infra/stacks/forgejo && scripts/tg apply
# Containerd hosts.toml on each existing k8s node — VM cloud-init
# only fires on first boot.
# Containerd hosts.toml + /etc/hosts pin on each existing k8s node — VM
# cloud-init only fires on first boot. The /etc/hosts pin
# (10.0.20.203 forgejo.viktorbarzin.me) is what makes pulls hairpin-proof:
# the hosts.toml mirror alone falls back to public DNS (Traefik 404s its
# bare-IP requests, and the registry auth realm is an absolute public URL).
infra/scripts/setup-forgejo-containerd-mirror.sh
```
@ -135,7 +138,9 @@ docker pull alpine:3.20
docker tag alpine:3.20 forgejo.viktorbarzin.me/viktor/smoketest:1
docker push forgejo.viktorbarzin.me/viktor/smoketest:1
# Pull from a k8s node.
# Per-node pull path: pin present + name resolves internally + pull works.
ssh wizard@<node> 'grep forgejo-internal-pin /etc/hosts && getent hosts forgejo.viktorbarzin.me'
# Expect: 10.0.20.203 forgejo.viktorbarzin.me
ssh wizard@<node> sudo crictl pull forgejo.viktorbarzin.me/viktor/smoketest:1
# Confirm the cluster-wide Secret was synced into a fresh namespace.