forgejo pulls: route *.viktorbarzin.me to Technitium, drop /etc/hosts pins [ci skip]

Supersedes this morning's per-node /etc/hosts pin (no hardcoded service
IPs on nodes, per Viktor). Technitium's split-horizon zone already
resolves forgejo.viktorbarzin.me -> CNAME apex -> live Traefik LB IP
(ingress-dns-sync auto-CNAMEs every ingress host; apex drift probe
alerts) -- the nodes just never queried it. Rolled the devvm's
systemd-resolved routing-domain pattern (~viktorbarzin.me ->
10.0.20.201) to all 7 nodes, removed the pins, verified getent +
crictl pull via pure DNS.

Also demoted node5/6's cloud-init global-dns.conf (DNS=8.8.8.8 1.1.1.1)
to FallbackDNS-only: public servers in the global set race the routing
domain. Its justification ("Technitium NXDOMAINs forgejo") was obsolete
-- exactly the stale comment that pointed new nodes at the hairpin.

hosts.toml mirror kept but documented as vestigial (Traefik 404s
bare-IP requests; registry auth realm is an absolute URL).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
This commit is contained in:
Viktor Barzin 2026-06-10 07:56:31 +00:00
parent b6976ce014
commit 1ee1bf0817
7 changed files with 135 additions and 66 deletions

View file

@ -1,19 +1,24 @@
#!/usr/bin/env bash
# One-shot deployment of the forgejo.viktorbarzin.me containerd hosts.toml
# entry + /etc/hosts pin across every k8s node. Cloud-init only fires on VM
# One-shot deployment of the forgejo pull path across every k8s node:
# systemd-resolved routing domain ~viktorbarzin.me -> Technitium, plus the
# (vestigial) containerd hosts.toml entry. Cloud-init only fires on VM
# provision, so existing nodes need this manual rollout.
#
# The /etc/hosts pin (forgejo.viktorbarzin.me -> Traefik LB) is what actually
# makes pulls hairpin-proof: Traefik 404s the mirror's bare-IP requests (no
# Host/SNI match) and the registry's Bearer auth realm is the absolute public
# URL, so the hosts.toml mirror alone always degrades to the flaky public-IP
# hairpin (2026-06-10 tuya-bridge outage; see
# The routing domain is what actually makes pulls hairpin-proof: Technitium's
# split-horizon zone resolves forgejo.viktorbarzin.me (CNAME, auto-synced from
# ingresses) to the zone apex whose A record tracks the live Traefik LB IP —
# no hardcoded service IPs on nodes. The hosts.toml mirror alone CANNOT do
# this: Traefik 404s its bare-IP requests (no Host/SNI match) and the registry
# Bearer auth realm is the absolute public URL fetched outside the mirror
# (2026-06-10 tuya-bridge outage; see
# docs/post-mortems/2026-06-10-tuya-bridge-forgejo-pull-hairpin.md).
#
# What it does, per node:
# 1. drain (ignore-daemonsets, delete-emptydir-data)
# 2. ssh in: mkdir + write /etc/containerd/certs.d/forgejo.viktorbarzin.me/hosts.toml
# + append the forgejo /etc/hosts pin
# 2. ssh in: write /etc/systemd/resolved.conf.d/viktorbarzin.conf (routing
# domain), neuter any public global-dns.conf to FallbackDNS-only, drop
# legacy forgejo-internal-pin /etc/hosts lines, restart systemd-resolved,
# write /etc/containerd/certs.d/forgejo.viktorbarzin.me/hosts.toml
# 3. systemctl restart containerd
# 4. uncordon
#
@ -46,12 +51,31 @@ for n in $NODES; do
ssh -o StrictHostKeyChecking=accept-new "wizard@$n" sudo bash <<EOF
set -euo pipefail
mkdir -p /etc/systemd/resolved.conf.d
cat > /etc/systemd/resolved.conf.d/viktorbarzin.conf <<'CONF'
# Route *.viktorbarzin.me to Technitium (split-horizon zone -> live Traefik LB),
# so kubelet image pulls of forgejo.viktorbarzin.me never traverse the public
# NAT-hairpin. Everything else uses the link DNS.
# Managed: setup-forgejo-containerd-mirror.sh / cloud_init.yaml
[Resolve]
DNS=10.0.20.201
Domains=~viktorbarzin.me
CONF
# Public servers in the global DNS= set would race the routing domain —
# demote any legacy global-dns.conf to emergency fallback only.
if [ -f /etc/systemd/resolved.conf.d/global-dns.conf ]; then
cat > /etc/systemd/resolved.conf.d/global-dns.conf <<'CONF'
# Emergency fallback only (used when no link DNS is configured at all).
[Resolve]
FallbackDNS=8.8.8.8 1.1.1.1
CONF
fi
sed -i '/forgejo-internal-pin/d' /etc/hosts
systemctl restart systemd-resolved
mkdir -p "$CERTS_DIR"
cat > "$CERTS_DIR/hosts.toml" <<'TOML'
$HOSTS_TOML
TOML
grep -q forgejo-internal-pin /etc/hosts || \
echo '10.0.20.203 forgejo.viktorbarzin.me # forgejo-internal-pin (managed: setup-forgejo-containerd-mirror.sh)' >> /etc/hosts
systemctl restart containerd
EOF