infra/scripts/setup-forgejo-containerd-mirror.sh

#!/usr/bin/env bash
# One-shot deployment of the forgejo pull path across every k8s node:
# systemd-resolved routing domain ~viktorbarzin.me -> Technitium, plus the
# (vestigial) containerd hosts.toml entry. Cloud-init only fires on VM
# provision, so existing nodes need this manual rollout.
#
# The routing domain is what actually makes pulls hairpin-proof: Technitium's
# split-horizon zone resolves forgejo.viktorbarzin.me (CNAME, auto-synced from
# ingresses) to the zone apex whose A record tracks the live Traefik LB IP —
# no hardcoded service IPs on nodes. The hosts.toml mirror alone CANNOT do
# this: Traefik 404s its bare-IP requests (no Host/SNI match) and the registry
# Bearer auth realm is the absolute public URL fetched outside the mirror
# (2026-06-10 tuya-bridge outage; see
# docs/post-mortems/2026-06-10-tuya-bridge-forgejo-pull-hairpin.md).
#
# What it does, per node:
#   1. drain (ignore-daemonsets, delete-emptydir-data)
#   2. ssh in: write /etc/systemd/resolved.conf.d/viktorbarzin.conf (routing
#      domain), neuter any public global-dns.conf to FallbackDNS-only, drop
#      legacy forgejo-internal-pin /etc/hosts lines, restart systemd-resolved,
#      write /etc/containerd/certs.d/forgejo.viktorbarzin.me/hosts.toml
#   3. systemctl restart containerd
#   4. uncordon
#
# hosts.toml is documented as hot-reloaded but the post-2026-04-19
# containerd corruption playbook calls for an explicit restart so the
# config is unambiguously in effect. Running drain/uncordon around it
# avoids pulling against an in-flight containerd restart.
#
# Re-run is safe: writes are idempotent.

set -euo pipefail

CERTS_DIR=/etc/containerd/certs.d/forgejo.viktorbarzin.me
HOSTS_TOML='server = "https://forgejo.viktorbarzin.me"

[host."https://10.0.20.203"]
  capabilities = ["pull", "resolve"]
  skip_verify = true
'

NODES=$(kubectl get nodes -o name | sed 's|^node/||')
if [[ -z "$NODES" ]]; then
  echo "ERROR: no nodes returned from kubectl get nodes" >&2
  exit 1
fi

for n in $NODES; do
  echo "=== $n ==="
  kubectl drain "$n" --ignore-daemonsets --delete-emptydir-data --force --grace-period=60

  ssh -o StrictHostKeyChecking=accept-new "wizard@$n" sudo bash <<EOF
set -euo pipefail
mkdir -p /etc/systemd/resolved.conf.d
cat > /etc/systemd/resolved.conf.d/viktorbarzin.conf <<'CONF'
# Route *.viktorbarzin.me to Technitium (split-horizon zone -> live Traefik LB),
# so kubelet image pulls of forgejo.viktorbarzin.me never traverse the public
# NAT-hairpin. Everything else uses the link DNS.
# Managed: setup-forgejo-containerd-mirror.sh / cloud_init.yaml
[Resolve]
DNS=10.0.20.201
Domains=~viktorbarzin.me
CONF
# Public servers in the global DNS= set would race the routing domain —
# demote any legacy global-dns.conf to emergency fallback only.
if [ -f /etc/systemd/resolved.conf.d/global-dns.conf ]; then
  cat > /etc/systemd/resolved.conf.d/global-dns.conf <<'CONF'
# Emergency fallback only (used when no link DNS is configured at all).
[Resolve]
FallbackDNS=8.8.8.8 1.1.1.1
CONF
fi
sed -i '/forgejo-internal-pin/d' /etc/hosts
systemctl restart systemd-resolved
mkdir -p "$CERTS_DIR"
cat > "$CERTS_DIR/hosts.toml" <<'TOML'
$HOSTS_TOML
TOML
systemctl restart containerd
EOF

  kubectl uncordon "$n"

  # Wait for the node to report Ready before moving to the next one.
  for i in {1..30}; do
    if kubectl get node "$n" -o jsonpath='{.status.conditions[?(@.type=="Ready")].status}' | grep -q True; then
      echo "    node Ready"
      break
    fi
    sleep 2
  done
done

echo "All nodes updated."
fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 6d224861 came from a --no-checkout worktree whose empty index made the commit drop every file except two. This restores 05b50d2b's full tree and correctly adds stacks/stem95su/gdrive-sync.tf + the service-catalog stem95su entry. Forward-only (parent=6d224861, no force-push); [ci skip] since the live infra was never applied from the broken commit. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> 2026-06-09 08:45:33 +00:00			`#!/usr/bin/env bash`
forgejo pulls: route *.viktorbarzin.me to Technitium, drop /etc/hosts pins [ci skip] Supersedes this morning's per-node /etc/hosts pin (no hardcoded service IPs on nodes, per Viktor). Technitium's split-horizon zone already resolves forgejo.viktorbarzin.me -> CNAME apex -> live Traefik LB IP (ingress-dns-sync auto-CNAMEs every ingress host; apex drift probe alerts) -- the nodes just never queried it. Rolled the devvm's systemd-resolved routing-domain pattern (~viktorbarzin.me -> 10.0.20.201) to all 7 nodes, removed the pins, verified getent + crictl pull via pure DNS. Also demoted node5/6's cloud-init global-dns.conf (DNS=8.8.8.8 1.1.1.1) to FallbackDNS-only: public servers in the global set race the routing domain. Its justification ("Technitium NXDOMAINs forgejo") was obsolete -- exactly the stale comment that pointed new nodes at the hairpin. hosts.toml mirror kept but documented as vestigial (Traefik 404s bare-IP requests; registry auth realm is an absolute URL). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> 2026-06-10 07:56:31 +00:00			`# One-shot deployment of the forgejo pull path across every k8s node:`
			`# systemd-resolved routing domain ~viktorbarzin.me -> Technitium, plus the`
			`# (vestigial) containerd hosts.toml entry. Cloud-init only fires on VM`
forgejo pulls: pin registry name to internal Traefik in node /etc/hosts [ci skip] tuya-bridge was down 7.5h (ImagePullBackOff on k8s-node3): fresh kubelet pulls of forgejo.viktorbarzin.me images depended on the intermittently broken public-IP hairpin. The containerd hosts.toml mirror cannot keep pulls internal on its own — Traefik 404s its bare-IP requests (no Host/SNI match) and the registry Bearer realm is an absolute public URL fetched outside the mirror. Third incident of this class (buildkit 06-04, tripit/devvm 06-09). Fix: /etc/hosts pin 10.0.20.203 forgejo.viktorbarzin.me on every node — covers resolve + token + blob legs with correct SNI and valid cert. Applied live to all 7 nodes; persisted in the cloud-init bootstrap and the existing-node rollout script. Docs updated (registry bullet, dns.md hairpin scope + stale .200 literals, runbook) + post-mortem. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> 2026-06-10 07:15:24 +00:00			`# provision, so existing nodes need this manual rollout.`
			`#`
forgejo pulls: route *.viktorbarzin.me to Technitium, drop /etc/hosts pins [ci skip] Supersedes this morning's per-node /etc/hosts pin (no hardcoded service IPs on nodes, per Viktor). Technitium's split-horizon zone already resolves forgejo.viktorbarzin.me -> CNAME apex -> live Traefik LB IP (ingress-dns-sync auto-CNAMEs every ingress host; apex drift probe alerts) -- the nodes just never queried it. Rolled the devvm's systemd-resolved routing-domain pattern (~viktorbarzin.me -> 10.0.20.201) to all 7 nodes, removed the pins, verified getent + crictl pull via pure DNS. Also demoted node5/6's cloud-init global-dns.conf (DNS=8.8.8.8 1.1.1.1) to FallbackDNS-only: public servers in the global set race the routing domain. Its justification ("Technitium NXDOMAINs forgejo") was obsolete -- exactly the stale comment that pointed new nodes at the hairpin. hosts.toml mirror kept but documented as vestigial (Traefik 404s bare-IP requests; registry auth realm is an absolute URL). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> 2026-06-10 07:56:31 +00:00			`# The routing domain is what actually makes pulls hairpin-proof: Technitium's`
			`# split-horizon zone resolves forgejo.viktorbarzin.me (CNAME, auto-synced from`
			`# ingresses) to the zone apex whose A record tracks the live Traefik LB IP —`
			`# no hardcoded service IPs on nodes. The hosts.toml mirror alone CANNOT do`
			`# this: Traefik 404s its bare-IP requests (no Host/SNI match) and the registry`
			`# Bearer auth realm is the absolute public URL fetched outside the mirror`
			`# (2026-06-10 tuya-bridge outage; see`
forgejo pulls: pin registry name to internal Traefik in node /etc/hosts [ci skip] tuya-bridge was down 7.5h (ImagePullBackOff on k8s-node3): fresh kubelet pulls of forgejo.viktorbarzin.me images depended on the intermittently broken public-IP hairpin. The containerd hosts.toml mirror cannot keep pulls internal on its own — Traefik 404s its bare-IP requests (no Host/SNI match) and the registry Bearer realm is an absolute public URL fetched outside the mirror. Third incident of this class (buildkit 06-04, tripit/devvm 06-09). Fix: /etc/hosts pin 10.0.20.203 forgejo.viktorbarzin.me on every node — covers resolve + token + blob legs with correct SNI and valid cert. Applied live to all 7 nodes; persisted in the cloud-init bootstrap and the existing-node rollout script. Docs updated (registry bullet, dns.md hairpin scope + stale .200 literals, runbook) + post-mortem. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> 2026-06-10 07:15:24 +00:00			`# docs/post-mortems/2026-06-10-tuya-bridge-forgejo-pull-hairpin.md).`
fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 6d224861 came from a --no-checkout worktree whose empty index made the commit drop every file except two. This restores 05b50d2b's full tree and correctly adds stacks/stem95su/gdrive-sync.tf + the service-catalog stem95su entry. Forward-only (parent=6d224861, no force-push); [ci skip] since the live infra was never applied from the broken commit. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> 2026-06-09 08:45:33 +00:00			`#`
			`# What it does, per node:`
			`# 1. drain (ignore-daemonsets, delete-emptydir-data)`
forgejo pulls: route *.viktorbarzin.me to Technitium, drop /etc/hosts pins [ci skip] Supersedes this morning's per-node /etc/hosts pin (no hardcoded service IPs on nodes, per Viktor). Technitium's split-horizon zone already resolves forgejo.viktorbarzin.me -> CNAME apex -> live Traefik LB IP (ingress-dns-sync auto-CNAMEs every ingress host; apex drift probe alerts) -- the nodes just never queried it. Rolled the devvm's systemd-resolved routing-domain pattern (~viktorbarzin.me -> 10.0.20.201) to all 7 nodes, removed the pins, verified getent + crictl pull via pure DNS. Also demoted node5/6's cloud-init global-dns.conf (DNS=8.8.8.8 1.1.1.1) to FallbackDNS-only: public servers in the global set race the routing domain. Its justification ("Technitium NXDOMAINs forgejo") was obsolete -- exactly the stale comment that pointed new nodes at the hairpin. hosts.toml mirror kept but documented as vestigial (Traefik 404s bare-IP requests; registry auth realm is an absolute URL). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> 2026-06-10 07:56:31 +00:00			`# 2. ssh in: write /etc/systemd/resolved.conf.d/viktorbarzin.conf (routing`
			`# domain), neuter any public global-dns.conf to FallbackDNS-only, drop`
			`# legacy forgejo-internal-pin /etc/hosts lines, restart systemd-resolved,`
			`# write /etc/containerd/certs.d/forgejo.viktorbarzin.me/hosts.toml`
fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 6d224861 came from a --no-checkout worktree whose empty index made the commit drop every file except two. This restores 05b50d2b's full tree and correctly adds stacks/stem95su/gdrive-sync.tf + the service-catalog stem95su entry. Forward-only (parent=6d224861, no force-push); [ci skip] since the live infra was never applied from the broken commit. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> 2026-06-09 08:45:33 +00:00			`# 3. systemctl restart containerd`
			`# 4. uncordon`
			`#`
			`# hosts.toml is documented as hot-reloaded but the post-2026-04-19`
			`# containerd corruption playbook calls for an explicit restart so the`
			`# config is unambiguously in effect. Running drain/uncordon around it`
			`# avoids pulling against an in-flight containerd restart.`
			`#`
			`# Re-run is safe: writes are idempotent.`

			`set -euo pipefail`

			`CERTS_DIR=/etc/containerd/certs.d/forgejo.viktorbarzin.me`
			`HOSTS_TOML='server = "https://forgejo.viktorbarzin.me"`

			`[host."https://10.0.20.203"]`
			`capabilities = ["pull", "resolve"]`
			`skip_verify = true`
			`'`

			`NODES=$(kubectl get nodes -o name \| sed 's\|^node/\|\|')`
			`if [[ -z "$NODES" ]]; then`
			`echo "ERROR: no nodes returned from kubectl get nodes" >&2`
			`exit 1`
			`fi`

			`for n in $NODES; do`
			`echo "=== $n ==="`
			`kubectl drain "$n" --ignore-daemonsets --delete-emptydir-data --force --grace-period=60`

			`ssh -o StrictHostKeyChecking=accept-new "wizard@$n" sudo bash <<EOF`
			`set -euo pipefail`
forgejo pulls: route *.viktorbarzin.me to Technitium, drop /etc/hosts pins [ci skip] Supersedes this morning's per-node /etc/hosts pin (no hardcoded service IPs on nodes, per Viktor). Technitium's split-horizon zone already resolves forgejo.viktorbarzin.me -> CNAME apex -> live Traefik LB IP (ingress-dns-sync auto-CNAMEs every ingress host; apex drift probe alerts) -- the nodes just never queried it. Rolled the devvm's systemd-resolved routing-domain pattern (~viktorbarzin.me -> 10.0.20.201) to all 7 nodes, removed the pins, verified getent + crictl pull via pure DNS. Also demoted node5/6's cloud-init global-dns.conf (DNS=8.8.8.8 1.1.1.1) to FallbackDNS-only: public servers in the global set race the routing domain. Its justification ("Technitium NXDOMAINs forgejo") was obsolete -- exactly the stale comment that pointed new nodes at the hairpin. hosts.toml mirror kept but documented as vestigial (Traefik 404s bare-IP requests; registry auth realm is an absolute URL). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> 2026-06-10 07:56:31 +00:00			`mkdir -p /etc/systemd/resolved.conf.d`
			`cat > /etc/systemd/resolved.conf.d/viktorbarzin.conf <<'CONF'`
			`# Route *.viktorbarzin.me to Technitium (split-horizon zone -> live Traefik LB),`
			`# so kubelet image pulls of forgejo.viktorbarzin.me never traverse the public`
			`# NAT-hairpin. Everything else uses the link DNS.`
			`# Managed: setup-forgejo-containerd-mirror.sh / cloud_init.yaml`
			`[Resolve]`
			`DNS=10.0.20.201`
			`Domains=~viktorbarzin.me`
			`CONF`
			`# Public servers in the global DNS= set would race the routing domain —`
			`# demote any legacy global-dns.conf to emergency fallback only.`
			`if [ -f /etc/systemd/resolved.conf.d/global-dns.conf ]; then`
			`cat > /etc/systemd/resolved.conf.d/global-dns.conf <<'CONF'`
			`# Emergency fallback only (used when no link DNS is configured at all).`
			`[Resolve]`
			`FallbackDNS=8.8.8.8 1.1.1.1`
			`CONF`
			`fi`
			`sed -i '/forgejo-internal-pin/d' /etc/hosts`
			`systemctl restart systemd-resolved`
fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip] 6d224861 came from a --no-checkout worktree whose empty index made the commit drop every file except two. This restores 05b50d2b's full tree and correctly adds stacks/stem95su/gdrive-sync.tf + the service-catalog stem95su entry. Forward-only (parent=6d224861, no force-push); [ci skip] since the live infra was never applied from the broken commit. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> 2026-06-09 08:45:33 +00:00			`mkdir -p "$CERTS_DIR"`
			`cat > "$CERTS_DIR/hosts.toml" <<'TOML'`
			`$HOSTS_TOML`
			`TOML`
			`systemctl restart containerd`
			`EOF`

			`kubectl uncordon "$n"`

			`# Wait for the node to report Ready before moving to the next one.`
			`for i in {1..30}; do`
			`if kubectl get node "$n" -o jsonpath='{.status.conditions[?(@.type=="Ready")].status}' \| grep -q True; then`
			`echo " node Ready"`
			`break`
			`fi`
			`sleep 2`
			`done`
			`done`

			`echo "All nodes updated."`