forgejo pulls: route *.viktorbarzin.me to Technitium, drop /etc/hosts pins [ci skip]
Supersedes this morning's per-node /etc/hosts pin (no hardcoded service
IPs on nodes, per Viktor). Technitium's split-horizon zone already
resolves forgejo.viktorbarzin.me -> CNAME apex -> live Traefik LB IP
(ingress-dns-sync auto-CNAMEs every ingress host; apex drift probe
alerts) -- the nodes just never queried it. Rolled the devvm's
systemd-resolved routing-domain pattern (~viktorbarzin.me ->
10.0.20.201) to all 7 nodes, removed the pins, verified getent +
crictl pull via pure DNS.
Also demoted node5/6's cloud-init global-dns.conf (DNS=8.8.8.8 1.1.1.1)
to FallbackDNS-only: public servers in the global set race the routing
domain. Its justification ("Technitium NXDOMAINs forgejo") was obsolete
-- exactly the stale comment that pointed new nodes at the hairpin.
hosts.toml mirror kept but documented as vestigial (Traefik 404s
bare-IP requests; registry auth realm is an absolute URL).
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
This commit is contained in:
parent
b6976ce014
commit
1ee1bf0817
7 changed files with 135 additions and 66 deletions
|
|
@ -269,10 +269,11 @@ Technitium's **Split Horizon AddressTranslation** app post-processes DNS respons
|
|||
|
||||
- **Affected**: Non-proxied domains (ha-sofia, immich, headscale, calibre, vaultwarden, etc.) for 192.168.1.x clients
|
||||
- **Not affected**: Cloudflare-proxied domains (resolve to Cloudflare edge IPs, no translation needed)
|
||||
- **Not affected**: 10.0.x.x and K8s clients — these resolve non-proxied domains to the public IP and rely on pfSense NAT reflection, which is **intermittently broken** (observed i/o timeouts to `176.12.22.76:443` from k8s nodes and the devvm, 2026-06-04 → 2026-06-10). Hairpin-sensitive paths on this network get explicit per-leg fixes instead:
|
||||
- **kubelet image pulls of `forgejo.viktorbarzin.me`**: `/etc/hosts` pin `10.0.20.203 forgejo.viktorbarzin.me` on every k8s node (marker `forgejo-internal-pin`; deployed via `modules/create-template-vm/k8s-node-containerd-setup.sh` for new nodes, `scripts/setup-forgejo-containerd-mirror.sh` rollout for existing ones). The containerd hosts.toml mirror alone is insufficient — Traefik 404s its bare-IP requests (no Host/SNI match) and the registry's Bearer auth realm is an absolute public URL fetched outside the mirror. Root cause of the 2026-06-10 tuya-bridge outage (`docs/post-mortems/2026-06-10-tuya-bridge-forgejo-pull-hairpin.md`).
|
||||
- **in-cluster pods → forgejo**: CoreDNS `rewrite name exact forgejo.viktorbarzin.me traefik.traefik.svc.cluster.local` (2026-06-04, beads code-yh33).
|
||||
- **devvm git → forgejo**: still exposed to the hairpin (manual `/etc/hosts` pin workaround when it flares).
|
||||
- **Not affected**: 10.0.x.x and K8s clients — these resolve non-proxied domains to the public IP and rely on pfSense NAT reflection, which is **intermittently broken** (observed i/o timeouts to `176.12.22.76:443` from k8s nodes and the devvm, 2026-06-04 → 2026-06-10). Hairpin-sensitive paths on this network route `*.viktorbarzin.me` to Technitium instead, via a systemd-resolved **routing domain** (`/etc/systemd/resolved.conf.d/viktorbarzin.conf`: `DNS=10.0.20.201`, `Domains=~viktorbarzin.me`). Technitium's split-horizon zone answers with the zone apex A record, which auto-tracks the live Traefik LB IP (`technitium-ingress-dns-sync` CNAMEs every ingress host hourly; `viktorbarzin-apex-probe` is the drift canary) — no hardcoded service IPs on clients:
|
||||
- **k8s nodes (kubelet image pulls of `forgejo.viktorbarzin.me`)**: routing-domain drop-in on all 7 nodes (2026-06-10, replacing a same-day `/etc/hosts` pin; deployed via `modules/create-template-vm/cloud_init.yaml` for new nodes, `scripts/setup-forgejo-containerd-mirror.sh` rollout for existing ones). The containerd hosts.toml mirror alone is insufficient — Traefik 404s its bare-IP requests (no Host/SNI match) and the registry's Bearer auth realm is an absolute public URL fetched outside the mirror. Caution: public servers must NOT sit in the nodes' global resolved `DNS=` set — they merge with and race the routing domain (the old node5/6 `global-dns.conf` did exactly this; now `FallbackDNS=` only). Root cause analysis: `docs/post-mortems/2026-06-10-tuya-bridge-forgejo-pull-hairpin.md`.
|
||||
- **devvm**: same `viktorbarzin.conf` drop-in (predates the node rollout; provisioned by `setup-devvm.sh`).
|
||||
- **in-cluster pods → forgejo**: CoreDNS `rewrite name exact forgejo.viktorbarzin.me traefik.traefik.svc.cluster.local` (2026-06-04, beads code-yh33) — pods bypass node resolved entirely.
|
||||
- **Trade-off**: `*.viktorbarzin.me` resolution from nodes/devvm now depends on in-cluster Technitium (3 replicas). During a full cluster outage these names SERVFAIL — acceptable, the services behind them are down anyway; bootstrap images pull via the IP-addressed `10.0.20.10` mirrors, so cold-start self-unwinds.
|
||||
|
||||
Config is synced to all 3 Technitium instances by CronJob `technitium-split-horizon-sync` (every 6h).
|
||||
|
||||
|
|
|
|||
|
|
@ -59,28 +59,55 @@ CoreDNS forgejo rewrite (2026-06-04) covers pods only, not kubelet.
|
|||
|
||||
## Fix
|
||||
|
||||
`/etc/hosts` pin on every k8s node (hot, no drain, no containerd restart):
|
||||
**Initial mitigation (same morning):** `/etc/hosts` pin
|
||||
`10.0.20.203 forgejo.viktorbarzin.me` on every node — restored service
|
||||
immediately (resolve + token + blob legs all internal with correct SNI).
|
||||
|
||||
**Superseded same day (Viktor: "no hardcoded IPs in nodes") by a DNS-based
|
||||
fix.** Discovery: Technitium's split-horizon zone *already* resolves
|
||||
`forgejo.viktorbarzin.me → CNAME viktorbarzin.me → A <live Traefik IP>` —
|
||||
the `technitium-ingress-dns-sync` CronJob auto-CNAMEs every ingress host
|
||||
hourly, the apex A record tracks the live Traefik LB IP, and the
|
||||
`viktorbarzin-apex-probe` canary alerts on drift. The nodes simply never
|
||||
queried Technitium (resolv chain: pfSense + public AdGuard fallback). The
|
||||
devvm already solved this with a systemd-resolved **routing domain**
|
||||
drop-in; the same was rolled to all 7 nodes:
|
||||
|
||||
```
|
||||
10.0.20.203 forgejo.viktorbarzin.me # forgejo-internal-pin (managed: setup-forgejo-containerd-mirror.sh)
|
||||
# /etc/systemd/resolved.conf.d/viktorbarzin.conf
|
||||
[Resolve]
|
||||
DNS=10.0.20.201
|
||||
Domains=~viktorbarzin.me
|
||||
```
|
||||
|
||||
Go's resolver (containerd) consults `/etc/hosts` first, so resolve + token
|
||||
+ blob legs all go to internal Traefik with correct SNI and a valid
|
||||
wildcard cert (no `skip_verify` needed on this path). Applied live to all
|
||||
7 nodes; persisted in `modules/create-template-vm/k8s-node-containerd-setup.sh`
|
||||
(new nodes) and `scripts/setup-forgejo-containerd-mirror.sh` (existing-node
|
||||
rollout). hosts.toml mirror left in place (harmless, uniform config).
|
||||
The `/etc/hosts` pins were then removed (verified `getent` still returns
|
||||
the Traefik IP via DNS, and `crictl pull` succeeds). On node5/6 the
|
||||
cloud-init `global-dns.conf` (`DNS=8.8.8.8 1.1.1.1`) was demoted to
|
||||
`FallbackDNS=` only — public servers in the global set merge with and
|
||||
race the routing domain. That file's original justification ("Technitium
|
||||
NXDOMAINs forgejo.viktorbarzin.me") was obsolete: the ingress-dns-sync
|
||||
has since added forgejo to the zone — a stale comment that actively
|
||||
pointed new nodes at the hairpin.
|
||||
|
||||
**Renumber hazard:** the pin hardcodes Traefik's LB IP, same as the
|
||||
hosts.toml mirror and the 5 literals broken by the 2026-05-30 `.200→.203`
|
||||
move. Any future Traefik LB renumber must update both (grep nodes for
|
||||
`forgejo-internal-pin`).
|
||||
Persisted in `modules/create-template-vm/cloud_init.yaml` (new nodes; DNS
|
||||
drop-ins) and `scripts/setup-forgejo-containerd-mirror.sh` (existing-node
|
||||
rollout). hosts.toml mirror left in place but documented as vestigial.
|
||||
|
||||
**Renumber hazard: resolved.** A future Traefik LB renumber propagates
|
||||
via the apex A record automatically (drift probe alerts if it doesn't);
|
||||
only the vestigial hosts.toml literal goes stale. **New trade-off:**
|
||||
`*.viktorbarzin.me` resolution from nodes now depends on in-cluster
|
||||
Technitium (3 replicas); in a full cluster outage these names SERVFAIL —
|
||||
acceptable, the services are down anyway, and bootstrap images pull via
|
||||
the IP-addressed `10.0.20.10` mirrors.
|
||||
|
||||
## Verification
|
||||
|
||||
- `getent hosts forgejo.viktorbarzin.me` → `10.0.20.203` on all 7 nodes;
|
||||
`curl https://forgejo.viktorbarzin.me/v2/` → 401 (internal route, valid TLS).
|
||||
- `getent hosts forgejo.viktorbarzin.me` → `10.0.20.203` on all 7 nodes
|
||||
**with no `/etc/hosts` entry** (pure DNS via the routing domain);
|
||||
`resolvectl status` shows `~viktorbarzin.me` routed to `10.0.20.201`;
|
||||
general resolution (`getent hosts google.com`) intact on every node;
|
||||
`crictl pull` of the tuya_bridge image succeeds via the DNS path.
|
||||
- tuya-bridge pod Running; `/health` `ok=true`; 27/27 devices
|
||||
`success=true`; 7/7 `*_tuya_cloud_up` gauges = 1; no tuya-related alerts.
|
||||
|
||||
|
|
@ -90,10 +117,16 @@ move. Any future Traefik LB renumber must update both (grep nodes for
|
|||
latency bomb with the blast delayed until the cache misses.
|
||||
- Registry token realms are absolute URLs: any "redirect the registry"
|
||||
scheme must also redirect the *name*, not just the endpoint.
|
||||
- The remaining hairpin-exposed leg is **devvm git** (manual `/etc/hosts`
|
||||
workaround documented in memory); a durable LAN-wide fix would need
|
||||
pfSense Unbound host overrides (live network device — deliberate,
|
||||
separate change).
|
||||
- Before inventing a redirect mechanism, check what the DNS authority
|
||||
already serves: the Technitium split-horizon zone had the correct,
|
||||
auto-maintained answer all along — the clients just weren't asking it.
|
||||
- Stale config comments are load-bearing: the obsolete "Technitium
|
||||
NXDOMAINs forgejo" comment in cloud-init steered new nodes onto public
|
||||
DNS, recreating the hairpin exposure on every node added after it.
|
||||
- All `10.0.x` legs are now DNS-routed (nodes + devvm via routing domain,
|
||||
pods via CoreDNS rewrite). pfSense Unbound host overrides remain an
|
||||
option for other LAN segments if a non-Technitium client ever needs
|
||||
internal answers (live network device — deliberate, separate change).
|
||||
|
||||
## Related
|
||||
|
||||
|
|
|
|||
|
|
@ -119,9 +119,9 @@ cd infra/stacks/kyverno && scripts/tg apply
|
|||
cd infra/stacks/monitoring && scripts/tg apply
|
||||
cd infra/stacks/forgejo && scripts/tg apply
|
||||
|
||||
# Containerd hosts.toml + /etc/hosts pin on each existing k8s node — VM
|
||||
# cloud-init only fires on first boot. The /etc/hosts pin
|
||||
# (10.0.20.203 forgejo.viktorbarzin.me) is what makes pulls hairpin-proof:
|
||||
# Resolved routing domain (+ vestigial containerd hosts.toml) on each
|
||||
# existing k8s node — VM cloud-init only fires on first boot. The routing
|
||||
# domain (~viktorbarzin.me -> Technitium) is what makes pulls hairpin-proof:
|
||||
# the hosts.toml mirror alone falls back to public DNS (Traefik 404s its
|
||||
# bare-IP requests, and the registry auth realm is an absolute public URL).
|
||||
infra/scripts/setup-forgejo-containerd-mirror.sh
|
||||
|
|
@ -138,9 +138,11 @@ docker pull alpine:3.20
|
|||
docker tag alpine:3.20 forgejo.viktorbarzin.me/viktor/smoketest:1
|
||||
docker push forgejo.viktorbarzin.me/viktor/smoketest:1
|
||||
|
||||
# Per-node pull path: pin present + name resolves internally + pull works.
|
||||
ssh wizard@<node> 'grep forgejo-internal-pin /etc/hosts && getent hosts forgejo.viktorbarzin.me'
|
||||
# Expect: 10.0.20.203 forgejo.viktorbarzin.me
|
||||
# Per-node pull path: routing domain active + name resolves to the live
|
||||
# Traefik LB (via Technitium split-horizon zone) + pull works.
|
||||
ssh wizard@<node> 'resolvectl status | grep -A2 "~viktorbarzin.me"; getent hosts forgejo.viktorbarzin.me'
|
||||
# Expect: DNS Domain ~viktorbarzin.me on server 10.0.20.201, and
|
||||
# getent -> the current Traefik LB IP (10.0.20.203 today)
|
||||
ssh wizard@<node> sudo crictl pull forgejo.viktorbarzin.me/viktor/smoketest:1
|
||||
|
||||
# Confirm the cluster-wide Secret was synced into a fresh namespace.
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue