coredns: pods get internal split-horizon answers for viktorbarzin.me [ci skip]

Forward the viktorbarzin.me:53 pod block to the Technitium ClusterIP (10.96.0.53, same as the .lan block) instead of 8.8.8.8/1.1.1.1. Pods become ordinary internal clients (CNAME -> apex -> live Traefik LB; mail -> 10.0.20.1), fixing the 27 non-proxied [External] uptime-kuma monitors that rode the TP-Link NAT loopback (hard-down since 06-09; loopback refuses flows whose source equals the reflection target, which all pfSense-SNAT'd cluster traffic does). Enabled by re-testing a stale premise: on k8s 1.34 pods DO reach the ETP=Local Traefik LB IP (kube-proxy short-circuits in-cluster traffic to LB IPs; verified from pods on three non-Traefik nodes) — re-verify after major k8s upgrades; canary = [External] fleet going red. The NAT-layer alternatives (pfSense rdr, SNAT-drop) were rejected: both fight return-path asymmetry and deepen TP-Link dependency. Verified in-pod: immich -> .203 + HTTPS 200, mail -> 10.0.20.1, forgejo -> Traefik ClusterIP (pin kept for Technitium-outage resilience). Proxied [External] monitors now test the internal path — true edge fidelity moves to the external vantage (ha-london, next fix). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-10 16:21:34 +00:00 · 2026-06-10 16:21:34 +00:00 · 59a531b8e0
commit 59a531b8e0
parent 35c89fa90c
5 changed files with 36 additions and 15 deletions
--- a/docs/post-mortems/2026-06-10-tuya-bridge-forgejo-pull-hairpin.md
+++ b/docs/post-mortems/2026-06-10-tuya-bridge-forgejo-pull-hairpin.md
@ -108,6 +108,19 @@ replaced them:
   `.me` names to `8.8.8.8/1.1.1.1`, preserving pods' pre-existing
   public-IP behavior. Replaces the old forgejo rewrite in `.:53`.

+   **Addendum (same day, evening):** the "pods cannot reach the
+   ETP=Local LB IP" premise was re-tested and is FALSE on k8s 1.34
+   (kube-proxy short-circuits in-cluster traffic to LB IPs via the
+   cluster path; verified from pods on three non-Traefik nodes). The
+   public-answer carve-out had meanwhile left pods as the only client
+   class still riding the TP-Link NAT loopback, which hard-died
+   2026-06-09 — 27 non-proxied `[External]` uptime-kuma monitors dark.
+   Fix: the block now forwards to the Technitium ClusterIP
+   (`10.96.0.53`) — pods are ordinary internal clients; forgejo pin
+   kept for Technitium-outage resilience. In-cluster `[External]`
+   monitors now test the internal path for all names; genuine
+   edge-path fidelity belongs to a true external vantage (ha-london).
+
 node5/6 were also re-pointed from link-DNS=Technitium to
 `10.0.20.1 94.140.14.14` (netplan + `qm set --nameserver` on PVE VMs
 205/206) for fleet parity, and their `global-dns.conf` was deleted.