pfsense: SNI-routed internal 443 — mail.viktorbarzin.me serves webmail everywhere
Completes the internal port table of the mail front door (10.0.20.1):
443 was squatted by the pfSense webGUI (self-signed cert expired 2022),
so internal webmail and the kuma [External] mail probe hit the firewall
login instead of Roundcube — the last leg of the mail split-brain name.
Design (Viktor): route by what the client asked for. New HAProxy
frontend internal_https_443 (binds 10.0.20.1+10.0.10.1 :443, mode tcp):
SNI present -> Traefik .203 with send-proxy-v2 (trusted, IPv6-bridge
pattern, no health check per the PROXY-probe gotcha); SNI of
pfsense.viktorbarzin.{lan,me} or NO SNI (bare-IP admin access) -> webGUI,
which moved to :8443 (invisible to habits — https://10.0.20.1 still
lands on the login page; :8443 doubles as direct fallback). The
reverse-proxy pfsense ingress now targets :8443 directly.
Declared idempotently in pfsense-haproxy-bootstrap.php; config.xml
backed up on-box (config.xml.bak-2026-06-10-pre-sni443). Verified:
bare IP -> GUI login; pfsense.viktorbarzin.lan -> GUI;
pfsense.viktorbarzin.me -> 302 via ingress; mail.viktorbarzin.me ->
Roundcube with STRICT cert validation; :993 IMAPS untouched.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
This commit is contained in:
parent
176a65d3d2
commit
eae35c511a
4 changed files with 113 additions and 4 deletions
|
|
@ -269,7 +269,7 @@ Technitium's **Split Horizon AddressTranslation** app post-processes DNS respons
|
|||
|
||||
- **Affected**: Non-proxied domains (ha-sofia, immich, headscale, calibre, vaultwarden, etc.) for 192.168.1.x clients
|
||||
- **Not affected**: Cloudflare-proxied domains (resolve to Cloudflare edge IPs, no translation needed)
|
||||
- **10.0.x.x clients (k8s nodes, devvm, other VMs)** — handled at the resolver since 2026-06-10: **pfSense Unbound carries a domain override forwarding the whole `viktorbarzin.me` zone to Technitium** (`10.0.20.201`). Technitium's split-horizon zone answers with the zone apex A record, which auto-tracks the live Traefik LB IP (`technitium-ingress-dns-sync` CNAMEs every ingress host hourly; `viktorbarzin-apex-probe` is the drift canary). Every client of pfSense Unbound — all VLANs, k8s nodes included — therefore gets internal answers with **zero per-host configuration** (no `/etc/hosts` pins, no resolved drop-ins; both earlier same-day approaches were removed, nodes are stock). Names not behind Traefik keep distinct records in the zone (e.g. `mail.viktorbarzin.me → 10.0.20.1`, verified working on :993/:25). See `docs/runbooks/pfsense-unbound.md` for the override config + rollback, and `docs/post-mortems/2026-06-10-tuya-bridge-forgejo-pull-hairpin.md` for the incident that motivated this (kubelet forgejo pulls riding the broken hairpin; the containerd hosts.toml mirror cannot fix it — Traefik 404s bare-IP requests and the registry auth realm is an absolute public URL).
|
||||
- **10.0.x.x clients (k8s nodes, devvm, other VMs)** — handled at the resolver since 2026-06-10: **pfSense Unbound carries a domain override forwarding the whole `viktorbarzin.me` zone to Technitium** (`10.0.20.201`). Technitium's split-horizon zone answers with the zone apex A record, which auto-tracks the live Traefik LB IP (`technitium-ingress-dns-sync` CNAMEs every ingress host hourly; `viktorbarzin-apex-probe` is the drift canary). Every client of pfSense Unbound — all VLANs, k8s nodes included — therefore gets internal answers with **zero per-host configuration** (no `/etc/hosts` pins, no resolved drop-ins; both earlier same-day approaches were removed, nodes are stock). Names not behind Traefik keep distinct records in the zone (e.g. `mail.viktorbarzin.me → 10.0.20.1`, verified working on :993/:25; since 2026-06-10 its :443 also works internally — pfSense carries an SNI-routed HAProxy frontend on 443 that sends hostname traffic to Traefik and bare-IP/no-SNI traffic to the webGUI, which moved to :8443; see `docs/runbooks/mailserver-pfsense-haproxy.md`). See `docs/runbooks/pfsense-unbound.md` for the override config + rollback, and `docs/post-mortems/2026-06-10-tuya-bridge-forgejo-pull-hairpin.md` for the incident that motivated this (kubelet forgejo pulls riding the broken hairpin; the containerd hosts.toml mirror cannot fix it — Traefik 404s bare-IP requests and the registry auth realm is an absolute public URL).
|
||||
- **devvm**: also covered by a `~viktorbarzin.me → 10.0.20.201` resolved routing domain (predates the pfSense override, provisioned by `setup-devvm.sh`) — redundant-but-harmless belt-and-suspenders.
|
||||
- **in-cluster PODS are ordinary internal clients too** (since 2026-06-10 evening): CoreDNS's dedicated `viktorbarzin.me:53` block (in `stacks/technitium`, TF-managed) forwards to the Technitium ClusterIP (`10.96.0.53`, same as the `.lan` block), so pods get the same split-horizon answers as everyone else. This works because on k8s 1.34 **pods CAN reach the ETP=Local Traefik LB IP** — kube-proxy short-circuits in-cluster traffic to LB IPs via the cluster path (verified from pods on three non-Traefik nodes; re-verify after major k8s upgrades — the canary is the uptime-kuma `[External]` fleet going red). forgejo stays pinned to Traefik's **ClusterIP** in the same block so CI pushes survive a Technitium outage. History: the block briefly forwarded to `8.8.8.8/1.1.1.1` (morning of 2026-06-10), which kept pods on public IPs and the broken TP-Link NAT loopback — 27 non-proxied `[External]` uptime-kuma monitors dark (beads code-yh33). Note: in-cluster `[External]` monitors now test DNS+Traefik+service via the internal path for ALL names, including Cloudflare-proxied ones — genuine edge-path fidelity is the job of a true external vantage (ha-london), not in-cluster probes.
|
||||
- **Trade-off**: `viktorbarzin.me` resolution via pfSense now depends on in-cluster Technitium (3 replicas). During a full cluster outage the zone SERVFAILs LAN-wide — acceptable, the services behind it are down anyway; node bootstrap images pull via the IP-addressed `10.0.20.10` mirrors, so cold-start self-unwinds.
|
||||
|
|
|
|||
|
|
@ -55,7 +55,7 @@ External mail (WAN) path — PROXY v2
|
|||
│ pfSense WAN:{25,465,587,993} │
|
||||
│ │ NAT rdr → 10.0.20.1:{same} │
|
||||
│ ▼ │
|
||||
│ pfSense HAProxy (mode tcp, 4 frontends, 4 backend pools) │
|
||||
│ pfSense HAProxy (mode tcp, 5 frontends, 6 backend pools) │
|
||||
│ │ data: send-proxy-v2 → :{30125..30128} (PROXY-aware pod) │
|
||||
│ │ health: TCP-check → :{30145..30147} (no-PROXY pod) │
|
||||
│ │ inter 5000 │
|
||||
|
|
@ -113,6 +113,28 @@ kubectl logs -c docker-mailserver deployment/mailserver -n mailserver \
|
|||
# Expect external source IPs (e.g., Brevo 77.32.148.x), NOT 10.0.20.x
|
||||
```
|
||||
|
||||
## SNI-routed internal :443 frontend (2026-06-10)
|
||||
|
||||
`internal_https_443` binds `10.0.20.1:443` + `10.0.10.1:443` and completes
|
||||
the internal port table of the mail front door so `mail.viktorbarzin.me`
|
||||
(internal A record → 10.0.20.1) serves webmail too. Routing (Viktor's
|
||||
design — route by what the client asked for):
|
||||
|
||||
| Client connects with | Routed to |
|
||||
|---|---|
|
||||
| SNI = `pfsense.viktorbarzin.{lan,me}` | webgui backend `127.0.0.1:8443` |
|
||||
| any other SNI (hostnames, e.g. `mail.…`) | Traefik `10.0.20.203:443`, send-proxy-v2 |
|
||||
| no SNI (bare IP — `https://10.0.20.1`) | webgui backend `127.0.0.1:8443` |
|
||||
|
||||
The **pfSense webGUI was moved to `:8443`** (config.xml
|
||||
`system.webgui.port`, 2026-06-10) to free the 443 socket; admin access by
|
||||
IP keeps working through the no-SNI route, and `:8443` remains a direct
|
||||
fallback if HAProxy is down. The `pfsense.viktorbarzin.me` Traefik ingress
|
||||
(stacks/reverse-proxy) targets `:8443` directly. Traefik leg mirrors the
|
||||
IPv6 bridge: send-proxy-v2 (Traefik trusts 10.0.20.1), **no health check**
|
||||
(PROXY-expecting receivers reject bare probes — gotcha above). All of this
|
||||
is declared in `pfsense-haproxy-bootstrap.php` — re-run to reset.
|
||||
|
||||
## Bootstrap / restore from scratch
|
||||
|
||||
pfSense HAProxy config lives in `/cf/conf/config.xml` under
|
||||
|
|
|
|||
|
|
@ -45,6 +45,8 @@ $h['maxconn'] = '1000';
|
|||
|
||||
// Our declared object names (anything starting with mailserver_ is ours)
|
||||
$POOL_NAMES = [
|
||||
'webgui_traefik_443', // SNI-routed 443: hostname traffic -> Traefik
|
||||
'pfsense_webgui_8443', // SNI-routed 443: no-SNI / pfsense.* -> webgui
|
||||
'mailserver_nodes', // legacy (Phase 2/3 test)
|
||||
'mailserver_nodes_smtp',
|
||||
'mailserver_nodes_smtps',
|
||||
|
|
@ -52,6 +54,7 @@ $POOL_NAMES = [
|
|||
'mailserver_nodes_imaps',
|
||||
];
|
||||
$FRONTEND_NAMES = [
|
||||
'internal_https_443', // SNI-routed internal 443 (2026-06-10)
|
||||
'mailserver_proxy_test', // legacy (Phase 2/3 test, :2525)
|
||||
'mailserver_proxy_25',
|
||||
'mailserver_proxy_465',
|
||||
|
|
@ -185,6 +188,58 @@ $h['ha_pools']['item'][] = build_pool('mailserver_nodes_smtps', '30126', $NODES,
|
|||
$h['ha_pools']['item'][] = build_pool('mailserver_nodes_sub', '30127', $NODES, 'TCP', '30147');
|
||||
$h['ha_pools']['item'][] = build_pool('mailserver_nodes_imaps', '30128', $NODES);
|
||||
|
||||
// ── SNI-routed internal :443 pools (2026-06-10) ─────────────────────────
|
||||
// Completes the internal port table of 10.0.20.1 so mail.viktorbarzin.me
|
||||
// (internal A record -> 10.0.20.1) serves webmail too. Routing rule
|
||||
// (Viktor's design): TLS with a hostname (SNI present) -> Traefik; bare-IP
|
||||
// /no-SNI (admin hitting https://10.0.20.1) -> pfSense webgui, which moved
|
||||
// to :8443 to free the socket. pfsense.viktorbarzin.{lan,me} SNI is
|
||||
// excepted back to the webgui. Traefik leg mirrors the IPv6 bridge:
|
||||
// send-proxy-v2 (Traefik trusts 10.0.20.1), NO health check (PROXY-
|
||||
// expecting receivers reject bare probes — see runbook gotcha).
|
||||
$h['ha_pools']['item'][] = [
|
||||
'name' => 'webgui_traefik_443',
|
||||
'balance' => '',
|
||||
'check_type' => 'none',
|
||||
'monitor_domain' => '',
|
||||
'checkinter' => '',
|
||||
'retries' => '',
|
||||
'ha_servers' => ['item' => [[
|
||||
'name' => 'traefik',
|
||||
'address' => '10.0.20.203',
|
||||
'port' => '443',
|
||||
'weight' => '10',
|
||||
'ssl' => '',
|
||||
'advanced' => 'send-proxy-v2',
|
||||
'status' => 'active',
|
||||
]]],
|
||||
'advanced_bind' => '',
|
||||
'persist_cookie_enabled' => '',
|
||||
'transparent_clientip' => '',
|
||||
'advanced' => '',
|
||||
];
|
||||
$h['ha_pools']['item'][] = [
|
||||
'name' => 'pfsense_webgui_8443',
|
||||
'balance' => '',
|
||||
'check_type' => 'none',
|
||||
'monitor_domain' => '',
|
||||
'checkinter' => '',
|
||||
'retries' => '',
|
||||
'ha_servers' => ['item' => [[
|
||||
'name' => 'webgui',
|
||||
'address' => '127.0.0.1',
|
||||
'port' => '8443',
|
||||
'weight' => '10',
|
||||
'ssl' => '',
|
||||
'advanced' => '',
|
||||
'status' => 'active',
|
||||
]]],
|
||||
'advanced_bind' => '',
|
||||
'persist_cookie_enabled' => '',
|
||||
'transparent_clientip' => '',
|
||||
'advanced' => '',
|
||||
];
|
||||
|
||||
// ── Frontends ───────────────────────────────────────────────────────────
|
||||
if (!is_array($h['ha_backends'])) $h['ha_backends'] = ['item' => []];
|
||||
if (!is_array($h['ha_backends']['item'])) $h['ha_backends']['item'] = [];
|
||||
|
|
@ -228,7 +283,36 @@ $h['ha_backends']['item'][] = build_frontend(
|
|||
'mailserver_nodes_imaps'
|
||||
);
|
||||
|
||||
write_config('code-yiu: mailserver HAProxy — 4 production frontends + legacy :2525 test');
|
||||
// ── SNI-routed internal :443 frontend (2026-06-10) ──────────────────────
|
||||
// Binds both internal interface IPs so IP-based GUI access works from
|
||||
// either VLAN. mode tcp + SNI inspection; TLS passthrough on both legs
|
||||
// (Traefik serves the real certs; the webgui keeps its self-signed one).
|
||||
$h['ha_backends']['item'][] = [
|
||||
'name' => 'internal_https_443',
|
||||
'descr' => 'SNI-routed internal 443: hostname->Traefik (proxy-v2), no-SNI/pfsense.*->webgui:8443',
|
||||
'status' => 'active',
|
||||
'secondary' => '',
|
||||
'type' => 'tcp',
|
||||
'a_extaddr' => ['item' => [
|
||||
['extaddr' => 'custom', 'extaddr_custom' => '10.0.20.1', 'extaddr_port' => '443', 'extaddr_ssl' => '', 'extaddr_advanced' => ''],
|
||||
['extaddr' => 'custom', 'extaddr_custom' => '10.0.10.1', 'extaddr_port' => '443', 'extaddr_ssl' => '', 'extaddr_advanced' => ''],
|
||||
]],
|
||||
'backend_serverpool' => 'pfsense_webgui_8443',
|
||||
'ha_acls' => ['item' => [
|
||||
['name' => 'sni_pfsense', 'expression' => 'custom', 'value' => 'req.ssl_sni -i -m str pfsense.viktorbarzin.lan pfsense.viktorbarzin.me', 'casesensitive' => '', 'not' => ''],
|
||||
['name' => 'sni_any', 'expression' => 'custom', 'value' => 'req.ssl_sni -m found', 'casesensitive' => '', 'not' => ''],
|
||||
]],
|
||||
'a_actionitems' => ['item' => [
|
||||
['action' => 'use_backend', 'use_backendbackend' => 'pfsense_webgui_8443', 'acl' => 'sni_pfsense'],
|
||||
['action' => 'use_backend', 'use_backendbackend' => 'webgui_traefik_443', 'acl' => 'sni_any'],
|
||||
]],
|
||||
'dontlognull'=> '',
|
||||
'httpclose' => '',
|
||||
'forwardfor' => '',
|
||||
'advanced' => base64_encode("tcp-request inspect-delay 5s\n\ttcp-request content accept if { req.ssl_hello_type 1 } || !{ req.ssl_hello_type 1 }"),
|
||||
];
|
||||
|
||||
write_config('mailserver HAProxy + SNI-routed internal 443 (hostname->Traefik, no-SNI->webgui:8443)');
|
||||
|
||||
$messages = '';
|
||||
$rc = haproxy_check_and_run($messages, true);
|
||||
|
|
|
|||
|
|
@ -36,7 +36,10 @@ module "pfsense" {
|
|||
name = "pfsense"
|
||||
external_name = "pfsense.viktorbarzin.lan"
|
||||
tls_secret_name = var.tls_secret_name
|
||||
port = 443
|
||||
# webGUI moved to :8443 on 2026-06-10 — :443 on pfSense is now the
|
||||
# SNI-routed HAProxy frontend (hostname->Traefik, no-SNI->GUI). Direct
|
||||
# backend port avoids a Traefik->HAProxy->GUI double hop.
|
||||
port = 8443
|
||||
backend_protocol = "HTTPS"
|
||||
|
||||
extra_annotations = {
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue