traefik+pfsense: real IPv6 client IPs via HAProxy PROXY-v2 bridge

Replace the pfSense socat IPv6 forwarder (which masked every IPv6 client
as 10.0.20.1) with a standalone HAProxy bridge using send-proxy-v2, so
real IPv6 client IPs reach Traefik/CrowdSec. Traefik now trusts PROXY-v2
only from 10.0.20.1 on the web/websecure entrypoints; real IPv4 clients
(ETP=Local, own source IP) are unaffected. Mail-over-IPv6 routed through
the mail NodePorts (send-proxy-v2) too. Bridge is TCP/h2 only (no QUIC
over IPv6). Persistence on pfSense: rc.d/ipv6proxy + ipv6_proxy.sh
(config.xml shellcmd), keeping the nginx-off-[::] patch.

Also fixes stale networking.md: Traefik was still documented on the
shared .200; it moved to dedicated .203/ETP=Local on 2026-05-30.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
Viktor Barzin 2026-05-30 09:51:23 +00:00
parent 16c9aafafa
commit e9046e5a26
4 changed files with 67 additions and 3 deletions

View file

@ -126,6 +126,7 @@ Repo IDs: infra=1, Website=2, finance=3, health=4, travel_blog=5, webhook-handle
- **Retry middleware**: 2 attempts, 100ms — in default ingress chain.
- **HTTP/3 (QUIC)**: Enabled on Traefik. Works for **direct (non-proxied) apps** via the dedicated LB IP below (ETP=Local). Proxied apps get QUIC at the Cloudflare edge.
- **Traefik LB IP = `10.0.20.203`, `externalTrafficPolicy: Local`** (dedicated, NOT the shared `.200`). Moved off the shared `.200` on 2026-05-30 so direct/non-proxied apps preserve the **real client IP for CrowdSec** (ETP=Cluster SNAT'd them to the node IP) and so QUIC works. **The shared `10.0.20.200` keeps the other 10 LB services** (PG state-backend `postgresql-lb`, headscale, wireguard, coturn, xray, etc. — all ETP=Cluster; MetalLB forbids mixed ETP on a shared IP, hence Traefik's own IP). **cloudflared targets the in-cluster Traefik Service** (`https://traefik.traefik.svc.cluster.local:443`, remote/dashboard tunnel config — edit via CF Global API Key in `secret/platform`), so proxied apps are decoupled from the LB IP. pfSense WAN 443 (tcp+udp) NAT → alias `traefik_lb` (`.203`). Internal split-horizon apex `viktorbarzin.me A``.203`. Full runbook + post-mortem: `docs/plans/2026-05-30-traefik-dedicated-ip-etp-local-*`.
- **IPv6 ingress** = HE 6in4 tunnel (`2001:470:6e:43d::2`) → **standalone HAProxy on pfSense** (`/usr/local/etc/ipv6-haproxy.cfg`, NOT the HAProxy package) using `send-proxy-v2` → Traefik `.203` (web 443/80) + mail NodePorts `30125-30128` (25/465/587/993) — so **real IPv6 client IPs reach CrowdSec**. Traefik trusts PROXY-v2 **only from `10.0.20.1`** (`entryPoints.web/websecure.proxyProtocol.trustedIPs`); real IPv4 clients (own source IP) unaffected. **No QUIC over IPv6** (bridge is TCP/h2). Replaced socat 2026-05-30 (socat masked every v6 client as `10.0.20.1`). Boot/persistence: config.xml `<shellcmd>``ipv6_proxy.sh` (patches nginx off `[::]:443/:80` to free the tunnel IPv6, then `service ipv6proxy onestart`); `rc.d/ipv6proxy` manages HAProxy. Backends use **no health `check`** (a plain TCP check false-DOWNs the PROXY-expecting listeners). As-built: `docs/architecture/networking.md` → "IPv6 Ingress".
- **IPAM & DNS auto-registration**: pfSense Kea DHCP serves all 3 subnets (VLAN 10, VLAN 20, 192.168.1.x). Kea DDNS auto-registers every DHCP client in Technitium (RFC 2136, A+PTR). CronJob `phpipam-pfsense-import` (hourly) pulls Kea leases + ARP into phpIPAM via SSH (passive, no scanning). CronJob `phpipam-dns-sync` (15min) bidirectional sync phpIPAM ↔ Technitium. 42 MAC reservations for 192.168.1.x.
## Service-Specific Notes

View file

@ -254,11 +254,11 @@ Additional middleware:
### MetalLB & Load Balancing
MetalLB v0.15.3 allocates IPs from the range 10.0.20.200-10.0.20.220 in **Layer 2 mode**. Most LoadBalancer services share **10.0.20.200** using the `metallb.io/allow-shared-ip: shared` annotation. Technitium DNS has a **dedicated IP (10.0.20.201)** with `externalTrafficPolicy: Local` to preserve client source IPs for query logging.
MetalLB v0.15.3 allocates IPs from the range 10.0.20.200-10.0.20.220 in **Layer 2 mode**. Most LoadBalancer services share **10.0.20.200** using the `metallb.io/allow-shared-ip: shared` annotation. Two services have **dedicated IPs** with `externalTrafficPolicy: Local` to preserve real client source IPs: **Traefik (10.0.20.203)** — so CrowdSec sees real public IPs on the direct-ingress path and QUIC/HTTP3 works (a shared IP forbids the mixed ETP that QUIC's UDP listener needs) — and **Technitium DNS (10.0.20.201)** for query logging.
| Service | Namespace | IP | Ports |
|---------|-----------|-----|-------|
| traefik | traefik | 10.0.20.200 (shared) | 80, 443, 443/UDP (HTTP/3), 10200, 10300, 11434/TCP |
| traefik | traefik | **10.0.20.203 (dedicated, ETP=Local)** | 80, 443, 443/UDP (HTTP/3), 10200, 10300, 11434/TCP |
| coturn | coturn | 10.0.20.200 (shared) | 3478/UDP (STUN/TURN), 49152-49252/UDP (relay) |
| headscale | headscale | 10.0.20.200 (shared) | 41641/UDP, 3479/UDP |
| windows-kms¹ | kms | 10.0.20.200 (shared) | 1688/TCP |
@ -270,7 +270,7 @@ MetalLB v0.15.3 allocates IPs from the range 10.0.20.200-10.0.20.220 in **Layer
| xray-reality | xray | 10.0.20.200 (shared) | 7443/TCP |
| **technitium-dns** | **technitium** | **10.0.20.201 (dedicated)** | **53/UDP+TCP** |
pfSense aliases reference these IPs: `k8s_shared_lb` (10.0.20.200), `technitium_dns` (10.0.20.201). NAT rules use aliases for maintainability.
pfSense aliases reference these IPs: `k8s_shared_lb` (10.0.20.200), `traefik_lb` (10.0.20.203), `technitium_dns` (10.0.20.201). NAT rules use aliases for maintainability — the WAN 443 (TCP+UDP) forward targets `traefik_lb`.
¹ **windows-kms is publicly WAN-exposed.** pfSense forwards WAN TCP/1688 → `k8s_shared_lb:1688` so any internet host can activate. The matching filter rule applies a per-source rate limit (`max-src-conn 50`, `max-src-conn-rate 10/60`) with `overload <virusprot>` flush — offenders are auto-added to pfSense's stock `virusprot` pf table for follow-on blocks. Operations (rate-limit tuning, log locations, revocation) are documented in `docs/runbooks/kms-public-exposure.md`.
@ -283,6 +283,28 @@ Critical services are scaled to **3 replicas**:
PodDisruptionBudgets ensure at least 2 replicas remain during node maintenance or disruptions.
### IPv6 Ingress (HE Tunnel + HAProxy Bridge)
Public IPv6 reaches the cluster over a **Hurricane Electric 6in4 tunnel** terminated on pfSense (`gif0`; tunnel endpoint `2001:470:6e:43d::2`, LAN prefix `2001:470:6f:43d::/64`). The apex `viktorbarzin.me AAAA``2001:470:6e:43d::2`.
pfSense cannot NAT IPv6→IPv4, so ingress is bridged by a **standalone HAProxy** on pfSense (a separate config/service — *not* the pfSense HAProxy package) that listens on the tunnel IPv6 and forwards to the IPv4 cluster LBs with **PROXY protocol v2 (`send-proxy-v2`)**, so real client IPv6 addresses propagate to CrowdSec instead of being masked as `10.0.20.1`:
| Listen `[2001:470:6e:43d::2]:` | → Backend (`send-proxy-v2`) | Purpose |
|---|---|---|
| 443, 80 | Traefik `10.0.20.203:443` / `:80` | Web apps |
| 25, 465, 587, 993 | mail NodePorts `30125` / `30126` / `30127` / `30128` on .101-103 | SMTP / SMTPS / Submission / IMAPS |
The web path works because Traefik trusts PROXY-v2 **only from `10.0.20.1`** (`entryPoints.web/websecure.proxyProtocol.trustedIPs` in `stacks/traefik/.../main.tf`) — real IPv4 clients arrive via ETP=Local with their own source IP (never `10.0.20.1`), so they are unaffected. Mail backends hit the mailserver's PROXY-aware alt-listeners (same pattern as the IPv4 mail HAProxy — see `mailserver.md`).
**No QUIC over IPv6** — the bridge is TCP/h2 only; IPv4 carries QUIC/HTTP3.
pfSense files (out-of-band, **not Terraform**):
- `/usr/local/etc/ipv6-haproxy.cfg` — the 6-frontend bridge config above.
- `/usr/local/etc/rc.d/ipv6proxy` — service wrapper (`service ipv6proxy {start,stop,status}`); `start` does a graceful `-sf` reload.
- `/usr/local/etc/ipv6_proxy.sh` — boot entrypoint (config.xml `<shellcmd>`): patches pfSense nginx off `[::]:443/:80` (rebinds to LAN IPv6) to free the tunnel IPv6, then `service ipv6proxy onestart`.
**Gotcha:** the backends use **no health `check`** — a plain TCP check hits the PROXY-expecting listeners without a PROXY header and would false-mark them DOWN. This path previously used `socat` (functional, but masked every IPv6 client as `10.0.20.1`); replaced by HAProxy on 2026-05-30 for real client IPs.
### Container Registry Pull-Through Cache
**Location**: Registry VM at 10.0.20.10

View file

@ -197,3 +197,34 @@ node `10.0.20.103`); PG state DB OK; TF state reconciled (`tg apply` exit 0).
(`2001:470:6e:43d::2`, separate HE-tunnel path, unchanged) and fails before
reaching Traefik. Confirm QUIC from a real device (Chrome → Protocol `h3`).
- pfSense `nginx` alias (=`.200`) is now unused; `traefik_lb` (=`.203`) is live.
## IPv6 follow-up — 2026-05-30 — DONE (HAProxy bridge, real client IPs)
The ETP=Local cutover fixed real client IPs + QUIC on the **IPv4** direct path
only. The **IPv6** path (HE 6in4 tunnel `2001:470:6e:43d::2` → pfSense) still ran
`socat`, which (a) masked every IPv6 client as `10.0.20.1`, and (b) broke
outright once Traefik's `proxyProtocol.trustedIPs` started requiring PROXY-v2
from `10.0.20.1`. Replaced socat with a **standalone HAProxy bridge** on pfSense
using `send-proxy-v2` so real IPv6 client IPs reach Traefik/CrowdSec.
Executed:
1. **Traefik** (TF, `stacks/traefik/.../main.tf`): added
`proxyProtocol = { trustedIPs = ["10.0.20.1"] }` to the `web` + `websecure`
entrypoints. Bounded risk — only connections *from* `10.0.20.1` (the bridge)
are PROXY-parsed; real IPv4 clients (ETP=Local, own source IP) are untouched.
Applied; IPv4 + proxied verified 200 immediately after.
2. **pfSense HAProxy** (`/usr/local/etc/ipv6-haproxy.cfg`): 6 frontends on
`[::2]:{443,80,25,465,587,993}` → Traefik `.203:{443,80}` and mail NodePorts
`{30125,30126,30127,30128}` (.101-103), all `send-proxy-v2`, **no `check`**
(a plain check would false-DOWN the PROXY-expecting listeners).
3. **Persistence**: rewrote `rc.d/ipv6proxy` → manages HAProxy
(`service ipv6proxy {start,stop,status}`, graceful `-sf`); rewrote
`ipv6_proxy.sh` (config.xml `<shellcmd>` boot entrypoint) to keep the
nginx-off-`[::]` patch then `service ipv6proxy onestart`. socat backups kept
as `*.socat-bak-*`.
**Verified:** web over `::2` = 200; Traefik logs show real public IPv6 clients
(e.g. `2620:10d:c092:500::6:1eda`), **zero** `10.0.20.1` artifacts; mail-over-IPv6
`220` banners on `::2:25/587` + IMAPS connect on `::2:993`; IPv4 direct/proxied
+ QUIC (`alt-svc: h3=":443"`) unaffected. **No QUIC over IPv6** (bridge is TCP/h2).
Authoritative as-built: `docs/architecture/networking.md` → "IPv6 Ingress".

View file

@ -130,6 +130,9 @@ resource "helm_release" "traefik" {
}
}
}
proxyProtocol = {
trustedIPs = ["10.0.20.1"]
}
}
websecure = {
port = 8443
@ -147,6 +150,13 @@ resource "helm_release" "traefik" {
enabled = true
advertisedPort = 443
}
# Accept PROXY-v2 ONLY from the pfSense HAProxy IPv6 bridge (10.0.20.1)
# so IPv6 clients (forwarded [2001:470:6e:43d::2] -> here) get their real
# IP for CrowdSec. Real IPv4 clients arrive with their own source IP
# (ETP=Local, not 10.0.20.1) and are unaffected.
proxyProtocol = {
trustedIPs = ["10.0.20.1"]
}
}
whisper-tcp = {
port = 10300