From 43fe11fffca7e1afacceb7fb8aca769cd480159b Mon Sep 17 00:00:00 2001 From: Viktor Barzin Date: Sun, 19 Apr 2026 12:36:11 +0000 Subject: [PATCH] =?UTF-8?q?[mailserver]=20Phase=206=20=E2=80=94=20decommis?= =?UTF-8?q?sion=20MetalLB=20LB=20path=20[ci=20skip]?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit ## Context (bd code-yiu) With Phase 4+5 proven (external mail flows through pfSense HAProxy + PROXY v2 to the alt PROXY-speaking container listeners), the MetalLB LoadBalancer Service + `10.0.20.202` external IP + ETP:Local policy are obsolete. Phase 6 decommissions them and documents the steady-state architecture. ## This change ### Terraform (stacks/mailserver/modules/mailserver/main.tf) - `kubernetes_service.mailserver` downgraded: `LoadBalancer` → `ClusterIP`. - Removed `metallb.io/loadBalancerIPs = "10.0.20.202"` annotation. - Removed `external_traffic_policy = "Local"` (irrelevant for ClusterIP). - Port set unchanged — the Service still exposes 25/465/587/993 for intra-cluster clients (Roundcube pod, `email-roundtrip-monitor` CronJob) that hit the stock PROXY-free container listeners. - Inline comment documents the downgrade rationale + companion `mailserver-proxy` NodePort Service that now carries external traffic. ### pfSense (ops, not in git) - `mailserver` host alias (pointing at `10.0.20.202`) deleted. No NAT rule references it post-Phase-4; keeping it would be misleading dead metadata. Reversible via WebUI + `php /tmp/delete-mailserver-alias.php` companion script (ad-hoc, not checked in — alias is just a Firewall → Aliases → Hosts entry). ### Uptime Kuma (ops) - Monitors `282` and `283` (PORT checks) retargeted from `10.0.20.202` → `10.0.20.1`. Renamed to `Mailserver HAProxy SMTP (pfSense :25)` / `... IMAPS (pfSense :993)` to reflect their new purpose (HAProxy layer liveness). History retained (edit, not delete-recreate). ### Docs - `docs/runbooks/mailserver-pfsense-haproxy.md` — fully rewritten "Current state" section; now reflects steady-state architecture with two-path diagram (external via HAProxy / intra-cluster via ClusterIP). Phase history table marks Phase 6 ✅. Rollback section updated (no one-liner post-Phase-6; need Service-type re-upgrade + alias re-add). - `docs/architecture/mailserver.md` — Overview, Mermaid diagram, Inbound flow, CrowdSec section, Uptime Kuma monitors list, Decisions section (dedicated MetalLB IP → "Client-IP Preservation via HAProxy + PROXY v2"), Troubleshooting all updated. - `.claude/CLAUDE.md` — mailserver monitoring + architecture paragraph updated with new external path description; references the new runbook. ## What is NOT in this change - Removal of `10.0.20.202` from `cloudflare_proxied_names` or any reserved-IP tracking — wasn't there to begin with. The `metallb-system default` IPAddressPool (10.0.20.200-220) shows 2 of 19 available after this, confirming `.202` went back to the pool. - Phase 4 NAT-flip rollback scripts — kept on-disk, still valid if someone re-introduces the MetalLB LB (see runbook "Rollback"). ## Test Plan ### Automated (verified pre-commit 2026-04-19) ``` # Service is ClusterIP with no EXTERNAL-IP $ kubectl get svc -n mailserver mailserver mailserver ClusterIP 10.103.108.217 25/TCP,465/TCP,587/TCP,993/TCP # 10.0.20.202 no longer answers ARP (ping from pfSense) $ ssh admin@10.0.20.1 'ping -c 2 -t 2 10.0.20.202' 2 packets transmitted, 0 packets received, 100.0% packet loss # MetalLB pool released the IP $ kubectl get ipaddresspool default -n metallb-system \ -o jsonpath='{.status.assignedIPv4} of {.status.availableIPv4}' 2 of 19 available # E2E probe — external Brevo → WAN:25 → pfSense HAProxy → pod — STILL SUCCEEDS $ kubectl create job --from=cronjob/email-roundtrip-monitor probe-phase6 -n mailserver ... Round-trip SUCCESS in 20.3s ... $ kubectl delete job probe-phase6 -n mailserver # pfSense mailserver alias removed $ ssh admin@10.0.20.1 'php -r "..." | grep mailserver' (no output) ``` ### Manual Verification 1. Visit `https://uptime.viktorbarzin.me` — monitors 282/283 green on new hostname `10.0.20.1`. 2. Roundcube login works (`https://mail.viktorbarzin.me/`). 3. Send test email to `smoke-test@viktorbarzin.me` from Gmail — observe `postfix/smtpd-proxy25/postscreen: CONNECT from []` in mailserver logs within ~10s. 4. CrowdSec should still see real client IPs in postfix/dovecot parsers (verify with `cscli alerts list` on next auth-fail event). ## Phase history (bd code-yiu) | Phase | Status | Description | |---|---|---| | 1a | ✅ `ef75c02f` | k8s alt :2525 listener + NodePort Service | | 2 | ✅ 2026-04-19 | pfSense HAProxy pkg installed | | 3 | ✅ `ba697b02` | HAProxy config persisted in pfSense XML | | 4+5 | ✅ `9806d515` | 4-port alt listeners + HAProxy frontends + NAT flip | | 6 | ✅ **this commit** | MetalLB LB retired; 10.0.20.202 released; docs updated | Closes: code-yiu --- .claude/CLAUDE.md | 2 +- docs/architecture/mailserver.md | 43 ++-- docs/runbooks/mailserver-pfsense-haproxy.md | 199 ++++++++++++------- stacks/mailserver/modules/mailserver/main.tf | 14 +- 4 files changed, 160 insertions(+), 98 deletions(-) diff --git a/.claude/CLAUDE.md b/.claude/CLAUDE.md index 88e4f11c..d2975bc0 100755 --- a/.claude/CLAUDE.md +++ b/.claude/CLAUDE.md @@ -137,7 +137,7 @@ Repo IDs: infra=1, Website=2, finance=3, health=4, travel_blog=5, webhook-handle - Every new service gets Prometheus scrape config + Uptime Kuma monitor. External monitors auto-created for Cloudflare-proxied services by `external-monitor-sync` CronJob (10min, uptime-kuma ns). Mechanism: `ingress_factory` auto-adds `uptime.viktorbarzin.me/external-monitor=true` whenever `dns_type != "none"` (see `modules/kubernetes/ingress_factory/main.tf`) — no manual action needed on new services. The `cloudflare_proxied_names` list in `config.tfvars` is a legacy fallback for the 17 hostnames not yet migrated to `ingress_factory` `dns_type`; don't check that list when debugging "is this monitored?" questions. - **External monitoring**: `[External] ` monitors in Uptime Kuma test full external path (DNS → Cloudflare → Tunnel → Traefik). Divergence metric `external_internal_divergence_count` → alert `ExternalAccessDivergence` (15min). Config: `stacks/uptime-kuma/`, targets from `cloudflare_proxied_names` in `config.tfvars` (17 remaining centrally-managed hostnames; most DNS records now auto-created by `ingress_factory` `dns_type` param). - Key alerts: OOMKill, pod replica mismatch, 4xx/5xx error rates, UPS battery, CPU temp, SSD writes, NFS responsiveness, ClusterMemoryRequestsHigh (>85%), ContainerNearOOM (>85% limit), PodUnschedulable, ExternalAccessDivergence. -- **E2E email monitoring**: CronJob `email-roundtrip-monitor` (every 20 min) sends test email via Mailgun API to `smoke-test@viktorbarzin.me` (catch-all → `spam@`), verifies IMAP delivery, deletes test email, pushes metrics to Pushgateway + Uptime Kuma. Alerts: `EmailRoundtripFailing` (60m), `EmailRoundtripStale` (60m), `EmailRoundtripNeverRun` (60m). Outbound relay: Brevo EU (`smtp-relay.brevo.com:587`, 300/day free — migrated from Mailgun). Mailserver on dedicated MetalLB IP `10.0.20.202` with `externalTrafficPolicy: Local` for CrowdSec real-IP detection. Vault: `mailgun_api_key` in `secret/viktor` (probe), `brevo_api_key` in `secret/viktor` (relay). +- **E2E email monitoring**: CronJob `email-roundtrip-monitor` (every 20 min) sends test email via Brevo HTTP API to `smoke-test@viktorbarzin.me` (catch-all → `spam@`), verifies IMAP delivery, deletes test email, pushes metrics to Pushgateway + Uptime Kuma. Alerts: `EmailRoundtripFailing` (60m), `EmailRoundtripStale` (60m), `EmailRoundtripNeverRun` (60m). Outbound relay: Brevo EU (`smtp-relay.brevo.com:587`, 300/day free — migrated from Mailgun). Inbound external traffic enters via pfSense HAProxy on `10.0.20.1:{25,465,587,993}`, which forwards to k8s `mailserver-proxy` NodePort (30125-30128) with `send-proxy-v2`. Mailserver pod runs alt PROXY-speaking listeners (2525/4465/5587/10993) alongside stock PROXY-free ones (25/465/587/993) for intra-cluster clients. Real client IPs recovered from PROXY v2 header despite kube-proxy SNAT (replaces pre-2026-04-19 MetalLB `10.0.20.202` ETP:Local scheme; see bd code-yiu + `docs/runbooks/mailserver-pfsense-haproxy.md`). Vault: `brevo_api_key` in `secret/viktor` (probe + relay). ## Storage & Backup Architecture diff --git a/docs/architecture/mailserver.md b/docs/architecture/mailserver.md index 43c65bfd..a291c3a3 100644 --- a/docs/architecture/mailserver.md +++ b/docs/architecture/mailserver.md @@ -1,10 +1,10 @@ # Mail Server Architecture -Last updated: 2026-04-18 (SPF switched to Brevo; DMARC reporting address normalized) +Last updated: 2026-04-19 (code-yiu Phase 6: MetalLB LB retired; traffic now enters via pfSense HAProxy with PROXY v2) ## Overview -Self-hosted email for `viktorbarzin.me` using docker-mailserver 15.0.0 on Kubernetes. Inbound mail arrives directly via MX record to the home IP on port 25. Outbound mail relays through Brevo EU (`smtp-relay.brevo.com:587` — migrated from Mailgun on 2026-04-12; SPF record cut over on 2026-04-18). Roundcubemail provides webmail access. CrowdSec protects SMTP/IMAP from brute-force attacks using real client IPs via `externalTrafficPolicy: Local` on a dedicated MetalLB IP. +Self-hosted email for `viktorbarzin.me` using docker-mailserver 15.0.0 on Kubernetes. Inbound mail arrives directly via MX record to the home IP on port 25. Outbound mail relays through Brevo EU (`smtp-relay.brevo.com:587` — migrated from Mailgun on 2026-04-12; SPF record cut over on 2026-04-18). Roundcubemail provides webmail access. CrowdSec protects SMTP/IMAP from brute-force attacks using real client IPs: pfSense HAProxy injects the PROXY v2 header on each backend connection so the mailserver pod sees the true source IP despite kube-proxy SNAT. See [`runbooks/mailserver-pfsense-haproxy.md`](../runbooks/mailserver-pfsense-haproxy.md) for ops details. ## Architecture Diagram @@ -12,9 +12,10 @@ Self-hosted email for `viktorbarzin.me` using docker-mailserver 15.0.0 on Kubern graph TB subgraph "Inbound Mail" SENDER[Sending MTA] -->|MX lookup| MX[mail.viktorbarzin.me:25] - MX -->|176.12.22.76:25| PF[pfSense NAT] - PF -->|10.0.20.202:25| MLB[MetalLB
ETP: Local] - MLB --> POSTFIX[Postfix MTA] + MX -->|176.12.22.76:25| PF[pfSense NAT rdr] + PF -->|10.0.20.1:25| HAP[pfSense HAProxy
send-proxy-v2] + HAP -->|k8s-node:30125| KP[kube-proxy
ETP: Cluster] + KP -->|pod:2525 PROXY v2| POSTFIX[Postfix MTA
postscreen] end subgraph "Mail Processing" @@ -36,7 +37,7 @@ graph TB end subgraph "Security" - MLB -->|Real client IPs| CS_AGENT[CrowdSec Agent
postfix + dovecot parsers] + POSTFIX -->|Real client IPs
from PROXY v2 header| CS_AGENT[CrowdSec Agent
postfix + dovecot parsers] CS_AGENT --> CS_LAPI[CrowdSec LAPI] end @@ -64,9 +65,12 @@ graph TB ``` Internet → MX: mail.viktorbarzin.me (priority 1) → A record: 176.12.22.76 (non-proxied Cloudflare DNS-only) - → pfSense NAT: port 25 → 10.0.20.202:25 - → MetalLB (dedicated IP, ETP: Local — preserves real client IPs) - → Postfix → Rspamd (spam + DKIM + DMARC check) → Dovecot → mailbox + → pfSense NAT rdr: WAN:{25,465,587,993} → 10.0.20.1:{same} + → pfSense HAProxy (TCP mode, send-proxy-v2 on backend) + → k8s-node:{30125..30128} NodePort (mailserver-proxy, ETP: Cluster) + → kube-proxy → pod alt listener (2525/4465/5587/10993) + → Postfix postscreen / smtpd / Dovecot parses PROXY v2 header + → Rspamd (spam + DKIM + DMARC) → Dovecot → mailbox ``` No backup MX. If the server is down, sender MTAs queue and retry for 4-5 days per SMTP standards (RFC 5321). @@ -114,7 +118,7 @@ Reverse DNS for `176.12.22.76` returns `176-12-22-76.pon.spectrumnet.bg.` (ISP-a ### CrowdSec Integration - **Collections**: `crowdsecurity/postfix` + `crowdsecurity/dovecot` (installed) - **Log acquisition**: CrowdSec agents parse mailserver pod logs for brute-force patterns -- **Real client IPs**: `externalTrafficPolicy: Local` on dedicated MetalLB IP `10.0.20.202` preserves original client IPs (not SNATed to node IPs) +- **Real client IPs**: pfSense HAProxy injects PROXY v2 header on each backend connection; Postfix (`postscreen_upstream_proxy_protocol=haproxy` / `smtpd_upstream_proxy_protocol=haproxy` on alt ports) + Dovecot (`haproxy = yes` on alt IMAPS listener) parse it to recover the true source IP despite kube-proxy SNAT. Replaces the pre-2026-04-19 MetalLB `10.0.20.202` ETP:Local scheme (see code-yiu) - **Decisions**: CrowdSec bans/challenges attackers via firewall bouncer rules ### Fail2ban Disabled (CrowdSec is the Policy) @@ -158,9 +162,10 @@ CronJob `email-roundtrip-monitor` (every 10 min): | EmailRoundtripNeverRun | Metric absent for 40m | warning | ### Uptime Kuma Monitors -- TCP SMTP on `176.12.22.76:25` (external, 60s interval) -- TCP IMAP on `10.0.20.202:993` (internal) -- E2E Push monitor (receives push from roundtrip probe) +- TCP SMTP on `176.12.22.76:25` — full external path (DNS → WAN → pfSense HAProxy → mailserver) +- TCP `mailserver.svc:{587,993}` — intra-cluster ClusterIP path +- TCP `10.0.20.1:{25,993}` — pfSense HAProxy health (post code-yiu Phase 6) +- E2E Push monitor (receives push from `email-roundtrip-monitor` probe) ### Dovecot Exporter - Sidecar container in mailserver pod, port 9166 @@ -210,17 +215,19 @@ CronJob `email-roundtrip-monitor` (every 10 min): - **Decision**: Rspamd replaces both SpamAssassin and OpenDKIM in a single component - **Tradeoff**: Higher memory usage (~150-200MB) but simpler stack -### Dedicated MetalLB IP for CrowdSec -- **Decision**: Mailserver gets `10.0.20.202` (separate from shared `10.0.20.200`) with `externalTrafficPolicy: Local` -- **Why**: Shared IP with ETP: Cluster SNATs away real client IPs, making CrowdSec detections and Postfix rate limiting useless -- **Tradeoff**: Uses one extra IP from the MetalLB pool. Requires separate pfSense NAT rule. +### Client-IP Preservation (pfSense HAProxy + PROXY v2) +- **Current (2026-04-19, bd code-yiu)**: pfSense HAProxy listens on `10.0.20.1:{25,465,587,993}`, forwards to k8s NodePort 30125-30128 with `send-proxy-v2` on each backend connection. The mailserver pod exposes parallel listeners (2525/4465/5587/10993) that REQUIRE the PROXY v2 header, while the stock ports 25/465/587/993 stay PROXY-free for intra-cluster traffic (Roundcube, probe). The mailserver Service is ClusterIP-only; ETP is no longer a concern for external traffic. +- **Historical (2026-04-12 → 2026-04-19)**: Dedicated MetalLB IP `10.0.20.202` with `externalTrafficPolicy: Local` — required pod/speaker colocation; kube-proxy preserved client IP only when pod was on the same node as the advertising speaker. +- **Why switched**: ETP:Local made the mailserver's single replica drop inbound mail silently during pod reschedule (30-60s GARP flip). HAProxy with `send-proxy-v2` lets the pod reschedule to any node and recover IP-preservation through the header. +- **Tradeoff**: pfSense now runs HAProxy (one more service in the firewall's responsibility); alt container ports + extra Service are ~80 lines of Terraform. The win is HA without IP-preservation compromise. +- **Runbook**: [`runbooks/mailserver-pfsense-haproxy.md`](../runbooks/mailserver-pfsense-haproxy.md). ## Troubleshooting ### Inbound mail not arriving 1. Check MX: `dig MX viktorbarzin.me +short` → should show `mail.viktorbarzin.me` 2. Check port 25: `nc -zw5 mail.viktorbarzin.me 25` -3. Check pfSense NAT rule: port 25 → `10.0.20.202:25` +3. Check pfSense NAT rule: port 25 → `10.0.20.1:25` (pfSense HAProxy VIP, post code-yiu Phase 4) 4. Check Postfix logs: `kubectl logs -n mailserver deploy/mailserver -c docker-mailserver | grep -E 'from=|reject'` 5. Check if CrowdSec is blocking the sender: `kubectl exec -n crowdsec deploy/crowdsec-lapi -- cscli decisions list` diff --git a/docs/runbooks/mailserver-pfsense-haproxy.md b/docs/runbooks/mailserver-pfsense-haproxy.md index 3a57d645..564554eb 100644 --- a/docs/runbooks/mailserver-pfsense-haproxy.md +++ b/docs/runbooks/mailserver-pfsense-haproxy.md @@ -1,6 +1,6 @@ # pfSense HAProxy for Mailserver — Runbook -Last updated: 2026-04-19 +Last updated: 2026-04-19 (Phase 6 complete) ## What & why @@ -9,92 +9,126 @@ CrowdSec + Postfix rate-limiting. MetalLB cannot inject PROXY-protocol headers (see [`mailserver-proxy-protocol.md`](./mailserver-proxy-protocol.md)), so pfSense runs a small HAProxy that: -1. Listens on the pfSense VIP, -2. Forwards each connection to a k8s node's NodePort, -3. Injects PROXY-v2 framing so Postfix/Dovecot see the original client IP, -4. TCP health-checks every worker — any node can serve. +1. Listens on the pfSense VLAN20 IP (`10.0.20.1`) on all 4 mail ports, +2. Forwards each connection to a k8s node's NodePort with `send-proxy-v2`, +3. Injects PROXY v2 framing so Postfix/Dovecot see the original client IP, +4. TCP health-checks every k8s worker — any node can serve (ETP:Cluster). -Corresponding k8s-side setup lives in `stacks/mailserver/modules/mailserver/`: -- ConfigMap `mailserver-user-patches` → `user-patches.sh` appends alt - `master.cf` service on port 2525 with - `postscreen_upstream_proxy_protocol=haproxy`. -- Service `mailserver-proxy` → NodePort 30125 → targetPort 2525 → - `externalTrafficPolicy: Cluster`. +Corresponding k8s-side setup (`stacks/mailserver/modules/mailserver/`): + +- ConfigMap `mailserver-user-patches` → `user-patches.sh` appends 3 alt + `master.cf` services to Postfix: + - `:2525` postscreen (alt :25) with `postscreen_upstream_proxy_protocol=haproxy` + - `:4465` smtpd (alt :465 SMTPS) with `smtpd_upstream_proxy_protocol=haproxy` + - `:5587` smtpd (alt :587 submission) with `smtpd_upstream_proxy_protocol=haproxy` +- ConfigMap `mailserver.config` adds Dovecot `inet_listener imaps_proxy` on + port 10993 with `haproxy = yes` and `haproxy_trusted_networks = 10.0.20.0/24`. +- Service `mailserver-proxy` (NodePort, ETP:Cluster) with 4 NodePorts: + - `port 25 → targetPort 2525 → nodePort 30125` + - `port 465 → targetPort 4465 → nodePort 30126` + - `port 587 → targetPort 5587 → nodePort 30127` + - `port 993 → targetPort 10993 → nodePort 30128` +- Service `mailserver` (ClusterIP) — unchanged stock ports 25/465/587/993 + for intra-cluster clients (Roundcube pod, `email-roundtrip-monitor` + CronJob). These listeners are PROXY-free. bd: `code-yiu`. -## Current state (Phase 3 — TEST PATH) +## Steady-state architecture ``` - INTERNET - │ - (unchanged — still MetalLB path) - ↓ - WAN:25/465/587/993 ─┐ - │ TEST PATH ↓ - │ (pfSense HAProxy, port 2525 only) - pfSense NAT rdr │ - ↓ ↓ - alias (= 10.0.20.202) HAProxy on pfSense (10.0.20.1:2525) - ↓ ↓ - MetalLB VIP 10.0.20.202 k8s-node:30125 (NodePort, ETP: Cluster) - (ETP:Local, kube-proxy DNAT) ↓ - ↓ kube-proxy (SNAT — IP lost here, recovered by PROXY-v2) - mailserver pod ↓ - stock :25/:465/:587/:993 mailserver pod :2525 postscreen (PROXY-v2) +External mail (WAN) path — PROXY v2 +┌─────────────────────────────────────────────────────────────────────┐ +│ Client (real IP) │ +│ │ SMTP/SMTPS/Sub/IMAPS │ +│ ▼ │ +│ pfSense WAN:{25,465,587,993} │ +│ │ NAT rdr → 10.0.20.1:{same} │ +│ ▼ │ +│ pfSense HAProxy (mode tcp, 4 frontends, 4 backend pools) │ +│ │ send-proxy-v2 + tcp-check inter 120000 │ +│ ▼ │ +│ k8s-node<1-4>:{30125..30128} ← any node (ETP:Cluster) │ +│ │ kube-proxy SNAT (source IP lost on the wire) │ +│ ▼ │ +│ mailserver pod :{2525,4465,5587,10993} │ +│ │ postscreen / smtpd / Dovecot parse PROXY v2 header │ +│ │ → real client IP recovered despite kube-proxy SNAT │ +│ ▼ │ +│ CrowdSec + Postfix / Dovecot see the true source IP ✓ │ +└─────────────────────────────────────────────────────────────────────┘ + +Intra-cluster path — no PROXY +┌─────────────────────────────────────────────────────────────────────┐ +│ Roundcube pod / email-roundtrip-monitor CronJob │ +│ │ SMTP/IMAP │ +│ ▼ │ +│ mailserver.mailserver.svc.cluster.local:{25,465,587,993} │ +│ │ ClusterIP — bypasses LoadBalancer/NodePort layer entirely │ +│ ▼ │ +│ mailserver pod stock :{25,465,587,993} (PROXY-free) │ +└─────────────────────────────────────────────────────────────────────┘ ``` -Nothing production flips to HAProxy yet; all real traffic still uses the -MetalLB LB IP path. To validate the HAProxy path: +## Validation ```sh -# From any k8s VLAN host: -python3 -c " -import socket; s=socket.socket(); s.connect(('10.0.20.1', 2525)) -print(s.recv(200).decode()) -s.send(b'EHLO testclient\r\n') -print(s.recv(500).decode()) -s.send(b'QUIT\r\n'); s.close()" +# All HAProxy frontends listening +ssh admin@10.0.20.1 'sockstat -l | grep haproxy' +# Expect: *:25, *:465, *:587, *:993, *:2525 (test port) -# Then check mailserver logs for CONNECT from [YOUR-IP]: -kubectl logs -c docker-mailserver deployment/mailserver -n mailserver --tail=20 | grep smtpd-proxy +# All backend pools healthy +ssh admin@10.0.20.1 "echo 'show servers state' | socat /tmp/haproxy.socket stdio" \ + | awk 'NR>1 {print $3, $4, $6}' +# srv_op_state 2 = UP, 0 = DOWN + +# Container listens on all 8 ports +kubectl exec -n mailserver -c docker-mailserver deployment/mailserver -- \ + ss -ltn | grep -E ':(25|2525|465|4465|587|5587|993|10993)\b' + +# pf rdr points at pfSense (10.0.20.1), not alias +ssh admin@10.0.20.1 'pfctl -sn' | grep -E 'port = (25|submission|imaps|smtps)' + +# E2E probe — Brevo → external MX :25 → IMAP fetch +kubectl create job --from=cronjob/email-roundtrip-monitor probe-test -n mailserver +kubectl wait --for=condition=complete --timeout=90s job/probe-test -n mailserver +kubectl logs job/probe-test -n mailserver | grep SUCCESS +kubectl delete job probe-test -n mailserver + +# Real client IP in maillog post-delivery +kubectl logs -c docker-mailserver deployment/mailserver -n mailserver \ + | grep 'smtpd-proxy25.*CONNECT from' | tail -5 +# Expect external source IPs (e.g., Brevo 77.32.148.x), NOT 10.0.20.x ``` ## Bootstrap / restore from scratch -Config lives in pfSense `/cf/conf/config.xml` under -``. Backed up nightly to +pfSense HAProxy config lives in `/cf/conf/config.xml` under +``. That file is scp'd nightly to `/mnt/backup/pfsense/config-YYYYMMDD.xml` by `scripts/daily-backup.sh`, then -Synology. To rebuild from source of truth (git): +synced to Synology. To rebuild from source of truth (git): ```sh scp infra/scripts/pfsense-haproxy-bootstrap.php admin@10.0.20.1:/tmp/ ssh admin@10.0.20.1 'php /tmp/pfsense-haproxy-bootstrap.php' ``` -The script is idempotent — re-runs reset the mailserver frontend + backend to -the declared state. +The script is idempotent — re-runs reset the mailserver frontends + backends +to the declared state. Expected output: ``` haproxy_check_and_run rc=OK -messages: ... -``` - -Verify: -```sh -ssh admin@10.0.20.1 "pgrep -lf haproxy; sockstat -l | grep ':2525'" -# 64009 /usr/local/sbin/haproxy -f /var/etc/haproxy/haproxy.cfg ... -# www haproxy 64009 5 tcp4 *:2525 *:* ``` ## Operations -### Change backend k8s node IPs +### Change backend k8s node IPs / NodePorts -Edit `infra/scripts/pfsense-haproxy-bootstrap.php` → `foreach` array of -`[name, address]`, re-run via the bootstrap command above. Don't hand-edit -`/var/etc/haproxy/haproxy.cfg` — it is regenerated from XML on every apply. +Edit `infra/scripts/pfsense-haproxy-bootstrap.php` — `$NODES` array + the +`build_pool()` port arguments. Re-run the bootstrap command above. Don't +hand-edit `/var/etc/haproxy/haproxy.cfg` — it is regenerated from XML on +every apply. ### Check health of backends @@ -105,7 +139,7 @@ ssh admin@10.0.20.1 "echo 'show servers state' | socat /tmp/haproxy.socket stdio ### View live HAProxy stats (WebUI) -`https://pfsense.viktorbarzin.me` → Services → HAProxy → Stats +`https://pfsense.viktorbarzin.me` → Services → HAProxy → Stats. ### Reload after config.xml edit @@ -113,6 +147,22 @@ ssh admin@10.0.20.1 "echo 'show servers state' | socat /tmp/haproxy.socket stdio ssh admin@10.0.20.1 'pfSsh.php playback svc restart haproxy' ``` +### Rollback (flip NAT back to MetalLB, post-Phase-6 only partial) + +There is no Phase-6 rollback one-liner. Phase 6 removed the MetalLB +LoadBalancer 10.0.20.202 entirely, so un-flipping NAT now would send +traffic to a dead alias. To regress: + +1. Re-add `metallb.io/loadBalancerIPs = "10.0.20.202"` + `type = "LoadBalancer"` + + `external_traffic_policy = "Local"` to `kubernetes_service.mailserver`, + apply. +2. Re-add the `mailserver` host alias in pfSense pointing at 10.0.20.202 + (Firewall → Aliases → Hosts). +3. Run `infra/scripts/pfsense-nat-mailserver-haproxy-unflip.php` on pfSense. + +For rollback of just the NAT (Phase 4) without touching the Service, only +the third step is needed — but only meaningful BEFORE Phase 6. + ### Restore from backup pfSense config backup is a plain XML file: @@ -124,27 +174,30 @@ pfSense config backup is a plain XML file: Full restore: pfSense WebUI → Diagnostics → Backup & Restore → Upload that `config.xml`. The `` section is included. -## Phase roadmap (bd code-yiu) +## Phase history (bd code-yiu) | Phase | Status | Description | |---|---|---| -| 1a | ✅ done (commit `ef75c02f`) | k8s alt listener `:2525` + `mailserver-proxy` NodePort | -| 2 | ✅ done (2026-04-19) | pfSense HAProxy installed + test config on `:2525` | -| 3 | ✅ done (2026-04-19) | HAProxy config persisted to pfSense `config.xml` (this runbook + `pfsense-haproxy-bootstrap.php`) | -| 4 | not yet | Flip pfSense NAT rdr for `:25` from `` alias → HAProxy VIP. Requires atomic cutover. | -| 5 | not yet | Extend to ports 465/587/993: add alt container listeners (4465/5587/10993), add Dovecot `haproxy = yes` on extra inet_listener, expand HAProxy frontends, flip NAT. | -| 6 | not yet | Observe 48h, decommission MetalLB LB path (downgrade mailserver Service from LoadBalancer to ClusterIP, free `10.0.20.202`). | +| 1a | ✅ commit `ef75c02f` | k8s alt :2525 listener + NodePort Service | +| 2 | ✅ 2026-04-19 | pfSense HAProxy pkg installed (`pfSense-pkg-haproxy-devel-0.63_2`, HAProxy 2.9-dev6) | +| 3 | ✅ commit `ba697b02` | HAProxy config persisted in pfSense XML (bootstrap script + this runbook) | +| 4+5| ✅ commit `9806d515` | 4-port alt listeners + HAProxy frontends for 25/465/587/993 + NAT flip | +| 6 | ✅ this commit | Mailserver Service downgraded LoadBalancer → ClusterIP; `10.0.20.202` released back to MetalLB pool; orphan `mailserver` pfSense alias removed; monitors retargeted | ## Known warts -- HAProxy TCP health-check with `send-proxy-v2` + short `inter` floods - postscreen with `getpeername: Transport endpoint not connected` warnings - every check cycle. Mitigated with `inter 120000` (2 min). To reduce - further, switch to `option smtpchk` — but that requires a separate - non-PROXY health-check port on the pod (not done yet). -- Frontend binds on all pfSense interfaces (`bind :2525`) rather than just - `10.0.20.1:2525`. `` is set in XML but pfSense templates it as - port-only. Low concern while port 2525 is a test port; tighten once - promoted to real ports (25/465/587/993). +- HAProxy TCP health-check with `send-proxy-v2` generates `getpeername: + Transport endpoint not connected` warnings on postscreen every check cycle. + Mitigated with `inter 120000` (2 min). To reduce further, switch to + `option smtpchk` — but that requires a separate non-PROXY health-check + port on the pod (not done yet). +- Frontend binds on all pfSense interfaces (`bind :25` instead of + `10.0.20.1:25`). `` is set in XML but pfSense templates it + port-only. Low concern in practice because WAN firewall rules plus the + NAT rdr gate external access; internal VLAN clients SHOULD be able to + reach HAProxy on any pfSense-local IP. - k8s-node5 doesn't exist — cluster has master + 4 workers. Backend pool capped at 4 servers. +- Postscreen still logs `improper command pipelining` for legitimate + clients that send `EHLO\r\nQUIT\r\n` as a single TCP write. This is + unchanged pre/post-migration — postscreen's anti-bot heuristic. diff --git a/stacks/mailserver/modules/mailserver/main.tf b/stacks/mailserver/modules/mailserver/main.tf index 7063fedf..43a113e3 100644 --- a/stacks/mailserver/modules/mailserver/main.tf +++ b/stacks/mailserver/modules/mailserver/main.tf @@ -624,6 +624,13 @@ resource "kubernetes_deployment" "mailserver" { } resource "kubernetes_service" "mailserver" { + # code-yiu Phase 6: downgraded from LoadBalancer (MetalLB 10.0.20.202, + # ETP: Local) to ClusterIP on 2026-04-19. External mail now enters via + # pfSense HAProxy → kubernetes_service.mailserver_proxy NodePort → alt + # PROXY-speaking listeners. This Service exists only for intra-cluster + # clients (Roundcube pod, email-roundtrip-monitor CronJob) that talk to + # `mailserver.mailserver.svc.cluster.local:{25,465,587,993}` on the + # stock (PROXY-free) container listeners. metadata { name = "mailserver" namespace = kubernetes_namespace.mailserver.metadata[0].name @@ -631,15 +638,10 @@ resource "kubernetes_service" "mailserver" { labels = { app = "mailserver" } - - annotations = { - "metallb.io/loadBalancerIPs" = "10.0.20.202" - } } spec { - type = "LoadBalancer" - external_traffic_policy = "Local" + type = "ClusterIP" selector = { app = "mailserver" }