[mailserver] Phase 6 — decommission MetalLB LB path [ci skip]

## Context (bd code-yiu) With Phase 4+5 proven (external mail flows through pfSense HAProxy + PROXY v2 to the alt PROXY-speaking container listeners), the MetalLB LoadBalancer Service + `10.0.20.202` external IP + ETP:Local policy are obsolete. Phase 6 decommissions them and documents the steady-state architecture. ## This change ### Terraform (stacks/mailserver/modules/mailserver/main.tf) - `kubernetes_service.mailserver` downgraded: `LoadBalancer` → `ClusterIP`. - Removed `metallb.io/loadBalancerIPs = "10.0.20.202"` annotation. - Removed `external_traffic_policy = "Local"` (irrelevant for ClusterIP). - Port set unchanged — the Service still exposes 25/465/587/993 for intra-cluster clients (Roundcube pod, `email-roundtrip-monitor` CronJob) that hit the stock PROXY-free container listeners. - Inline comment documents the downgrade rationale + companion `mailserver-proxy` NodePort Service that now carries external traffic. ### pfSense (ops, not in git) - `mailserver` host alias (pointing at `10.0.20.202`) deleted. No NAT rule references it post-Phase-4; keeping it would be misleading dead metadata. Reversible via WebUI + `php /tmp/delete-mailserver-alias.php` companion script (ad-hoc, not checked in — alias is just a Firewall → Aliases → Hosts entry). ### Uptime Kuma (ops) - Monitors `282` and `283` (PORT checks) retargeted from `10.0.20.202` → `10.0.20.1`. Renamed to `Mailserver HAProxy SMTP (pfSense :25)` / `... IMAPS (pfSense :993)` to reflect their new purpose (HAProxy layer liveness). History retained (edit, not delete-recreate). ### Docs - `docs/runbooks/mailserver-pfsense-haproxy.md` — fully rewritten "Current state" section; now reflects steady-state architecture with two-path diagram (external via HAProxy / intra-cluster via ClusterIP). Phase history table marks Phase 6 ✅. Rollback section updated (no one-liner post-Phase-6; need Service-type re-upgrade + alias re-add). - `docs/architecture/mailserver.md` — Overview, Mermaid diagram, Inbound flow, CrowdSec section, Uptime Kuma monitors list, Decisions section (dedicated MetalLB IP → "Client-IP Preservation via HAProxy + PROXY v2"), Troubleshooting all updated. - `.claude/CLAUDE.md` — mailserver monitoring + architecture paragraph updated with new external path description; references the new runbook. ## What is NOT in this change - Removal of `10.0.20.202` from `cloudflare_proxied_names` or any reserved-IP tracking — wasn't there to begin with. The `metallb-system default` IPAddressPool (10.0.20.200-220) shows 2 of 19 available after this, confirming `.202` went back to the pool. - Phase 4 NAT-flip rollback scripts — kept on-disk, still valid if someone re-introduces the MetalLB LB (see runbook "Rollback"). ## Test Plan ### Automated (verified pre-commit 2026-04-19) ``` # Service is ClusterIP with no EXTERNAL-IP $ kubectl get svc -n mailserver mailserver mailserver ClusterIP 10.103.108.217 <none> 25/TCP,465/TCP,587/TCP,993/TCP # 10.0.20.202 no longer answers ARP (ping from pfSense) $ ssh admin@10.0.20.1 'ping -c 2 -t 2 10.0.20.202' 2 packets transmitted, 0 packets received, 100.0% packet loss # MetalLB pool released the IP $ kubectl get ipaddresspool default -n metallb-system \ -o jsonpath='{.status.assignedIPv4} of {.status.availableIPv4}' 2 of 19 available # E2E probe — external Brevo → WAN:25 → pfSense HAProxy → pod — STILL SUCCEEDS $ kubectl create job --from=cronjob/email-roundtrip-monitor probe-phase6 -n mailserver ... Round-trip SUCCESS in 20.3s ... $ kubectl delete job probe-phase6 -n mailserver # pfSense mailserver alias removed $ ssh admin@10.0.20.1 'php -r "..." | grep mailserver' (no output) ``` ### Manual Verification 1. Visit `https://uptime.viktorbarzin.me` — monitors 282/283 green on new hostname `10.0.20.1`. 2. Roundcube login works (`https://mail.viktorbarzin.me/`). 3. Send test email to `smoke-test@viktorbarzin.me` from Gmail — observe `postfix/smtpd-proxy25/postscreen: CONNECT from [<Gmail-IP>]` in mailserver logs within ~10s. 4. CrowdSec should still see real client IPs in postfix/dovecot parsers (verify with `cscli alerts list` on next auth-fail event). ## Phase history (bd code-yiu) | Phase | Status | Description | |---|---|---| | 1a | ✅ `ef75c02f` | k8s alt :2525 listener + NodePort Service | | 2 | ✅ 2026-04-19 | pfSense HAProxy pkg installed | | 3 | ✅ `ba697b02` | HAProxy config persisted in pfSense XML | | 4+5 | ✅ `9806d515` | 4-port alt listeners + HAProxy frontends + NAT flip | | 6 | ✅ **this commit** | MetalLB LB retired; 10.0.20.202 released; docs updated | Closes: code-yiu
2026-04-19 12:36:11 +00:00 · 2026-04-19 12:36:11 +00:00 · 43fe11fffc
commit 43fe11fffc
parent 9806d515dd
4 changed files with 160 additions and 98 deletions
--- a/.claude/CLAUDE.md
+++ b/.claude/CLAUDE.md
@ -137,7 +137,7 @@ Repo IDs: infra=1, Website=2, finance=3, health=4, travel_blog=5, webhook-handle
 - Every new service gets Prometheus scrape config + Uptime Kuma monitor. External monitors auto-created for Cloudflare-proxied services by `external-monitor-sync` CronJob (10min, uptime-kuma ns). Mechanism: `ingress_factory` auto-adds `uptime.viktorbarzin.me/external-monitor=true` whenever `dns_type != "none"` (see `modules/kubernetes/ingress_factory/main.tf`) — no manual action needed on new services. The `cloudflare_proxied_names` list in `config.tfvars` is a legacy fallback for the 17 hostnames not yet migrated to `ingress_factory` `dns_type`; don't check that list when debugging "is this monitored?" questions.
 - **External monitoring**: `[External] <service>` monitors in Uptime Kuma test full external path (DNS → Cloudflare → Tunnel → Traefik). Divergence metric `external_internal_divergence_count` → alert `ExternalAccessDivergence` (15min). Config: `stacks/uptime-kuma/`, targets from `cloudflare_proxied_names` in `config.tfvars` (17 remaining centrally-managed hostnames; most DNS records now auto-created by `ingress_factory` `dns_type` param).
 - Key alerts: OOMKill, pod replica mismatch, 4xx/5xx error rates, UPS battery, CPU temp, SSD writes, NFS responsiveness, ClusterMemoryRequestsHigh (>85%), ContainerNearOOM (>85% limit), PodUnschedulable, ExternalAccessDivergence.
- **E2E email monitoring**: CronJob `email-roundtrip-monitor` (every 20 min) sends test email via Mailgun API to `smoke-test@viktorbarzin.me` (catch-all → `spam@`), verifies IMAP delivery, deletes test email, pushes metrics to Pushgateway + Uptime Kuma. Alerts: `EmailRoundtripFailing` (60m), `EmailRoundtripStale` (60m), `EmailRoundtripNeverRun` (60m). Outbound relay: Brevo EU (`smtp-relay.brevo.com:587`, 300/day free — migrated from Mailgun). Mailserver on dedicated MetalLB IP `10.0.20.202` with `externalTrafficPolicy: Local` for CrowdSec real-IP detection. Vault: `mailgun_api_key` in `secret/viktor` (probe), `brevo_api_key` in `secret/viktor` (relay).
+- **E2E email monitoring**: CronJob `email-roundtrip-monitor` (every 20 min) sends test email via Brevo HTTP API to `smoke-test@viktorbarzin.me` (catch-all → `spam@`), verifies IMAP delivery, deletes test email, pushes metrics to Pushgateway + Uptime Kuma. Alerts: `EmailRoundtripFailing` (60m), `EmailRoundtripStale` (60m), `EmailRoundtripNeverRun` (60m). Outbound relay: Brevo EU (`smtp-relay.brevo.com:587`, 300/day free — migrated from Mailgun). Inbound external traffic enters via pfSense HAProxy on `10.0.20.1:{25,465,587,993}`, which forwards to k8s `mailserver-proxy` NodePort (30125-30128) with `send-proxy-v2`. Mailserver pod runs alt PROXY-speaking listeners (2525/4465/5587/10993) alongside stock PROXY-free ones (25/465/587/993) for intra-cluster clients. Real client IPs recovered from PROXY v2 header despite kube-proxy SNAT (replaces pre-2026-04-19 MetalLB `10.0.20.202` ETP:Local scheme; see bd code-yiu + `docs/runbooks/mailserver-pfsense-haproxy.md`). Vault: `brevo_api_key` in `secret/viktor` (probe + relay).

 ## Storage & Backup Architecture

--- a/docs/architecture/mailserver.md
+++ b/docs/architecture/mailserver.md
@ -1,10 +1,10 @@
 # Mail Server Architecture

-Last updated: 2026-04-18 (SPF switched to Brevo; DMARC reporting address normalized)
+Last updated: 2026-04-19 (code-yiu Phase 6: MetalLB LB retired; traffic now enters via pfSense HAProxy with PROXY v2)

 ## Overview

-Self-hosted email for `viktorbarzin.me` using docker-mailserver 15.0.0 on Kubernetes. Inbound mail arrives directly via MX record to the home IP on port 25. Outbound mail relays through Brevo EU (`smtp-relay.brevo.com:587` — migrated from Mailgun on 2026-04-12; SPF record cut over on 2026-04-18). Roundcubemail provides webmail access. CrowdSec protects SMTP/IMAP from brute-force attacks using real client IPs via `externalTrafficPolicy: Local` on a dedicated MetalLB IP.
+Self-hosted email for `viktorbarzin.me` using docker-mailserver 15.0.0 on Kubernetes. Inbound mail arrives directly via MX record to the home IP on port 25. Outbound mail relays through Brevo EU (`smtp-relay.brevo.com:587` — migrated from Mailgun on 2026-04-12; SPF record cut over on 2026-04-18). Roundcubemail provides webmail access. CrowdSec protects SMTP/IMAP from brute-force attacks using real client IPs: pfSense HAProxy injects the PROXY v2 header on each backend connection so the mailserver pod sees the true source IP despite kube-proxy SNAT. See [`runbooks/mailserver-pfsense-haproxy.md`](../runbooks/mailserver-pfsense-haproxy.md) for ops details.

 ## Architecture Diagram

@ -12,9 +12,10 @@ Self-hosted email for `viktorbarzin.me` using docker-mailserver 15.0.0 on Kubern
 graph TB
    subgraph "Inbound Mail"
        SENDER[Sending MTA] -->|MX lookup| MX[mail.viktorbarzin.me:25]
-        MX -->|176.12.22.76:25| PF[pfSense NAT]
-        PF -->|10.0.20.202:25| MLB[MetalLB<br/>ETP: Local]
-        MLB --> POSTFIX[Postfix MTA]
+        MX -->|176.12.22.76:25| PF[pfSense NAT rdr]
+        PF -->|10.0.20.1:25| HAP[pfSense HAProxy<br/>send-proxy-v2]
+        HAP -->|k8s-node:30125| KP[kube-proxy<br/>ETP: Cluster]
+        KP -->|pod:2525 PROXY v2| POSTFIX[Postfix MTA<br/>postscreen]
    end

    subgraph "Mail Processing"
@ -36,7 +37,7 @@ graph TB
    end

    subgraph "Security"
-        MLB -->|Real client IPs| CS_AGENT[CrowdSec Agent<br/>postfix + dovecot parsers]
+        POSTFIX -->|Real client IPs<br/>from PROXY v2 header| CS_AGENT[CrowdSec Agent<br/>postfix + dovecot parsers]
        CS_AGENT --> CS_LAPI[CrowdSec LAPI]
    end

@ -64,9 +65,12 @@ graph TB
 ```
 Internet → MX: mail.viktorbarzin.me (priority 1)
         → A record: 176.12.22.76 (non-proxied Cloudflare DNS-only)
-         → pfSense NAT: port 25 → 10.0.20.202:25
-         → MetalLB (dedicated IP, ETP: Local — preserves real client IPs)
-         → Postfix → Rspamd (spam + DKIM + DMARC check) → Dovecot → mailbox
+         → pfSense NAT rdr: WAN:{25,465,587,993} → 10.0.20.1:{same}
+         → pfSense HAProxy (TCP mode, send-proxy-v2 on backend)
+         → k8s-node:{30125..30128} NodePort (mailserver-proxy, ETP: Cluster)
+         → kube-proxy → pod alt listener (2525/4465/5587/10993)
+         → Postfix postscreen / smtpd / Dovecot parses PROXY v2 header
+         → Rspamd (spam + DKIM + DMARC) → Dovecot → mailbox
 ```

 No backup MX. If the server is down, sender MTAs queue and retry for 4-5 days per SMTP standards (RFC 5321).
@ -114,7 +118,7 @@ Reverse DNS for `176.12.22.76` returns `176-12-22-76.pon.spectrumnet.bg.` (ISP-a
 ### CrowdSec Integration
 - **Collections**: `crowdsecurity/postfix` + `crowdsecurity/dovecot` (installed)
 - **Log acquisition**: CrowdSec agents parse mailserver pod logs for brute-force patterns
- **Real client IPs**: `externalTrafficPolicy: Local` on dedicated MetalLB IP `10.0.20.202` preserves original client IPs (not SNATed to node IPs)
+- **Real client IPs**: pfSense HAProxy injects PROXY v2 header on each backend connection; Postfix (`postscreen_upstream_proxy_protocol=haproxy` / `smtpd_upstream_proxy_protocol=haproxy` on alt ports) + Dovecot (`haproxy = yes` on alt IMAPS listener) parse it to recover the true source IP despite kube-proxy SNAT. Replaces the pre-2026-04-19 MetalLB `10.0.20.202` ETP:Local scheme (see code-yiu)
 - **Decisions**: CrowdSec bans/challenges attackers via firewall bouncer rules

 ### Fail2ban Disabled (CrowdSec is the Policy)
@ -158,9 +162,10 @@ CronJob `email-roundtrip-monitor` (every 10 min):
 | EmailRoundtripNeverRun | Metric absent for 40m | warning |

 ### Uptime Kuma Monitors
- TCP SMTP on `176.12.22.76:25` (external, 60s interval)
- TCP IMAP on `10.0.20.202:993` (internal)
- E2E Push monitor (receives push from roundtrip probe)
+- TCP SMTP on `176.12.22.76:25` — full external path (DNS → WAN → pfSense HAProxy → mailserver)
+- TCP `mailserver.svc:{587,993}` — intra-cluster ClusterIP path
+- TCP `10.0.20.1:{25,993}` — pfSense HAProxy health (post code-yiu Phase 6)
+- E2E Push monitor (receives push from `email-roundtrip-monitor` probe)

 ### Dovecot Exporter
 - Sidecar container in mailserver pod, port 9166
@ -210,17 +215,19 @@ CronJob `email-roundtrip-monitor` (every 10 min):
 - **Decision**: Rspamd replaces both SpamAssassin and OpenDKIM in a single component
 - **Tradeoff**: Higher memory usage (~150-200MB) but simpler stack

-### Dedicated MetalLB IP for CrowdSec
- **Decision**: Mailserver gets `10.0.20.202` (separate from shared `10.0.20.200`) with `externalTrafficPolicy: Local`
- **Why**: Shared IP with ETP: Cluster SNATs away real client IPs, making CrowdSec detections and Postfix rate limiting useless
- **Tradeoff**: Uses one extra IP from the MetalLB pool. Requires separate pfSense NAT rule.
+### Client-IP Preservation (pfSense HAProxy + PROXY v2)
+- **Current (2026-04-19, bd code-yiu)**: pfSense HAProxy listens on `10.0.20.1:{25,465,587,993}`, forwards to k8s NodePort 30125-30128 with `send-proxy-v2` on each backend connection. The mailserver pod exposes parallel listeners (2525/4465/5587/10993) that REQUIRE the PROXY v2 header, while the stock ports 25/465/587/993 stay PROXY-free for intra-cluster traffic (Roundcube, probe). The mailserver Service is ClusterIP-only; ETP is no longer a concern for external traffic.
+- **Historical (2026-04-12 → 2026-04-19)**: Dedicated MetalLB IP `10.0.20.202` with `externalTrafficPolicy: Local` — required pod/speaker colocation; kube-proxy preserved client IP only when pod was on the same node as the advertising speaker.
+- **Why switched**: ETP:Local made the mailserver's single replica drop inbound mail silently during pod reschedule (30-60s GARP flip). HAProxy with `send-proxy-v2` lets the pod reschedule to any node and recover IP-preservation through the header.
+- **Tradeoff**: pfSense now runs HAProxy (one more service in the firewall's responsibility); alt container ports + extra Service are ~80 lines of Terraform. The win is HA without IP-preservation compromise.
+- **Runbook**: [`runbooks/mailserver-pfsense-haproxy.md`](../runbooks/mailserver-pfsense-haproxy.md).

 ## Troubleshooting

 ### Inbound mail not arriving
 1. Check MX: `dig MX viktorbarzin.me +short` → should show `mail.viktorbarzin.me`
 2. Check port 25: `nc -zw5 mail.viktorbarzin.me 25`
-3. Check pfSense NAT rule: port 25 → `10.0.20.202:25`
+3. Check pfSense NAT rule: port 25 → `10.0.20.1:25` (pfSense HAProxy VIP, post code-yiu Phase 4)
 4. Check Postfix logs: `kubectl logs -n mailserver deploy/mailserver -c docker-mailserver | grep -E 'from=|reject'`
 5. Check if CrowdSec is blocking the sender: `kubectl exec -n crowdsec deploy/crowdsec-lapi -- cscli decisions list`

--- a/docs/runbooks/mailserver-pfsense-haproxy.md
+++ b/docs/runbooks/mailserver-pfsense-haproxy.md
@ -1,6 +1,6 @@
 # pfSense HAProxy for Mailserver — Runbook

-Last updated: 2026-04-19
+Last updated: 2026-04-19 (Phase 6 complete)

 ## What & why

@ -9,92 +9,126 @@ CrowdSec + Postfix rate-limiting. MetalLB cannot inject PROXY-protocol
 headers (see [`mailserver-proxy-protocol.md`](./mailserver-proxy-protocol.md)),
 so pfSense runs a small HAProxy that:

-1. Listens on the pfSense VIP,
-2. Forwards each connection to a k8s node's NodePort,
-3. Injects PROXY-v2 framing so Postfix/Dovecot see the original client IP,
-4. TCP health-checks every worker — any node can serve.
+1. Listens on the pfSense VLAN20 IP (`10.0.20.1`) on all 4 mail ports,
+2. Forwards each connection to a k8s node's NodePort with `send-proxy-v2`,
+3. Injects PROXY v2 framing so Postfix/Dovecot see the original client IP,
+4. TCP health-checks every k8s worker — any node can serve (ETP:Cluster).

-Corresponding k8s-side setup lives in `stacks/mailserver/modules/mailserver/`:
- ConfigMap `mailserver-user-patches` → `user-patches.sh` appends alt
-  `master.cf` service on port 2525 with
-  `postscreen_upstream_proxy_protocol=haproxy`.
- Service `mailserver-proxy` → NodePort 30125 → targetPort 2525 →
-  `externalTrafficPolicy: Cluster`.
+Corresponding k8s-side setup (`stacks/mailserver/modules/mailserver/`):
+
+- ConfigMap `mailserver-user-patches` → `user-patches.sh` appends 3 alt
+  `master.cf` services to Postfix:
+  - `:2525` postscreen (alt :25) with `postscreen_upstream_proxy_protocol=haproxy`
+  - `:4465` smtpd (alt :465 SMTPS) with `smtpd_upstream_proxy_protocol=haproxy`
+  - `:5587` smtpd (alt :587 submission) with `smtpd_upstream_proxy_protocol=haproxy`
+- ConfigMap `mailserver.config` adds Dovecot `inet_listener imaps_proxy` on
+  port 10993 with `haproxy = yes` and `haproxy_trusted_networks = 10.0.20.0/24`.
+- Service `mailserver-proxy` (NodePort, ETP:Cluster) with 4 NodePorts:
+  - `port 25 → targetPort 2525 → nodePort 30125`
+  - `port 465 → targetPort 4465 → nodePort 30126`
+  - `port 587 → targetPort 5587 → nodePort 30127`
+  - `port 993 → targetPort 10993 → nodePort 30128`
+- Service `mailserver` (ClusterIP) — unchanged stock ports 25/465/587/993
+  for intra-cluster clients (Roundcube pod, `email-roundtrip-monitor`
+  CronJob). These listeners are PROXY-free.

 bd: `code-yiu`.

-## Current state (Phase 3 — TEST PATH)
+## Steady-state architecture

 ```
-                                 INTERNET
-                                    │
-                            (unchanged — still MetalLB path)
-                                    ↓
-                WAN:25/465/587/993 ─┐
-                                    │                  TEST PATH ↓
-                                    │                  (pfSense HAProxy, port 2525 only)
-                          pfSense NAT rdr                │
-                                ↓                        ↓
-             <mailserver> alias (= 10.0.20.202)   HAProxy on pfSense (10.0.20.1:2525)
-                                ↓                        ↓
-                 MetalLB VIP 10.0.20.202           k8s-node:30125 (NodePort, ETP: Cluster)
-                 (ETP:Local, kube-proxy DNAT)            ↓
-                                ↓                  kube-proxy (SNAT — IP lost here, recovered by PROXY-v2)
-                            mailserver pod               ↓
-                       stock :25/:465/:587/:993   mailserver pod :2525 postscreen (PROXY-v2)
+External mail (WAN) path — PROXY v2
+┌─────────────────────────────────────────────────────────────────────┐
+│  Client (real IP)                                                   │
+│      │  SMTP/SMTPS/Sub/IMAPS                                        │
+│      ▼                                                              │
+│  pfSense WAN:{25,465,587,993}                                       │
+│      │  NAT rdr → 10.0.20.1:{same}                                  │
+│      ▼                                                              │
+│  pfSense HAProxy  (mode tcp, 4 frontends, 4 backend pools)          │
+│      │  send-proxy-v2 + tcp-check inter 120000                      │
+│      ▼                                                              │
+│  k8s-node<1-4>:{30125..30128}   ← any node (ETP:Cluster)            │
+│      │  kube-proxy SNAT (source IP lost on the wire)                │
+│      ▼                                                              │
+│  mailserver pod :{2525,4465,5587,10993}                             │
+│      │  postscreen / smtpd / Dovecot parse PROXY v2 header          │
+│      │  → real client IP recovered despite kube-proxy SNAT          │
+│      ▼                                                              │
+│  CrowdSec + Postfix / Dovecot see the true source IP ✓              │
+└─────────────────────────────────────────────────────────────────────┘
+
+Intra-cluster path — no PROXY
+┌─────────────────────────────────────────────────────────────────────┐
+│  Roundcube pod / email-roundtrip-monitor CronJob                    │
+│      │  SMTP/IMAP                                                   │
+│      ▼                                                              │
+│  mailserver.mailserver.svc.cluster.local:{25,465,587,993}           │
+│      │  ClusterIP — bypasses LoadBalancer/NodePort layer entirely   │
+│      ▼                                                              │
+│  mailserver pod stock :{25,465,587,993}  (PROXY-free)               │
+└─────────────────────────────────────────────────────────────────────┘
 ```

-Nothing production flips to HAProxy yet; all real traffic still uses the
-MetalLB LB IP path. To validate the HAProxy path:
+## Validation

 ```sh
-# From any k8s VLAN host:
-python3 -c "
-import socket; s=socket.socket(); s.connect(('10.0.20.1', 2525))
-print(s.recv(200).decode())
-s.send(b'EHLO testclient\r\n')
-print(s.recv(500).decode())
-s.send(b'QUIT\r\n'); s.close()"
+# All HAProxy frontends listening
+ssh admin@10.0.20.1 'sockstat -l | grep haproxy'
+# Expect: *:25, *:465, *:587, *:993, *:2525 (test port)

-# Then check mailserver logs for CONNECT from [YOUR-IP]:
-kubectl logs -c docker-mailserver deployment/mailserver -n mailserver --tail=20 | grep smtpd-proxy
+# All backend pools healthy
+ssh admin@10.0.20.1 "echo 'show servers state' | socat /tmp/haproxy.socket stdio" \
+  | awk 'NR>1 {print $3, $4, $6}'
+# srv_op_state 2 = UP, 0 = DOWN
+
+# Container listens on all 8 ports
+kubectl exec -n mailserver -c docker-mailserver deployment/mailserver -- \
+  ss -ltn | grep -E ':(25|2525|465|4465|587|5587|993|10993)\b'
+
+# pf rdr points at pfSense (10.0.20.1), not <mailserver> alias
+ssh admin@10.0.20.1 'pfctl -sn' | grep -E 'port = (25|submission|imaps|smtps)'
+
+# E2E probe — Brevo → external MX :25 → IMAP fetch
+kubectl create job --from=cronjob/email-roundtrip-monitor probe-test -n mailserver
+kubectl wait --for=condition=complete --timeout=90s job/probe-test -n mailserver
+kubectl logs job/probe-test -n mailserver | grep SUCCESS
+kubectl delete job probe-test -n mailserver
+
+# Real client IP in maillog post-delivery
+kubectl logs -c docker-mailserver deployment/mailserver -n mailserver \
+  | grep 'smtpd-proxy25.*CONNECT from' | tail -5
+# Expect external source IPs (e.g., Brevo 77.32.148.x), NOT 10.0.20.x
 ```

 ## Bootstrap / restore from scratch

-Config lives in pfSense `/cf/conf/config.xml` under
-`<installedpackages><haproxy>`. Backed up nightly to
+pfSense HAProxy config lives in `/cf/conf/config.xml` under
+`<installedpackages><haproxy>`. That file is scp'd nightly to
 `/mnt/backup/pfsense/config-YYYYMMDD.xml` by `scripts/daily-backup.sh`, then
-Synology. To rebuild from source of truth (git):
+synced to Synology. To rebuild from source of truth (git):

 ```sh
 scp infra/scripts/pfsense-haproxy-bootstrap.php admin@10.0.20.1:/tmp/
 ssh admin@10.0.20.1 'php /tmp/pfsense-haproxy-bootstrap.php'
 ```

-The script is idempotent — re-runs reset the mailserver frontend + backend to
-the declared state.
+The script is idempotent — re-runs reset the mailserver frontends + backends
+to the declared state.

 Expected output:
 ```
 haproxy_check_and_run rc=OK
-messages: ...
-```
-
-Verify:
-```sh
-ssh admin@10.0.20.1 "pgrep -lf haproxy; sockstat -l | grep ':2525'"
-# 64009 /usr/local/sbin/haproxy -f /var/etc/haproxy/haproxy.cfg ...
-# www  haproxy  64009 5  tcp4  *:2525  *:*
 ```

 ## Operations

-### Change backend k8s node IPs
+### Change backend k8s node IPs / NodePorts

-Edit `infra/scripts/pfsense-haproxy-bootstrap.php` → `foreach` array of
-`[name, address]`, re-run via the bootstrap command above. Don't hand-edit
-`/var/etc/haproxy/haproxy.cfg` — it is regenerated from XML on every apply.
+Edit `infra/scripts/pfsense-haproxy-bootstrap.php` — `$NODES` array + the
+`build_pool()` port arguments. Re-run the bootstrap command above. Don't
+hand-edit `/var/etc/haproxy/haproxy.cfg` — it is regenerated from XML on
+every apply.

 ### Check health of backends

@ -105,7 +139,7 @@ ssh admin@10.0.20.1 "echo 'show servers state' | socat /tmp/haproxy.socket stdio

 ### View live HAProxy stats (WebUI)

-`https://pfsense.viktorbarzin.me` → Services → HAProxy → Stats
+`https://pfsense.viktorbarzin.me` → Services → HAProxy → Stats.

 ### Reload after config.xml edit

@ -113,6 +147,22 @@ ssh admin@10.0.20.1 "echo 'show servers state' | socat /tmp/haproxy.socket stdio
 ssh admin@10.0.20.1 'pfSsh.php playback svc restart haproxy'
 ```

+### Rollback (flip NAT back to MetalLB, post-Phase-6 only partial)
+
+There is no Phase-6 rollback one-liner. Phase 6 removed the MetalLB
+LoadBalancer 10.0.20.202 entirely, so un-flipping NAT now would send
+traffic to a dead alias. To regress:
+
+1. Re-add `metallb.io/loadBalancerIPs = "10.0.20.202"` + `type = "LoadBalancer"`
+   + `external_traffic_policy = "Local"` to `kubernetes_service.mailserver`,
+   apply.
+2. Re-add the `mailserver` host alias in pfSense pointing at 10.0.20.202
+   (Firewall → Aliases → Hosts).
+3. Run `infra/scripts/pfsense-nat-mailserver-haproxy-unflip.php` on pfSense.
+
+For rollback of just the NAT (Phase 4) without touching the Service, only
+the third step is needed — but only meaningful BEFORE Phase 6.
+
 ### Restore from backup

 pfSense config backup is a plain XML file:
@ -124,27 +174,30 @@ pfSense config backup is a plain XML file:
 Full restore: pfSense WebUI → Diagnostics → Backup & Restore → Upload that
 `config.xml`. The `<installedpackages><haproxy>` section is included.

-## Phase roadmap (bd code-yiu)
+## Phase history (bd code-yiu)

 | Phase | Status | Description |
 |---|---|---|
-| 1a | ✅ done (commit `ef75c02f`) | k8s alt listener `:2525` + `mailserver-proxy` NodePort |
-| 2  | ✅ done (2026-04-19) | pfSense HAProxy installed + test config on `:2525` |
-| 3  | ✅ done (2026-04-19) | HAProxy config persisted to pfSense `config.xml` (this runbook + `pfsense-haproxy-bootstrap.php`) |
-| 4  | not yet | Flip pfSense NAT rdr for `:25` from `<mailserver>` alias → HAProxy VIP. Requires atomic cutover. |
-| 5  | not yet | Extend to ports 465/587/993: add alt container listeners (4465/5587/10993), add Dovecot `haproxy = yes` on extra inet_listener, expand HAProxy frontends, flip NAT. |
-| 6  | not yet | Observe 48h, decommission MetalLB LB path (downgrade mailserver Service from LoadBalancer to ClusterIP, free `10.0.20.202`). |
+| 1a | ✅ commit `ef75c02f` | k8s alt :2525 listener + NodePort Service |
+| 2  | ✅ 2026-04-19 | pfSense HAProxy pkg installed (`pfSense-pkg-haproxy-devel-0.63_2`, HAProxy 2.9-dev6) |
+| 3  | ✅ commit `ba697b02` | HAProxy config persisted in pfSense XML (bootstrap script + this runbook) |
+| 4+5| ✅ commit `9806d515` | 4-port alt listeners + HAProxy frontends for 25/465/587/993 + NAT flip |
+| 6  | ✅ this commit | Mailserver Service downgraded LoadBalancer → ClusterIP; `10.0.20.202` released back to MetalLB pool; orphan `mailserver` pfSense alias removed; monitors retargeted |

 ## Known warts

- HAProxy TCP health-check with `send-proxy-v2` + short `inter` floods
-  postscreen with `getpeername: Transport endpoint not connected` warnings
-  every check cycle. Mitigated with `inter 120000` (2 min). To reduce
-  further, switch to `option smtpchk` — but that requires a separate
-  non-PROXY health-check port on the pod (not done yet).
- Frontend binds on all pfSense interfaces (`bind :2525`) rather than just
-  `10.0.20.1:2525`. `<extaddr>` is set in XML but pfSense templates it as
-  port-only. Low concern while port 2525 is a test port; tighten once
-  promoted to real ports (25/465/587/993).
+- HAProxy TCP health-check with `send-proxy-v2` generates `getpeername:
+  Transport endpoint not connected` warnings on postscreen every check cycle.
+  Mitigated with `inter 120000` (2 min). To reduce further, switch to
+  `option smtpchk` — but that requires a separate non-PROXY health-check
+  port on the pod (not done yet).
+- Frontend binds on all pfSense interfaces (`bind :25` instead of
+  `10.0.20.1:25`). `<extaddr>` is set in XML but pfSense templates it
+  port-only. Low concern in practice because WAN firewall rules plus the
+  NAT rdr gate external access; internal VLAN clients SHOULD be able to
+  reach HAProxy on any pfSense-local IP.
 - k8s-node5 doesn't exist — cluster has master + 4 workers. Backend pool
  capped at 4 servers.
+- Postscreen still logs `improper command pipelining` for legitimate
+  clients that send `EHLO\r\nQUIT\r\n` as a single TCP write. This is
+  unchanged pre/post-migration — postscreen's anti-bot heuristic.
--- a/stacks/mailserver/modules/mailserver/main.tf
+++ b/stacks/mailserver/modules/mailserver/main.tf
@ -624,6 +624,13 @@ resource "kubernetes_deployment" "mailserver" {
 }

 resource "kubernetes_service" "mailserver" {
+  # code-yiu Phase 6: downgraded from LoadBalancer (MetalLB 10.0.20.202,
+  # ETP: Local) to ClusterIP on 2026-04-19. External mail now enters via
+  # pfSense HAProxy → kubernetes_service.mailserver_proxy NodePort → alt
+  # PROXY-speaking listeners. This Service exists only for intra-cluster
+  # clients (Roundcube pod, email-roundtrip-monitor CronJob) that talk to
+  # `mailserver.mailserver.svc.cluster.local:{25,465,587,993}` on the
+  # stock (PROXY-free) container listeners.
  metadata {
    name      = "mailserver"
    namespace = kubernetes_namespace.mailserver.metadata[0].name
@ -631,15 +638,10 @@ resource "kubernetes_service" "mailserver" {
    labels = {
      app = "mailserver"
    }
-
-    annotations = {
-      "metallb.io/loadBalancerIPs" = "10.0.20.202"
-    }
  }

  spec {
-    type                    = "LoadBalancer"
-    external_traffic_policy = "Local"
+    type = "ClusterIP"
    selector = {
      app = "mailserver"
    }