From e6e5fc5f1747bff14ecf43c7ae7c80fedcce0e2b Mon Sep 17 00:00:00 2001 From: Viktor Barzin Date: Sun, 19 Apr 2026 12:40:53 +0000 Subject: [PATCH] =?UTF-8?q?[docs]=20Mailserver=20architecture=20=E2=80=94?= =?UTF-8?q?=20richer=20diagrams=20+=20steady-state=20accuracy=20[ci=20skip?= =?UTF-8?q?]?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit ## Context After code-yiu Phases 1a–6 landed, `docs/architecture/mailserver.md` still carried the pre-HAProxy Mermaid diagram, a retired Dovecot-exporter component row, stale PVC names (`-proxmox` suffixes that were renamed `-encrypted` during the LUKS migration), a wrong probe schedule (claimed 10 min, actually 20 min), and a Mailgun-API claim for the probe (it's been on Brevo since code-n5l). The two-path architecture (external-via-HAProxy + intra-cluster-via-ClusterIP) that defines the current design wasn't visualised at all. ## This change Rewrote the Architecture Diagram section to show **both ingress paths in one Mermaid flowchart**, colour-coded: - External (orange): Sender → pfSense NAT → HAProxy → NodePort → **alt PROXY listeners** (2525/4465/5587/10993). - Intra-cluster (blue): Roundcube / probe → ClusterIP Service → **stock listeners** (25/465/587/993), no PROXY. - The pod subgraph shows both listener sets feeding the same Postfix / Rspamd / Dovecot / Maildir pipeline. - Security dotted edges: Postfix log stream → CrowdSec agent → LAPI → pfSense bouncer decisions. - Monitoring dotted edges: probe → Brevo HTTP → MX → pod → IMAP → Pushgateway/Uptime Kuma. Added a **sequenceDiagram** for the external SMTP roundtrip — walks through the wire-level handshake from external MTA → pfSense NAT → HAProxy TCP connect → PROXY v2 header write → kube-proxy SNAT → pod postscreen parse → smtpd banner. Makes the "how does the pod see the real IP despite SNAT?" question self-answering. Added a **Port mapping table** listing all 8 container listeners (4 stock + 4 alt) with their Service, NodePort, PROXY-required flag, and who uses each path. Replaces the ambiguous prose about "alt ports". Fixed stale bits: - Removed Dovecot Exporter row from Components (retired in code-1ik). - Added pfSense HAProxy row. - Probe schedule: every 10 min → **every 20 min** (`*/20 * * * *`). - Probe API: Mailgun → **Brevo HTTP**. - PVC names: `-proxmox` → **`-encrypted`** (all three); storage class `proxmox-lvm` → **`proxmox-lvm-encrypted`**. - Added `mailserver-backup-host` + `roundcube-backup-host` RWX NFS PVCs to the Storage table with backup flow pointer. - Expanded Troubleshooting → Inbound to include HAProxy health check + container-listener verification steps. - Secrets table: `brevo_api_key` now marked as used by both relay + probe; `mailgun_api_key` marked historical. Added a prominent **UPDATE 2026-04-19** header to `docs/runbooks/mailserver-proxy-protocol.md` pointing future readers at the implemented state in `mailserver-pfsense-haproxy.md`. Research doc preserved as a decision record — it's the canonical "why not just pin the pod?" reference. ## What is NOT in this change - No Terraform changes; this is docs-only. - No changes to the runbook (`mailserver-pfsense-haproxy.md`) — it was already rewritten during Phase 6. ## Test Plan ### Automated ``` $ awk '/^```mermaid/ {c++} END{print c}' docs/architecture/mailserver.md 2 $ grep -c '\-encrypted' docs/architecture/mailserver.md 5 # PVC references normalised $ grep -c '\-proxmox' docs/architecture/mailserver.md 0 # no stale names left ``` ### Manual Verification Render `docs/architecture/mailserver.md` on GitHub or any Mermaid- capable viewer: 1. Top Architecture Diagram should show two labelled paths into the pod, colour-coded (orange = external, blue = intra-cluster). 2. Sequence diagram should show 10 numbered steps ending at Rspamd + Dovecot delivery. 3. Port Mapping table should make it obvious that the 4 alt container ports are only reachable via `mailserver-proxy` NodePort and require PROXY v2. --- docs/architecture/mailserver.md | 182 +++++++++++++++------ docs/runbooks/mailserver-proxy-protocol.md | 15 +- 2 files changed, 144 insertions(+), 53 deletions(-) diff --git a/docs/architecture/mailserver.md b/docs/architecture/mailserver.md index a291c3a3..5307eb7c 100644 --- a/docs/architecture/mailserver.md +++ b/docs/architecture/mailserver.md @@ -8,57 +8,128 @@ Self-hosted email for `viktorbarzin.me` using docker-mailserver 15.0.0 on Kubern ## Architecture Diagram +Two independent paths into the mailserver pod: + +- **External** (MX traffic, webmail clients over WAN): Internet → pfSense → HAProxy → NodePort → **alt container ports** (2525/4465/5587/10993) that **require** PROXY v2 framing. +- **Intra-cluster** (Roundcube, E2E probe): same pod, **stock container ports** (25/465/587/993), **no** PROXY framing. + +One Deployment, one pod, two sets of Postfix `master.cf` services + Dovecot `inet_listener` blocks, two Kubernetes Services (`mailserver` ClusterIP + `mailserver-proxy` NodePort). + ```mermaid -graph TB - subgraph "Inbound Mail" - SENDER[Sending MTA] -->|MX lookup| MX[mail.viktorbarzin.me:25] - MX -->|176.12.22.76:25| PF[pfSense NAT rdr] - PF -->|10.0.20.1:25| HAP[pfSense HAProxy
send-proxy-v2] - HAP -->|k8s-node:30125| KP[kube-proxy
ETP: Cluster] - KP -->|pod:2525 PROXY v2| POSTFIX[Postfix MTA
postscreen] +flowchart TB + %% External ingress path + SENDER[Sending MTA
arbitrary public IP] -->|MX lookup + SMTP
:25| MX[mail.viktorbarzin.me
A 176.12.22.76] + MX --> PF[pfSense WAN
vtnet0 192.168.1.2] + PF -->|NAT rdr
WAN:25/465/587/993
→ 10.0.20.1:same| HAP + HAP[pfSense HAProxy
4 TCP frontends on 10.0.20.1
send-proxy-v2 to backends] + HAP -->|round-robin
tcp-check inter 120s| KN{k8s worker
node1..4} + KN -->|NodePort 30125-30128
ETP: Cluster → kube-proxy SNAT| PODEXT + + %% Internal ingress path + RC[Roundcubemail pod] -->|SMTP :587 + IMAP :993
no PROXY| SVC[Service mailserver
ClusterIP 10.103.108.x
25/465/587/993] + PROBE[email-roundtrip-monitor
CronJob every 20m] -->|IMAP :993
no PROXY| SVC + SVC -->|kube-proxy routes| PODINT + + %% The pod — two listener sets, one process tree + subgraph POD["mailserver pod (docker-mailserver 15.0.0)"] + direction LR + PODEXT[Alt ports
2525 / 4465 / 5587 / 10993
PROXY v2 REQUIRED
smtpd_upstream_proxy_protocol=haproxy
haproxy = yes] + PODINT[Stock ports
25 / 465 / 587 / 993
PROXY-free] + PODEXT --> POSTFIX + PODINT --> POSTFIX + POSTFIX[Postfix
postscreen + smtpd + cleanup + queue] + POSTFIX --> RSPAMD[Rspamd
spam + DKIM + DMARC] + RSPAMD --> DOVECOT[Dovecot IMAP
LMTP deliver] + DOVECOT --> MAILBOX[(Maildir storage
mailserver-data-encrypted PVC
proxmox-lvm-encrypted LUKS2)] end - subgraph "Mail Processing" - POSTFIX --> RSPAMD[Rspamd
Spam/DKIM/DMARC] - RSPAMD --> DOVECOT[Dovecot IMAP] - DOVECOT --> MAILBOX[(Mailboxes
proxmox-lvm PVC)] - end + %% Outbound + POSTFIX -->|queued mail
SASL + TLS| BREVO[Brevo EU Relay
smtp-relay.brevo.com:587
300/day free tier] + BREVO --> RECIPIENT[External Recipient] - subgraph "Outbound Mail" - POSTFIX_OUT[Postfix] -->|SASL + TLS| MAILGUN[Brevo EU Relay
smtp-relay.brevo.com:587] - MAILGUN --> RECIPIENT[Recipient] - end + %% Webmail HTTP path + USER[User browser] -->|HTTPS| CF[Cloudflare proxy
mail.viktorbarzin.me] + CF --> TUNNEL[Cloudflared tunnel
pfSense → Traefik] + TUNNEL --> TRAEFIK[Traefik Ingress
Authentik-protected] + TRAEFIK --> RC - subgraph "Webmail" - USER[User] -->|HTTPS| TRAEFIK[Traefik Ingress] - TRAEFIK --> RC[Roundcubemail] - RC -->|IMAP 993| DOVECOT - RC -->|SMTP 587| POSTFIX_OUT - end + %% Security + POSTFIX -.->|log stream
real client IPs from PROXY v2| CSAGENT[CrowdSec Agent
postfix + dovecot parsers] + CSAGENT -.-> CSLAPI[CrowdSec LAPI] + CSLAPI -.->|bouncer decisions
ban external IPs| PF - subgraph "Security" - POSTFIX -->|Real client IPs
from PROXY v2 header| CS_AGENT[CrowdSec Agent
postfix + dovecot parsers] - CS_AGENT --> CS_LAPI[CrowdSec LAPI] - end + %% Monitoring + PROBE -.->|Brevo HTTP API
triggers external delivery| MX + PROBE -.->|Push on roundtrip success| PUSH[Pushgateway + Uptime Kuma] - subgraph "Monitoring" - PROBE[E2E Roundtrip Probe
CronJob every 20m] -->|Mailgun API| SENDER - PROBE -->|IMAP check| DOVECOT - PROBE --> PUSH[Pushgateway + Uptime Kuma] - DEXP[Dovecot Exporter
:9166] --> PROM[Prometheus] - end + classDef extPath fill:#ffedd5,stroke:#ea580c,stroke-width:2px + classDef intPath fill:#dbeafe,stroke:#2563eb,stroke-width:2px + classDef pod fill:#dcfce7,stroke:#15803d + classDef sec fill:#fee2e2,stroke:#dc2626 + class SENDER,MX,PF,HAP,KN,PODEXT extPath + class RC,PROBE,SVC,PODINT intPath + class POSTFIX,RSPAMD,DOVECOT,MAILBOX pod + class CSAGENT,CSLAPI sec ``` +### PROXY v2 sequence (external SMTP roundtrip) + +Illustrates the wire-level sequence of a Brevo probe email arriving at our MX. Same sequence applies to any external sender. + +```mermaid +sequenceDiagram + autonumber + participant C as External MTA
(e.g. Brevo 77.32.148.26) + participant PF as pfSense WAN
192.168.1.2:25 + participant HAP as pfSense HAProxy
10.0.20.1:25 + participant N as k8s-node:30125
ETP: Cluster + participant P as Postfix postscreen
pod:2525 + + C->>PF: TCP SYN dst=192.168.1.2:25 + PF->>HAP: NAT rdr rewrites dst → 10.0.20.1:25 + HAP->>N: TCP connect (src=10.0.20.1, dst=k8s-node:30125) + Note over HAP,N: HAProxy opens a NEW TCP flow
to the backend k8s node. + HAP->>N: PROXY v2 header
(source=77.32.148.26, dest=10.0.20.1) + N->>P: kube-proxy SNAT src=k8s-node IP
forwards PROXY header + payload to pod + P->>P: Parse PROXY v2 header
smtpd_client_addr := 77.32.148.26
(despite kube-proxy SNAT on the wire) + P-->>C: SMTP banner 220 mail.viktorbarzin.me + C-->>P: EHLO / MAIL FROM / RCPT TO / DATA + Note over P,C: Real client IP logged in maillog,
fed to CrowdSec postfix parser. + P->>P: → smtpd → Rspamd → Dovecot → mailbox +``` + + ## Components | Component | Version | Location | Purpose | |-----------|---------|----------|---------| -| docker-mailserver | 15.0.0 | `mailserver` namespace | Postfix MTA + Dovecot IMAP + Rspamd | +| docker-mailserver | 15.0.0 | `mailserver` namespace | Postfix MTA + Dovecot IMAP + Rspamd (single container) | | Roundcubemail | 1.6.13-apache | `mailserver` namespace | Webmail UI (MySQL-backed) | -| Dovecot Exporter | latest | Sidecar in mailserver pod | Prometheus metrics (port 9166) | | Rspamd | Built into docker-mailserver | — | Spam filtering, DKIM signing, DMARC verification | +| pfSense HAProxy | 2.9-dev6 (`pfSense-pkg-haproxy-devel`) | pfSense VM | TCP reverse proxy injecting PROXY v2 for external mail | | Brevo EU (ex-Sendinblue) | SaaS | — | Outbound SMTP relay (300/day free) | +Dovecot exporter was retired in code-1ik (2026-04-19) — `viktorbarzin/dovecot_exporter` speaks the pre-2.3 `old_stats` FIFO protocol which docker-mailserver 15.0.0's Dovecot 2.3.19 no longer emits. + +## Port mapping + +The mailserver pod exposes **8 TCP listeners**: 4 stock + 4 alt. Two Kubernetes Services front them depending on whether the client can inject PROXY v2. + +| Mail protocol | Service port | K8s Service | Container port | NodePort | PROXY v2? | Who uses this path | +|---|---|---|---|---|---|---| +| SMTP (plain + STARTTLS) | 25 | `mailserver` ClusterIP | 25 | — | ❌ stock | Intra-cluster only (not used — internal clients send via 587) | +| SMTPS (implicit TLS) | 465 | `mailserver` ClusterIP | 465 | — | ❌ stock | Intra-cluster (Roundcube rarely uses this) | +| Submission (STARTTLS) | 587 | `mailserver` ClusterIP | 587 | — | ❌ stock | **Roundcube pod** → mailserver.svc:587 | +| IMAPS | 993 | `mailserver` ClusterIP | 993 | — | ❌ stock | **Roundcube pod** + E2E probe → mailserver.svc:993 | +| SMTP | 25 | `mailserver-proxy` NodePort | 2525 | 30125 | ✅ required | External MX traffic via pfSense HAProxy | +| SMTPS | 465 | `mailserver-proxy` NodePort | 4465 | 30126 | ✅ required | External SMTPS submission | +| Submission | 587 | `mailserver-proxy` NodePort | 5587 | 30127 | ✅ required | External STARTTLS submission (mail clients over WAN) | +| IMAPS | 993 | `mailserver-proxy` NodePort | 10993 | 30128 | ✅ required | External IMAPS (mail clients over WAN) | + +The alt listeners are set up by: +- **Postfix**: `user-patches.sh` (shipped via ConfigMap `mailserver-user-patches`) appends 3 entries to `master.cf` with `-o postscreen_upstream_proxy_protocol=haproxy` (for 2525) or `-o smtpd_upstream_proxy_protocol=haproxy` (for 4465/5587). +- **Dovecot**: `dovecot.cf` ConfigMap adds a second `inet_listener` inside `service imap-login` with `haproxy = yes`, plus `haproxy_trusted_networks = 10.0.20.0/24` to allow PROXY headers from the k8s node subnet (post kube-proxy SNAT the source IP is always a node IP). + ## Mail Flow ### Inbound @@ -147,11 +218,13 @@ anvil_rate_time_unit = 60s ## Monitoring ### E2E Roundtrip Probe -CronJob `email-roundtrip-monitor` (every 10 min): -1. Sends test email via Mailgun HTTP API to `smoke-test@viktorbarzin.me` -2. Email hits MX → Postfix → catch-all delivers to `spam@` mailbox -3. Verifies delivery via IMAP (searches by UUID marker) -4. Deletes test email, pushes metrics to Pushgateway + Uptime Kuma +CronJob `email-roundtrip-monitor` (every 20 min, `*/20 * * * *`): +1. Sends test email via **Brevo HTTP API** to `smoke-test@viktorbarzin.me` (Brevo delivers it to our MX over the public internet, exercising the full external-ingress path). +2. Email hits WAN → pfSense HAProxy → k8s-node:30125 → pod :2525 postscreen (PROXY v2) → Postfix → catch-all delivers to `spam@` mailbox. +3. Verifies delivery via IMAP — connects to `mailserver.mailserver.svc.cluster.local:993` (intra-cluster path, no PROXY), searches by UUID marker. +4. Deletes test email, pushes metrics to Pushgateway + Uptime Kuma. + +Push secrets (`BREVO_API_KEY`, `EMAIL_MONITOR_IMAP_PASSWORD`) come from ExternalSecret `mailserver-probe-secrets` (synced from Vault `secret/viktor` + `secret/platform.mailserver_accounts`) — see code-39v. ### Prometheus Alerts | Alert | Threshold | Severity | @@ -167,9 +240,8 @@ CronJob `email-roundtrip-monitor` (every 10 min): - TCP `10.0.20.1:{25,993}` — pfSense HAProxy health (post code-yiu Phase 6) - E2E Push monitor (receives push from `email-roundtrip-monitor` probe) -### Dovecot Exporter -- Sidecar container in mailserver pod, port 9166 -- Scraped by Prometheus for IMAP connection metrics +### Dovecot exporter — retired +`viktorbarzin/dovecot_exporter` was removed in code-1ik (2026-04-19). It spoke the pre-2.3 `old_stats` FIFO protocol; Dovecot 2.3.19 (docker-mailserver 15.0.0) no longer emits that, so the scrape only ever returned `dovecot_up{scope="user"} 0`. If Dovecot metrics become valuable, reach for a 2.3+ compatible exporter (e.g. `jtackaberry/dovecot_exporter`) and re-add the scrape + alerts. The previously-created `mailserver-metrics` ClusterIP Service was also removed. ## Terraform @@ -187,16 +259,20 @@ CronJob `email-roundtrip-monitor` (every 10 min): | `secret/platform` | `mailserver_aliases` | Postfix virtual aliases | | `secret/platform` | `mailserver_opendkim_key` | DKIM private key | | `secret/platform` | `mailserver_sasl_passwd` | Brevo relay credentials (`[smtp-relay.brevo.com]:587 :`) | -| `secret/viktor` | `mailgun_api_key` | Mailgun API for E2E roundtrip probe (retained for inbound delivery testing only; not used for user mail) | -| `secret/viktor` | `brevo_api_key` | Brevo API key (stored for reference) | +| `secret/viktor` | `brevo_api_key` | Brevo API key — used by BOTH outbound SMTP SASL (postfix) AND the E2E roundtrip probe (sends external test mail via Brevo HTTP) | +| `secret/viktor` | `mailgun_api_key` | Historical; no longer used by the probe post code-n5l/Phase-5 work. Kept for reference. | ## Storage | PVC | Size | Storage Class | Purpose | |-----|------|---------------|---------| -| `mailserver-data-proxmox` | 2Gi (auto-resize 5Gi) | proxmox-lvm | Mail data, state, logs | -| `roundcubemail-html-proxmox` | 1Gi | proxmox-lvm | Roundcube web files | -| `roundcubemail-enigma-proxmox` | 1Gi | proxmox-lvm | Roundcube encryption | +| `mailserver-data-encrypted` | 2Gi (auto-resize 5Gi) | `proxmox-lvm-encrypted` (LUKS2) | Maildir + Postfix queue + state + logs | +| `roundcubemail-html-encrypted` | 1Gi | `proxmox-lvm-encrypted` | Roundcube PHP code + user session data | +| `roundcubemail-enigma-encrypted` | 1Gi | `proxmox-lvm-encrypted` | Roundcube Enigma (PGP) user keys | +| `mailserver-backup-host` (RWX) | 10Gi | `nfs-truenas` | `mailserver-backup` CronJob destination (`/srv/nfs/mailserver-backup//`) | +| `roundcube-backup-host` (RWX) | 10Gi | `nfs-truenas` | `roundcube-backup` CronJob destination | + +**Backup**: daily `mailserver-backup` + `roundcube-backup` CronJobs rsync data PVCs to NFS. NFS directory is picked up by the PVE host's inotify-driven `/usr/local/bin/offsite-sync-backup` which pushes to Synology (weekly). See [Storage & Backup Architecture](storage.md) for the 3-2-1 flow. ## Decisions & Rationale @@ -225,11 +301,13 @@ CronJob `email-roundtrip-monitor` (every 10 min): ## Troubleshooting ### Inbound mail not arriving -1. Check MX: `dig MX viktorbarzin.me +short` → should show `mail.viktorbarzin.me` -2. Check port 25: `nc -zw5 mail.viktorbarzin.me 25` -3. Check pfSense NAT rule: port 25 → `10.0.20.1:25` (pfSense HAProxy VIP, post code-yiu Phase 4) -4. Check Postfix logs: `kubectl logs -n mailserver deploy/mailserver -c docker-mailserver | grep -E 'from=|reject'` -5. Check if CrowdSec is blocking the sender: `kubectl exec -n crowdsec deploy/crowdsec-lapi -- cscli decisions list` +1. **DNS/MX**: `dig MX viktorbarzin.me +short` → should show `mail.viktorbarzin.me` +2. **WAN reachability**: `nc -zw5 mail.viktorbarzin.me 25` from outside +3. **pfSense NAT**: verify WAN:{25,465,587,993} rdr to `10.0.20.1` (HAProxy VIP). `ssh admin@10.0.20.1 'pfctl -sn' | grep '10.0.20.1'` +4. **HAProxy health**: `ssh admin@10.0.20.1 "echo 'show servers state' | socat /tmp/haproxy.socket stdio"` — at least one backend in `srv_op_state=2` (UP) per pool +5. **Container listener**: `kubectl exec -n mailserver -c docker-mailserver deployment/mailserver -- ss -ltn | grep -E ':(25|2525|465|4465|587|5587|993|10993)\b'` — 8 lines expected +6. **Postfix queue + delivery**: `kubectl logs -n mailserver deploy/mailserver -c docker-mailserver | grep -E 'from=|reject|smtpd-proxy'` +7. **CrowdSec decisions**: `kubectl exec -n crowdsec deploy/crowdsec-lapi -- cscli decisions list` ### Outbound mail failing 1. Check Brevo relay: `kubectl logs -n mailserver deploy/mailserver -c docker-mailserver | grep relay` — should show `relay=smtp-relay.brevo.com` diff --git a/docs/runbooks/mailserver-proxy-protocol.md b/docs/runbooks/mailserver-proxy-protocol.md index db9005ec..f00e2000 100644 --- a/docs/runbooks/mailserver-proxy-protocol.md +++ b/docs/runbooks/mailserver-proxy-protocol.md @@ -1,6 +1,19 @@ # Mailserver PROXY protocol — research & decision -Last updated: 2026-04-18 +Last updated: 2026-04-18 (original research). **Outcome implemented 2026-04-19 — see [UPDATE](#update-2026-04-19) below.** + +> ## UPDATE (2026-04-19) +> +> This doc describes the research that led to the Phase-6 rollout. **Option C +> (pfSense HAProxy + PROXY v2)** was chosen and is now live. Operational +> state, cutover history, bootstrap, and rollback procedures live in +> [`mailserver-pfsense-haproxy.md`](mailserver-pfsense-haproxy.md). +> +> This file is retained as a decision record — it explains *why* Option A +> (pod-pinning via nodeSelector) was rejected mid-session in favour of +> Option C, and documents the MetalLB upstream limitation (PROXY injection +> is explicitly won't-implement). Future debates of "why don't we just pin +> the pod?" should land here first. ## TL;DR