infra/docs/runbooks/mailserver-proxy-protocol.md

# Mailserver PROXY protocol — research & decision

Last updated: 2026-04-18 (original research). **Outcome implemented 2026-04-19 — see [UPDATE](#update-2026-04-19) below.**

> ## UPDATE (2026-04-19)
>
> This doc describes the research that led to the Phase-6 rollout. **Option C
> (pfSense HAProxy + PROXY v2)** was chosen and is now live. Operational
> state, cutover history, bootstrap, and rollback procedures live in
> [`mailserver-pfsense-haproxy.md`](mailserver-pfsense-haproxy.md).
>
> This file is retained as a decision record — it explains *why* Option A
> (pod-pinning via nodeSelector) was rejected mid-session in favour of
> Option C, and documents the MetalLB upstream limitation (PROXY injection
> is explicitly won't-implement). Future debates of "why don't we just pin
> the pod?" should land here first.

## TL;DR

**MetalLB does not and will not inject PROXY protocol headers.** The original plan
(`/home/wizard/.claude/plans/let-s-work-on-linking-temporal-valiant.md`, task
`code-rtb`) assumed MetalLB could be configured to emit PROXY v1/v2 on behalf of
the `mailserver` LoadBalancer Service. That assumption is wrong at the product
level. MetalLB is a control-plane-only announcer (ARP/NDP for L2 mode, BGP for
L3 mode); it never touches the L4 payload.

As a result, there is no single Terraform change that can flip
`externalTrafficPolicy: Local` → `Cluster` on the `mailserver` Service while
preserving the real client IP for Postfix/postscreen and Dovecot. Three
alternative paths exist (see below); none is trivial.

## Environment (verified 2026-04-18)

- **MetalLB version**: `quay.io/metallb/controller:v0.15.3` /
  `quay.io/metallb/speaker:v0.15.3` (5 speakers).
- **Advertisement type**: L2Advertisement `default` bound to IPAddressPool
  `default` (10.0.20.200–10.0.20.220). No BGPAdvertisements.
- **Service**: `mailserver/mailserver` — type `LoadBalancer`, `loadBalancerIPs:
  10.0.20.202`, `externalTrafficPolicy: Local`,
  `healthCheckNodePort: 30234`, 5 ports (25, 465, 587, 993, 9166/dovecot-metrics).
- **Pod**: single replica today, RWO PVCs prevent horizontal scale without
  further work (`mailserver-data-encrypted`, `mailserver-letsencrypt-encrypted`).

## Why the original plan fails

### MetalLB never touches packets

> *"MetalLB is controlplane only, making it part of the dataplane means we
> would be responsible for the performance of the system, so more bugs to
> fight, I personally don't see that happening."*
> — MetalLB maintainer `champtar`, 2021-01-06
> (issue [#797 — Feature Request: Supporting Proxy Protocol v2](https://github.com/metallb/metallb/issues/797))

Issue #797 is closed as "won't implement". Repeat asks in 2022–2023 got the
same answer. The v0.15.3 API surface confirms this: no
`proxyProtocol`/`haproxy`/`protocol: proxy` field exists on `IPAddressPool`,
`L2Advertisement`, `BGPAdvertisement`, or as a Service annotation.

Only managed-cloud LBs (AWS NLB, Azure LB, OCI, DO, OVH, Scaleway, etc.) offer
PROXY protocol as a tick-box. MetalLB's equivalents are:

| MetalLB feature | Does it preserve client IP? | Comment |
|---|---|---|
| `externalTrafficPolicy: Local` (current) | Yes, via iptables DNAT on the speaker node | Forces pod↔speaker colocation on L2 mode. This is the pain we wanted to avoid. |
| `externalTrafficPolicy: Cluster` | No — kube-proxy SNATs to the node IP | The problem we would re-introduce if we flipped without PROXY injection. |
| PROXY protocol injection | N/A — not implemented | Dead end. |

### The `Local` trap is real, but narrower than it seems

Today's `Local` policy means the ARP announcer node must also host the mailserver
pod. MetalLB always picks a single speaker to advertise the VIP (leader
election per IP), so in practice exactly one node matters at any moment. A pod
rescheduled to a different node silently drops inbound SMTP/IMAP until a GARP
flip or node cordon.

The only pods on our cluster that see this same class of risk are Traefik
(3 replicas + PDB `minAvailable=2`, so 2 of 3 nodes always have a pod) and
mailserver (1 replica). Traefik survives because the pods outnumber the nodes
that could be the speaker at once; the mailserver cannot.

## Alternative paths (ranked by effort)

### Option A — Pin the mailserver pod to a specific node (SIMPLEST)

Add `nodeSelector` on the mailserver Deployment pointing at a label that's also
stamped on the MetalLB speaker we want to advertise the VIP from, and use
MetalLB's [node selector](https://metallb.io/configuration/_advanced_l2_configuration/#specify-network-interfaces-that-lb-ip-can-be-announced-from)
on `L2Advertisement.spec.nodeSelectors` to pin the announcer to the same node.

Trade-offs:

- Zero changes to Postfix/Dovecot configs.
- Keeps `externalTrafficPolicy: Local` — real client IP keeps arriving.
- Loses HA (the whole point of the MetalLB layer) but reflects reality — one
  replica, one PVC, no HA today anyway.
- Drain of that node requires a planned cutover, but that's no worse than
  today's silent failure mode.

Implementation (~10 lines of Terraform):

```hcl
# In stacks/mailserver/modules/mailserver/main.tf, on the Deployment:
node_selector = { "viktorbarzin.me/mailserver-anchor" = "true" }

# In stacks/platform (or wherever the MetalLB CRs live):
resource "kubernetes_manifest" "mailserver_l2ad" {
  manifest = {
    apiVersion = "metallb.io/v1beta1"
    kind       = "L2Advertisement"
    metadata   = { name = "mailserver", namespace = "metallb-system" }
    spec = {
      ipAddressPools = ["default"]
      nodeSelectors  = [{ matchLabels = { "viktorbarzin.me/mailserver-anchor" = "true" } }]
    }
  }
}
```

Plus a node label via `kubectl label node k8s-node3 viktorbarzin.me/mailserver-anchor=true`.

**Recommendation: this is the shortest path to eliminating the silent-drop
failure mode** without taking on a new proxy tier.

### Option B — Put a HAProxy sidecar in front of Postfix/Dovecot

Stand up an in-cluster HAProxy with PROXY v2 enabled on the frontend and
`send-proxy-v2` on the backend to `mailserver:25/465/587/993`. Expose HAProxy
via a new MetalLB Service with `externalTrafficPolicy: Cluster` + kube-proxy
DSR workaround (still loses client IP at that layer), or run HAProxy on the
host-network of the same node (back to Option A's colocation).

Trade-offs:

- Introduces one more network hop and TLS-termination decision for every
  SMTP connect.
- HAProxy needs its own cert rotation (or `tls-passthrough`) — adds moving
  parts to an already crowded mailserver module.
- Doesn't actually solve the colocation problem on its own — HAProxy itself
  needs to receive the client IP, so we are back to externalTrafficPolicy
  constraints for HAProxy.

**Recommendation: avoid unless we also get HA for mailserver itself, which
needs RWX storage + DB split-brain work — out of scope.**

### Option C — Replace MetalLB with a different LB for this Service

Candidates: [kube-vip](https://kube-vip.io/) (supports eBPF-based DSR but not
PROXY injection either), [Cilium LB](https://docs.cilium.io/en/stable/network/lb-ipam/)
(preserves client IP via DSR in hybrid mode), or a dedicated HAProxy running on
pfSense and NAT-forwarding 25/465/587/993 with PROXY headers to a
ClusterIP-exposed mailserver. Cilium requires a CNI migration (we run Calico
today); pfSense HAProxy is genuinely feasible but belongs in a different bd
task.

**Recommendation: track as P3 follow-up under a new bd task if Option A proves
insufficient.**

## Decision

Do nothing in this session beyond this runbook + the bd note. The `code-rtb`
task as written is not executable — MetalLB cannot inject PROXY headers, and
the Postfix/Dovecot config changes the plan proposed would not receive the
header they expect, they would hang waiting for it and then timeout (5s per
connection).

Follow-up work filed as bd child tasks (if user wants to pursue):

- **Option A — pin mailserver + L2Advertisement nodeSelectors** (new bd task)
- **Option C — HAProxy on pfSense with PROXY v2 to a ClusterIP** (new bd task)

## References

- [MetalLB issue #797 — Feature Request: Supporting Proxy Protocol v2](https://github.com/metallb/metallb/issues/797) (closed, won't implement)
- [MetalLB PR #796 — Source IP Preservation discussion](https://github.com/metallb/metallb/issues/796)
- Postfix [postscreen_upstream_proxy_protocol](https://www.postfix.org/postconf.5.html#postscreen_upstream_proxy_protocol) — expects the PROXY header *on every incoming connection*; if absent, postscreen drops after `postscreen_upstream_proxy_timeout`.
- Dovecot [haproxy_trusted_networks](https://doc.dovecot.org/settings/core/#core_setting-haproxy_trusted_networks) — treats the header as mandatory for listed source networks.
- Cluster state verified against: `kubectl -n metallb-system get pods`,
  `kubectl get ipaddresspools.metallb.io -A`,
  `kubectl get l2advertisements.metallb.io -A`,
  `kubectl get bgpadvertisements.metallb.io -A`,
  `kubectl -n mailserver get svc mailserver -o yaml`.
-												[mailserver] Delete postfix_cf_reference_DO_NOT_USE dead code [ci skip]

## Context

`infra/stacks/mailserver/modules/mailserver/variables.tf` carried a
130-line historical scaffolding variable
`postfix_cf_reference_DO_NOT_USE` containing a reference copy of an
older Postfix `main.cf` layout. The variable name itself signalled
dead-code intent ("DO_NOT_USE"), and a repo-wide
`grep -rn postfix_cf_reference infra/` confirmed zero consumers — no
module, no stack, no script, no doc ever referenced it. Carrying dead
Terraform variables costs nothing at runtime but actively wastes
reviewer attention on every `git blame`, drives up `variables.tf` read
time, and lets drift calcify.

Trade-offs considered:
- Keep it "just in case" → rejected; the file it mirrored
  (`/usr/share/postfix/main.cf.dist`) is already canonical upstream and
  reproducible inside any docker-mailserver container.
- Move it to a comment block → rejected; same noise cost, no value
  over deletion (authoritative source is in the image).

## This change

Drops the entire `variable "postfix_cf_reference_DO_NOT_USE" { ... }`
block (136 lines incl. trailing blank). No other variable touched, no
resource touched, no comment elsewhere touched. `variables.tf` now
contains only the single live variable `postfix_cf` that is actually
consumed by the module.

## What is NOT in this change

- No Terraform state modification — variable was never read, so state
  has no record of it.
- No Postfix runtime behaviour change — `postfix_cf` (the live one) is
  untouched.
- No fix for the pre-existing `kubernetes_deployment.mailserver` /
  `kubernetes_service.mailserver` drift that `terragrunt plan` surfaces
  independently. Those 2 in-place updates are known and tracked
  separately; this commit explicitly avoids conflating cleanup with
  drift resolution.
- No apply needed — pure source hygiene.

## Test Plan

### Automated

Reference check before edit:
```
$ grep -rn postfix_cf_reference /home/wizard/code/infra/
infra/stacks/mailserver/modules/mailserver/variables.tf:41:variable "postfix_cf_reference_DO_NOT_USE" {
```
(single match — the declaration itself)

Reference check after edit:
```
$ grep -rn postfix_cf_reference /home/wizard/code/infra/
(no matches)
```

`terragrunt validate` (from `infra/stacks/mailserver/`):
```
Success! The configuration is valid, but there were some
validation warnings as shown above.
```
(warnings are pre-existing `kubernetes_namespace` → `_v1` deprecation
notices, unrelated)

`terragrunt plan` (from `infra/stacks/mailserver/`):
```
  # module.mailserver.kubernetes_deployment.mailserver will be updated in-place
  # module.mailserver.kubernetes_service.mailserver will be updated in-place
Plan: 0 to add, 2 to change, 0 to destroy.
```
Both in-place updates are the known pre-existing drift
(volume_mount ordering + stale `metallb.io/ip-allocated-from-pool`
annotation). No change is attributable to this commit — the dead
variable was never referenced, so removing it leaves state untouched.

### Manual Verification

1. `cd infra/stacks/mailserver/modules/mailserver/`
2. `grep -c postfix_cf_reference variables.tf` → expected `0`
3. `wc -l variables.tf` → expected `39` (was `175`; 136 lines removed
   including the trailing blank after the EOT)
4. Open `variables.tf` → expected: only `variable "postfix_cf"` remains
5. `cd ../..` (stack root) → `terragrunt validate` → expected:
   `Success! The configuration is valid`
6. `terragrunt plan` → expected: `Plan: 0 to add, 2 to change, 0 to
   destroy.` (the 2 are the pre-existing drift, not from this commit).

Closes: code-o3q

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

											
										
										
											2026-04-19 00:03:42 +00:00
+								# Mailserver PROXY protocol — research & decision
-												[docs] Mailserver architecture — richer diagrams + steady-state accuracy [ci skip]

## Context

After code-yiu Phases 1a–6 landed, `docs/architecture/mailserver.md` still
carried the pre-HAProxy Mermaid diagram, a retired Dovecot-exporter
component row, stale PVC names (`-proxmox` suffixes that were renamed
`-encrypted` during the LUKS migration), a wrong probe schedule
(claimed 10 min, actually 20 min), and a Mailgun-API claim for the
probe (it's been on Brevo since code-n5l). The two-path architecture
(external-via-HAProxy + intra-cluster-via-ClusterIP) that defines the
current design wasn't visualised at all.

## This change

Rewrote the Architecture Diagram section to show **both ingress paths
in one Mermaid flowchart**, colour-coded:

- External (orange): Sender → pfSense NAT → HAProxy → NodePort →
  **alt PROXY listeners** (2525/4465/5587/10993).
- Intra-cluster (blue): Roundcube / probe → ClusterIP Service →
  **stock listeners** (25/465/587/993), no PROXY.
- The pod subgraph shows both listener sets feeding the same Postfix /
  Rspamd / Dovecot / Maildir pipeline.
- Security dotted edges: Postfix log stream → CrowdSec agent →
  LAPI → pfSense bouncer decisions.
- Monitoring dotted edges: probe → Brevo HTTP → MX → pod → IMAP →
  Pushgateway/Uptime Kuma.

Added a **sequenceDiagram** for the external SMTP roundtrip — walks
through the wire-level handshake from external MTA → pfSense NAT →
HAProxy TCP connect → PROXY v2 header write → kube-proxy SNAT → pod
postscreen parse → smtpd banner. Makes the "how does the pod see the
real IP despite SNAT?" question self-answering.

Added a **Port mapping table** listing all 8 container listeners (4
stock + 4 alt) with their Service, NodePort, PROXY-required flag, and
who uses each path. Replaces the ambiguous prose about "alt ports".

Fixed stale bits:
- Removed Dovecot Exporter row from Components (retired in code-1ik).
- Added pfSense HAProxy row.
- Probe schedule: every 10 min → **every 20 min** (`*/20 * * * *`).
- Probe API: Mailgun → **Brevo HTTP**.
- PVC names: `-proxmox` → **`-encrypted`** (all three); storage class
  `proxmox-lvm` → **`proxmox-lvm-encrypted`**.
- Added `mailserver-backup-host` + `roundcube-backup-host` RWX NFS
  PVCs to the Storage table with backup flow pointer.
- Expanded Troubleshooting → Inbound to include HAProxy health check
  + container-listener verification steps.
- Secrets table: `brevo_api_key` now marked as used by both relay +
  probe; `mailgun_api_key` marked historical.

Added a prominent **UPDATE 2026-04-19** header to
`docs/runbooks/mailserver-proxy-protocol.md` pointing future readers
at the implemented state in `mailserver-pfsense-haproxy.md`. Research
doc preserved as a decision record — it's the canonical "why not just
pin the pod?" reference.

## What is NOT in this change

- No Terraform changes; this is docs-only.
- No changes to the runbook (`mailserver-pfsense-haproxy.md`) — it was
  already rewritten during Phase 6.

## Test Plan

### Automated
```
$ awk '/^```mermaid/ {c++} END{print c}' docs/architecture/mailserver.md
2
$ grep -c '\-encrypted' docs/architecture/mailserver.md
5  # PVC references normalised
$ grep -c '\-proxmox' docs/architecture/mailserver.md
0  # no stale names left
```

### Manual Verification
Render `docs/architecture/mailserver.md` on GitHub or any Mermaid-
capable viewer:
1. Top Architecture Diagram should show two labelled paths into the
   pod, colour-coded (orange = external, blue = intra-cluster).
2. Sequence diagram should show 10 numbered steps ending at Rspamd +
   Dovecot delivery.
3. Port Mapping table should make it obvious that the 4 alt container
   ports are only reachable via `mailserver-proxy` NodePort and require
   PROXY v2.

											
										
										
											2026-04-19 12:40:53 +00:00
+								Last updated: 2026-04-18 (original research). **Outcome implemented 2026-04-19 — see [UPDATE](#update-2026-04-19) below.**
 								> ## UPDATE (2026-04-19)
 								>
 								> This doc describes the research that led to the Phase-6 rollout. **Option C
 								> (pfSense HAProxy + PROXY v2)** was chosen and is now live. Operational
 								> state, cutover history, bootstrap, and rollback procedures live in
 								> [`mailserver-pfsense-haproxy.md`](mailserver-pfsense-haproxy.md).
 								>
 								> This file is retained as a decision record — it explains *why* Option A
 								> (pod-pinning via nodeSelector) was rejected mid-session in favour of
 								> Option C, and documents the MetalLB upstream limitation (PROXY injection
 								> is explicitly won't-implement). Future debates of "why don't we just pin
 								> the pod?" should land here first.
-												[mailserver] Delete postfix_cf_reference_DO_NOT_USE dead code [ci skip]

## Context

`infra/stacks/mailserver/modules/mailserver/variables.tf` carried a
130-line historical scaffolding variable
`postfix_cf_reference_DO_NOT_USE` containing a reference copy of an
older Postfix `main.cf` layout. The variable name itself signalled
dead-code intent ("DO_NOT_USE"), and a repo-wide
`grep -rn postfix_cf_reference infra/` confirmed zero consumers — no
module, no stack, no script, no doc ever referenced it. Carrying dead
Terraform variables costs nothing at runtime but actively wastes
reviewer attention on every `git blame`, drives up `variables.tf` read
time, and lets drift calcify.

Trade-offs considered:
- Keep it "just in case" → rejected; the file it mirrored
  (`/usr/share/postfix/main.cf.dist`) is already canonical upstream and
  reproducible inside any docker-mailserver container.
- Move it to a comment block → rejected; same noise cost, no value
  over deletion (authoritative source is in the image).

## This change

Drops the entire `variable "postfix_cf_reference_DO_NOT_USE" { ... }`
block (136 lines incl. trailing blank). No other variable touched, no
resource touched, no comment elsewhere touched. `variables.tf` now
contains only the single live variable `postfix_cf` that is actually
consumed by the module.

## What is NOT in this change

- No Terraform state modification — variable was never read, so state
  has no record of it.
- No Postfix runtime behaviour change — `postfix_cf` (the live one) is
  untouched.
- No fix for the pre-existing `kubernetes_deployment.mailserver` /
  `kubernetes_service.mailserver` drift that `terragrunt plan` surfaces
  independently. Those 2 in-place updates are known and tracked
  separately; this commit explicitly avoids conflating cleanup with
  drift resolution.
- No apply needed — pure source hygiene.

## Test Plan

### Automated

Reference check before edit:
```
$ grep -rn postfix_cf_reference /home/wizard/code/infra/
infra/stacks/mailserver/modules/mailserver/variables.tf:41:variable "postfix_cf_reference_DO_NOT_USE" {
```
(single match — the declaration itself)

Reference check after edit:
```
$ grep -rn postfix_cf_reference /home/wizard/code/infra/
(no matches)
```

`terragrunt validate` (from `infra/stacks/mailserver/`):
```
Success! The configuration is valid, but there were some
validation warnings as shown above.
```
(warnings are pre-existing `kubernetes_namespace` → `_v1` deprecation
notices, unrelated)

`terragrunt plan` (from `infra/stacks/mailserver/`):
```
  # module.mailserver.kubernetes_deployment.mailserver will be updated in-place
  # module.mailserver.kubernetes_service.mailserver will be updated in-place
Plan: 0 to add, 2 to change, 0 to destroy.
```
Both in-place updates are the known pre-existing drift
(volume_mount ordering + stale `metallb.io/ip-allocated-from-pool`
annotation). No change is attributable to this commit — the dead
variable was never referenced, so removing it leaves state untouched.

### Manual Verification

1. `cd infra/stacks/mailserver/modules/mailserver/`
2. `grep -c postfix_cf_reference variables.tf` → expected `0`
3. `wc -l variables.tf` → expected `39` (was `175`; 136 lines removed
   including the trailing blank after the EOT)
4. Open `variables.tf` → expected: only `variable "postfix_cf"` remains
5. `cd ../..` (stack root) → `terragrunt validate` → expected:
   `Success! The configuration is valid`
6. `terragrunt plan` → expected: `Plan: 0 to add, 2 to change, 0 to
   destroy.` (the 2 are the pre-existing drift, not from this commit).

Closes: code-o3q

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

											
										
										
											2026-04-19 00:03:42 +00:00
 								## TL;DR
 								**MetalLB does not and will not inject PROXY protocol headers.** The original plan
 								(`/home/wizard/.claude/plans/let-s-work-on-linking-temporal-valiant.md`, task
 								`code-rtb`) assumed MetalLB could be configured to emit PROXY v1/v2 on behalf of
 								the `mailserver` LoadBalancer Service. That assumption is wrong at the product
 								level. MetalLB is a control-plane-only announcer (ARP/NDP for L2 mode, BGP for
 								L3 mode); it never touches the L4 payload.
 								As a result, there is no single Terraform change that can flip
 								`externalTrafficPolicy: Local` → `Cluster` on the `mailserver` Service while
 								preserving the real client IP for Postfix/postscreen and Dovecot. Three
 								alternative paths exist (see below); none is trivial.
 								## Environment (verified 2026-04-18)
 								- **MetalLB version**: `quay.io/metallb/controller:v0.15.3` /
 								  `quay.io/metallb/speaker:v0.15.3` (5 speakers).
 								- **Advertisement type**: L2Advertisement `default` bound to IPAddressPool
 								  `default` (10.0.20.200–10.0.20.220). No BGPAdvertisements.
 								- **Service**: `mailserver/mailserver` — type `LoadBalancer`, `loadBalancerIPs:
 .0.20.202`, `externalTrafficPolicy: Local`,
 								  `healthCheckNodePort: 30234`, 5 ports (25, 465, 587, 993, 9166/dovecot-metrics).
 								- **Pod**: single replica today, RWO PVCs prevent horizontal scale without
 								  further work (`mailserver-data-encrypted`, `mailserver-letsencrypt-encrypted`).
 								## Why the original plan fails
 								### MetalLB never touches packets
 								> *"MetalLB is controlplane only, making it part of the dataplane means we
 								> would be responsible for the performance of the system, so more bugs to
 								> fight, I personally don't see that happening."*
 								> — MetalLB maintainer `champtar`, 2021-01-06
 								> (issue [#797 — Feature Request: Supporting Proxy Protocol v2](https://github.com/metallb/metallb/issues/797))
 								Issue #797 is closed as "won't implement". Repeat asks in 2022–2023 got the
 								same answer. The v0.15.3 API surface confirms this: no
 								`proxyProtocol`/`haproxy`/`protocol: proxy` field exists on `IPAddressPool`,
 								`L2Advertisement`, `BGPAdvertisement`, or as a Service annotation.
 								Only managed-cloud LBs (AWS NLB, Azure LB, OCI, DO, OVH, Scaleway, etc.) offer
 								PROXY protocol as a tick-box. MetalLB's equivalents are:
 								| MetalLB feature | Does it preserve client IP? | Comment |
 								|---|---|---|
 								| `externalTrafficPolicy: Local` (current) | Yes, via iptables DNAT on the speaker node | Forces pod↔speaker colocation on L2 mode. This is the pain we wanted to avoid. |
 								| `externalTrafficPolicy: Cluster` | No — kube-proxy SNATs to the node IP | The problem we would re-introduce if we flipped without PROXY injection. |
 								| PROXY protocol injection | N/A — not implemented | Dead end. |
 								### The `Local` trap is real, but narrower than it seems
 								Today's `Local` policy means the ARP announcer node must also host the mailserver
 								pod. MetalLB always picks a single speaker to advertise the VIP (leader
 								election per IP), so in practice exactly one node matters at any moment. A pod
 								rescheduled to a different node silently drops inbound SMTP/IMAP until a GARP
 								flip or node cordon.
 								The only pods on our cluster that see this same class of risk are Traefik
 								(3 replicas + PDB `minAvailable=2`, so 2 of 3 nodes always have a pod) and
 								mailserver (1 replica). Traefik survives because the pods outnumber the nodes
 								that could be the speaker at once; the mailserver cannot.
 								## Alternative paths (ranked by effort)
 								### Option A — Pin the mailserver pod to a specific node (SIMPLEST)
 								Add `nodeSelector` on the mailserver Deployment pointing at a label that's also
 								stamped on the MetalLB speaker we want to advertise the VIP from, and use
 								MetalLB's [node selector](https://metallb.io/configuration/_advanced_l2_configuration/#specify-network-interfaces-that-lb-ip-can-be-announced-from)
 								on `L2Advertisement.spec.nodeSelectors` to pin the announcer to the same node.
 								Trade-offs:
 								- Zero changes to Postfix/Dovecot configs.
 								- Keeps `externalTrafficPolicy: Local` — real client IP keeps arriving.
 								- Loses HA (the whole point of the MetalLB layer) but reflects reality — one
 								  replica, one PVC, no HA today anyway.
 								- Drain of that node requires a planned cutover, but that's no worse than
 								  today's silent failure mode.
 								Implementation (~10 lines of Terraform):
 								```hcl
 								# In stacks/mailserver/modules/mailserver/main.tf, on the Deployment:
 								node_selector = { "viktorbarzin.me/mailserver-anchor" = "true" }
 								# In stacks/platform (or wherever the MetalLB CRs live):
 								resource "kubernetes_manifest" "mailserver_l2ad" {
 								  manifest = {
 								    apiVersion = "metallb.io/v1beta1"
 								    kind       = "L2Advertisement"
 								    metadata   = { name = "mailserver", namespace = "metallb-system" }
 								    spec = {
 								      ipAddressPools = ["default"]
 								      nodeSelectors  = [{ matchLabels = { "viktorbarzin.me/mailserver-anchor" = "true" } }]
 								    }
 								  }
 								}
 								```
 								Plus a node label via `kubectl label node k8s-node3 viktorbarzin.me/mailserver-anchor=true`.
 								**Recommendation: this is the shortest path to eliminating the silent-drop
 								failure mode** without taking on a new proxy tier.
 								### Option B — Put a HAProxy sidecar in front of Postfix/Dovecot
 								Stand up an in-cluster HAProxy with PROXY v2 enabled on the frontend and
 								`send-proxy-v2` on the backend to `mailserver:25/465/587/993`. Expose HAProxy
 								via a new MetalLB Service with `externalTrafficPolicy: Cluster` + kube-proxy
 								DSR workaround (still loses client IP at that layer), or run HAProxy on the
 								host-network of the same node (back to Option A's colocation).
 								Trade-offs:
 								- Introduces one more network hop and TLS-termination decision for every
 								  SMTP connect.
 								- HAProxy needs its own cert rotation (or `tls-passthrough`) — adds moving
 								  parts to an already crowded mailserver module.
 								- Doesn't actually solve the colocation problem on its own — HAProxy itself
 								  needs to receive the client IP, so we are back to externalTrafficPolicy
 								  constraints for HAProxy.
 								**Recommendation: avoid unless we also get HA for mailserver itself, which
 								needs RWX storage + DB split-brain work — out of scope.**
 								### Option C — Replace MetalLB with a different LB for this Service
 								Candidates: [kube-vip](https://kube-vip.io/) (supports eBPF-based DSR but not
 								PROXY injection either), [Cilium LB](https://docs.cilium.io/en/stable/network/lb-ipam/)
 								(preserves client IP via DSR in hybrid mode), or a dedicated HAProxy running on
 								pfSense and NAT-forwarding 25/465/587/993 with PROXY headers to a
 								ClusterIP-exposed mailserver. Cilium requires a CNI migration (we run Calico
 								today); pfSense HAProxy is genuinely feasible but belongs in a different bd
 								task.
 								**Recommendation: track as P3 follow-up under a new bd task if Option A proves
 								insufficient.**
 								## Decision
 								Do nothing in this session beyond this runbook + the bd note. The `code-rtb`
 								task as written is not executable — MetalLB cannot inject PROXY headers, and
 								the Postfix/Dovecot config changes the plan proposed would not receive the
 								header they expect, they would hang waiting for it and then timeout (5s per
 								connection).
 								Follow-up work filed as bd child tasks (if user wants to pursue):
 								- **Option A — pin mailserver + L2Advertisement nodeSelectors** (new bd task)
 								- **Option C — HAProxy on pfSense with PROXY v2 to a ClusterIP** (new bd task)
 								## References
 								- [MetalLB issue #797 — Feature Request: Supporting Proxy Protocol v2](https://github.com/metallb/metallb/issues/797) (closed, won't implement)
 								- [MetalLB PR #796 — Source IP Preservation discussion](https://github.com/metallb/metallb/issues/796)
 								- Postfix [postscreen_upstream_proxy_protocol](https://www.postfix.org/postconf.5.html#postscreen_upstream_proxy_protocol) — expects the PROXY header *on every incoming connection*; if absent, postscreen drops after `postscreen_upstream_proxy_timeout`.
 								- Dovecot [haproxy_trusted_networks](https://doc.dovecot.org/settings/core/#core_setting-haproxy_trusted_networks) — treats the header as mandatory for listed source networks.
 								- Cluster state verified against: `kubectl -n metallb-system get pods`,
 								  `kubectl get ipaddresspools.metallb.io -A`,
 								  `kubectl get l2advertisements.metallb.io -A`,
 								  `kubectl get bgpadvertisements.metallb.io -A`,
 								  `kubectl -n mailserver get svc mailserver -o yaml`.