Two coordinated fixes for the same root cause: Postfix's smtpd_upstream_proxy_protocol
listener fatals on every HAProxy health probe with `smtpd_peer_hostaddr_to_sockaddr:
... Servname not supported for ai_socktype` — the daemon respawns get throttled by
postfix master, and real client connections that land mid-respawn time out. We saw
this as ~50% timeout rate on public 587 from inside the cluster.
Layer 1 (book-search) — stacks/ebooks/main.tf:
SMTP_HOST mail.viktorbarzin.me → mailserver.mailserver.svc.cluster.local
Internal services should use ClusterIP, not hairpin through pfSense+HAProxy.
12/12 OK in <28ms vs ~6/12 timeouts on the public path.
Layer 2 (pfSense HAProxy) — stacks/mailserver + scripts/pfsense-haproxy-bootstrap.php:
Add 3 non-PROXY healthcheck NodePorts to mailserver-proxy svc:
30145 → pod 25 (stock postscreen)
30146 → pod 465 (stock smtps)
30147 → pod 587 (stock submission)
HAProxy uses `port <healthcheck-nodeport>` (per-server in advanced field) to
redirect L4 health probes to those ports while real client traffic keeps
going to 30125-30128 with PROXY v2.
Result: 0 fatals/min (was 96), 30/30 probes OK on 587, e2e roundtrip 20.4s.
Inter dropped 120000 → 5000 since log-spam concern is gone.
`option smtpchk EHLO` was tried first but flapped against postscreen (multi-line
greet + DNSBL silence + anti-pre-greet detection trip HAProxy's parser → L7RSP).
Plain TCP accept-on-port check is sufficient for both submission and postscreen.
Updated docs/runbooks/mailserver-pfsense-haproxy.md to reflect the new healthcheck
path and mark the "Known warts" entry as resolved.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The 2026-05-02 change that added the Brevo defensive-unblock step
to the email-roundtrip-monitor cron contained an apostrophe in a
Python comment ("wasn't"). The whole script is wrapped in shell
single quotes (python3 -c '...'), so the apostrophe terminated
the shell string. Python only parsed up to the apostrophe and
raised IndentationError on the now-bodyless try: block; everything
after was handed to /bin/sh which complained about "try::" and
unmatched parens. Result: every probe run since 2026-05-02 00:41 UTC
crashed before it could push, and the "Email Roundtrip E2E" Uptime
Kuma push monitor went DOWN with "No heartbeat in the time window".
Fix: rewrite the comment without an apostrophe and add a banner
warning so the next person editing this heredoc does not regress.
Validated: shell parses (bash -n), Python compiles (py_compile)
with the wrapping single quotes intact.
## Context (bd code-yiu)
With Phase 4+5 proven (external mail flows through pfSense HAProxy +
PROXY v2 to the alt PROXY-speaking container listeners), the MetalLB
LoadBalancer Service + `10.0.20.202` external IP + ETP:Local policy are
obsolete. Phase 6 decommissions them and documents the steady-state
architecture.
## This change
### Terraform (stacks/mailserver/modules/mailserver/main.tf)
- `kubernetes_service.mailserver` downgraded: `LoadBalancer` → `ClusterIP`.
- Removed `metallb.io/loadBalancerIPs = "10.0.20.202"` annotation.
- Removed `external_traffic_policy = "Local"` (irrelevant for ClusterIP).
- Port set unchanged — the Service still exposes 25/465/587/993 for
intra-cluster clients (Roundcube pod, `email-roundtrip-monitor`
CronJob) that hit the stock PROXY-free container listeners.
- Inline comment documents the downgrade rationale + companion
`mailserver-proxy` NodePort Service that now carries external traffic.
### pfSense (ops, not in git)
- `mailserver` host alias (pointing at `10.0.20.202`) deleted. No NAT
rule references it post-Phase-4; keeping it would be misleading dead
metadata. Reversible via WebUI + `php /tmp/delete-mailserver-alias.php`
companion script (ad-hoc, not checked in — alias is just a
Firewall → Aliases → Hosts entry).
### Uptime Kuma (ops)
- Monitors `282` and `283` (PORT checks) retargeted from `10.0.20.202`
→ `10.0.20.1`. Renamed to `Mailserver HAProxy SMTP (pfSense :25)` /
`... IMAPS (pfSense :993)` to reflect their new purpose (HAProxy
layer liveness). History retained (edit, not delete-recreate).
### Docs
- `docs/runbooks/mailserver-pfsense-haproxy.md` — fully rewritten
"Current state" section; now reflects steady-state architecture with
two-path diagram (external via HAProxy / intra-cluster via ClusterIP).
Phase history table marks Phase 6 ✅. Rollback section updated (no
one-liner post-Phase-6; need Service-type re-upgrade + alias re-add).
- `docs/architecture/mailserver.md` — Overview, Mermaid diagram, Inbound
flow, CrowdSec section, Uptime Kuma monitors list, Decisions section
(dedicated MetalLB IP → "Client-IP Preservation via HAProxy + PROXY
v2"), Troubleshooting all updated.
- `.claude/CLAUDE.md` — mailserver monitoring + architecture paragraph
updated with new external path description; references the new runbook.
## What is NOT in this change
- Removal of `10.0.20.202` from `cloudflare_proxied_names` or any
reserved-IP tracking — wasn't there to begin with. The
`metallb-system default` IPAddressPool (10.0.20.200-220) shows 2 of
19 available after this, confirming `.202` went back to the pool.
- Phase 4 NAT-flip rollback scripts — kept on-disk, still valid if
someone re-introduces the MetalLB LB (see runbook "Rollback").
## Test Plan
### Automated (verified pre-commit 2026-04-19)
```
# Service is ClusterIP with no EXTERNAL-IP
$ kubectl get svc -n mailserver mailserver
mailserver ClusterIP 10.103.108.217 <none> 25/TCP,465/TCP,587/TCP,993/TCP
# 10.0.20.202 no longer answers ARP (ping from pfSense)
$ ssh admin@10.0.20.1 'ping -c 2 -t 2 10.0.20.202'
2 packets transmitted, 0 packets received, 100.0% packet loss
# MetalLB pool released the IP
$ kubectl get ipaddresspool default -n metallb-system \
-o jsonpath='{.status.assignedIPv4} of {.status.availableIPv4}'
2 of 19 available
# E2E probe — external Brevo → WAN:25 → pfSense HAProxy → pod — STILL SUCCEEDS
$ kubectl create job --from=cronjob/email-roundtrip-monitor probe-phase6 -n mailserver
... Round-trip SUCCESS in 20.3s ...
$ kubectl delete job probe-phase6 -n mailserver
# pfSense mailserver alias removed
$ ssh admin@10.0.20.1 'php -r "..." | grep mailserver'
(no output)
```
### Manual Verification
1. Visit `https://uptime.viktorbarzin.me` — monitors 282/283 green on new
hostname `10.0.20.1`.
2. Roundcube login works (`https://mail.viktorbarzin.me/`).
3. Send test email to `smoke-test@viktorbarzin.me` from Gmail — observe
`postfix/smtpd-proxy25/postscreen: CONNECT from [<Gmail-IP>]` in
mailserver logs within ~10s.
4. CrowdSec should still see real client IPs in postfix/dovecot parsers
(verify with `cscli alerts list` on next auth-fail event).
## Phase history (bd code-yiu)
| Phase | Status | Description |
|---|---|---|
| 1a | ✅ `ef75c02f` | k8s alt :2525 listener + NodePort Service |
| 2 | ✅ 2026-04-19 | pfSense HAProxy pkg installed |
| 3 | ✅ `ba697b02` | HAProxy config persisted in pfSense XML |
| 4+5 | ✅ `9806d515` | 4-port alt listeners + HAProxy frontends + NAT flip |
| 6 | ✅ **this commit** | MetalLB LB retired; 10.0.20.202 released; docs updated |
Closes: code-yiu
## Context (bd code-yiu)
Toward replacing MetalLB ETP:Local + pod-speaker colocation with pfSense
HAProxy injecting PROXY v2 → mailserver. This commit lays the k8s-side
groundwork for port 25 only. External SMTP flow post-cutover:
Client → pfSense WAN:25 → pfSense HAProxy (injects PROXY v2) → k8s-node:30125
(NodePort for mailserver-proxy Service, ETP:Cluster) → kube-proxy → pod :2525
(postscreen with postscreen_upstream_proxy_protocol=haproxy) → real client IP
recovered from PROXY header despite kube-proxy SNAT.
Internal clients (Roundcube, email-roundtrip-monitor) keep using the stock
:25 on mailserver.svc ClusterIP — no PROXY required, zero regression.
## This change
- New `kubernetes_config_map.mailserver_user_patches` with a
`user-patches.sh` script. docker-mailserver runs
`/tmp/docker-mailserver/user-patches.sh` on startup; our script appends a
`2525 postscreen` entry to `master.cf` with
`-o postscreen_upstream_proxy_protocol=haproxy` and a 5s PROXY timeout.
Sentinel-guarded for idempotency on in-place restart.
- New volume + volume_mount (`mode = 0755` via defaultMode) wires the
ConfigMap into the mailserver container.
- New container port spec for 2525 (informational; kube-proxy resolves
targetPort by number anyway).
- New Service `mailserver-proxy` — NodePort type, ETP:Cluster, selector
`app=mailserver`, port 25 → targetPort 2525 → fixed nodePort 30125.
pfSense HAProxy's backend pool will be `<all k8s node IPs>:30125 check
send-proxy-v2`.
The existing `mailserver` LoadBalancer Service (ETP:Local, 10.0.20.202,
ports 25/465/587/993) is untouched. Traffic still flows through it via the
pfSense NAT `<mailserver>` alias; this commit does not change routing.
## What is NOT in this change
- pfSense HAProxy install/config (Phase 2 — out-of-Terraform, runbook-managed)
- pfSense NAT rdr flip from `<mailserver>` → HAProxy VIP (Phase 4)
- 465/587/993 — scoped to port 25 first for proof of concept. Other ports
get the same treatment (alt listeners 4465/5587/10993 + Service ports)
once 25 is proven.
- Dovecot per-listener `haproxy = yes` — irrelevant until IMAP is migrated.
## Test Plan
### Automated (verified pre-commit)
```
$ kubectl rollout status deployment/mailserver -n mailserver
deployment "mailserver" successfully rolled out
$ kubectl exec -n mailserver -c docker-mailserver deployment/mailserver -- \
postconf -M | grep '^2525'
2525 inet n - y - 1 postscreen \
-o syslog_name=postfix/smtpd-proxy \
-o postscreen_upstream_proxy_protocol=haproxy \
-o postscreen_upstream_proxy_timeout=5s
$ kubectl exec -n mailserver -c docker-mailserver deployment/mailserver -- \
ss -ltn | grep -E ':25\b|:2525'
LISTEN 0 100 0.0.0.0:2525 0.0.0.0:*
LISTEN 0 100 0.0.0.0:25 0.0.0.0:*
$ kubectl get svc -n mailserver mailserver-proxy
NAME TYPE CLUSTER-IP PORT(S) AGE
mailserver-proxy NodePort 10.98.213.164 25:30125/TCP 93s
# Expected-to-fail probe (no PROXY header) → postscreen rejects
$ timeout 8 nc -v 10.0.20.101 30125 </dev/null
Connection to 10.0.20.101 30125 port [tcp/*] succeeded!
421 4.3.2 No system resources
```
### Manual Verification (after Phase 2 — pfSense HAProxy)
Once HAProxy on pfSense is configured to listen on alt port :2525 (not the
real :25 yet) and targets `k8s-nodes:30125` with `send-proxy-v2`:
1. From an external host: `swaks --to smoke-test@viktorbarzin.me
--server <pfsense-ip>:2525 --body "phase 1 test"`
2. In mailserver logs: `kubectl logs -c docker-mailserver deployment/mailserver
| grep postfix/smtpd-proxy` — "connect from [<external-ip>]" with the real
public IP, NOT the k8s node IP.
3. E2E probe CronJob keeps green (uses ClusterIP path, unaffected).
## Reproduce locally
1. `kubectl get svc mailserver-proxy -n mailserver` → NodePort 30125 exists
2. `kubectl get cm mailserver-user-patches -n mailserver` → exists
3. `timeout 8 nc -v <k8s-node>:30125 </dev/null` → "421 4.3.2 No system resources"
(postscreen rejecting malformed PROXY)
## Context
code-vnc confirmed `viktorbarzin/dovecot_exporter` cannot produce real
metrics against docker-mailserver 15.0.0's Dovecot 2.3.19 — the
exporter speaks the pre-2.3 `old_stats` FIFO protocol, which Dovecot
2.3 deprecated in favour of `service stats` + `doveadm-server` with
a different wire format. The scrape only ever returned
`dovecot_up{scope="user"} 0`.
code-1ik listed two paths: (a) switch to a Dovecot 2.3+ exporter, or
(b) retire the exporter + scrape + alerts. Picking (b) — carrying a
no-op exporter + scrape + alert group taxes cluster resources,
clutters Prometheus /targets, and tees up an alert that can never
fire correctly. If a future session needs real Dovecot stats, reach
for a known-good exporter (e.g., jtackaberry/dovecot_exporter) and
rebuild this scaffolding.
## This change
### mailserver stack
- Removes the `dovecot-exporter` container from
`kubernetes_deployment.mailserver` (was ~28 lines). Pod now
runs a single `docker-mailserver` container.
- Removes `kubernetes_service.mailserver_metrics` (ClusterIP Service
added in code-izl). The `mailserver` LoadBalancer (ports 25, 465,
587, 993) is unaffected.
- Drops the dovecot.cf comment documenting the failed code-vnc
attempt — the documentation survives here + in bd code-vnc /
code-1ik.
### monitoring stack
- Removes `job_name: 'mailserver-dovecot'` from `extraScrapeConfigs`.
- Removes the `Mailserver Dovecot` PrometheusRule group
(`DovecotConnectionsNearLimit`, `DovecotExporterDown`).
- Inline comments in both files point future work at code-1ik's
decision record.
Prometheus configmap-reload picked up the change; scrape target set
now has zero entries for `mailserver-dovecot`. Pod rolled cleanly to
1/1 Running.
## What is NOT in this change
- No replacement exporter — deliberate. The alert that was removed
was a false-signal alert; its removal returns cluster alerting to
a correct, lower-noise state.
- mailserver MetalLB Service + SMTP/IMAP ports — unchanged.
- `auth_failure_delay`, `mail_max_userip_connections` — stay; those
are unrelated to stats export.
## Test Plan
### Automated
```
$ kubectl get pod -n mailserver -l app=mailserver
NAME READY STATUS RESTARTS AGE
mailserver-78589bfd95-swz6h 1/1 Running 0 49s
$ kubectl get svc -n mailserver
NAME TYPE PORT(S)
mailserver LoadBalancer 25/TCP,465/TCP,587/TCP,993/TCP
roundcubemail ClusterIP 80/TCP
# mailserver-metrics gone
$ kubectl exec -n monitoring <prom-pod> -c prometheus-server -- \
wget -qO- 'http://localhost:9090/api/v1/targets?scrapePool=mailserver-dovecot'
{"status":"success","data":{"activeTargets":[]}}
```
### Manual Verification
1. E2E probe `email-roundtrip-monitor` keeps succeeding (20-min cadence)
2. `EmailRoundtripFailing` stays green — proves IMAP is healthy even
without the exporter signal
3. Prometheus `/alerts` page no longer shows DovecotConnectionsNearLimit
or DovecotExporterDown
Closes: code-1ik
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
## Context
bd code-vnc investigated why `viktorbarzin/dovecot_exporter` only
exposed `dovecot_up{scope="user"} 0`. Root cause: the exporter speaks
the legacy pre-2.3 `old_stats` FIFO wire protocol. docker-mailserver
15.0.0 ships Dovecot 2.3.19, which moved to `service stats` with a
different architecture — `doveadm stats dump` on the old-stats
unix_listener returns "Failed to read VERSION line" and the exporter
loops on "Input does not provide any columns".
Attempted fix: enabled `old_stats` plugin via `mail_plugins` +
declared `service old-stats { unix_listener stats-reader }`. Socket
was created but protocol incompatibility made it useless. Reverted.
## This change
- Reverts the attempted dovecot.cf additions
- Adds a comment in the dovecot.cf heredoc explaining why we
deliberately do NOT enable old_stats here
- `auth_failure_delay = 5s` (code-9mi) and
`mail_max_userip_connections = 50` stay — they're unrelated to
stats
## What is NOT in this change
- A replacement exporter — filed as follow-up bd code-1ik with
two paths: switch to jtackaberry/dovecot_exporter, or retire the
exporter+scrape+alert entirely
- The `mailserver-metrics` ClusterIP Service (from code-izl) —
kept; it will be useful for whichever path code-1ik chooses
## Test Plan
### Automated
```
$ kubectl exec -n mailserver -c docker-mailserver deployment/mailserver -- \
supervisorctl status dovecot postfix
dovecot RUNNING pid 1022, uptime 0:00:27
postfix RUNNING pid 1063, uptime 0:00:26
$ kubectl rollout status deployment/mailserver -n mailserver
deployment "mailserver" successfully rolled out
```
### Manual Verification
Dovecot config returns to baseline + auth_failure_delay. Mail continues
to flow (E2E probe continues to succeed via `email-roundtrip-monitor`).
Closes: code-vnc
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
## Context
The mailserver container had `capabilities.add = ["NET_ADMIN"]`. Upstream
docker-mailserver docs say the capability is only needed by Fail2ban to
run iptables ban actions. Fail2ban is DISABLED in this stack
(`ENABLE_FAIL2BAN=0`, see line ~68) — CrowdSec owns the brute-force
policy at the LB layer. The capability was therefore unused ballast and
a minor attack-surface reduction opportunity. Addresses code-4mu.
## This change
Replaces the explicit `capabilities { add = ["NET_ADMIN"] }` block with
an empty `security_context {}`. Post-rollout verification
(`supervisorctl status`) confirms every service we actually run is
healthy — dovecot, postfix, rspamd, rsyslog, postsrsd, changedetector,
cron, mailserver. Every STOPPED entry was already disabled.
The inline comment documents the revert trigger: check
`kubectl logs -c docker-mailserver` for permission-denied patterns and
restore the capability if observed.
## Test Plan
### Automated
```
$ kubectl get pod -n mailserver -l app=mailserver -o jsonpath='{.items[0].spec.containers[?(@.name=="docker-mailserver")].securityContext}'
{"allowPrivilegeEscalation":true,"privileged":false,"readOnlyRootFilesystem":false,"runAsNonRoot":false}
$ kubectl rollout status deployment/mailserver -n mailserver
deployment "mailserver" successfully rolled out
$ kubectl exec -n mailserver -c docker-mailserver deployment/mailserver -- \
supervisorctl status | grep RUNNING
changedetector RUNNING ...
cron RUNNING ...
dovecot RUNNING ...
mailserver RUNNING ...
postfix RUNNING ...
postsrsd RUNNING ...
rspamd RUNNING ...
rsyslog RUNNING ...
```
### Observation window
EmailRoundtripFailing + EmailRoundtripStale alerts continue to run
every 20 min. If no alert fires in the 24h post-rollout window
(through ~2026-04-20 10:40 UTC), the change is considered safe and
this commit stands. Otherwise revert this commit.
## What is NOT in this change
- readOnlyRootFilesystem (separate hardening, out of scope)
- runAsNonRoot (docker-mailserver needs root for Postfix)
- Removing privilege-escalation defaults (container needs those for
chowning mail spool at startup)
Closes: code-4mu
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
## Context
Port 9166 (`dovecot-metrics`) was exposed on the public MetalLB
LoadBalancer 10.0.20.202 alongside SMTP/IMAP. While only LAN-routable,
shipping an internal metric on the same listening IP as external mail
conflated two concerns and over-exposed the port. Prometheus was
scraping via the same LB Service. Addresses code-izl (follow-up to
code-61v which added the scrape job).
## This change
### mailserver stack
- Drops `dovecot-metrics` port from `kubernetes_service.mailserver`
(LoadBalancer stays: 25, 465, 587, 993).
- Adds new `kubernetes_service.mailserver_metrics` — ClusterIP-only,
selecting the same `app=mailserver` pod, exposing 9166.
### monitoring stack
- Updates `extraScrapeConfigs` in the Prometheus chart values to
target the new `mailserver-metrics.mailserver.svc.cluster.local:9166`
instead of `mailserver.mailserver.svc.cluster.local:9166`.
- helm_release.prometheus updated in-place; configmap-reload sidecar
picked up the new target within 10s.
```
mailserver LB mailserver-metrics ClusterIP
┌──────────────────┐ ┌──────────────────┐
│ 25 smtp │ │ 9166 dovecot- │
│ 465 smtp-secure │ │ metrics │ ← Prometheus only
│ 587 smtp-auth │ └──────────────────┘
│ 993 imap-secure │
└──────────────────┘
↑ 10.0.20.202
```
## What is NOT in this change
- Per-Service RBAC/NetworkPolicy tightening (separate task)
- Moving the metrics port to a dedicated sidecar-only Service Monitor
(ServiceMonitor CRDs not installed; extraScrapeConfigs is correct
for the prometheus-community chart in use)
## Test Plan
### Automated
```
$ kubectl get svc -n mailserver
mailserver LoadBalancer 10.0.20.202 25/TCP,465/TCP,587/TCP,993/TCP
mailserver-metrics ClusterIP 10.100.102.174 9166/TCP
$ kubectl get endpoints -n mailserver mailserver-metrics
mailserver-metrics 10.10.169.163:9166
$ # Prometheus target (after 10s configmap-reload)
$ kubectl exec -n monitoring <prom-pod> -c prometheus-server -- \
wget -qO- 'http://localhost:9090/api/v1/targets?scrapePool=mailserver-dovecot'
scrapeUrl: http://mailserver-metrics.mailserver.svc.cluster.local:9166/metrics
health: up
```
### Manual Verification
1. From a host outside the cluster: `nc -vz 10.0.20.202 9166` → connection refused
2. Prometheus UI `/targets` → `mailserver-dovecot` UP, labels show new DNS name
3. PromQL: `up{job="mailserver-dovecot"}` returns `1`
Closes: code-izl
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
## Context
The `postfix-accounts.cf` ConfigMap renders `bcrypt(pass, 6)` for each
user in `var.mailserver_accounts`. bcrypt generates a fresh salt on
every evaluation → the ConfigMap `data` hash line differs every plan
run. `ignore_changes = [data["postfix-accounts.cf"]]` was the pragmatic
workaround, but the side-effect wasn't documented: a Vault rotation of
a mailserver password would be MASKED by ignore_changes — TF would
never push the new hash and the pod would keep accepting the old
password until manual taint/replace.
Addresses bd code-7ns.
## This change
Inline comment on the lifecycle block spelling out:
- Why ignore_changes exists (non-deterministic bcrypt)
- What the invariant costs (masks automatic rotation)
- Why it's acceptable TODAY (no automatic rotation for
mailserver_accounts — verified in Vault; manual password change is a
manual TF run anyway)
- Two concrete alternatives if rotation is ever added:
(a) deterministic bcrypt with stable per-user salt
(b) render from an ESO-synced K8s Secret
No code change, no apply needed — this is a comment-only commit. The
decision (live-with + document) is one of the three options in the plan.
## What is NOT in this change
- Deterministic hashing (not needed until automatic rotation exists)
- ESO-driven Secret (same reason)
- Removal of ignore_changes (would cause the original drift flap)
## Test Plan
### Automated
```
$ cd stacks/mailserver && /home/wizard/code/infra/scripts/tg plan
# no diff expected on this comment-only change; other drift remains
# but is pre-existing and out of scope.
```
### Manual Verification
Read the new comment block at `stacks/mailserver/modules/mailserver/
main.tf` around the postfix-accounts-cf lifecycle — comprehensible
without session context.
Closes: code-7ns
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
## Context
Dovecot's `dovecot.cf` block previously set only
`mail_max_userip_connections = 50`. No equivalent of the SMTP rate
limit existed for IMAP auth — brute-force against IMAP/POP auth was
throttled only by CrowdSec at the LB level. Adding an in-process
auth delay is cheap defense in depth. Addresses code-9mi.
## This change
Adds `auth_failure_delay = 5s` to the dovecot.cf ConfigMap key.
Each failed auth attempt pauses 5s before responding; a sequential
1000-entry dictionary attack stretches from <1s to ~85min, bought
out CrowdSec's ban window.
## What is NOT in this change
- `login_processes_count` tuning (workload doesn't warrant it yet)
- Equivalent SMTP AUTH delay (CrowdSec already covers, and SMTP AUTH
is rate-limited via `smtpd_client_connection_rate_limit`)
## Test Plan
### Automated
```
$ kubectl exec -n mailserver -c docker-mailserver deployment/mailserver -- \
doveconf -n | grep -E 'auth_failure|mail_max_userip'
auth_failure_delay = 5 secs
mail_max_userip_connections = 50
$ kubectl rollout status deployment/mailserver -n mailserver
deployment "mailserver" successfully rolled out
```
### Manual Verification
1. `openssl s_client -connect mail.viktorbarzin.me:993`
2. `a1 LOGIN bogus@viktorbarzin.me wrongpass` — expect ~5s delay before `NO [AUTHENTICATIONFAILED]`
3. Fire 5 failed attempts rapidly: total ≥25s
## Reproduce locally
1. `kubectl exec -n mailserver -c docker-mailserver deployment/mailserver -- doveconf -n | grep auth_failure`
2. Expected: `auth_failure_delay = 5 secs`
Closes: code-9mi
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
## Context
docker-mailserver 15.0.0's default Postfix config does NOT set
`smtpd_tls_auth_only = yes`. Clients that skip STARTTLS on port 587
(or 25 with AUTH) can send PLAIN/LOGIN creds in cleartext. CrowdSec
and rate limiting don't catch this — it's an auth-path leak, not a
bruteforce. Addresses bd code-vnw.
## This change
Adds `smtpd_tls_auth_only = yes` to `postfix_cf` (applied via the
`postfix-main.cf` ConfigMap key consumed by docker-mailserver).
Rolled the pod to pick up the new ConfigMap.
### Deviation from task spec
code-vnw's fix field cited `smtpd_sasl_auth_only = yes`. That is NOT
a real Postfix parameter — attempting it gets
`postconf: warning: smtpd_sasl_auth_only: unknown parameter`. The
acceptance test (reject PLAIN auth before STARTTLS) is satisfied by
`smtpd_tls_auth_only`, which is the correct knob. Added an inline
comment noting the common confusion.
## What is NOT in this change
- Per-service override in master.cf (smtpd_tls_auth_only applied
globally, which is safe because port 25 doesn't accept AUTH here)
- Other Postfix hardening (sender_restrictions, etc.)
## Test Plan
### Automated
```
$ kubectl exec -n mailserver -c docker-mailserver deployment/mailserver -- \
postconf smtpd_tls_auth_only
smtpd_tls_auth_only = yes
$ kubectl rollout status deployment/mailserver -n mailserver
deployment "mailserver" successfully rolled out
```
### Manual Verification
1. `openssl s_client -connect mail.viktorbarzin.me:587 -starttls smtp`
2. At prompt, send `AUTH PLAIN <base64>` BEFORE `STARTTLS`
3. Expected: Postfix rejects with `503 5.5.1 Error: authentication not enabled`
4. Follow-up: STARTTLS first, then `AUTH PLAIN <base64>` — succeeds for valid creds
## Reproduce locally
1. From a shell with `kubectl` access to the cluster:
2. `kubectl exec -n mailserver -c docker-mailserver deployment/mailserver -- postconf smtpd_tls_auth_only`
3. Expected: `smtpd_tls_auth_only = yes`
Closes: code-vnw
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
## Context
`viktorbarzin/dovecot_exporter:latest` was consumed with `IfNotPresent`
pull, which means whichever node landed the pod kept whatever digest
was cached from an earlier pull. A SHA-level pin is the reproducibility
baseline this repo uses for every other home-built image
(`headscale`, `excalidraw`, `linkwarden`).
## This change
- Pins `dovecot-exporter` container image to
`viktorbarzin/dovecot_exporter@sha256:1114224c...` — the digest the
pod is actually running today (captured from live `imageID`).
- Enables Diun tag watching on the mailserver Deployment
(`diun.enable=true`, `diun.include_tags=^latest$`) so new `:latest`
digests trigger a notification rather than silently landing on the
next `IfNotPresent` miss.
Deviation from task spec (code-cno): the task asked for an 8-char SHA
*tag*, but Docker Hub only publishes `:latest` for this image — a SHA
tag doesn't exist. Used the digest-pin pattern already established at
`stacks/headscale/modules/headscale/main.tf:204` instead; Diun watches
the `:latest` tag for drift, which is the equivalent notification.
## What is NOT in this change
- Volume-mount ordering drift on `kubernetes_deployment.mailserver`
(pre-existing; tolerated by Waves 1+2).
- Splitting the metrics port into its own Service (code-izl).
## Test Plan
### Automated
```
$ kubectl get pod -n mailserver -l app=mailserver \
-o jsonpath='{.items[0].spec.containers[*].image}'
docker.io/mailserver/docker-mailserver:15.0.0 \
viktorbarzin/dovecot_exporter@sha256:1114224c9bf0261ca8e9949a6b42d3c5a2c923d34ca4593f6b62f034daf14fc5
$ kubectl get deployment -n mailserver mailserver \
-o jsonpath='{.spec.template.metadata.annotations}'
{"diun.enable":"true","diun.include_tags":"^latest$"}
$ kubectl rollout status deployment/mailserver -n mailserver
deployment "mailserver" successfully rolled out
```
### Manual Verification
1. Push a new `:latest` digest to the exporter image (or wait for one).
2. Check Diun notifier output: a tag event for `^latest$` should fire.
3. `kubectl describe deployment/mailserver -n mailserver` shows the
digest pin unchanged until someone rebumps it.
## Reproduce locally
1. `kubectl -n mailserver get pod -l app=mailserver -o yaml | \
grep -A1 dovecot_exporter`
2. Expected: `image: viktorbarzin/dovecot_exporter@sha256:1114224c...`.
Closes: code-cno
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
## Context
The @viktorbarzin.me catch-all routes to spam@viktorbarzin.me. The
mailbox had no retention policy. On 2026-04-18 it held 519 messages
consuming 43 MiB. Without a policy, the only brake on growth was
manual deletion, which has not been happening - hence the bd task.
Viktor's explicit constraint when filing code-oy4: DO NOT blind
age-expunge. We need targeted retention that keeps genuine forwarded
human mail for a long time while shedding the recurring-newsletter
cruft that dominates the byte count.
## Profile findings (2026-04-18, verified on the live pod)
Total: 519 messages, 43 MiB, 0 in new/, 0 in tmp/.
Top senders by volume:
138 dan@tldrnewsletter.com
51 hi@ratepunk.com
40 uber@uber.com
35 truenas@viktorbarzin.me
19 ubereats@uber.com
15 hello@travel.jacksflightclub.com
12 chris@chriswillx.com
10 me@viktorbarzin.me
Top senders by storage bytes:
8,176,481 dan@tldrnewsletter.com (19 % of 43 MiB alone)
2,866,104 uber@uber.com
2,207,458 noreply@mail.selfh.st
2,066,094 hi@ratepunk.com
1,675,435 ubereats@uber.com
Age distribution:
97 % older than 14 days (502 / 519)
23 % older than 90 days (121 / 519)
Automated-sender markers:
66 % carry List-Unsubscribe: (342 / 519)
4 % carry Precedence: bulk|list|junk ( 21 / 519)
34 % carry neither marker (= human-ish tail) (177 / 519)
Combined "automated AND >14d": 328 messages -> target of rule 1.
## Retention strategy
Signed off by Viktor 2026-04-18. Two rules, both delete-leaf:
1. Older than 14 days AND header matches one of:
- `^List-Unsubscribe:`
- `^Precedence:\s*(bulk|list|junk)`
- `^Auto-Submitted:\s*auto-`
-> DELETE.
Rationale: these markers are the RFC-agreed indicators of bulk /
robotic senders. A 14-day window still lets genuine subscription
alerts (delivery, flight, calendar invite) come to attention.
2. Older than 90 days AND no automated marker at all
-> DELETE.
Rationale: these are long-tail forwards from real people to the
catch-all. 90 days is deliberately generous - I would rather
leak bytes than lose Viktor's personal correspondence.
3. Everything else -> KEEP (recent traffic, or aged human tail
younger than 90d).
## Implementation
A `kubernetes_cron_job_v1.spam_retention` running every 4h (at :17
past) that `kubectl exec`s a Python retention script into the
mailserver pod.
Why kubectl exec and not a sibling CronJob with the Maildir mounted:
mailserver-data-encrypted is a RWO volume held by the mailserver
pod. A sibling would fail to attach. The nextcloud-watchdog pattern
in stacks/nextcloud/main.tf already solves this for a similar
"interact with the live pod on a schedule" shape. Mirrored here with
its own SA + Role + RoleBinding scoped to list/get pods and create
pods/exec in the mailserver namespace only.
Why Python and not pure shell: POSIX `find + stat + awk` struggles
with the header-scan-up-to-blank-line rule, and `stat -c` is Linux-
GNU-specific anyway. The script reads each message's first 64 KiB,
stops at the first blank line, scans headers only, then checks mtime.
The CronJob streams the Python source via `kubectl exec -i ... --
python3 - <<PYEOF`. After the retention pass, `doveadm force-resync
-u spam@viktorbarzin.me INBOX/spam` refreshes Dovecot's cached index
so the deletions appear in IMAP immediately instead of after the next
pod restart.
Includes the standard KYVERNO_LIFECYCLE_V1 marker on the CronJob so
Kyverno ndots mutation does not cause perpetual drift.
## What is NOT in this change
- Dovecot sieve rules (no sieve infrastructure exists in the module;
the plan file's fallback option was precisely this CronJob path).
- Push of retention metrics to Pushgateway - the script prints them
to the job log for now; plumbing Pushgateway is a follow-up if
Viktor wants alerts.
- Any touch of other mailboxes - only `/var/mail/viktorbarzin.me/spam/cur`
is walked.
- Any mailserver pod restart or config reload.
## Test plan
### Automated
`terraform fmt` + `terragrunt hclfmt` pass. `scripts/tg plan` on the
mailserver stack shows:
Plan: 7 to add, 3 to change, 0 to destroy.
Of the 7 adds, 4 are mine (SA + Role + RoleBinding + CronJob). The
other 3 adds belong to the concurrent roundcube-backup CronJob +
nfs_roundcube_backup_host PV + PVC already on master in parallel.
The 3 in-place updates are pre-existing drift on the mailserver
Deployment, Service and email_roundtrip_monitor CronJob, not
introduced by this change.
### Manual Verification
After `scripts/tg apply` lands the CronJob:
1. Trigger an immediate run:
`kubectl -n mailserver create job --from=cronjob/spam-retention manual-1`
2. Wait for completion, read the log:
`kubectl -n mailserver logs job/manual-1`
-> expected tail:
spam_retention_scanned_total <N>
spam_retention_auto_deleted_total <M>
spam_retention_human_deleted_total <H>
spam_retention_kept_total <K>
spam_retention_errors_total 0
Retention pass complete
3. Confirm mailbox shrunk:
`kubectl -n mailserver exec deploy/mailserver -c docker-mailserver \
-- du -sh /var/mail/viktorbarzin.me/spam/`
-> expected: well below 43 MiB within one run (bulk rule alone
purges ~328 messages per the profile numbers above).
4. Confirm IMAP reflects the deletions:
`kubectl -n mailserver exec deploy/mailserver -c docker-mailserver \
-- doveadm mailbox status -u spam@viktorbarzin.me messages INBOX/spam`
-> expected: message count dropped accordingly.
5. 4 hours later, confirm the next scheduled run logs a much
smaller scan count and 0 deletions (nothing new crossed the
threshold).
Closes: code-oy4
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
## Context
Roundcube webmail runs with two encrypted RWO PVCs (see roundcubemail.tf:
`roundcubemail-html-encrypted`, `roundcubemail-enigma-encrypted`). These
carry user-visible state that is NOT regenerable without user action:
- `html` PVC → Apache docroot, plugin installs, skin overrides, session
artefacts (two_factor_webauthn keys, persistent_login tokens, rcguard
throttle state)
- `enigma` PVC → user-uploaded PGP private keyrings
Per the subdir CLAUDE.md "Storage & Backup Architecture" rule every
proxmox-lvm* PVC MUST have a backup CronJob writing to NFS
`/mnt/main/<app>-backup/`. Mailserver already complies via code-z26's
`mailserver-backup` CronJob; Roundcube does not. Losing either Roundcube
PVC means users must re-add 2FA devices, re-install plugins, and
re-import PGP keys — none of it recoverable from a database dump.
Target task: `code-1f6`.
## This change
- Adds `module.nfs_roundcube_backup_host` sourcing
`modules/kubernetes/nfs_volume` pointed at
`/srv/nfs/roundcube-backup` on the Proxmox host (NFSv4, inotify
change-tracker picks it up for Synology offsite).
- Adds `kubernetes_cron_job_v1.roundcube-backup`:
- Schedule `10 3 * * *` — 10 minutes after `mailserver-backup`
(`0 3 * * *`) to avoid NFS write-window contention. Roundcube PVCs
are tiny (<200 MiB combined on current cluster) so the window is
well under 10 min.
- `pod_affinity` on `app=roundcubemail` (Roundcube runs 1 replica with
`Recreate` strategy on a fresh node per pod; the backup pod must
co-locate because both PVCs are RWO).
- `rsync -aH --delete --link-dest=/backup/<prev-week>` into
`/backup/<YYYY-WW>/{html,enigma}/` — hardlinks unchanged files vs
the previous weekly snapshot, keeping storage cost ~= delta only.
- Weekly rotation retains 8 snapshots (~2 months), matching
`mailserver-backup`.
- Pushgateway metrics under `job=roundcube-backup` so existing
`BackupDurationHigh` / `BackupStale` alert patterns detect
regressions without extra wiring.
- `KYVERNO_LIFECYCLE_V1` `ignore_changes` for mutated `dns_config`.
## Layout
```
NFS server 192.168.1.127:/srv/nfs/
├── mailserver-backup/ (0 3 * * * — code-z26)
│ └── <YYYY-WW>/{data,state,log}/
└── roundcube-backup/ (10 3 * * * — this change)
└── <YYYY-WW>/{html,enigma}/
```
## What is NOT in this change
- Changing the mailserver-backup CronJob to also cover Roundcube. Two
separate CronJobs keep the concerns (and pod anti-affinity/affinity)
clean; the 10-min stagger eliminates the contention justification for
merging them.
- Retention alerting tuning — existing Pushgateway/Prometheus rule
ecosystem suffices for now.
- Restore tooling — follows the standard pattern in
`docs/runbooks/` (rsync back, fix perms).
## Reproduce locally
1. Plan: `cd stacks/mailserver && scripts/tg plan -lock=false` →
2 new resources (nfs_volume module + CronJob).
2. Apply, then trigger a one-shot run:
`kubectl -n mailserver create job --from=cronjob/roundcube-backup roundcube-backup-manual-1`
3. Expected on success:
- `kubectl -n mailserver logs job/roundcube-backup-manual-1` → "=== Backup IO Stats ===".
- On Proxmox host:
`ls /srv/nfs/roundcube-backup/$(date +%Y-%W)/` → `html`, `enigma`.
- `/mnt/backup/.nfs-changes.log` (Proxmox) lists fresh paths under
`roundcube-backup/` within ~1s of the rsync finishing.
- Pushgateway: `curl -s prometheus-prometheus-pushgateway.monitoring:9091/metrics | grep roundcube`
shows `backup_duration_seconds`, `backup_last_success_timestamp`.
## Automated
- `terraform fmt -check -recursive stacks/mailserver/modules/mailserver/` → clean.
- `scripts/tg plan -lock=false` in stacks/mailserver expected to show
`+ module.nfs_roundcube_backup_host.*`, `+ kubernetes_cron_job_v1.roundcube-backup`.
Closes: code-1f6
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
## Context
The e2e email-roundtrip probe (CronJob `email-roundtrip-monitor`) currently
wraps `requests.put(PUSHGATEWAY, ...)` and `requests.get(UPTIME_KUMA, ...)`
in bare `try/except` that only prints "Failed to push ..." on error. If
Pushgateway is transiently unreachable (e.g., during a Prometheus Helm
upgrade / HPA scale-down / brief network blip) metrics silently drop and
downstream detection relies entirely on `EmailRoundtripStale` firing after
60 min of staleness. Single transient failures masquerade as data-plane
breakage for up to an hour.
Target task: `code-n5l` — Add retry to probe Pushgateway + Uptime Kuma pushes.
## This change
- Extracts a `push_with_retry(label, func, url)` helper that performs 3
attempts with exponential backoff (1s, 2s, 4s). Treats HTTP 2xx as
success, everything else as failure. On final failure, logs an explicit
`ERROR:` line to stderr with the URL and either the last HTTP status or
the exception repr — matches the existing `print(...)` logging style
used throughout the heredoc (no stdlib `logging` dependency added).
- Replaces the two inline `try/requests.put/except print` blocks with
calls to the helper. Pushgateway runs unconditionally; Uptime Kuma
still only runs on round-trip success (same as before).
- Makes exit code responsive to push outcome: probe exits non-zero when
the round-trip itself failed (unchanged), OR when BOTH pushes failed
all retries on the success path. Single-endpoint push failure with the
other succeeding keeps exit 0 — partial observability is preferred
over noisy pod restarts from Kubernetes' Job controller.
## Behavior matrix
```
roundtrip | pushgw | kuma | exit | rationale
----------+--------+------+------+-------------------------------
success | ok | ok | 0 | happy path (unchanged)
success | fail | ok | 0 | one endpoint still has telemetry
success | ok | fail | 0 | one endpoint still has telemetry
success | fail | fail | 1 | NEW — total observability loss
fail | ok | - | 1 | roundtrip failed (unchanged, Kuma skipped)
fail | fail | - | 1 | roundtrip failed (unchanged, Kuma skipped)
```
## What is NOT in this change
- Alert thresholds (`EmailRoundtripStale` still 60m) — explicitly out of
scope per the task description.
- `logging` stdlib adoption — rest of heredoc uses `print`, staying
consistent.
- Moving the heredoc out of `main.tf` into a sidecar Python file —
separate refactor.
## Reproduce locally
1. Point PUSHGATEWAY at a black hole:
`kubectl -n mailserver set env cronjob/email-roundtrip-monitor \`
`PUSHGATEWAY=http://nope.invalid:9091/metrics/job/test`
2. Trigger a one-shot job:
`kubectl -n mailserver create job --from=cronjob/email-roundtrip-monitor probe-test`
3. Expected in logs:
- 3 attempts, each ~1s/2s/4s apart
- `ERROR: Failed to push to Pushgateway after 3 attempts: url=... exception=...`
- Uptime Kuma push still succeeds (round-trip ok) → exit 0
4. Flip UPTIME_KUMA_URL to also fail (edit heredoc or DNS-poison): expect
exit 1 + two ERROR lines.
## Automated
- `python3 -c "import ast; ast.parse(open('/tmp/probe.py').read())"` → OK
(heredoc extracts cleanly).
- `terraform fmt -check -recursive modules/mailserver/` → no diff.
Closes: code-n5l
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
## Context
`infra/stacks/mailserver/modules/mailserver/variables.tf` carried a
130-line historical scaffolding variable
`postfix_cf_reference_DO_NOT_USE` containing a reference copy of an
older Postfix main.cf layout. The variable name itself signalled
dead-code intent ("DO_NOT_USE"), and a repo-wide
`grep -rn postfix_cf_reference infra/` confirmed zero consumers — no
module, no stack, no script, no doc ever referenced it. Carrying dead
Terraform variables costs nothing at runtime but wastes reviewer
attention on every `git blame` and drives up `variables.tf` read time.
Note on history: the prior commit 09c11056 landed with an identical
title ("Delete postfix_cf_reference_DO_NOT_USE dead code") but
actually committed `docs/runbooks/mailserver-proxy-protocol.md` —
fallout from a race between two concurrent mailserver sessions that
staged files in parallel. That commit accidentally closed this beads
task via the `Closes:` trailer without performing the deletion. This
commit does the actual deletion that was originally intended for
code-o3q. The runbook from 09c11056 is legitimate work for code-rtb
and is left in place.
## This change
Drops the entire `variable "postfix_cf_reference_DO_NOT_USE" { ... }`
block (136 lines incl. trailing blank). No other variable touched, no
resource touched, no comment elsewhere touched. `variables.tf` now
contains only the live `postfix_cf` variable that is actually consumed
by the module.
## What is NOT in this change
- No Terraform state modification — variable was never read, so state
has no record of it.
- No Postfix runtime behaviour change — `postfix_cf` (the live one) is
untouched.
- No fix for the pre-existing `kubernetes_deployment.mailserver` /
`kubernetes_service.mailserver` drift that `terragrunt plan` surfaces
independently. Those 2 in-place updates are known and tracked
separately.
- No apply needed — pure source hygiene.
## Test Plan
### Automated
Reference check before edit:
```
$ grep -rn postfix_cf_reference /home/wizard/code/infra/
infra/stacks/mailserver/modules/mailserver/variables.tf:41:variable "postfix_cf_reference_DO_NOT_USE" {
```
(single match — the declaration itself)
Reference check after edit:
```
$ grep -rn postfix_cf_reference /home/wizard/code/infra/
(no matches)
```
`terragrunt validate` (from `infra/stacks/mailserver/`):
```
Success! The configuration is valid, but there were some
validation warnings as shown above.
```
(warnings are pre-existing `kubernetes_namespace` -> `_v1` deprecation
notices, unrelated)
`terragrunt plan` (from `infra/stacks/mailserver/`):
```
# module.mailserver.kubernetes_deployment.mailserver will be updated in-place
# module.mailserver.kubernetes_service.mailserver will be updated in-place
Plan: 0 to add, 2 to change, 0 to destroy.
```
Both in-place updates are the known pre-existing drift. No change is
attributable to this commit — the dead variable was never referenced.
### Manual Verification
1. `cd infra/stacks/mailserver/modules/mailserver/`
2. `grep -c postfix_cf_reference variables.tf` -> expected `0`
3. `wc -l variables.tf` -> expected `39` (was `175`; 136 lines removed)
4. `cd ../..` -> `terragrunt validate` -> expected `Success!`
5. `terragrunt plan` -> expected `Plan: 0 to add, 2 to change, 0 to
destroy.` (pre-existing drift only).
Closes: code-o3q
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
## Context
mail.viktorbarzin.me exposed the Roundcube login page directly: requests
hit Traefik → CrowdSec + anti-AI middleware → Roundcube. The `ingress_factory`
call in `roundcubemail.tf` omitted `protected = true`, so the Authentik
ForwardAuth middleware was never wired up. Project rule
(`infra/.claude/CLAUDE.md`): ingresses should be `protected = true` unless
there is a specific reason to leave them open. Credentialed surfaces (login
pages) have no reason to skip the OIDC gate — CrowdSec alone is a behavioural
signal, not an identity gate.
Trade-off accepted by Viktor on 2026-04-18: webmail now requires two logins
(Authentik SSO, then Roundcube IMAP auth against dovecot). This is tolerable
for a low-volume personal webmail; mail clients (Thunderbird, phone Mail)
bypass the webmail entirely and speak IMAPS/SMTP directly against
`mail.viktorbarzin.me` on the MetalLB service IP (10.0.20.202), which is a
separate path and MUST stay open.
## This change
Single-line flip: `protected = true` added to the `ingress_factory` call in
`stacks/mailserver/modules/mailserver/roundcubemail.tf`.
The factory (`modules/kubernetes/ingress_factory/main.tf`) responds to the
flag by:
1. Appending `traefik-authentik-forward-auth@kubernetescrd` to the ingress
`router.middlewares` annotation — Traefik then hands each request to
the Authentik outpost before forwarding to Roundcube.
2. Flipping `effective_anti_ai` from true → false (logic:
`anti_ai_scraping != null ? … : !var.protected`), which removes the two
anti-AI middlewares. Rationale in the factory: a login-gated resource
is already invisible to unauthenticated scrapers, so the robots/noai
middleware chain is redundant.
Request path before vs after:
Before: Client → Traefik → [retry, error-pages, rate-limit, csp,
crowdsec, ai-bot-block, anti-ai-headers]
→ Roundcube (200 on /)
After: Client → Traefik → [retry, error-pages, rate-limit, csp,
crowdsec, authentik-forward-auth]
→ if unauth: 302 to authentik.viktorbarzin.me
→ if auth: Roundcube (login form)
## What is NOT in this change
- The `mailserver` Service (MetalLB IP 10.0.20.202) is untouched. IMAPS
(993), SMTPS (465), SMTP-Submission (587) continue to bypass Traefik
entirely and speak directly to dovecot/postfix. Mail clients are
unaffected.
- Pre-existing drift on `kubernetes_deployment.mailserver` (volume_mount
ordering) and `kubernetes_service.mailserver` (stale metallb annotation)
is left alone — out of scope per bd-bmh. Apply was scoped with
`-target=` to the ingress resource only.
- No Authentik app/provider Terraform was touched — the `mail.*` ingress
is already covered by the existing wildcard Authentik proxy outpost on
`*.viktorbarzin.me` (standard pattern).
## Test Plan
### Automated
Baseline (before apply):
$ curl -sI https://mail.viktorbarzin.me/ | head -2
HTTP/2 200
alt-svc: h3=":443"; ma=2592000
$ openssl s_client -connect mail.viktorbarzin.me:993 < /dev/null 2>&1 \
| grep -E 'CONNECTED|subject='
CONNECTED(00000003)
subject=CN = viktorbarzin.me
After apply:
$ curl -sI https://mail.viktorbarzin.me/ | head -3
HTTP/2 302
alt-svc: h3=":443"; ma=2592000
location: https://authentik.viktorbarzin.me/application/o/authorize/?client_id=…
$ openssl s_client -connect mail.viktorbarzin.me:993 < /dev/null 2>&1 \
| grep -E 'CONNECTED|subject='
CONNECTED(00000003)
subject=CN = viktorbarzin.me
Middleware annotation on the ingress:
$ kubectl get ingress -n mailserver mail \
-o jsonpath='{.metadata.annotations.traefik\.ingress\.kubernetes\.io/router\.middlewares}'
traefik-retry@kubernetescrd,traefik-error-pages@kubernetescrd,
traefik-rate-limit@kubernetescrd,traefik-csp-headers@kubernetescrd,
traefik-crowdsec@kubernetescrd,traefik-authentik-forward-auth@kubernetescrd
Terraform apply (targeted):
$ scripts/tg apply --non-interactive \
-target=module.mailserver.module.ingress.kubernetes_ingress_v1.proxied-ingress
…
Apply complete! Resources: 0 added, 1 changed, 0 destroyed.
### Manual Verification
1. In a private browser window, navigate to https://mail.viktorbarzin.me/
2. Expected: redirected to Authentik SSO login (not Roundcube)
3. Authenticate with Authentik credentials
4. Expected: redirected back and shown the Roundcube IMAP login form
5. Enter IMAP credentials (same as before the change)
6. Expected: Roundcube inbox loads normally
7. Separately, verify a mail client (Thunderbird, phone Mail) still
connects to IMAPS on mail.viktorbarzin.me:993 and SMTP on :587 without
any Authentik prompt — that path hits MetalLB 10.0.20.202 directly.
## Reproduce locally
1. cd infra/stacks/mailserver
2. vault login -method=oidc
3. scripts/tg plan
Expected: 0 to add, 3 to change, 0 to destroy. Relevant change is the
`router.middlewares` annotation on
`module.ingress.kubernetes_ingress_v1.proxied-ingress` swapping the
two anti-AI middlewares for `traefik-authentik-forward-auth`. The
other 2 changes are pre-existing drift (volume_mounts, metallb
annotation) and are out of scope.
4. scripts/tg apply --non-interactive \
-target=module.mailserver.module.ingress.kubernetes_ingress_v1.proxied-ingress
5. curl -sI https://mail.viktorbarzin.me/ — expect HTTP/2 302 to
authentik.viktorbarzin.me
Closes: code-bmh
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
## Context
The mailserver container (Postfix + Dovecot in one pod) had no liveness, readiness, or startup probes declared. If either daemon deadlocked or hung on a socket, Kubernetes had no way to detect it and restart. The only external canary was the email-roundtrip-monitor CronJob which runs on a 20-minute interval, giving a detection lag of 20-60 minutes — long enough for real delivery failures before an alert fires.
Tracked as bd code-ekf out of the mailserver probe audit. Both port 25 (SMTP) and port 993 (IMAPS) are cheap, reliable up-signals — the existing e2e probe already hits IMAPS, so TCP probes on those ports are a close proxy for user-visible service health without the cost of full SMTP/IMAP handshakes every 10s.
## This change
Adds a readiness_probe (TCP :25, initial_delay=30s, period=10s) and a liveness_probe (TCP :993, initial_delay=60s, period=60s, timeout=15s) to the mailserver deployment's primary container.
Design choices:
- **TCP over exec/HTTP**: the daemons do not expose HTTP health; exec probes would require shelling into the container with auth for SMTP/IMAP banner checks, which is both costly and flaky. TCP accept is sufficient — if postfix cannot accept a TCP connection on :25 it is unambiguously broken.
- **Split ports per probe**: readiness on :25 (the public SMTP surface — if this is down, external delivery is broken) and liveness on :993 (IMAPS, the other critical daemon — catches Dovecot deadlocks independently of Postfix).
- **30s readiness delay**: Postfix needs ~20-30s to warm up including chroot setup and DKIM key loading; probing earlier would cause bogus NotReady cycles on deploy.
- **60s liveness delay + 60s period + 15s timeout**: generous so transient blips (brief CPU spike, RBL timeout, slow NFS unmount during rotation) do not trigger a restart loop. With failure_threshold=3 (default), a real deadlock is detected in ~3 minutes; false positives on transient load are suppressed.
- **No startup_probe**: the 60s liveness initial_delay is enough cover for the warmup window; adding a startup probe would be redundant machinery.
## What is NOT in this change
- No startup_probe (liveness initial_delay_seconds=60 handles warmup)
- No exec-based probes (banner-check probes are out of scope and not needed)
- No changes to the opendkim or other sidecars
- Pre-existing drift in other stacks (dawarich namespace label, owntracks dawarich-hook wiring) is deliberately left out — those are separate workstreams
## Test Plan
### Automated
Applied via `tg apply -target=kubernetes_deployment.mailserver` before this commit. Current pod state:
```
$ kubectl get pod -n mailserver -l app=mailserver
NAME READY STATUS RESTARTS AGE
mailserver-6c6bf77ffb-w7nl5 2/2 Running 0 2m26s
$ kubectl describe pod -n mailserver -l app=mailserver | grep -E "(Liveness|Readiness|Restart Count|Status:|Ready:)"
Status: Running
Ready: True
Restart Count: 0
Ready: True
Restart Count: 0
Liveness: tcp-socket :993 delay=60s timeout=15s period=60s #success=1 #failure=3
Readiness: tcp-socket :25 delay=30s timeout=1s period=10s #success=1 #failure=3
```
Pod has run >120s (two full liveness cycles) with RESTARTS=0 and Ready=True.
### Manual Verification
1. Confirm probes are declared on the live pod:
```
kubectl describe pod -n mailserver -l app=mailserver | grep -E "(Liveness|Readiness)"
```
Expected: `Liveness: tcp-socket :993 ...` and `Readiness: tcp-socket :25 ...`
2. Confirm pod stays Ready under normal load for 5+ minutes:
```
kubectl get pod -n mailserver -l app=mailserver -w
```
Expected: RESTARTS stays at 0, READY stays at 2/2.
3. (Optional) Failure-simulate by dropping :993 inside the pod and observing liveness failure + restart within ~3 minutes (3 × period_seconds).
## Reproduce locally
1. `cd infra/stacks/mailserver`
2. `tg plan -target=kubernetes_deployment.mailserver`
3. Expected: no drift (or only the probe additions if rolling forward a stale state)
4. `kubectl get pod -n mailserver -l app=mailserver` — pod Ready, RESTARTS=0
5. `kubectl describe pod -n mailserver -l app=mailserver | grep -E "(Liveness|Readiness)"` — both probes present
Closes: code-ekf
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
## Context
The email-roundtrip-monitor CronJob injected `BREVO_API_KEY` and
`EMAIL_MONITOR_IMAP_PASSWORD` as inline `env { value = var.xxx }` —
Terraform read them from Vault at plan time and embedded them in the
generated CronJob spec. Anyone with `kubectl describe cronjob` (or
pod-event read) in the `mailserver` namespace could read both secrets
verbatim.
The two upstream Vault entries are not flat strings:
- `secret/viktor` → `brevo_api_key` = base64(JSON({"api_key": "..."}))
- `secret/platform` → `mailserver_accounts` = JSON({"spam@viktorbarzin.me": "<pw>", ...})
A plain ESO `remoteRef.property` can traverse one level of JSON but
cannot base64-decode the wrapper or index a map key that contains `@`.
So the ExternalSecret pulls the raw Vault values and the rendered K8s
Secret is produced via ESO's `target.template` (engineVersion v2, sprig
pipeline `b64dec | fromJson | dig`). `mergePolicy` defaults to Replace,
so only the transformed `BREVO_API_KEY` / `EMAIL_MONITOR_IMAP_PASSWORD`
keys land in the K8s Secret — the raw wrapped inputs never reach it.
## This change
1. New `kubernetes_manifest.email_roundtrip_monitor_secrets` rendering
an `external-secrets.io/v1beta1` ExternalSecret into a K8s Secret
named `mailserver-probe-secrets` via the `vault-kv` ClusterSecretStore.
2. CronJob's two `env { name=... value=var.xxx }` blocks replaced with
a single `env_from { secret_ref { name = "mailserver-probe-secrets" } }`.
3. Unused `brevo_api_key` / `email_monitor_imap_password` module
variables + their wiring in `stacks/mailserver/main.tf` removed.
`data "vault_kv_secret_v2" "viktor"` dropped (last consumer gone).
```
Before: After:
┌────────────┐ ┌────────────┐
│ Vault KV │ │ Vault KV │
└────┬───────┘ └────┬───────┘
│ (plan-time read) │ (runtime pull)
▼ ▼
┌────────────┐ ┌────────────┐
│ Terraform │ │ ESO ctrl │
│ state │ │ +template │
└────┬───────┘ └────┬───────┘
│ inline value= │ sprig b64dec | fromJson
▼ ▼
┌────────────┐ ┌────────────┐
│ CronJob │ <-- kubectl describe leaks! │ K8s Secret │
│ env[].value│ │ probe-sec │
└────────────┘ └────┬───────┘
│ env_from.secret_ref
▼
┌────────────┐
│ CronJob │
│ (no values │
│ in spec) │
└────────────┘
```
## Test Plan
### Automated
`terragrunt plan -target=...ExternalSecret -target=...CronJob`:
```
Plan: 1 to add, 1 to change, 0 to destroy.
+ kubernetes_manifest.email_roundtrip_monitor_secrets (ExternalSecret)
~ kubernetes_cron_job_v1.email_roundtrip_monitor
- env { name = "BREVO_API_KEY" ... }
- env { name = "EMAIL_MONITOR_IMAP_PASSWORD" ... }
+ env_from { secret_ref { name = "mailserver-probe-secrets" } }
```
`terragrunt apply --non-interactive` same targets:
```
Apply complete! Resources: 1 added, 1 changed, 0 destroyed.
```
`kubectl get externalsecret -n mailserver mailserver-probe-secrets`:
```
NAME STORE REFRESH INTERVAL STATUS READY
mailserver-probe-secrets vault-kv 15m SecretSynced True
```
`kubectl get secret -n mailserver mailserver-probe-secrets -o yaml`
exposes exactly two data keys (`BREVO_API_KEY`, `EMAIL_MONITOR_IMAP_PASSWORD`) —
both populated, 120 / 32 base64 chars, no raw `brevo_api_key_wrapped` /
`mailserver_accounts` keys.
`kubectl describe cronjob -n mailserver email-roundtrip-monitor`:
```
Environment Variables from:
mailserver-probe-secrets Secret Optional: false
Environment: <none>
```
(Previously the `Environment:` block listed both secrets with their raw
values.)
### Manual Verification
1. `kubectl create job --from=cronjob/email-roundtrip-monitor \
probe-test-$RANDOM -n mailserver`
2. `kubectl logs -n mailserver -l job-name=probe-test-... --tail=30`
expected:
```
Sent test email via Brevo: 201 marker=e2e-probe-...
Found test email after 1 attempts
Deleted 1 e2e probe email(s)
Round-trip SUCCESS in 20.3s
Pushed metrics to Pushgateway
Pushed to Uptime Kuma
```
3. `kubectl exec -n monitoring deploy/prometheus-prometheus-pushgateway \
-- wget -q -O- http://localhost:9091/metrics | grep email_roundtrip`
shows `email_roundtrip_success=1`, fresh timestamp, duration in range.
4. `kubectl delete job -n mailserver probe-test-...` to clean up.
Closes: code-39v
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
## Context
The mailserver stack holds everything valuable and hard to recreate:
243M of maildirs, dovecot/rspamd state, and the DKIM private key that
signs outbound mail. Today the only defense is the LVM thin-pool
snapshots on the PVE host (7-day retention, storage-class scope only)
— there is no app-level backup. Infra/.claude/CLAUDE.md mandates that
every proxmox-lvm(-encrypted) app ship a NFS-backed backup CronJob,
and the mailserver stack was the only one still out of compliance.
Loss of mailserver-data-encrypted without backups = total loss of all
stored mail plus a DKIM key rotation (which requires a DNS update and
breaks signature verification on every message in transit for the TTL
window). Unacceptable for a service people actually use.
Trade-offs considered:
- mysqldump-style single-file dump vs rsync snapshot — maildirs are
millions of small files, not a DB export. rsync --link-dest gives
incremental weekly snapshots for ~10% of the cost of a full copy.
- RWO PVC read-only mount — the underlying PVC is ReadWriteOnce, so
the backup Job has to co-locate with the mailserver pod. vaultwarden
solves this with pod_affinity; mirrored here.
- Image choice — alpine + apk add rsync matches vaultwarden's pattern
and keeps the container image small.
## This change
Adds `kubernetes_cron_job_v1.mailserver-backup` + NFS PV/PVC to the
mailserver module. Runs daily at 03:00 (avoids the 00:30 mysql-backup
and 00:45 per-db windows, and the */20 email-roundtrip cadence). The
job rsyncs /var/mail, /var/mail-state, /var/log/mail into
/srv/nfs/mailserver-backup/<YYYY-WW>/ with --link-dest against the
previous week for space-efficient incrementals. 8-week retention.
Data layout (flowed through from the deployment's subPath mounts so
the rsync tree matches the mailserver's own on-disk layout):
PVC mailserver-data-encrypted (RWO, 2Gi)
├─ data/ (subPath) → pod's /var/mail → backup/<week>/data/
├─ state/ (subPath) → pod's /var/mail-state → backup/<week>/state/
└─ log/ (subPath) → pod's /var/log/mail → backup/<week>/log/
Safety:
- PVC mounted read-only (volume.persistent_volume_claim.read_only
AND all three volume_mounts set read_only=true) so a backup-script
bug cannot corrupt maildirs.
- pod_affinity on app=mailserver + topology_key=hostname forces the
Job pod onto the same node holding the RWO PVC attachment.
- set -euxo pipefail + per-directory existence guard so a missing
subPath short-circuits cleanly instead of silently no-op'ing.
Metrics pushed to Pushgateway match the mysql-backup/vaultwarden-backup
convention (job="mailserver-backup"):
backup_duration_seconds, backup_read_bytes, backup_written_bytes,
backup_output_bytes, backup_last_success_timestamp.
Alert rules added in monitoring stack, mirroring Mysql/Vaultwarden:
- MailserverBackupStale — 36h threshold, critical, 30m for:
- MailserverBackupNeverSucceeded — critical, 1h for:
## Reproduce locally
1. cd infra/stacks/mailserver && ../../scripts/tg plan
Expected: 3 to add (cronjob + NFS PV + PVC), unrelated drift on
deployment/service is pre-existing.
2. ../../scripts/tg apply --non-interactive \
-target=module.mailserver.module.nfs_mailserver_backup_host \
-target=module.mailserver.kubernetes_cron_job_v1.mailserver-backup
3. cd ../monitoring && ../../scripts/tg apply --non-interactive
4. kubectl create job --from=cronjob/mailserver-backup \
mailserver-backup-test -n mailserver
5. kubectl wait --for=condition=complete --timeout=300s \
job/mailserver-backup-test -n mailserver
6. Expected: test pod co-locates with mailserver on same node
(k8s-node2 today), rsync writes ~950M to
/srv/nfs/mailserver-backup/<YYYY-WW>/, Pushgateway exposes
backup_output_bytes{job="mailserver-backup"}.
## Test Plan
### Automated
$ kubectl get cronjob -n mailserver mailserver-backup
NAME SCHEDULE TIMEZONE SUSPEND ACTIVE LAST SCHEDULE AGE
mailserver-backup 0 3 * * * <none> False 0 <none> 3s
$ kubectl create job --from=cronjob/mailserver-backup \
mailserver-backup-test -n mailserver
job.batch/mailserver-backup-test created
$ kubectl wait --for=condition=complete --timeout=300s \
job/mailserver-backup-test -n mailserver
job.batch/mailserver-backup-test condition met
$ kubectl logs -n mailserver job/mailserver-backup-test | tail -5
=== Backup IO Stats ===
duration: 80s
read: 1120 MiB
written: 1186 MiB
output: 947.0M
$ kubectl run nfs-verify --rm --image=alpine --restart=Never \
--overrides='{...nfs mount /srv/nfs...}' \
-n mailserver --attach -- ls -la /nfs/mailserver-backup/
947.0M /nfs/mailserver-backup/2026-15
$ curl http://prometheus-prometheus-pushgateway.monitoring:9091/metrics \
| grep mailserver-backup
backup_duration_seconds{instance="",job="mailserver-backup"} 80
backup_last_success_timestamp{instance="",job="mailserver-backup"} 1.776554641e+09
backup_output_bytes{instance="",job="mailserver-backup"} 9.92315701e+08
backup_read_bytes{instance="",job="mailserver-backup"} 1.175027712e+09
backup_written_bytes{instance="",job="mailserver-backup"} 1.244254208e+09
$ curl -s http://prometheus-server/api/v1/rules \
| jq '.data.groups[].rules[] | select(.name | test("Mailserver"))'
MailserverBackupStale: (time() - kube_cronjob_status_last_successful_time{cronjob="mailserver-backup",namespace="mailserver"}) > 129600
MailserverBackupNeverSucceeded: kube_cronjob_status_last_successful_time{cronjob="mailserver-backup",namespace="mailserver"} == 0
### Manual Verification
1. Wait for the scheduled 03:00 run tonight; verify
`kubectl get job -n mailserver` shows a new completed job.
2. Check that `backup_last_success_timestamp` advances past today.
3. Confirm `MailserverBackupNeverSucceeded` did not fire.
4. Next week (week 16), confirm `--link-dest` builds hardlinks vs
2026-15 (size delta should drop from ~950M to ~the actual churn).
## Deviations from mysql-backup pattern
- Image: alpine + rsync (mirrors vaultwarden — mysql's `mysql:8.0`
base is not applicable for a filesystem rsync).
- pod_affinity: required for RWO PVC co-location (mysql uses its own
MySQL service for network access; mailserver must mount the PVC).
- Metric push via wget (mirrors vaultwarden; alpine has wget, not curl).
- Week-folder layout with --link-dest rotation: rsync pattern, closer
to the PVE daily-backup script than mysql's single-file gzip dumps.
[ci skip]
Closes: code-z26
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
## Context
After fixing the two mail-server-side root causes of probe false-failures
(Dovecot userdb duplicates, postscreen btree lock contention), the probe
is expected to succeed well under 120s. This commit is defence in depth
against residual SMTP relay variance and against a future scenario where
Dovecot is transiently unresponsive during IMAP login.
The probe currently polls IMAP with `range(9) × 20s = 180s`. Brevo's
queueing, DNS variance, and general SMTP retry backoff can easily
exceed that on a bad day. Widening to 5 minutes gives plenty of headroom
while still remaining well within the CronJob's 20-minute schedule
interval.
Additionally, `imaplib.IMAP4_SSL(...)` previously had no timeout. If
Dovecot is unresponsive (e.g., mid-rollout, transient TLS handshake
hang), the connect call can block indefinitely and the probe hangs
without ever looping to the next attempt. Adding `timeout=10` caps each
connect at 10s so the retry loop keeps making forward progress.
## This change
Two edits to the embedded probe script inside the cronjob resource:
```
- # Step 2: Wait for delivery, retry IMAP up to 3 min
+ # Step 2: Wait for delivery, retry IMAP up to 5 min (15 x 20s)
...
- for attempt in range(9):
+ for attempt in range(15):
...
- imap = imaplib.IMAP4_SSL(IMAP_HOST, 993, ssl_context=ctx)
+ imap = imaplib.IMAP4_SSL(IMAP_HOST, 993, ssl_context=ctx, timeout=10)
```
Flow (before):
```
send via Brevo ─► for 9 loops: sleep 20s, IMAP connect (blocks on hang) ─► 180s total
```
Flow (after):
```
send via Brevo ─► for 15 loops: sleep 20s, IMAP connect (≤10s) ─► 300s total
│
└─ timeout ─► log, continue to next loop
```
## What is NOT in this change
- Probe frequency stays at `*/20 * * * *`.
- The `EmailRoundtripStale` alert thresholds are intentionally left at
3600s + for: 10m. Those fire only on sustained multi-hour issues and
should not be loosened — they would mask future regressions. Probe
success rate is now expected to recover to ≥95% from the two upstream
fixes; if it doesn't, alert tuning gets revisited separately.
- No change to the Brevo send step, the success-metrics push, or the
cleanup of stale e2e-probe-* messages.
## Test Plan
### Automated
`scripts/tg plan -target=module.mailserver.kubernetes_cron_job_v1.email_roundtrip_monitor`:
```
# module.mailserver.kubernetes_cron_job_v1.email_roundtrip_monitor will be updated in-place
- for attempt in range(9):
+ for attempt in range(15):
- imap = imaplib.IMAP4_SSL(IMAP_HOST, 993, ssl_context=ctx)
+ imap = imaplib.IMAP4_SSL(IMAP_HOST, 993, ssl_context=ctx, timeout=10)
Plan: 0 to add, 1 to change, 0 to destroy.
```
`scripts/tg apply`:
```
Apply complete! Resources: 0 added, 1 changed, 0 destroyed.
```
### Manual Verification
1. Trigger the probe manually:
`kubectl -n mailserver create job --from=cronjob/email-roundtrip-monitor probe-verify-$(date +%s)`
2. Tail its logs:
`kubectl -n mailserver logs job/probe-verify-<ts> -f`
3. Expect: `Round-trip SUCCESS` within the 5-min window. Typical
successful run should still complete in < 60s now that postscreen
is no longer stalling.
4. Watch the 48-hour window on the `email_roundtrip_success` gauge in
Prometheus — expect ≥95% (was ~65% before all three fixes).
## Reproduce locally
1. `kubectl -n mailserver get cronjob email-roundtrip-monitor -o yaml | grep -E "range\(|timeout"`
2. Expect: `range(15)` and `timeout=10`
3. `kubectl -n mailserver create job --from=cronjob/email-roundtrip-monitor probe-verify-$(date +%s)`
4. `kubectl -n mailserver logs -f job/probe-verify-<ts>`
5. Expect: eventual `Round-trip SUCCESS in <N>s` message and exit 0.
Closes: code-18e
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
## Context
Postfix inside docker-mailserver was spamming fatal errors at roughly
1 per minute — 5,464 of them in a 24h window — all of the same shape:
```
postfix/postscreen[NNN]: fatal: btree:/var/lib/postfix/postscreen_cache:
unable to get exclusive lock: Resource temporarily unavailable
```
Every time one of these fires, the postscreen process dies mid-connection
and the inbound SMTP session is dropped. Legitimate mail (including Brevo
deliveries for our e2e email-roundtrip probe) gets re-queued by the sender
and arrives late — frequently past the probe's 180s IMAP polling window,
producing a 35%/7d probe success rate and the EmailRoundtripStale alert
noise that was originally flagged as "probably nothing."
## Root cause
`master.cf` declares postscreen with `maxproc=1`, but postscreen still
re-spawns per incoming connection (or for short-lived reopens), and each
instance opens the shared btree cache with an exclusive file lock. Under
any concurrency (two TCP SYNs arriving close together, or a retry during
teardown), the second process hits EWOULDBLOCK on fcntl and Postfix
treats that as fatal.
Three options were considered:
| Option | Verdict |
|--------|---------|
| (a) Disable cache (postscreen_cache_map = ) | ✓ chosen |
| (b) Switch btree → lmdb | ✗ lmdb not compiled into docker-mailserver 15.0.0's postfix (`postconf -m` has no lmdb) |
| (c) proxy:btree via proxymap | ✗ unsafe — Postfix docs: "postscreen does its own locking, not safe via proxymap" |
| (d) Memcached sidecar | ✗ new moving part; deferred |
Option (a) is a small trade-off: legitimate clients re-run the
greet-action / bare-newline-action checks on every fresh TCP session
instead of hitting the 7-day whitelist cache. At our volume (~100
deliveries/day, ~72 of which are the probe itself) that's negligible CPU.
DNSBL re-evaluation is also avoided only partially, but this mailserver
already has `postscreen_dnsbl_action = ignore` so the cache's DNSBL role
was doing nothing anyway.
## This change
Appends a stanza to the user-merged postfix main.cf stored in
`variable.postfix_cf` that sets `postscreen_cache_map =` (empty value).
Postfix treats an empty cache_map as "no persistent cache" — per-session
decisions are still enforced, they just aren't cached across sessions.
Before:
```
smtpd ──► postscreen (maxproc=1, btree cache with exclusive lock)
├─ concurrent access → fcntl EWOULDBLOCK → fatal
└─ connection dropped, sender retries, mail arrives late
```
After:
```
smtpd ──► postscreen (no cache, per-session checks only)
└─ no shared file, no lock → no fatal, no dropped session
```
No change to master.cf (postscreen still the front-end), no change to
DNSBL / greet / bare-newline policy.
## What is NOT in this change
- Dovecot userdb dedup (shipped in the previous commit).
- Email-roundtrip probe widening (next commit).
- Rebuilding docker-mailserver image with lmdb support (deferred —
disabling the cache is simpler and sufficient at our volume).
## Test Plan
### Automated
`postconf -m` in the running container to confirm lmdb is genuinely absent
(ruling out option (b) before we commit to (a)):
```
btree cidr environ fail hash inline internal ldap memcache
nis pcre pipemap proxy randmap regexp socketmap static tcp
texthash unionmap unix
```
No lmdb entry — confirmed.
`scripts/tg plan -target=module.mailserver.kubernetes_config_map.mailserver_config`:
```
~ "postfix-main.cf" = <<-EOT
+ postscreen_cache_map =
```
`scripts/tg apply`:
```
Apply complete! Resources: 0 added, 1 changed, 0 destroyed.
```
Reloader triggers pod rollout — baseline error count before apply was 34
`unable to get exclusive lock` lines per `--tail=500` log window.
### Manual Verification
Post-rollout, when the new pod is Ready:
1. `kubectl -n mailserver exec <pod> -c docker-mailserver -- postconf postscreen_cache_map`
Expect: empty (no value)
2. Watch for 15 min: `kubectl -n mailserver logs -l app=mailserver -c docker-mailserver --tail=1000 | grep -c "unable to get exclusive lock"`
Expect: 0 new occurrences (any hits are from before the rollout).
3. Trigger a probe run manually:
`kubectl -n mailserver create job --from=cronjob/email-roundtrip-monitor probe-verify-$(date +%s)`
then `kubectl -n mailserver logs job/probe-verify-...`
Expect: `Round-trip SUCCESS` with duration < 120s.
## Reproduce locally
1. `kubectl -n mailserver exec <pod> -c docker-mailserver -- postconf postscreen_cache_map`
2. Expect: `postscreen_cache_map =` (empty value)
3. `kubectl -n mailserver logs -l app=mailserver -c docker-mailserver --since=15m | grep -c "unable to get exclusive lock"`
4. Expect: 0
Closes: code-1dc
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
## Context
Dovecot auth logs have been steadily spamming
`passwd-file /etc/dovecot/userdb: User r730-idrac@viktorbarzin.me exists more
than once` (and the same for vaultwarden@) at ~31 occurrences per 500 log
lines. Under load this flakes IMAP auth for the e2e email-roundtrip probe
(spam@viktorbarzin.me uses the catch-all), which was masquerading as "Brevo
or probe timing" noise.
## Root cause
docker-mailserver builds Dovecot's `/etc/dovecot/userdb` from two sources:
real accounts (`postfix-accounts.cf`) AND virtual-alias entries whose
*target* resolves to a local mailbox (`postfix-virtual.cf`). When the same
address appears as BOTH a real mailbox AND an alias whose target is another
local mailbox, the generated userdb has two lines for that username pointing
to different home directories — e.g.:
r730-idrac@viktorbarzin.me:...:/var/mail/.../r730-idrac/home
r730-idrac@viktorbarzin.me:...:/var/mail/.../spam/home ← from alias
Dovecot's passwd-file driver rejects the duplicate, and every subsequent
auth lookup logs the error.
This affected exactly two addresses:
- r730-idrac@viktorbarzin.me (real account + alias → spam@)
- vaultwarden@viktorbarzin.me (real account + alias → me@)
Other aliases are fine: they either forward to external addresses (gmail
etc.) — no local userdb entry generated — or map an address to itself
(me@ → me@) which docker-mailserver dedups internally.
Note: removing the real accounts is not an option because Vaultwarden uses
`vaultwarden@viktorbarzin.me` as its live SMTP_USERNAME
(stacks/vaultwarden/modules/vaultwarden/main.tf:121).
## This change
Introduces a `local.postfix_virtual` that concatenates the Vault-sourced
aliases with `extra/aliases.txt`, then filters out any line matching the
exact "LHS RHS" shape where both sides are in `var.mailserver_accounts` and
LHS != RHS. That is, only the pure local→local redundant entries are
dropped; all forwarding aliases and the catch-all are preserved.
The filter is self-healing: if a future alias ever collides with a real
account, it gets silently suppressed instead of breaking Dovecot auth.
```
Vault mailserver_aliases ─┐
├─ concat ─ split \n ─ filter ─ join \n ─► postfix-virtual.cf
extra/aliases.txt ─────────┘ │
└── drop if LHS+RHS both in
mailserver_accounts and
LHS != RHS
```
Filtered entries (confirmed via locally-simulated filter on live data):
- r730-idrac@viktorbarzin.mespam@viktorbarzin.me
- vaultwarden@viktorbarzin.meme@viktorbarzin.me
Preserved (sample): postmaster→me, contact→me, alarm-valchedrym→self+3 ext,
lubohristov→gmail, yoana→gmail, @viktorbarzin.me→spam (catch-all), all four
disposable `*-generated@` aliases.
## What is NOT in this change
- Real accounts in Vault (`secret/platform.mailserver_accounts`) are
untouched — vaultwarden SMTP auth keeps working.
- Postfix postscreen btree lock contention (separate commit).
- Email-roundtrip probe IMAP window (separate commit).
## Test Plan
### Automated
`terraform validate` — passes (docker-mailserver module):
```
Success! The configuration is valid, but there were some validation warnings as shown above.
```
`scripts/tg plan -target=module.mailserver.kubernetes_config_map.mailserver_config`:
```
# module.mailserver.kubernetes_config_map.mailserver_config will be updated in-place
~ resource "kubernetes_config_map" "mailserver_config" {
~ data = {
~ "postfix-virtual.cf" = (sensitive value)
# (9 unchanged elements hidden)
}
id = "mailserver/mailserver.config"
}
Plan: 0 to add, 1 to change, 0 to destroy.
```
`scripts/tg apply` — applied:
```
Apply complete! Resources: 0 added, 1 changed, 0 destroyed.
```
### Manual Verification
Post-apply configmap content (the two lines are gone):
```
$ kubectl -n mailserver get cm mailserver.config -o jsonpath='{.data.postfix-virtual\.cf}'
postmaster@viktorbarzin.meme@viktorbarzin.mecontact@viktorbarzin.meme@viktorbarzin.meme@viktorbarzin.meme@viktorbarzin.melubohristov@viktorbarzin.melyubomir.hristov3@gmail.comalarm-valchedrym@viktorbarzin.me alarm-valchedrym@...,vbarzin@...,emil.barzin@...,me@...
yoana@viktorbarzin.medivcheva.yoana@gmail.com
@viktorbarzin.me spam@viktorbarzin.mefirmly-gerardo-generated@viktorbarzin.meme@viktorbarzin.meclosely-keith-generated@viktorbarzin.mevbarzin@gmail.comliterally-paolo-generated@viktorbarzin.meviktorbarzin@fb.comhastily-stefanie-generated@viktorbarzin.meelliestamenova@gmail.com
```
Reloader triggers a pod rollout; once new pod is Ready:
- `kubectl -n mailserver exec <pod> -c docker-mailserver -- cut -d: -f1 /etc/dovecot/userdb | sort | uniq -d`
expected output: empty (no duplicate usernames)
- `kubectl -n mailserver logs <pod> -c docker-mailserver --tail=500 | grep -c "exists more than once"`
expected output: 0 (baseline was 31/500 lines)
## Reproduce locally
1. `kubectl -n mailserver get cm mailserver.config -o jsonpath='{.data.postfix-virtual\.cf}'`
2. Expect: no `r730-idrac@viktorbarzin.me spam@viktorbarzin.me` line and no
`vaultwarden@viktorbarzin.me me@viktorbarzin.me` line.
3. After pod restart: `kubectl -n mailserver logs -l app=mailserver -c docker-mailserver --tail=500 | grep -c "exists more than once"` → 0.
Closes: code-27l
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
## Context
Wave 3A (commit c9d221d5) added the `# KYVERNO_LIFECYCLE_V1` marker to the
27 pre-existing `ignore_changes = [...dns_config]` sites so they could be
grepped and audited. It did NOT address pod-owning resources that were
simply missing the suppression entirely. Post-Wave-3A sampling (2026-04-18)
found that navidrome, f1-stream, frigate, servarr, monitoring, crowdsec,
and many other stacks showed perpetual `dns_config` drift every plan
because their `kubernetes_deployment` / `kubernetes_stateful_set` /
`kubernetes_cron_job_v1` resources had no `lifecycle {}` block at all.
Root cause (same as Wave 3A): Kyverno's admission webhook stamps
`dns_config { option { name = "ndots"; value = "2" } }` on every pod's
`spec.template.spec.dns_config` to prevent NxDomain search-domain flooding
(see `k8s-ndots-search-domain-nxdomain-flood` skill). Without `ignore_changes`
on every Terraform-managed pod-owner, Terraform repeatedly tries to strip
the injected field.
## This change
Extends the Wave 3A convention by sweeping EVERY `kubernetes_deployment`,
`kubernetes_stateful_set`, `kubernetes_daemon_set`, `kubernetes_cron_job_v1`,
`kubernetes_job_v1` (+ their `_v1` variants) in the repo and ensuring each
carries the right `ignore_changes` path:
- **kubernetes_deployment / stateful_set / daemon_set / job_v1**:
`spec[0].template[0].spec[0].dns_config`
- **kubernetes_cron_job_v1**:
`spec[0].job_template[0].spec[0].template[0].spec[0].dns_config`
(extra `job_template[0]` nesting — the CronJob's PodTemplateSpec is
one level deeper)
Each injection / extension is tagged `# KYVERNO_LIFECYCLE_V1: Kyverno
admission webhook mutates dns_config with ndots=2` inline so the
suppression is discoverable via `rg 'KYVERNO_LIFECYCLE_V1' stacks/`.
Two insertion paths are handled by a Python pass (`/tmp/add_dns_config_ignore.py`):
1. **No existing `lifecycle {}`**: inject a brand-new block just before the
resource's closing `}`. 108 new blocks on 93 files.
2. **Existing `lifecycle {}` (usually for `DRIFT_WORKAROUND: CI owns image tag`
from Wave 4, commit a62b43d1)**: extend its `ignore_changes` list with the
dns_config path. Handles both inline (`= [x]`) and multiline
(`= [\n x,\n]`) forms; ensures the last pre-existing list item carries
a trailing comma so the extended list is valid HCL. 34 extensions.
The script skips anything already mentioning `dns_config` inside an
`ignore_changes`, so re-running is a no-op.
## Scale
- 142 total lifecycle injections/extensions
- 93 `.tf` files touched
- 108 brand-new `lifecycle {}` blocks + 34 extensions of existing ones
- Every Tier 0 and Tier 1 stack with a pod-owning resource is covered
- Together with Wave 3A's 27 pre-existing markers → **169 greppable
`KYVERNO_LIFECYCLE_V1` dns_config sites across the repo**
## What is NOT in this change
- `stacks/trading-bot/main.tf` — entirely commented-out block (`/* … */`).
Python script touched the file, reverted manually.
- `_template/main.tf.example` skeleton — kept minimal on purpose; any
future stack created from it should either inherit the Wave 3A one-line
form or add its own on first `kubernetes_deployment`.
- `terraform fmt` fixes to pre-existing alignment issues in meshcentral,
nvidia/modules/nvidia, vault — unrelated to this commit. Left for a
separate fmt-only pass.
- Non-pod resources (`kubernetes_service`, `kubernetes_secret`,
`kubernetes_manifest`, etc.) — they don't own pods so they don't get
Kyverno dns_config mutation.
## Verification
Random sample post-commit:
```
$ cd stacks/navidrome && ../../scripts/tg plan → No changes.
$ cd stacks/f1-stream && ../../scripts/tg plan → No changes.
$ cd stacks/frigate && ../../scripts/tg plan → No changes.
$ rg -c 'KYVERNO_LIFECYCLE_V1' stacks/ --include='*.tf' --include='*.tf.example' \
| awk -F: '{s+=$2} END {print s}'
169
```
## Reproduce locally
1. `git pull`
2. `rg 'KYVERNO_LIFECYCLE_V1' stacks/ | wc -l` → 169+
3. `cd stacks/navidrome && ../../scripts/tg plan` → expect 0 drift on
the deployment's dns_config field.
Refs: code-seq (Wave 3B dns_config class closed; kubernetes_manifest
annotation class handled separately in 8d94688d for tls_secret)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
## Context
Wave 3B-continued: the Goldilocks VPA dashboard (stacks/vpa) runs a Kyverno
ClusterPolicy `goldilocks-vpa-auto-mode` that mutates every namespace with
`metadata.labels["goldilocks.fairwinds.com/vpa-update-mode"] = "off"`. This
is intentional — Terraform owns container resource limits, and Goldilocks
should only provide recommendations, never auto-update. The label is how
Goldilocks decides per-namespace whether to run its VPA in `off` mode.
Effect on Terraform: every `kubernetes_namespace` resource shows the label
as pending-removal (`-> null`) on every `scripts/tg plan`. Dawarich survey
2026-04-18 confirmed the drift. Cluster-side count: 88 namespaces carry the
label (`kubectl get ns -o json | jq ... | wc -l`). Every TF-managed namespace
is affected.
This commit brings the intentional admission drift under the same
`# KYVERNO_LIFECYCLE_V1` discoverability marker introduced in c9d221d5 for
the ndots dns_config pattern. The marker now stands generically for any
Kyverno admission-webhook drift suppression; the inline comment records
which specific policy stamps which specific field so future grep audits
show why each suppression exists.
## This change
107 `.tf` files touched — every stack's `resource "kubernetes_namespace"`
resource gets:
```hcl
lifecycle {
# KYVERNO_LIFECYCLE_V1: goldilocks-vpa-auto-mode ClusterPolicy stamps this label on every namespace
ignore_changes = [metadata[0].labels["goldilocks.fairwinds.com/vpa-update-mode"]]
}
```
Injection was done with a brace-depth-tracking Python pass (`/tmp/add_goldilocks_ignore.py`):
match `^resource "kubernetes_namespace" ` → track `{` / `}` until the
outermost closing brace → insert the lifecycle block before the closing
brace. The script is idempotent (skips any file that already mentions
`goldilocks.fairwinds.com/vpa-update-mode`) so re-running is safe.
Vault stack picked up 2 namespaces in the same file (k8s-users produces
one, plus a second explicit ns) — confirmed via file diff (+8 lines).
## What is NOT in this change
- `stacks/trading-bot/main.tf` — entire file is `/* … */` commented out
(paused 2026-04-06 per user decision). Reverted after the script ran.
- `stacks/_template/main.tf.example` — per-stack skeleton, intentionally
minimal. User keeps it that way. Not touched by the script (file
has no real `resource "kubernetes_namespace"` — only a placeholder
comment).
- `.terraform/` copies (e.g. `stacks/metallb/.terraform/modules/...`) —
gitignored, won't commit; the live path was edited.
- `terraform fmt` cleanup of adjacent pre-existing alignment issues in
authentik, freedify, hermes-agent, nvidia, vault, meshcentral. Reverted
to keep the commit scoped to the Goldilocks sweep. Those files will
need a separate fmt-only commit or will be cleaned up on next real
apply to that stack.
## Verification
Dawarich (one of the hundred-plus touched stacks) showed the pattern
before and after:
```
$ cd stacks/dawarich && ../../scripts/tg plan
Before:
Plan: 0 to add, 2 to change, 0 to destroy.
# kubernetes_namespace.dawarich will be updated in-place
(goldilocks.fairwinds.com/vpa-update-mode -> null)
# module.tls_secret.kubernetes_secret.tls_secret will be updated in-place
(Kyverno generate.* labels — fixed in 8d94688d)
After:
No changes. Your infrastructure matches the configuration.
```
Injection count check:
```
$ rg -c 'KYVERNO_LIFECYCLE_V1: goldilocks-vpa-auto-mode' stacks/ | awk -F: '{s+=$2} END {print s}'
108
```
## Reproduce locally
1. `git pull`
2. Pick any stack: `cd stacks/<name> && ../../scripts/tg plan`
3. Expect: no drift on the namespace's goldilocks.fairwinds.com/vpa-update-mode label.
Closes: code-dwx
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The rewrite-body Traefik plugin (both packruler/rewrite-body v1.2.0 and
the-ccsn/traefik-plugin-rewritebody v0.1.3) silently fails on Traefik
v3.6.12 due to Yaegi interpreter issues with ResponseWriter wrapping.
Both plugins load without errors but never inject content.
Removed:
- rewrite-body plugin download (init container) and registration
- strip-accept-encoding middleware (only existed for rewrite-body bug)
- anti-ai-trap-links middleware (used rewrite-body for injection)
- rybbit_site_id variable from ingress_factory and reverse_proxy factory
- rybbit_site_id from 25 service stacks (39 instances)
- Per-service rybbit-analytics middleware CRD resources
Kept:
- compress middleware (entrypoint-level, working correctly)
- ai-bot-block middleware (ForwardAuth to bot-block-proxy)
- anti-ai-headers middleware (X-Robots-Tag: noai, noimageai)
- All CrowdSec, Authentik, rate-limit middleware unchanged
Next: Cloudflare Workers with HTMLRewriter for edge-side injection.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Two-tier state architecture:
- Tier 0 (infra, platform, cnpg, vault, dbaas, external-secrets): local
state with SOPS encryption in git — unchanged, required for bootstrap.
- Tier 1 (105 app stacks): PostgreSQL backend on CNPG cluster at
10.0.20.200:5432/terraform_state with native pg_advisory_lock.
Motivation: multi-operator friction (every workstation needed SOPS + age +
git-crypt), bootstrap complexity for new operators, and headless agents/CI
needing the full encryption toolchain just to read state.
Changes:
- terragrunt.hcl: conditional backend (local vs pg) based on tier0 list
- scripts/tg: tier detection, auto-fetch PG creds from Vault for Tier 1,
skip SOPS and Vault KV locking for Tier 1 stacks
- scripts/state-sync: tier-aware encrypt/decrypt (skips Tier 1)
- scripts/migrate-state-to-pg: one-shot migration script (idempotent)
- stacks/vault/main.tf: pg-terraform-state static role + K8s auth role
for claude-agent namespace
- stacks/dbaas: terraform_state DB creation + MetalLB LoadBalancer
service on shared IP 10.0.20.200
- Deleted 107 .tfstate.enc files for migrated Tier 1 stacks
- Cleaned up per-stack tiers.tf (now generated by root terragrunt.hcl)
[ci skip]
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
## Context
Deploying new services required manually adding hostnames to
cloudflare_proxied_names/cloudflare_non_proxied_names in config.tfvars —
a separate file from the service stack. This was frequently forgotten,
leaving services unreachable externally.
## This change:
- Add `dns_type` parameter to `ingress_factory` and `reverse_proxy/factory`
modules. Setting `dns_type = "proxied"` or `"non-proxied"` auto-creates
the Cloudflare DNS record (CNAME to tunnel or A/AAAA to public IP).
- Simplify cloudflared tunnel from 100 per-hostname rules to wildcard
`*.viktorbarzin.me → Traefik`. Traefik still handles host-based routing.
- Add global Cloudflare provider via terragrunt.hcl (separate
cloudflare_provider.tf with Vault-sourced API key).
- Migrate 118 hostnames from centralized config.tfvars to per-service
dns_type. 17 hostnames remain centrally managed (Helm ingresses,
special cases).
- Update docs, AGENTS.md, CLAUDE.md, dns.md runbook.
```
BEFORE AFTER
config.tfvars (manual list) stacks/<svc>/main.tf
| module "ingress" {
v dns_type = "proxied"
stacks/cloudflared/ }
for_each = list |
cloudflare_record auto-creates
tunnel per-hostname cloudflare_record + annotation
```
## What is NOT in this change:
- Uptime Kuma monitor migration (still reads from config.tfvars)
- 17 remaining centrally-managed hostnames (Helm, special cases)
- Removal of allow_overwrite (keep until migration confirmed stable)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Inbound:
- Direct MX to mail.viktorbarzin.me (ForwardEmail relay attempted and abandoned)
- Dedicated MetalLB IP 10.0.20.202 with ETP: Local for CrowdSec real-IP detection
- Removed Cloudflare Email Routing (can't store-and-forward)
- Fixed dual SPF violation, hardened to -all
- Added MTA-STS, TLSRPT, imported Rspamd DKIM into Terraform
- Removed dead BIND zones from config.tfvars (199 lines)
Outbound:
- Migrated from Mailgun (100/day) to Brevo (300/day free)
- Added Brevo DKIM CNAMEs and verification TXT
Monitoring:
- Probe frequency: 30m → 20m, alert thresholds adjusted to 60m
- Enabled Dovecot exporter scraping (port 9166)
- Added external SMTP monitor on public IP
Documentation:
- New docs/architecture/mailserver.md with full architecture
- New docs/architecture/mailserver-visual.html visualization
- Updated monitoring.md, CLAUDE.md, historical plan docs
Previously only searched for the current run's specific marker subject.
If IMAP deletion failed, old emails accumulated. Now searches for all
emails with "e2e-probe" in subject and deletes them, cleaning up any
leftovers from prior failed runs.
ENABLE_RSPAMD_REDIS=0 prevents the docker-mailserver from attempting to start
an embedded Redis server. The rspamd-redis subprocess was failing repeatedly
due to a corrupted/empty RDB file after the recent NFS-to-proxmox-lvm storage
migration. Since the DKIM signing config uses use_redis=false, Redis is not
needed.
Also correct the PVC storage request to match the actual provisioned size (2Gi).
The mismatch was causing unnecessary PVC replacement during terraform apply.
- Add public_ipv6 variable and AAAA records for all 34 non-proxied services
- Fix stale DNS records (85.130.108.6 → 176.12.22.76, old IPv6 → HE tunnel)
- Update SPF record with current IPv4/IPv6 addresses
- Add AAAA update support to Technitium DNS updater CLI
- Pin mailserver MetalLB IP to 10.0.20.201 for stable pfSense NAT
- pfSense: HE_IPv6 interface, strict firewall (80,443,25,465,587,993 + ICMPv6),
socat IPv6→IPv4 proxy, removed dangerous "Allow all DEBUG" rules
Phase 2 of platform stack split. 5 more modules extracted into
independent stacks. All applied successfully with zero destroys.
Cloudflared now reads k8s_users from Vault directly to compute
user_domains. Woodpecker pipeline runs all 8 extracted stacks
in parallel. Memory bumped to 6Gi for 9 concurrent TF processes.
Platform reduced from 27 to 19 modules.