infra/stacks
Viktor Barzin 4c8d12229f mailserver: split healthcheck path off PROXY-aware listeners + book-search uses ClusterIP
Two coordinated fixes for the same root cause: Postfix's smtpd_upstream_proxy_protocol
listener fatals on every HAProxy health probe with `smtpd_peer_hostaddr_to_sockaddr:
... Servname not supported for ai_socktype` — the daemon respawns get throttled by
postfix master, and real client connections that land mid-respawn time out. We saw
this as ~50% timeout rate on public 587 from inside the cluster.

Layer 1 (book-search) — stacks/ebooks/main.tf:
  SMTP_HOST mail.viktorbarzin.me → mailserver.mailserver.svc.cluster.local
  Internal services should use ClusterIP, not hairpin through pfSense+HAProxy.
  12/12 OK in <28ms vs ~6/12 timeouts on the public path.

Layer 2 (pfSense HAProxy) — stacks/mailserver + scripts/pfsense-haproxy-bootstrap.php:
  Add 3 non-PROXY healthcheck NodePorts to mailserver-proxy svc:
    30145 → pod 25  (stock postscreen)
    30146 → pod 465 (stock smtps)
    30147 → pod 587 (stock submission)
  HAProxy uses `port <healthcheck-nodeport>` (per-server in advanced field) to
  redirect L4 health probes to those ports while real client traffic keeps
  going to 30125-30128 with PROXY v2.
  Result: 0 fatals/min (was 96), 30/30 probes OK on 587, e2e roundtrip 20.4s.
  Inter dropped 120000 → 5000 since log-spam concern is gone.

`option smtpchk EHLO` was tried first but flapped against postscreen (multi-line
greet + DNSBL silence + anti-pre-greet detection trip HAProxy's parser → L7RSP).
Plain TCP accept-on-port check is sufficient for both submission and postscreen.

Updated docs/runbooks/mailserver-pfsense-haproxy.md to reflect the new healthcheck
path and mark the "Known warts" entry as resolved.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-05 19:45:33 +00:00
..
_template [infra] Establish KYVERNO_LIFECYCLE_V1 drift-suppression convention [ci skip] 2026-04-18 14:15:51 +00:00
actualbudget [infra] TrueNAS decommission — remove active references from Terraform + configs 2026-04-19 16:57:05 +00:00
affine [multi] Sweep Kyverno wait-for redis annotations to redis-master 2026-04-19 12:44:46 +00:00
authentik priority-pass: backend c2b4ac50 — crop to card before transforming 2026-05-01 19:06:02 +00:00
beads-server [registry] bulk-clean 34 orphan manifests + beads-server image bump 2026-04-19 23:16:34 +00:00
blog [infra] Sweep dns_config ignore_changes across all pod-owning resources [ci skip] 2026-04-18 21:19:48 +00:00
broker-sync [broker-sync] unsuspend IMAP + Panel 15 RSU vest reconciliation (Phase D) 2026-04-19 18:29:01 +00:00
calico [infra] Partial Calico adoption: namespaces only (Wave 5b) 2026-04-18 22:52:56 +00:00
changedetection [infra] Sweep dns_config ignore_changes across all pod-owning resources [ci skip] 2026-04-18 21:19:48 +00:00
city-guesser [infra] Sweep dns_config ignore_changes across all pod-owning resources [ci skip] 2026-04-18 21:19:48 +00:00
claude-agent-service [claude-agent-service] Add WOODPECKER_API_TOKEN + SLACK_WEBHOOK_URL env vars 2026-04-19 13:23:12 +00:00
claude-memory [infra] Sweep dns_config ignore_changes across all pod-owning resources [ci skip] 2026-04-18 21:19:48 +00:00
cloudflared [mailserver] Route DMARC rua/ruf to dmarc@viktorbarzin.me [ci skip] 2026-04-18 23:49:14 +00:00
cnpg [infra] Suppress Goldilocks vpa-update-mode label drift on all namespaces [ci skip] 2026-04-18 21:15:27 +00:00
coturn [infra] Sweep dns_config ignore_changes across all pod-owning resources [ci skip] 2026-04-18 21:19:48 +00:00
crowdsec crowdsec/traefik: stop captchaing legit Immich mobile bursts 2026-04-26 09:27:16 +00:00
cyberchef [infra] Sweep dns_config ignore_changes across all pod-owning resources [ci skip] 2026-04-18 21:19:48 +00:00
dashy [infra] TrueNAS decommission — remove active references from Terraform + configs 2026-04-19 16:57:05 +00:00
dawarich [multi] Sweep Kyverno wait-for redis annotations to redis-master 2026-04-19 12:44:46 +00:00
dbaas fire-planner: add stack, Vault DB role, dashboard, DB 2026-04-25 17:27:19 +00:00
descheduler [infra] Suppress Goldilocks vpa-update-mode label drift on all namespaces [ci skip] 2026-04-18 21:15:27 +00:00
diun [infra] Suppress Goldilocks vpa-update-mode label drift on all namespaces [ci skip] 2026-04-18 21:15:27 +00:00
ebook2audiobook gpu: schedule off NFD label, not k8s-node1 hostname 2026-04-22 13:43:07 +00:00
ebooks mailserver: split healthcheck path off PROXY-aware listeners + book-search uses ClusterIP 2026-05-05 19:45:33 +00:00
echo [infra] Sweep dns_config ignore_changes across all pod-owning resources [ci skip] 2026-04-18 21:19:48 +00:00
excalidraw [docs] TrueNAS decommission cleanup — remove references from active docs 2026-04-19 16:55:43 +00:00
external-secrets [infra] Suppress Goldilocks vpa-update-mode label drift on all namespaces [ci skip] 2026-04-18 21:15:27 +00:00
f1-stream [infra] Sweep dns_config ignore_changes across all pod-owning resources [ci skip] 2026-04-18 21:19:48 +00:00
fire-planner fire-planner: add stack, Vault DB role, dashboard, DB 2026-04-25 17:27:19 +00:00
foolery [infra] Suppress Goldilocks vpa-update-mode label drift on all namespaces [ci skip] 2026-04-18 21:15:27 +00:00
forgejo [infra] Sweep dns_config ignore_changes across all pod-owning resources [ci skip] 2026-04-18 21:19:48 +00:00
freedify [infra] TrueNAS decommission — remove active references from Terraform + configs 2026-04-19 16:57:05 +00:00
freshrss [infra] Sweep dns_config ignore_changes across all pod-owning resources [ci skip] 2026-04-18 21:19:48 +00:00
frigate gpu: schedule off NFD label, not k8s-node1 hostname 2026-04-22 13:43:07 +00:00
grampsweb [monitoring] Opt-out external monitor for family/mladost3/task-webhook/torrserver; drop r730 2026-04-19 15:18:27 +00:00
hackmd [infra] Sweep dns_config ignore_changes across all pod-owning resources [ci skip] 2026-04-18 21:19:48 +00:00
headscale [infra] Sweep dns_config ignore_changes across all pod-owning resources [ci skip] 2026-04-18 21:19:48 +00:00
health [infra] Sweep dns_config ignore_changes across all pod-owning resources [ci skip] 2026-04-18 21:19:48 +00:00
hermes-agent [hermes-agent] disable deployment — PVC permission mismatch 2026-04-22 14:31:50 +00:00
homepage [docs] TrueNAS decommission cleanup — remove references from active docs 2026-04-19 16:55:43 +00:00
immich immich: bump server to 8Gi + override tier-2-gpu quota to 20Gi 2026-04-26 20:02:28 +00:00
infra [infra] Migrate Terraform state from local SOPS to PostgreSQL backend 2026-04-16 19:33:12 +00:00
infra-maintenance [infra] Sweep dns_config ignore_changes across all pod-owning resources [ci skip] 2026-04-18 21:19:48 +00:00
insta2spotify [infra] Suppress Goldilocks vpa-update-mode label drift on all namespaces [ci skip] 2026-04-18 21:15:27 +00:00
isponsorblocktv [infra] Sweep dns_config ignore_changes across all pod-owning resources [ci skip] 2026-04-18 21:19:48 +00:00
job-hunter [job-hunter] Bump image to 92afc38d — Frankfurter FX + comp_table COALESCE 2026-04-19 19:09:54 +00:00
jsoncrack [infra] Sweep dns_config ignore_changes across all pod-owning resources [ci skip] 2026-04-18 21:19:48 +00:00
k8s-dashboard [infra] Suppress Goldilocks vpa-update-mode label drift on all namespaces [ci skip] 2026-04-18 21:15:27 +00:00
k8s-portal gpu: schedule off NFD label, not k8s-node1 hostname 2026-04-22 13:43:07 +00:00
kms [infra] Sweep dns_config ignore_changes across all pod-owning resources [ci skip] 2026-04-18 21:19:48 +00:00
kured [infra] Adopt kured + sentinel-gate into Terraform (Wave 5a) 2026-04-18 22:33:29 +00:00
kyverno [multi] Sweep Kyverno wait-for redis annotations to redis-master 2026-04-19 12:44:46 +00:00
linkwarden [infra] Sweep dns_config ignore_changes across all pod-owning resources [ci skip] 2026-04-18 21:19:48 +00:00
local-path [infra] Adopt local-path-provisioner into Terraform (Wave 5c) 2026-04-18 22:39:55 +00:00
mailserver mailserver: split healthcheck path off PROXY-aware listeners + book-search uses ClusterIP 2026-05-05 19:45:33 +00:00
matrix [infra] Sweep dns_config ignore_changes across all pod-owning resources [ci skip] 2026-04-18 21:19:48 +00:00
meshcentral [infra] Sweep dns_config ignore_changes across all pod-owning resources [ci skip] 2026-04-18 21:19:48 +00:00
metallb [infra] Suppress Goldilocks vpa-update-mode label drift on all namespaces [ci skip] 2026-04-18 21:15:27 +00:00
metrics-server [infra] Suppress Goldilocks vpa-update-mode label drift on all namespaces [ci skip] 2026-04-18 21:15:27 +00:00
monitoring monitoring(wealth): show pre-2024 historical data on timeseries 2026-05-05 18:43:26 +00:00
n8n [job-hunter] Add infra stack + Grafana dashboard + n8n digest workflow 2026-04-19 17:09:29 +00:00
navidrome [infra] Sweep dns_config ignore_changes across all pod-owning resources [ci skip] 2026-04-18 21:19:48 +00:00
netbox [infra] Sweep dns_config ignore_changes across all pod-owning resources [ci skip] 2026-04-18 21:19:48 +00:00
networking-toolbox [infra] Sweep dns_config ignore_changes across all pod-owning resources [ci skip] 2026-04-18 21:19:48 +00:00
nextcloud nextcloud(backup): pin backup pod to nextcloud's node via podAffinity 2026-04-26 11:03:20 +00:00
nfs-csi [infra] TrueNAS decommission — remove active references from Terraform + configs 2026-04-19 16:57:05 +00:00
nodelocal-dns [dns] NodeLocal DNSCache — deploy DaemonSet to all nodes (WS C) 2026-04-19 15:46:41 +00:00
novelapp [infra] Sweep dns_config ignore_changes across all pod-owning resources [ci skip] 2026-04-18 21:19:48 +00:00
ntfy [infra] Sweep dns_config ignore_changes across all pod-owning resources [ci skip] 2026-04-18 21:19:48 +00:00
nvidia gpu: schedule off NFD label, not k8s-node1 hostname 2026-04-22 13:43:07 +00:00
onlyoffice [multi] Sweep Kyverno wait-for redis annotations to redis-master 2026-04-19 12:44:46 +00:00
openclaw [cluster-health] Expand to 42 checks, remove pod CronJob path 2026-04-19 15:13:03 +00:00
osm_routing [infra] Sweep dns_config ignore_changes across all pod-owning resources [ci skip] 2026-04-18 21:19:48 +00:00
owntracks [owntracks] Strip face avatar from hook payload + drop orphan PVC 2026-04-19 12:05:18 +00:00
paperless-ngx paperless-ngx: migrate to proxmox-lvm-encrypted 2026-04-25 16:48:53 +00:00
payslip-ingest [payslip-ingest] ActualBudget payroll sync CronJob + Panel 14 (Phase C) 2026-04-19 18:21:20 +00:00
phpipam phpipam-pfsense-import: every 5min → hourly 2026-04-26 22:48:43 +00:00
platform [infra] Add Cloudflare provider to all stack lock files and generated providers 2026-04-16 16:31:36 +00:00
plotting-book [infra] Sweep dns_config ignore_changes across all pod-owning resources [ci skip] 2026-04-18 21:19:48 +00:00
poison-fountain [poison-fountain] opt ingress out of Uptime Kuma external monitor 2026-04-22 21:24:22 +00:00
priority-pass priority-pass: pin to backend 7c01448d (transplant QR into golden-position container) 2026-05-05 19:14:11 +00:00
privatebin [infra] Sweep dns_config ignore_changes across all pod-owning resources [ci skip] 2026-04-18 21:19:48 +00:00
proxmox-csi [infra] Suppress Goldilocks vpa-update-mode label drift on all namespaces [ci skip] 2026-04-18 21:15:27 +00:00
pvc-autoresizer [infra] Suppress Goldilocks vpa-update-mode label drift on all namespaces [ci skip] 2026-04-18 21:15:27 +00:00
rbac [infra] Migrate Terraform state from local SOPS to PostgreSQL backend 2026-04-16 19:33:12 +00:00
real-estate-crawler [multi] Sweep Kyverno wait-for redis annotations to redis-master 2026-04-19 12:44:46 +00:00
redis [redis] stabilise against node-crash flap cascade — RC1-RC5 fixes 2026-04-22 15:59:00 +00:00
reloader [infra] Suppress Goldilocks vpa-update-mode label drift on all namespaces [ci skip] 2026-04-18 21:15:27 +00:00
resume [infra] Sweep dns_config ignore_changes across all pod-owning resources [ci skip] 2026-04-18 21:19:48 +00:00
reverse-proxy [infra] TrueNAS decommission — remove active references from Terraform + configs 2026-04-19 16:57:05 +00:00
rybbit [infra] Sweep dns_config ignore_changes across all pod-owning resources [ci skip] 2026-04-18 21:19:48 +00:00
sealed-secrets [infra] Suppress Goldilocks vpa-update-mode label drift on all namespaces [ci skip] 2026-04-18 21:15:27 +00:00
send [infra] Sweep dns_config ignore_changes across all pod-owning resources [ci skip] 2026-04-18 21:19:48 +00:00
servarr [servarr/mam-farming] Tune grabber for MAM's real catalogue 2026-04-19 15:46:46 +00:00
shadowsocks [infra] Sweep dns_config ignore_changes across all pod-owning resources [ci skip] 2026-04-18 21:19:48 +00:00
speedtest [infra] Sweep dns_config ignore_changes across all pod-owning resources [ci skip] 2026-04-18 21:19:48 +00:00
status-page [infra] Establish KYVERNO_LIFECYCLE_V1 drift-suppression convention [ci skip] 2026-04-18 14:15:51 +00:00
stirling-pdf [infra] Sweep dns_config ignore_changes across all pod-owning resources [ci skip] 2026-04-18 21:19:48 +00:00
tandoor [infra] Sweep dns_config ignore_changes across all pod-owning resources [ci skip] 2026-04-18 21:19:48 +00:00
technitium [technitium] zone-sync now reconciles primaryNameServerAddresses 2026-04-22 17:47:18 +00:00
terminal [infra] Suppress Goldilocks vpa-update-mode label drift on all namespaces [ci skip] 2026-04-18 21:15:27 +00:00
tor-proxy [openclaw,tor-proxy] Opt task-webhook + torrserver out of external monitoring 2026-04-19 13:01:36 +00:00
trading-bot [multi] Sweep Kyverno wait-for redis annotations to redis-master 2026-04-19 12:44:46 +00:00
traefik traefik: raise websecure idleTimeout 180s -> 600s for iOS Immich -1005 2026-04-26 12:32:05 +00:00
travel_blog [infra] Sweep dns_config ignore_changes across all pod-owning resources [ci skip] 2026-04-18 21:19:48 +00:00
tuya-bridge tuya-bridge: liveness probe hits /health so k8s restarts silently-hung bridge 2026-04-23 07:47:41 +00:00
uptime-kuma [reverse-proxy] Fix gw.viktorbarzin.me — point at 192.168.1.1 via EndpointSlice 2026-04-19 15:07:24 +00:00
url [infra] Sweep dns_config ignore_changes across all pod-owning resources [ci skip] 2026-04-18 21:19:48 +00:00
vault fire-planner: add stack, Vault DB role, dashboard, DB 2026-04-25 17:27:19 +00:00
vaultwarden [infra] Sweep dns_config ignore_changes across all pod-owning resources [ci skip] 2026-04-18 21:19:48 +00:00
vpa [infra] Migrate Terraform state from local SOPS to PostgreSQL backend 2026-04-16 19:33:12 +00:00
wealthfolio wealthfolio(daily-sync): API call CronJob, replaces rollout-restart 2026-04-29 21:21:24 +00:00
webhook_handler [infra] Suppress Goldilocks vpa-update-mode label drift on all namespaces [ci skip] 2026-04-18 21:15:27 +00:00
whisper gpu: schedule off NFD label, not k8s-node1 hostname 2026-04-22 13:43:07 +00:00
wireguard [infra] Suppress Goldilocks vpa-update-mode label drift on all namespaces [ci skip] 2026-04-18 21:15:27 +00:00
woodpecker [infra] Sweep dns_config ignore_changes across all pod-owning resources [ci skip] 2026-04-18 21:19:48 +00:00
xray [infra] Sweep dns_config ignore_changes across all pod-owning resources [ci skip] 2026-04-18 21:19:48 +00:00
ytdlp gpu: schedule off NFD label, not k8s-node1 hostname 2026-04-22 13:43:07 +00:00