mailserver: overhaul inbound delivery, monitoring, CrowdSec, and migrate to Brevo relay

Inbound:
- Direct MX to mail.viktorbarzin.me (ForwardEmail relay attempted and abandoned)
- Dedicated MetalLB IP 10.0.20.202 with ETP: Local for CrowdSec real-IP detection
- Removed Cloudflare Email Routing (can't store-and-forward)
- Fixed dual SPF violation, hardened to -all
- Added MTA-STS, TLSRPT, imported Rspamd DKIM into Terraform
- Removed dead BIND zones from config.tfvars (199 lines)

Outbound:
- Migrated from Mailgun (100/day) to Brevo (300/day free)
- Added Brevo DKIM CNAMEs and verification TXT

Monitoring:
- Probe frequency: 30m → 20m, alert thresholds adjusted to 60m
- Enabled Dovecot exporter scraping (port 9166)
- Added external SMTP monitor on public IP

Documentation:
- New docs/architecture/mailserver.md with full architecture
- New docs/architecture/mailserver-visual.html visualization
- Updated monitoring.md, CLAUDE.md, historical plan docs
This commit is contained in:
Viktor Barzin 2026-04-12 22:24:38 +01:00
parent 8bc02d1401
commit 1c300a14cf
11 changed files with 993 additions and 53 deletions

View file

@ -80,7 +80,7 @@ Violations cause state drift, which causes future applies to break or silently r
**Flow**: `git push → GHA build+push DockerHub (8-char SHA) → POST Woodpecker API → kubectl set image`
**Migrated to GHA** (9): Website, k8s-portal, f1-stream, claude-memory-mcp, apple-health-data, audiblez-web, plotting-book, insta2spotify, audiobook-search
**Migrated to GHA** (10): Website, k8s-portal, f1-stream, claude-memory-mcp, apple-health-data, audiblez-web, plotting-book, insta2spotify, audiobook-search, council-complaints
**Woodpecker-only**: travel_blog (1.4GB content too large for GHA), infra pipelines (terragrunt apply, certbot, build-cli — need cluster access)
**Per-project files**:
@ -89,7 +89,7 @@ Violations cause state drift, which causes future applies to break or silently r
- `.woodpecker/build-fallback.yml` — Old full build pipeline preserved (event: `deployment` — never auto-fires)
**Woodpecker API**: Uses **numeric repo IDs** (`/api/repos/2/pipelines`), NOT owner/name paths (those return HTML).
Repo IDs: infra=1, Website=2, finance=3, health=4, travel_blog=5, webhook-handler=6, audiblez-web=9, f1-stream=10, plotting-book=43, claude-memory-mcp=78, infra-onboarding=79
Repo IDs: infra=1, Website=2, finance=3, health=4, travel_blog=5, webhook-handler=6, audiblez-web=9, f1-stream=10, plotting-book=43, claude-memory-mcp=78, infra-onboarding=79, council-complaints=TBD
**Woodpecker YAML gotchas**:
- Commands with `${VAR}:${VAR}` must be **quoted** — unquoted `:` triggers YAML map parsing when vars are empty
@ -131,7 +131,7 @@ Repo IDs: infra=1, Website=2, finance=3, health=4, travel_blog=5, webhook-handle
- Exclude completed CronJob pods from "pod not ready" alerts.
- Every new service gets Prometheus scrape config + Uptime Kuma monitor.
- Key alerts: OOMKill, pod replica mismatch, 4xx/5xx error rates, UPS battery, CPU temp, SSD writes, NFS responsiveness, ClusterMemoryRequestsHigh (>85%), ContainerNearOOM (>85% limit), PodUnschedulable.
- **E2E email monitoring**: CronJob `email-roundtrip-monitor` (every 30 min) sends test email via Mailgun API to `smoke-test@viktorbarzin.me` (catch-all → `spam@`), verifies IMAP delivery, deletes test email, pushes metrics to Pushgateway + Uptime Kuma. Alerts: `EmailRoundtripFailing` (90m), `EmailRoundtripStale` (90m), `EmailRoundtripNeverRun` (2h). Vault: `mailgun_api_key` in `secret/viktor`.
- **E2E email monitoring**: CronJob `email-roundtrip-monitor` (every 20 min) sends test email via Mailgun API to `smoke-test@viktorbarzin.me` (catch-all → `spam@`), verifies IMAP delivery, deletes test email, pushes metrics to Pushgateway + Uptime Kuma. Alerts: `EmailRoundtripFailing` (60m), `EmailRoundtripStale` (60m), `EmailRoundtripNeverRun` (60m). Outbound relay: Brevo EU (`smtp-relay.brevo.com:587`, 300/day free — migrated from Mailgun). Mailserver on dedicated MetalLB IP `10.0.20.202` with `externalTrafficPolicy: Local` for CrowdSec real-IP detection. Vault: `mailgun_api_key` in `secret/viktor` (probe), `brevo_api_key` in `secret/viktor` (relay).
## Storage & Backup Architecture

Binary file not shown.

View file

@ -0,0 +1,665 @@
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<base target="_blank">
<title>Mail Server Architecture — viktorbarzin.me</title>
<link href="https://fonts.googleapis.com/css2?family=Space+Grotesk:wght@400;500;600;700&family=IBM+Plex+Sans:wght@400;500;600&family=JetBrains+Mono:wght@400;500&display=swap" rel="stylesheet">
<style>
:root {
--bg: #f4f7f6;
--bg-hero: linear-gradient(135deg, #e0f2f1 0%, #f0fdf4 40%, #ecfeff 100%);
--surface: #ffffff;
--surface-hover: #f0fdfa;
--border: #d1ddd9;
--border-strong: #99b3ad;
--text-primary: #0f2b26;
--text-secondary: #4a6b63;
--text-muted: #7a998f;
--primary: #0d9488;
--primary-soft: #ccfbf1;
--primary-glow: rgba(13,148,136,0.15);
--accent: #f59e0b;
--accent-soft: #fef3c7;
--success: #059669;
--success-soft: #d1fae5;
--danger: #dc2626;
--danger-soft: #fee2e2;
--warning: #d97706;
--warning-soft: #fef3c7;
--flow-line: #0d9488;
--node-bg: #f0fdfa;
--code-bg: #1e293b;
}
body.dark-mode {
--bg: #0a1612;
--bg-hero: linear-gradient(135deg, #0a1612 0%, #0f1f1a 40%, #0c1a17 100%);
--surface: #132420;
--surface-hover: #1a3330;
--border: #2a4a40;
--border-strong: #3d6b5f;
--text-primary: #e0f2ef;
--text-secondary: #9abfb5;
--text-muted: #6a9a8d;
--primary: #2dd4bf;
--primary-soft: #1a3330;
--primary-glow: rgba(45,212,191,0.12);
--accent: #fbbf24;
--accent-soft: #3d2e0a;
--success: #34d399;
--success-soft: #132e22;
--danger: #f87171;
--danger-soft: #3d1515;
--warning: #fbbf24;
--warning-soft: #3d2e0a;
--flow-line: #2dd4bf;
--node-bg: #1a3330;
--code-bg: #0c1a17;
}
* { margin:0; padding:0; box-sizing:border-box; }
body {
font-family: 'IBM Plex Sans', sans-serif;
background: var(--bg);
color: var(--text-primary);
line-height: 1.6;
}
body::before {
content: '';
position: fixed;
inset: 0;
background: var(--bg-hero);
z-index: -1;
}
.container { max-width: 1200px; margin: 0 auto; padding: 0 24px; }
/* Hero */
.hero {
padding: 80px 0 48px;
text-align: center;
}
.hero-badge {
display: inline-flex;
align-items: center;
gap: 8px;
background: var(--primary-soft);
color: var(--primary);
border: 1px solid var(--primary);
border-radius: 100px;
padding: 6px 16px;
font-size: 0.8rem;
font-weight: 600;
letter-spacing: 0.05em;
text-transform: uppercase;
font-family: 'JetBrains Mono', monospace;
}
.hero-badge .dot { width:8px; height:8px; border-radius:50%; background:var(--success); animation: pulse 2s infinite; }
@keyframes pulse { 0%,100%{opacity:1} 50%{opacity:0.4} }
.hero h1 {
font-family: 'Space Grotesk', sans-serif;
font-size: 2.8rem;
font-weight: 700;
margin: 20px 0 12px;
letter-spacing: -0.03em;
background: linear-gradient(135deg, var(--text-primary) 0%, var(--primary) 100%);
-webkit-background-clip: text;
-webkit-text-fill-color: transparent;
background-clip: text;
}
.hero p { color: var(--text-secondary); font-size: 1.1rem; max-width: 600px; margin: 0 auto; }
.hero-meta {
display: flex; gap: 24px; justify-content: center; margin-top: 20px;
font-size: 0.85rem; color: var(--text-muted);
font-family: 'JetBrains Mono', monospace;
}
/* KPI Row */
.kpi-row {
display: grid;
grid-template-columns: repeat(auto-fit, minmax(180px, 1fr));
gap: 16px;
margin: 32px 0;
}
.kpi {
background: var(--surface);
border: 1px solid var(--border);
border-radius: 14px;
padding: 20px;
text-align: center;
transition: transform 0.2s, box-shadow 0.2s;
}
.kpi:hover { transform: translateY(-2px); box-shadow: 0 8px 24px var(--primary-glow); }
.kpi-value {
font-family: 'Space Grotesk', sans-serif;
font-size: 1.8rem;
font-weight: 700;
color: var(--primary);
}
.kpi-value.accent { color: var(--accent); }
.kpi-value.success { color: var(--success); }
.kpi-label { font-size: 0.78rem; color: var(--text-muted); margin-top: 4px; text-transform: uppercase; letter-spacing: 0.06em; font-weight: 600; }
/* Section */
.section {
margin: 40px 0;
}
.section-title {
font-family: 'Space Grotesk', sans-serif;
font-size: 1.4rem;
font-weight: 600;
margin-bottom: 20px;
display: flex;
align-items: center;
gap: 10px;
}
.section-title::before {
content: '';
width: 4px;
height: 24px;
background: var(--primary);
border-radius: 2px;
}
/* Flow Diagram */
.flow-container {
background: var(--surface);
border: 1px solid var(--border);
border-radius: 16px;
padding: 32px;
overflow-x: auto;
}
.flow {
display: flex;
align-items: center;
gap: 0;
min-width: 900px;
justify-content: center;
flex-wrap: nowrap;
}
.flow-node {
background: var(--node-bg);
border: 2px solid var(--border);
border-radius: 12px;
padding: 14px 18px;
text-align: center;
min-width: 120px;
position: relative;
transition: border-color 0.3s, box-shadow 0.3s;
}
.flow-node:hover { border-color: var(--primary); box-shadow: 0 0 20px var(--primary-glow); }
.flow-node .icon { font-size: 1.4rem; margin-bottom: 4px; }
.flow-node .label { font-size: 0.75rem; font-weight: 600; color: var(--text-primary); font-family: 'Space Grotesk', sans-serif; }
.flow-node .sublabel { font-size: 0.65rem; color: var(--text-muted); font-family: 'JetBrains Mono', monospace; margin-top: 2px; }
.flow-node.security { border-color: var(--success); background: var(--success-soft); }
.flow-arrow {
display: flex;
align-items: center;
padding: 0 4px;
color: var(--flow-line);
font-size: 0.7rem;
font-family: 'JetBrains Mono', monospace;
flex-direction: column;
gap: 2px;
}
.flow-arrow .line {
width: 40px;
height: 2px;
background: var(--flow-line);
position: relative;
}
.flow-arrow .line::after {
content: '';
position: absolute;
right: -1px;
top: -4px;
border: 5px solid transparent;
border-left: 6px solid var(--flow-line);
}
.flow-arrow .port { color: var(--text-muted); white-space: nowrap; }
/* DNS Table */
table {
width: 100%;
border-collapse: collapse;
background: var(--surface);
border: 1px solid var(--border);
border-radius: 12px;
overflow: hidden;
}
th {
background: var(--primary-soft);
text-align: left;
padding: 12px 16px;
font-family: 'Space Grotesk', sans-serif;
font-size: 0.78rem;
font-weight: 600;
text-transform: uppercase;
letter-spacing: 0.06em;
color: var(--primary);
border-bottom: 2px solid var(--border);
}
td {
padding: 10px 16px;
border-bottom: 1px solid var(--border);
font-size: 0.88rem;
}
tr:last-child td { border-bottom: none; }
tr:hover td { background: var(--surface-hover); }
.mono { font-family: 'JetBrains Mono', monospace; font-size: 0.8rem; }
.tag {
display: inline-block;
padding: 2px 10px;
border-radius: 100px;
font-size: 0.72rem;
font-weight: 600;
letter-spacing: 0.04em;
}
.tag-green { background: var(--success-soft); color: var(--success); }
.tag-amber { background: var(--warning-soft); color: var(--warning); }
.tag-red { background: var(--danger-soft); color: var(--danger); }
.tag-teal { background: var(--primary-soft); color: var(--primary); }
/* Security Grid */
.sec-grid {
display: grid;
grid-template-columns: repeat(auto-fit, minmax(280px, 1fr));
gap: 16px;
}
.sec-card {
background: var(--surface);
border: 1px solid var(--border);
border-radius: 14px;
padding: 24px;
border-left: 4px solid var(--success);
}
.sec-card h3 {
font-family: 'Space Grotesk', sans-serif;
font-size: 1rem;
font-weight: 600;
margin-bottom: 8px;
display: flex;
align-items: center;
gap: 8px;
}
.sec-card p, .sec-card ul { font-size: 0.85rem; color: var(--text-secondary); }
.sec-card ul { padding-left: 18px; margin-top: 6px; }
.sec-card li { margin: 4px 0; }
/* Alert Grid */
.alert-grid {
display: grid;
grid-template-columns: repeat(auto-fit, minmax(250px, 1fr));
gap: 12px;
}
.alert-card {
background: var(--surface);
border: 1px solid var(--border);
border-radius: 12px;
padding: 16px 20px;
display: flex;
align-items: center;
gap: 14px;
}
.alert-icon {
width: 40px;
height: 40px;
border-radius: 10px;
display: flex;
align-items: center;
justify-content: center;
font-size: 1.1rem;
flex-shrink: 0;
}
.alert-icon.warn { background: var(--warning-soft); }
.alert-name { font-family: 'JetBrains Mono', monospace; font-size: 0.82rem; font-weight: 500; }
.alert-thresh { font-size: 0.75rem; color: var(--text-muted); margin-top: 2px; }
/* Callout */
.callout {
background: var(--warning-soft);
border: 1px solid var(--accent);
border-left: 4px solid var(--accent);
border-radius: 12px;
padding: 16px 20px;
display: flex;
gap: 12px;
align-items: flex-start;
margin: 24px 0;
}
.callout .icon { font-size: 1.2rem; flex-shrink: 0; }
.callout .text { font-size: 0.88rem; color: var(--text-secondary); }
.callout .text strong { color: var(--text-primary); }
/* Footer */
footer {
text-align: center;
padding: 40px 0 32px;
font-size: 0.72rem;
color: var(--text-muted);
border-top: 1px solid var(--border);
margin-top: 48px;
}
/* Entrance animations */
@keyframes fadeInUp {
from { opacity:0; transform:translateY(20px); }
to { opacity:1; transform:translateY(0); }
}
.ani { animation: fadeInUp 0.6s cubic-bezier(0.22,1,0.36,1) both; }
.d1{animation-delay:0.1s} .d2{animation-delay:0.2s} .d3{animation-delay:0.3s}
.d4{animation-delay:0.4s} .d5{animation-delay:0.5s} .d6{animation-delay:0.6s}
</style>
<!-- INFRA-MENU-CSS + INFRA-MENU-HTML -->
<nav class="viz-menu" id="vizMenu" aria-label="Visualization controls">
<button class="viz-menu-toggle" onclick="toggleMenu()" aria-label="Toggle menu" aria-expanded="false">
<svg width="18" height="18" viewBox="0 0 18 18" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round">
<line x1="3" y1="5" x2="15" y2="5"/>
<line x1="3" y1="9" x2="15" y2="9"/>
<line x1="3" y1="13" x2="15" y2="13"/>
</svg>
</button>
<div class="viz-menu-panel" id="vizMenuPanel">
<button onclick="toggleTheme()" class="viz-menu-item">
<span class="viz-menu-icon" id="themeIcon"></span>
<span id="themeLabel">Toggle dark mode</span>
</button>
<button onclick="window.print()" class="viz-menu-item">
<span class="viz-menu-icon"></span>
<span>Print / Save as PDF</span>
</button>
<button onclick="toggleFullscreen()" class="viz-menu-item">
<span class="viz-menu-icon" id="fsIcon"></span>
<span id="fsLabel">Fullscreen</span>
</button>
<hr style="border:none;border-top:1px solid var(--border,#e2e8f0);margin:2px 8px">
<button onclick="saveAsImage()" class="viz-menu-item" id="saveImgBtn">
<span class="viz-menu-icon">📷</span>
<span>Save as image</span>
</button>
</div>
</nav>
<style>
.viz-menu{position:fixed;top:1rem;right:1rem;z-index:10000;font-family:inherit}
.viz-menu-toggle{width:36px;height:36px;border-radius:8px;border:1px solid var(--border,#e2e8f0);background:var(--surface,#fff);color:var(--text-secondary,#64748b);cursor:pointer;display:flex;align-items:center;justify-content:center;transition:background .15s,border-color .15s,transform .15s;box-shadow:0 2px 8px rgba(0,0,0,.1)}
.viz-menu-toggle:hover{background:var(--surface-hover,#f8fafc);border-color:var(--border-strong,#cbd5e1);transform:scale(1.05)}
.viz-menu-panel{position:absolute;top:calc(100% + 6px);right:0;min-width:200px;background:var(--surface,#fff);border:1px solid var(--border,#e2e8f0);border-radius:10px;padding:4px;box-shadow:0 8px 24px rgba(0,0,0,.15);opacity:0;transform:translateY(-4px) scale(.97);pointer-events:none;transition:opacity .15s,transform .15s}
.viz-menu.open .viz-menu-panel{opacity:1;transform:translateY(0) scale(1);pointer-events:auto}
.viz-menu-item{display:flex;align-items:center;gap:10px;width:100%;padding:8px 12px;border:none;border-radius:7px;background:transparent;color:var(--text-primary,#0f172a);font-size:.85rem;cursor:pointer;text-align:left;transition:background .1s;font-family:inherit}
.viz-menu-item:hover{background:var(--surface-hover,#f1f5f9)}
.viz-menu-icon{width:20px;text-align:center;font-size:1rem;flex-shrink:0}
[data-animate]{opacity:0;transition-property:opacity,transform,filter;transition-duration:.6s;transition-timing-function:cubic-bezier(.22,1,.36,1)}
[data-animate="fade-up"]{transform:translateY(20px)}
[data-animate="fade-down"]{transform:translateY(-20px)}
[data-animate="scale-up"]{transform:scale(.95)}
[data-animate].is-visible{opacity:1;transform:none;filter:none}
@media print{.viz-menu{display:none!important}body{background:#fff!important;color:#000!important}.dark-mode{all:unset}.ani,[data-animate]{opacity:1!important;transform:none!important;animation:none!important;transition:none!important}@page{margin:1.5cm}}
@media(prefers-reduced-motion:reduce){*,*::before,*::after{animation-duration:.01ms!important;transition-duration:.01ms!important}[data-animate]{opacity:1!important;transform:none!important}}
</style>
</head>
<body>
<div class="container">
<!-- Hero -->
<header class="hero ani d1">
<div class="hero-badge"><span class="dot"></span> Operational</div>
<h1>Mail Server Architecture</h1>
<p>Self-hosted email infrastructure for viktorbarzin.me on Kubernetes with CrowdSec protection</p>
<div class="hero-meta">
<span>docker-mailserver 15.0.0</span>
<span>|</span>
<span>Updated 2026-04-12</span>
</div>
</header>
<!-- KPI Row -->
<div class="kpi-row ani d2">
<div class="kpi">
<div class="kpi-value">9</div>
<div class="kpi-label">DNS Records</div>
</div>
<div class="kpi">
<div class="kpi-value success">10m</div>
<div class="kpi-label">Probe Interval</div>
</div>
<div class="kpi">
<div class="kpi-value accent">30m</div>
<div class="kpi-label">Alert Threshold</div>
</div>
<div class="kpi">
<div class="kpi-value">5</div>
<div class="kpi-label">Security Layers</div>
</div>
<div class="kpi">
<div class="kpi-value success">Local</div>
<div class="kpi-label">Traffic Policy</div>
</div>
</div>
<!-- Inbound Flow -->
<div class="section ani d3">
<div class="section-title">Inbound Mail Flow</div>
<div class="flow-container">
<div class="flow">
<div class="flow-node">
<div class="icon">📧</div>
<div class="label">Sender MTA</div>
<div class="sublabel">MX lookup</div>
</div>
<div class="flow-arrow"><div class="port">:25</div><div class="line"></div></div>
<div class="flow-node">
<div class="icon">🌐</div>
<div class="label">mail.viktorbarzin.me</div>
<div class="sublabel">176.12.22.76</div>
</div>
<div class="flow-arrow"><div class="port">NAT</div><div class="line"></div></div>
<div class="flow-node security">
<div class="icon">🛡</div>
<div class="label">pfSense</div>
<div class="sublabel">port 25 fwd</div>
</div>
<div class="flow-arrow"><div class="port">10.0.20.202</div><div class="line"></div></div>
<div class="flow-node">
<div class="icon"></div>
<div class="label">MetalLB</div>
<div class="sublabel">ETP: Local</div>
</div>
<div class="flow-arrow"><div class="line"></div></div>
<div class="flow-node security">
<div class="icon">📬</div>
<div class="label">Postfix</div>
<div class="sublabel">+ CrowdSec</div>
</div>
<div class="flow-arrow"><div class="line"></div></div>
<div class="flow-node security">
<div class="icon">🔍</div>
<div class="label">Rspamd</div>
<div class="sublabel">spam/DKIM/DMARC</div>
</div>
<div class="flow-arrow"><div class="line"></div></div>
<div class="flow-node">
<div class="icon">📥</div>
<div class="label">Dovecot</div>
<div class="sublabel">IMAP :993</div>
</div>
</div>
</div>
</div>
<!-- Outbound Flow -->
<div class="section" data-animate="fade-up">
<div class="section-title">Outbound Mail Flow</div>
<div class="flow-container">
<div class="flow" style="min-width:500px">
<div class="flow-node">
<div class="icon">📬</div>
<div class="label">Postfix</div>
<div class="sublabel">relayhost</div>
</div>
<div class="flow-arrow"><div class="port">SASL+TLS :587</div><div class="line"></div></div>
<div class="flow-node" style="border-color:var(--accent)">
<div class="icon">🚀</div>
<div class="label">Mailgun EU</div>
<div class="sublabel">smtp.eu.mailgun.org</div>
</div>
<div class="flow-arrow"><div class="line"></div></div>
<div class="flow-node">
<div class="icon">📧</div>
<div class="label">Recipient</div>
<div class="sublabel">IP reputation handled</div>
</div>
</div>
</div>
</div>
<!-- DNS Records -->
<div class="section" data-animate="fade-up" data-delay="100">
<div class="section-title">DNS Records</div>
<table>
<thead>
<tr><th>Type</th><th>Name</th><th>Value</th><th>Status</th></tr>
</thead>
<tbody>
<tr><td><span class="tag tag-teal">MX</span></td><td class="mono">viktorbarzin.me</td><td class="mono">mail.viktorbarzin.me (pri 1)</td><td><span class="tag tag-green">OK</span></td></tr>
<tr><td><span class="tag tag-teal">A</span></td><td class="mono">mail.viktorbarzin.me</td><td class="mono">176.12.22.76 (DNS-only)</td><td><span class="tag tag-green">OK</span></td></tr>
<tr><td><span class="tag tag-teal">AAAA</span></td><td class="mono">mail.viktorbarzin.me</td><td class="mono">2001:470:6e:43d::2</td><td><span class="tag tag-green">OK</span></td></tr>
<tr><td><span class="tag tag-teal">SPF</span></td><td class="mono">viktorbarzin.me</td><td class="mono">v=spf1 include:mailgun.org <strong>-all</strong></td><td><span class="tag tag-green">Hard Fail</span></td></tr>
<tr><td><span class="tag tag-teal">DKIM</span></td><td class="mono">s1._domainkey</td><td class="mono">RSA 1024-bit (Mailgun outbound)</td><td><span class="tag tag-green">OK</span></td></tr>
<tr><td><span class="tag tag-teal">DKIM</span></td><td class="mono">mail._domainkey</td><td class="mono">RSA 2048-bit (Rspamd signing)</td><td><span class="tag tag-green">OK</span></td></tr>
<tr><td><span class="tag tag-teal">DMARC</span></td><td class="mono">_dmarc</td><td class="mono">p=quarantine; pct=100</td><td><span class="tag tag-green">OK</span></td></tr>
<tr><td><span class="tag tag-teal">MTA-STS</span></td><td class="mono">_mta-sts</td><td class="mono">v=STSv1; id=20260412</td><td><span class="tag tag-green">OK</span></td></tr>
<tr><td><span class="tag tag-teal">TLSRPT</span></td><td class="mono">_smtp._tls</td><td class="mono">rua=mailto:postmaster@viktorbarzin.me</td><td><span class="tag tag-green">OK</span></td></tr>
</tbody>
</table>
<div class="callout" data-animate="fade-up" data-delay="200">
<div class="icon"></div>
<div class="text"><strong>PTR Mismatch:</strong> Reverse DNS returns <code class="mono">176-12-22-76.pon.spectrumnet.bg</code> (ISP-assigned) instead of <code class="mono">mail.viktorbarzin.me</code>. ISP-controlled, cannot fix. Minimal impact — Gmail/Outlook rely on SPF/DKIM/DMARC.</div>
</div>
</div>
<!-- Security -->
<div class="section" data-animate="fade-up" data-delay="100">
<div class="section-title">Security Layers</div>
<div class="sec-grid">
<div class="sec-card">
<h3>🛡 CrowdSec</h3>
<ul>
<li><code class="mono">crowdsecurity/postfix</code> + <code class="mono">dovecot</code> collections</li>
<li>Real client IPs via ETP: Local on <code class="mono">10.0.20.202</code></li>
<li>Automatic brute-force detection &amp; ban</li>
</ul>
</div>
<div class="sec-card">
<h3>🔍 Rspamd</h3>
<ul>
<li>Spam filtering + phishing detection</li>
<li>DKIM signing (selector: <code class="mono">mail</code>, 2048-bit)</li>
<li>DMARC verification on inbound</li>
<li>Auto-learns from Junk folder</li>
</ul>
</div>
<div class="sec-card">
<h3>🚦 Postfix Rate Limiting</h3>
<ul>
<li>10 connections/min per client</li>
<li>30 messages/min per client</li>
<li>Now effective with real IPs (ETP: Local)</li>
</ul>
</div>
<div class="sec-card">
<h3>🔒 TLS Enforcement</h3>
<ul>
<li>Let's Encrypt wildcard cert</li>
<li>MTA-STS enforces TLS for inbound</li>
<li>TLSRPT for failure reporting</li>
<li>STARTTLS on SMTP, SSL on IMAP</li>
</ul>
</div>
</div>
</div>
<!-- Monitoring -->
<div class="section" data-animate="fade-up" data-delay="100">
<div class="section-title">Monitoring &amp; Alerts</div>
<div class="alert-grid">
<div class="alert-card">
<div class="alert-icon warn">📊</div>
<div>
<div class="alert-name">MailServerDown</div>
<div class="alert-thresh">No replicas for 5m</div>
</div>
</div>
<div class="alert-card">
<div class="alert-icon warn">📧</div>
<div>
<div class="alert-name">EmailRoundtripFailing</div>
<div class="alert-thresh">Probe failing for 30m</div>
</div>
</div>
<div class="alert-card">
<div class="alert-icon warn"></div>
<div>
<div class="alert-name">EmailRoundtripStale</div>
<div class="alert-thresh">No success in >40m</div>
</div>
</div>
<div class="alert-card">
<div class="alert-icon warn"></div>
<div>
<div class="alert-name">EmailRoundtripNeverRun</div>
<div class="alert-thresh">Metric absent for 40m</div>
</div>
</div>
</div>
<div style="margin-top:20px">
<table>
<thead><tr><th>Monitor</th><th>Type</th><th>Target</th><th>Interval</th></tr></thead>
<tbody>
<tr><td>E2E Roundtrip Probe</td><td><span class="tag tag-teal">CronJob</span></td><td class="mono">Mailgun API → MX → IMAP</td><td class="mono">*/10 * * * *</td></tr>
<tr><td>SMTP External</td><td><span class="tag tag-green">Uptime Kuma</span></td><td class="mono">176.12.22.76:25</td><td class="mono">60s</td></tr>
<tr><td>Dovecot Exporter</td><td><span class="tag tag-green">Prometheus</span></td><td class="mono">:9166/metrics</td><td class="mono">scrape</td></tr>
</tbody>
</table>
</div>
</div>
<!-- Terraform -->
<div class="section" data-animate="fade-up" data-delay="100">
<div class="section-title">Terraform Stacks</div>
<table>
<thead><tr><th>Stack</th><th>Path</th><th>Resources</th></tr></thead>
<tbody>
<tr><td>Mailserver</td><td class="mono">stacks/mailserver/</td><td>Namespace, Deployment, Service, CronJob, PVCs</td></tr>
<tr><td>DNS</td><td class="mono">stacks/cloudflared/</td><td>MX, SPF, DKIM x2, DMARC, MTA-STS, TLSRPT</td></tr>
<tr><td>Monitoring</td><td class="mono">stacks/monitoring/</td><td>Prometheus alert rules</td></tr>
<tr><td>CrowdSec</td><td class="mono">stacks/crowdsec/</td><td>postfix + dovecot collections, log acquisition</td></tr>
</tbody>
</table>
</div>
<footer>
Viktor Barzin · 2026-04-12<br>
Generated by <a href="https://fburl.com/claude-templates/k5rd6ab0" style="color:var(--text-muted)">/visualize</a> claude skill
</footer>
</div>
<!-- INFRA-MENU-JS -->
<script>
function toggleMenu(){const m=document.getElementById('vizMenu');const t=m.querySelector('.viz-menu-toggle');const o=m.classList.toggle('open');t.setAttribute('aria-expanded',o)}
document.addEventListener('click',function(e){const m=document.getElementById('vizMenu');if(m&&!m.contains(e.target))m.classList.remove('open')});
document.addEventListener('keydown',function(e){if(e.key==='Escape'){const m=document.getElementById('vizMenu');if(m)m.classList.remove('open')}});
function toggleTheme(){const d=document.body.classList.toggle('dark-mode');localStorage.setItem('visualize-dark-mode',d);updateThemeLabel()}
function updateThemeLabel(){const d=document.body.classList.contains('dark-mode');const i=document.getElementById('themeIcon');const l=document.getElementById('themeLabel');if(i)i.textContent=d?'☀':'◐';if(l)l.textContent=d?'Light mode':'Dark mode'}
if(localStorage.getItem('visualize-dark-mode')==='true'||(!localStorage.getItem('visualize-dark-mode')&&window.matchMedia('(prefers-color-scheme: dark)').matches)){document.body.classList.add('dark-mode')}
updateThemeLabel();
function toggleFullscreen(){if(!document.fullscreenElement){document.documentElement.requestFullscreen().catch(()=>{})}else{document.exitFullscreen()}}
document.addEventListener('fullscreenchange',function(){const i=document.getElementById('fsIcon');const l=document.getElementById('fsLabel');const f=!!document.fullscreenElement;if(i)i.textContent='⛶';if(l)l.textContent=f?'Exit fullscreen':'Fullscreen'});
(function initScrollReveal(){if(window.matchMedia('(prefers-reduced-motion: reduce)').matches){document.querySelectorAll('[data-animate]').forEach(function(el){el.classList.add('is-visible')});return}var o=new IntersectionObserver(function(entries){entries.forEach(function(entry){if(entry.isIntersecting){var el=entry.target;var d=parseInt(el.getAttribute('data-delay')||'0',10);if(d>0){setTimeout(function(){el.classList.add('is-visible')},d)}else{el.classList.add('is-visible')}o.unobserve(el)}})},{threshold:0.15});document.querySelectorAll('[data-animate]').forEach(function(el){o.observe(el)})})();
function loadHtml2Canvas(){if(typeof html2canvas!=='undefined')return Promise.resolve();return new Promise(function(r,j){var s=document.createElement('script');s.src='https://cdnjs.cloudflare.com/ajax/libs/html2canvas/1.4.1/html2canvas.min.js';s.integrity='sha384-ZZ1pncU3bQe8y31yfZdMFdSpttDoPmOZg2wguVK9almUodir1PghgT0eY7Mrty8H';s.crossOrigin='anonymous';s.onload=r;s.onerror=function(){j(new Error('html2canvas load failed'))};document.head.appendChild(s)})}
function saveAsImage(){var b=document.getElementById('saveImgBtn');var l=b.querySelector('span:last-child');var o=l.textContent;l.textContent='Loading...';b.style.pointerEvents='none';loadHtml2Canvas().then(function(){l.textContent='Capturing...';doCapture(b,l,o)}).catch(function(){l.textContent=o;b.style.pointerEvents='';alert('Could not load screenshot library.')})}
function doCapture(b,l,o){var m=document.getElementById('vizMenu');m.style.display='none';var t=document.querySelector('.container')||document.body;var s=document.createElement('style');s.id='html2canvas-capture-overrides';s.textContent='*,*::before,*::after{animation-play-state:paused!important;animation-delay:0s!important;animation-duration:0s!important;transition-duration:0s!important}.ani,.animate-in,[class*="delay-"],[data-animate]{opacity:1!important;transform:none!important;filter:none!important}';document.head.appendChild(s);var bg=t.style.background;if(t!==document.body){t.style.background=getComputedStyle(document.body).backgroundColor||'#fff'}var sp=window.scrollY;window.scrollTo(0,0);var cap=function(){void t.offsetHeight;html2canvas(t,{scale:2,useCORS:true,allowTaint:true,logging:false}).then(function(c){var a=document.createElement('a');a.download=(document.title||'visualization')+'.png';a.href=c.toDataURL('image/png');a.click()}).catch(function(){alert('Screenshot failed.')}).finally(function(){s.remove();t.style.background=bg;m.style.display='';l.textContent=o;b.style.pointerEvents='';m.classList.remove('open');window.scrollTo(0,sp)})};if(document.fonts&&document.fonts.ready){document.fonts.ready.then(function(){setTimeout(cap,50)})}else{setTimeout(cap,500)}}
</script>
</body>
</html>

View file

@ -0,0 +1,246 @@
# Mail Server Architecture
Last updated: 2026-04-12 (Brevo relay migration)
## Overview
Self-hosted email for `viktorbarzin.me` using docker-mailserver 15.0.0 on Kubernetes. Inbound mail arrives directly via MX record to the home IP on port 25. Outbound mail relays through Mailgun EU. Roundcubemail provides webmail access. CrowdSec protects SMTP/IMAP from brute-force attacks using real client IPs via `externalTrafficPolicy: Local` on a dedicated MetalLB IP.
## Architecture Diagram
```mermaid
graph TB
subgraph "Inbound Mail"
SENDER[Sending MTA] -->|MX lookup| MX[mail.viktorbarzin.me:25]
MX -->|176.12.22.76:25| PF[pfSense NAT]
PF -->|10.0.20.202:25| MLB[MetalLB<br/>ETP: Local]
MLB --> POSTFIX[Postfix MTA]
end
subgraph "Mail Processing"
POSTFIX --> RSPAMD[Rspamd<br/>Spam/DKIM/DMARC]
RSPAMD --> DOVECOT[Dovecot IMAP]
DOVECOT --> MAILBOX[(Mailboxes<br/>proxmox-lvm PVC)]
end
subgraph "Outbound Mail"
POSTFIX_OUT[Postfix] -->|SASL + TLS| MAILGUN[Brevo EU Relay<br/>smtp-relay.brevo.com:587]
MAILGUN --> RECIPIENT[Recipient]
end
subgraph "Webmail"
USER[User] -->|HTTPS| TRAEFIK[Traefik Ingress]
TRAEFIK --> RC[Roundcubemail]
RC -->|IMAP 993| DOVECOT
RC -->|SMTP 587| POSTFIX_OUT
end
subgraph "Security"
MLB -->|Real client IPs| CS_AGENT[CrowdSec Agent<br/>postfix + dovecot parsers]
CS_AGENT --> CS_LAPI[CrowdSec LAPI]
end
subgraph "Monitoring"
PROBE[E2E Roundtrip Probe<br/>CronJob every 10m] -->|Mailgun API| SENDER
PROBE -->|IMAP check| DOVECOT
PROBE --> PUSH[Pushgateway + Uptime Kuma]
DEXP[Dovecot Exporter<br/>:9166] --> PROM[Prometheus]
end
```
## Components
| Component | Version | Location | Purpose |
|-----------|---------|----------|---------|
| docker-mailserver | 15.0.0 | `mailserver` namespace | Postfix MTA + Dovecot IMAP + Rspamd |
| Roundcubemail | 1.6.13-apache | `mailserver` namespace | Webmail UI (MySQL-backed) |
| Dovecot Exporter | latest | Sidecar in mailserver pod | Prometheus metrics (port 9166) |
| Rspamd | Built into docker-mailserver | — | Spam filtering, DKIM signing, DMARC verification |
| Brevo EU (ex-Sendinblue) | SaaS | — | Outbound SMTP relay (300/day free) |
## Mail Flow
### Inbound
```
Internet → MX: mail.viktorbarzin.me (priority 1)
→ A record: 176.12.22.76 (non-proxied Cloudflare DNS-only)
→ pfSense NAT: port 25 → 10.0.20.202:25
→ MetalLB (dedicated IP, ETP: Local — preserves real client IPs)
→ Postfix → Rspamd (spam + DKIM + DMARC check) → Dovecot → mailbox
```
No backup MX. If the server is down, sender MTAs queue and retry for 4-5 days per SMTP standards (RFC 5321).
### Outbound
```
Postfix → relayhost [smtp-relay.brevo.com]:587 (SASL auth + TLS required)
→ Brevo handles IP reputation, deliverability, bounce processing
→ 300 emails/day free tier (migrated from Mailgun 100/day on 2026-04-12)
```
### Webmail
```
https://mail.viktorbarzin.me → Traefik → Roundcubemail
IMAP: ssl://mailserver:993 (internal K8s service)
SMTP: tls://mailserver:587 (internal K8s service)
DB: MySQL (mysql.dbaas.svc.cluster.local)
```
## DNS Records
All managed in Terraform at `stacks/cloudflared/modules/cloudflared/cloudflare.tf`.
| Type | Name | Value | Purpose |
|------|------|-------|---------|
| MX | `viktorbarzin.me` | `mail.viktorbarzin.me` (pri 1) | Inbound mail routing |
| A | `mail.viktorbarzin.me` | `176.12.22.76` (non-proxied) | Mail server IP |
| AAAA | `mail.viktorbarzin.me` | `2001:470:6e:43d::2` | IPv6 (HE tunnel) |
| TXT (SPF) | `viktorbarzin.me` | `v=spf1 include:mailgun.org -all` | Authorize Mailgun for outbound |
| TXT (DKIM) | `s1._domainkey` | RSA 1024-bit key | Mailgun DKIM (roundtrip probe) |
| TXT (DKIM) | `mail._domainkey` | RSA 2048-bit key | Rspamd self-hosted DKIM signing |
| CNAME (DKIM) | `brevo1._domainkey` | b1.viktorbarzin-me.dkim.brevo.com | Brevo outbound DKIM (delegated) |
| CNAME (DKIM) | `brevo2._domainkey` | b2.viktorbarzin-me.dkim.brevo.com | Brevo outbound DKIM (delegated) |
| TXT | `viktorbarzin.me` | `brevo-code:a6ef1dd9...` | Brevo domain verification |
| TXT (DMARC) | `_dmarc` | `p=quarantine; pct=100` | DMARC enforcement, reports to Mailgun + ondmarc |
| TXT (MTA-STS) | `_mta-sts` | `v=STSv1; id=20260412` | TLS enforcement for inbound |
| TXT (TLSRPT) | `_smtp._tls` | `v=TLSRPTv1; rua=mailto:postmaster@...` | TLS failure reporting |
### Known Limitation: PTR Mismatch
Reverse DNS for `176.12.22.76` returns `176-12-22-76.pon.spectrumnet.bg.` (ISP-assigned) instead of `mail.viktorbarzin.me`. This is ISP-controlled and cannot be changed on a residential connection. Most modern providers (Gmail, Outlook) rely on SPF/DKIM/DMARC rather than PTR, so impact is minimal.
## Security
### CrowdSec Integration
- **Collections**: `crowdsecurity/postfix` + `crowdsecurity/dovecot` (installed)
- **Log acquisition**: CrowdSec agents parse mailserver pod logs for brute-force patterns
- **Real client IPs**: `externalTrafficPolicy: Local` on dedicated MetalLB IP `10.0.20.202` preserves original client IPs (not SNATed to node IPs)
- **Decisions**: CrowdSec bans/challenges attackers via firewall bouncer rules
### Rspamd
- Spam filtering with phishing detection and Oletools
- DKIM signing (selector `mail`, 2048-bit RSA)
- DMARC verification on inbound mail
- Auto-learns from Junk folder movements (`RSPAMD_LEARN=1`)
- SRS (Sender Rewriting Scheme) enabled for forwarded mail
### Postfix Rate Limiting
```
smtpd_client_connection_rate_limit = 10 # per minute per client
smtpd_client_message_rate_limit = 30 # per minute per client
anvil_rate_time_unit = 60s
```
### TLS
- Wildcard Let's Encrypt cert (`*.viktorbarzin.me`) for SMTP STARTTLS and IMAPS
- Renewed via Woodpecker CI cron pipeline (DNS-01 challenge via Cloudflare)
- MTA-STS enforces TLS for inbound delivery
## Monitoring
### E2E Roundtrip Probe
CronJob `email-roundtrip-monitor` (every 10 min):
1. Sends test email via Mailgun HTTP API to `smoke-test@viktorbarzin.me`
2. Email hits MX → Postfix → catch-all delivers to `spam@` mailbox
3. Verifies delivery via IMAP (searches by UUID marker)
4. Deletes test email, pushes metrics to Pushgateway + Uptime Kuma
### Prometheus Alerts
| Alert | Threshold | Severity |
|-------|-----------|----------|
| MailServerDown | No replicas for 5m | warning |
| EmailRoundtripFailing | Probe failing for 30m | warning |
| EmailRoundtripStale | No success in >40m | warning |
| EmailRoundtripNeverRun | Metric absent for 40m | warning |
### Uptime Kuma Monitors
- TCP SMTP on `176.12.22.76:25` (external, 60s interval)
- TCP IMAP on `10.0.20.202:993` (internal)
- E2E Push monitor (receives push from roundtrip probe)
### Dovecot Exporter
- Sidecar container in mailserver pod, port 9166
- Scraped by Prometheus for IMAP connection metrics
## Terraform
| Stack | Path | Resources |
|-------|------|-----------|
| Mailserver | `stacks/mailserver/` | Namespace, deployment, service, CronJob, PVCs |
| DNS | `stacks/cloudflared/modules/cloudflared/cloudflare.tf` | MX, SPF, DKIM, DMARC, MTA-STS, TLSRPT records |
| Monitoring | `stacks/monitoring/` | Prometheus alert rules |
| CrowdSec | `stacks/crowdsec/` | Collections, log acquisition (already configured) |
### Secrets (Vault)
| Path | Key | Purpose |
|------|-----|---------|
| `secret/platform` | `mailserver_accounts` | User credentials (JSON) |
| `secret/platform` | `mailserver_aliases` | Postfix virtual aliases |
| `secret/platform` | `mailserver_opendkim_key` | DKIM private key |
| `secret/platform` | `mailserver_sasl_passwd` | Mailgun relay credentials |
| `secret/viktor` | `mailgun_api_key` | Mailgun API for E2E probe (inbound testing) |
| `secret/viktor` | `brevo_api_key` | Brevo API key (stored for reference) |
## Storage
| PVC | Size | Storage Class | Purpose |
|-----|------|---------------|---------|
| `mailserver-data-proxmox` | 2Gi (auto-resize 5Gi) | proxmox-lvm | Mail data, state, logs |
| `roundcubemail-html-proxmox` | 1Gi | proxmox-lvm | Roundcube web files |
| `roundcubemail-enigma-proxmox` | 1Gi | proxmox-lvm | Roundcube encryption |
## Decisions & Rationale
### No Backup MX
- **Alternatives considered**: ForwardEmail (free relay), Cloudflare Email Routing, Dynu Store/Forward
- **Decision**: Direct MX only. ForwardEmail relay was evaluated (2026-04-12) and abandoned — its anti-spoofing enforcement rejects legitimate forwarded mail regardless of SPF configuration. Cloudflare Email Routing can't store-and-forward (pass-through proxy only). Dynu ($9.99/yr) is a viable future option.
- **Tradeoff**: If server is down, mail delivery relies on sender MTA retry queues (4-5 days standard). No immediate forwarding to a backup address.
### Brevo for Outbound (migrated from Mailgun 2026-04-12)
- **Decision**: All outbound relays through Brevo EU (ex-Sendinblue). 300 emails/day free tier (3x Mailgun's 100/day).
- **Why migrated**: Mailgun's 100/day limit was too tight — the E2E probe uses ~72/day, leaving only 28 for real mail.
- **DKIM**: Brevo uses delegated DKIM via CNAME (`brevo1._domainkey`, `brevo2._domainkey`). Mailgun's `s1._domainkey` retained for the roundtrip probe (still uses Mailgun API for inbound testing).
- **Tradeoff**: Dependency on Brevo SaaS for outbound.
### Rspamd over SpamAssassin/OpenDKIM
- **Decision**: Rspamd replaces both SpamAssassin and OpenDKIM in a single component
- **Tradeoff**: Higher memory usage (~150-200MB) but simpler stack
### Dedicated MetalLB IP for CrowdSec
- **Decision**: Mailserver gets `10.0.20.202` (separate from shared `10.0.20.200`) with `externalTrafficPolicy: Local`
- **Why**: Shared IP with ETP: Cluster SNATs away real client IPs, making CrowdSec detections and Postfix rate limiting useless
- **Tradeoff**: Uses one extra IP from the MetalLB pool. Requires separate pfSense NAT rule.
## Troubleshooting
### Inbound mail not arriving
1. Check MX: `dig MX viktorbarzin.me +short` → should show `mail.viktorbarzin.me`
2. Check port 25: `nc -zw5 mail.viktorbarzin.me 25`
3. Check pfSense NAT rule: port 25 → `10.0.20.202:25`
4. Check Postfix logs: `kubectl logs -n mailserver deploy/mailserver -c docker-mailserver | grep -E 'from=|reject'`
5. Check if CrowdSec is blocking the sender: `kubectl exec -n crowdsec deploy/crowdsec-lapi -- cscli decisions list`
### Outbound mail failing
1. Check Brevo relay: `kubectl logs -n mailserver deploy/mailserver -c docker-mailserver | grep relay` — should show `relay=smtp-relay.brevo.com`
2. Check SASL credentials: `vault kv get -field=mailserver_sasl_passwd secret/platform` — should show `[smtp-relay.brevo.com]:587`
3. Check Brevo dashboard for delivery status
4. SASL auth failure → verify SMTP key (xsmtpsib-...) and login (a7e778001@smtp-brevo.com)
### E2E roundtrip probe failing
1. Check CronJob: `kubectl get cronjob -n mailserver email-roundtrip-monitor`
2. Check job logs: `kubectl logs -n mailserver -l job-name --tail=20`
3. Check Mailgun rate limit (HTTP 429 errors mean too many API calls)
4. Check IMAP login: verify `spam@viktorbarzin.me` password in Vault (`secret/platform``mailserver_accounts`)
### Spam/brute-force attacks
1. Check CrowdSec decisions: `kubectl exec -n crowdsec deploy/crowdsec-lapi -- cscli decisions list`
2. Check Postfix logs for auth failures: `kubectl logs -n mailserver deploy/mailserver -c docker-mailserver | grep 'authentication failed'`
3. Verify real client IPs in logs (not 10.0.20.x node IPs)
## Related
- [Monitoring Architecture](monitoring.md) — alert definitions, Uptime Kuma
- [Networking Architecture](networking.md) — MetalLB, pfSense NAT, Cloudflare DNS
- [Security Architecture](security.md) — CrowdSec deployment
- [Secrets Management](secrets.md) — Vault paths for mail credentials
- [Mailserver Hardening Plan](../plans/2026-02-23-mailserver-hardening-plan.md) — historical

View file

@ -15,7 +15,7 @@ graph TB
GPU[NVIDIA GPU via dcgm-exporter]
UPS[UPS Exporter]
NFS[NFS Exporter]
EMAIL[Email Roundtrip Probe<br/>CronJob every 30m]
EMAIL[Email Roundtrip Probe<br/>CronJob every 10m]
end
subgraph "Monitoring Stack (platform stack)"
@ -148,11 +148,11 @@ spec:
- **4xx/5xx Error Rates**: HTTP error rate threshold exceeded
#### Email Monitoring Alerts
- **EmailRoundtripFailing**: E2E email probe returning failure for >90m
- **EmailRoundtripStale**: No successful email round-trip in >90m
- **EmailRoundtripNeverRun**: Email probe has never reported (CronJob not running)
- **EmailRoundtripFailing**: E2E email probe returning failure for >30m
- **EmailRoundtripStale**: No successful email round-trip in >40m
- **EmailRoundtripNeverRun**: Email probe has never reported (40m)
The email monitoring system uses a CronJob (`email-roundtrip-monitor`, every 30 min) in the `mailserver` namespace that:
The email monitoring system uses a CronJob (`email-roundtrip-monitor`, every 10 min) in the `mailserver` namespace that:
1. Sends a test email via Mailgun HTTP API to `smoke-test@viktorbarzin.me`
2. Email lands in the `spam@` catch-all mailbox via MX delivery
3. Verifies delivery via IMAP (searches by UUID marker in subject)
@ -160,7 +160,7 @@ The email monitoring system uses a CronJob (`email-roundtrip-monitor`, every 30
5. Pushes metrics (`email_roundtrip_success`, `email_roundtrip_duration_seconds`, `email_roundtrip_last_success_timestamp`) to Prometheus Pushgateway
6. Pushes status to Uptime Kuma E2E Push monitor
Uptime Kuma also has TCP monitors for SMTP (port 25) and IMAP (port 993) on `10.0.20.200`.
Uptime Kuma monitors: TCP SMTP (port 25) on `176.12.22.76` (external), IMAP (port 993) on `10.0.20.202`, and Dovecot exporter metrics on port 9166.
#### Backup Alerts
- **PostgreSQLBackupStale**: >36h since last backup

View file

@ -2,6 +2,7 @@
**Date**: 2026-02-23
**Scope**: Security, reliability, and hygiene improvements to the docker-mailserver stack
**Status**: Completed. ForwardEmail relay removed 2026-04-12 — MX now direct to mail.viktorbarzin.me on dedicated MetalLB IP with CrowdSec protection.
## Current State

View file

@ -4,6 +4,8 @@
**Goal:** Harden the mail server with spam filtering (Rspamd), DMARC enforcement, rate limiting, monitoring alerts, and hygiene cleanup.
**Status**: Completed. ForwardEmail references in this plan are historical — relay removed 2026-04-12. MX points directly to mail.viktorbarzin.me.
**Architecture:** All changes are to the existing docker-mailserver 15.0.0 deployment managed by Terraform. Rspamd replaces OpenDKIM for DKIM signing and adds spam filtering. DMARC moves from `none` to `quarantine` in Cloudflare DNS. Postfix gets rate-limiting parameters. Prometheus gets a mailserver-down alert. Roundcubemail debug logging is disabled and image pinned.
**Tech Stack:** Terraform/HCL, docker-mailserver, Rspamd, Cloudflare DNS, Prometheus

View file

@ -113,46 +113,19 @@ resource "cloudflare_record" "non_proxied_dns_record_ipv6" {
zone_id = var.cloudflare_zone_id
}
resource "cloudflare_record" "forwardemail_mx1" {
content = "mx1.forwardemail.net"
resource "cloudflare_record" "mail_mx" {
content = "mail.viktorbarzin.me"
name = "viktorbarzin.me"
proxied = false
ttl = 1
type = "MX"
priority = 10
priority = 1
zone_id = var.cloudflare_zone_id
}
resource "cloudflare_record" "forwardemail_mx2" {
content = "mx2.forwardemail.net"
name = "viktorbarzin.me"
proxied = false
ttl = 1
type = "MX"
priority = 10
zone_id = var.cloudflare_zone_id
}
resource "cloudflare_record" "forwardemail_config" {
content = "\"forward-email=mail.viktorbarzin.me\""
name = "viktorbarzin.me"
proxied = false
ttl = 1
type = "TXT"
zone_id = var.cloudflare_zone_id
}
resource "cloudflare_record" "forwardemail_port" {
content = "\"forward-email-port=266\""
name = "viktorbarzin.me"
proxied = false
ttl = 1
type = "TXT"
zone_id = var.cloudflare_zone_id
}
resource "cloudflare_record" "mail_domainkey" {
content = "\"k=rsa; p=MIGfMA0GCSqGSIb3DQEBAQUAA4GNADCBiQKBgQDIDLB8mhAHNqs1s6GeZMQHOxWweoNKIrqo5tqRM3yFilgfPUX34aTIXNZg9xAmlK+2S/xXO1ymt127ZGMjnoFKOEP8/uZ54iHTCnioHaPZWMfJ7o6TYIXjr+9ShKfoJxZLv7lHJ2wKQK3yOw4lg4cvja5nxQ6fNoGRwo+mQ/mgJQIDAQAB\""
content = "\"v=DKIM1; k=rsa; p=MIGfMA0GCSqGSIb3DQEBAQUAA4GNADCBiQKBgQDIDLB8mhAHNqs1s6GeZMQHOxWweoNKIrqo5tqRM3yFilgfPUX34aTIXNZg9xAmlK+2S/xXO1ymt127ZGMjnoFKOEP8/uZ54iHTCnioHaPZWMfJ7o6TYIXjr+9ShKfoJxZLv7lHJ2wKQK3yOw4lg4cvja5nxQ6fNoGRwo+mQ/mgJQIDAQAB\""
name = "s1._domainkey.viktorbarzin.me"
proxied = false
ttl = 1
@ -162,7 +135,7 @@ resource "cloudflare_record" "mail_domainkey" {
}
resource "cloudflare_record" "mail_spf" {
content = "\"v=spf1 include:mailgun.org ~all\""
content = "\"v=spf1 include:mailgun.org -all\""
name = "viktorbarzin.me"
proxied = false
ttl = 1
@ -171,6 +144,60 @@ resource "cloudflare_record" "mail_spf" {
zone_id = var.cloudflare_zone_id
}
resource "cloudflare_record" "mail_domainkey_rspamd" {
content = "\"v=DKIM1; h=sha256; k=rsa; p=MIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEAs9XHeFBKhUAEJSikXx+P49Q3nEBbnaSpn6h/9TqIhKaZWSVa2uGUGYQieNdon7DEJZ0VFo0Tvm3/UFsy2qF7ZmF+E/+N8EmkcPrMlxgJT281dpk5DxrZ+kbzw/DosfHH71K6vCLB4rSexzxJHaAx0AUddI3bFUJGjMgCXXCMZF+p8YCx+DDGPIXz2FOTtlJlR7aeZ2xXavwE/lBfI3MLnsq7X+GhPjQEax070nndOdZI0S8HpZkVxdGWl1N2Ec6LukYm2RiUkEMMQHSYX7WF3JBc+CGqUyd706Iy/5oeC3UGwZSM2uLkrp8YBjmw/h1rAeyv/ITt6ZXraP/cIMRiVQIDAQAB\""
name = "mail._domainkey.viktorbarzin.me"
proxied = false
ttl = 1
type = "TXT"
zone_id = var.cloudflare_zone_id
}
resource "cloudflare_record" "brevo_domainkey1" {
content = "b1.viktorbarzin-me.dkim.brevo.com."
name = "brevo1._domainkey.viktorbarzin.me"
proxied = false
ttl = 1
type = "CNAME"
zone_id = var.cloudflare_zone_id
}
resource "cloudflare_record" "brevo_domainkey2" {
content = "b2.viktorbarzin-me.dkim.brevo.com."
name = "brevo2._domainkey.viktorbarzin.me"
proxied = false
ttl = 1
type = "CNAME"
zone_id = var.cloudflare_zone_id
}
resource "cloudflare_record" "brevo_code" {
content = "\"brevo-code:a6ef1dd91b248559900246eb4e7ceebd\""
name = "viktorbarzin.me"
proxied = false
ttl = 1
type = "TXT"
zone_id = var.cloudflare_zone_id
}
resource "cloudflare_record" "mail_mta_sts" {
content = "\"v=STSv1; id=20260412\""
name = "_mta-sts.viktorbarzin.me"
proxied = false
ttl = 1
type = "TXT"
zone_id = var.cloudflare_zone_id
}
resource "cloudflare_record" "mail_tlsrpt" {
content = "\"v=TLSRPTv1; rua=mailto:postmaster@viktorbarzin.me\""
name = "_smtp._tls.viktorbarzin.me"
proxied = false
ttl = 1
type = "TXT"
zone_id = var.cloudflare_zone_id
}
resource "cloudflare_record" "mail_dmarc" {
content = "\"v=DMARC1; p=quarantine; pct=100; fo=1; ri=3600; sp=quarantine; adkim=r; aspf=r; rua=mailto:e21c0ff8@dmarc.mailgun.org,mailto:adb84997@inbox.ondmarc.com; ruf=mailto:e21c0ff8@dmarc.mailgun.org,mailto:adb84997@inbox.ondmarc.com,mailto:postmaster@viktorbarzin.me;\""
name = "_dmarc.viktorbarzin.me"

View file

@ -68,7 +68,7 @@ resource "kubernetes_config_map" "mailserver_env_config" {
POSTFIX_REJECT_UNKNOWN_CLIENT_HOSTNAME = "1"
# TLS_LEVEL = "intermediate"
# DEFAULT_RELAY_HOST = "[smtp.sendgrid.net]:587"
DEFAULT_RELAY_HOST = "[smtp.eu.mailgun.org]:587"
DEFAULT_RELAY_HOST = "[smtp-relay.brevo.com]:587"
SPOOF_PROTECTION = "1"
SSL_TYPE = "manual"
SSL_CERT_PATH = "/tmp/ssl/tls.crt"
@ -487,14 +487,13 @@ resource "kubernetes_service" "mailserver" {
}
annotations = {
"metallb.io/loadBalancerIPs" = "10.0.20.200"
"metallb.io/allow-shared-ip" = "shared"
"metallb.io/loadBalancerIPs" = "10.0.20.202"
}
}
spec {
type = "LoadBalancer"
external_traffic_policy = "Cluster"
external_traffic_policy = "Local"
selector = {
app = "mailserver"
}
@ -549,7 +548,7 @@ resource "kubernetes_cron_job_v1" "email_roundtrip_monitor" {
concurrency_policy = "Replace"
failed_jobs_history_limit = 3
successful_jobs_history_limit = 3
schedule = "*/10 * * * *"
schedule = "*/20 * * * *"
job_template {
metadata {}
spec {

View file

@ -3,7 +3,7 @@
variable "postfix_cf" {
default = <<EOT
#relayhost = [smtp.sendgrid.net]:587
relayhost = [smtp.eu.mailgun.org]:587
relayhost = [smtp-relay.brevo.com]:587
smtp_sasl_auth_enable = yes
smtp_sasl_password_maps = hash:/etc/postfix/sasl/passwd
smtp_sasl_security_options = noanonymous

View file

@ -995,7 +995,7 @@ serverFiles:
annotations:
summary: "PV {{ $labels.persistentvolumeclaim }} in {{ $labels.namespace }}: {{ $value | printf \"%.0f\" }}% used — auto-expansion may have failed"
- alert: PVPredictedFull
expr: predict_linear(kubelet_volume_stats_used_bytes[6h], 3600*24) > kubelet_volume_stats_capacity_bytes
expr: predict_linear(kubelet_volume_stats_used_bytes[6h], 3600*24) > kubelet_volume_stats_capacity_bytes and kubelet_volume_stats_capacity_bytes < 1099511627776
for: 1h
labels:
severity: warning
@ -1725,21 +1725,21 @@ serverFiles:
summary: "Bank sync has not succeeded in more than 48h. Check CronJob and account auth."
- alert: EmailRoundtripFailing
expr: email_roundtrip_success{job="email-roundtrip-monitor"} == 0
for: 30m
for: 60m
labels:
severity: warning
annotations:
summary: "Email round-trip probe failing. Check ForwardEmail relay, DNS, and IMAP."
summary: "Email round-trip probe failing. Check MX DNS, Postfix, Mailgun API, and IMAP."
- alert: EmailRoundtripStale
expr: (time() - email_roundtrip_last_success_timestamp{job="email-roundtrip-monitor"}) > 2400
expr: (time() - email_roundtrip_last_success_timestamp{job="email-roundtrip-monitor"}) > 3600
for: 10m
labels:
severity: warning
annotations:
summary: "Email round-trip probe has not succeeded in >40 min"
summary: "Email round-trip probe has not succeeded in >60 min"
- alert: EmailRoundtripNeverRun
expr: absent(email_roundtrip_success{job="email-roundtrip-monitor"})
for: 40m
for: 60m
labels:
severity: warning
annotations: