From 77fcb08e8e69ccc35cf281e292164d5043f1e1ae Mon Sep 17 00:00:00 2001 From: Viktor Barzin Date: Fri, 3 Jul 2026 14:06:19 +0000 Subject: [PATCH] mailserver: add docs@ paperless ingest mailbox (sieve sender allowlist) Viktor asked to forward arbitrary emails with PDF attachments into paperless-ngx, with the forwarding sender mapping 1:1 to the paperless account that owns the document. paperless-ngx's built-in IMAP consumer already does the sender->owner mapping, so the infra half is a dedicated real mailbox docs@viktorbarzin.me: an explicit self-alias (the @domain catch-all would otherwise divert it into the TripIt-swept spam@ mailbox, whose sweeper LLM-parses and auto-replies to mail from linked senders) plus a per-user Dovecot sieve that discards non-family senders at delivery (chosen behaviour for unmatched senders: ignore and delete; also keeps spam out of the guessable address). The mailbox credential was added to Vault secret/platform.mailserver_accounts. Paperless-side mail account + 5 per-sender rules are DB state, configured via the API per the new runbook docs/runbooks/paperless-mail-ingest.md. Co-Authored-By: Claude Fable 5 --- .claude/reference/service-catalog.md | 2 +- docs/architecture/mailserver.md | 11 ++ docs/runbooks/paperless-mail-ingest.md | 116 ++++++++++++++++++ .../modules/mailserver/extra/aliases.txt | 9 ++ .../docs-at-viktorbarzin.me.dovecot.sieve | 17 +++ stacks/mailserver/modules/mailserver/main.tf | 12 ++ 6 files changed, 166 insertions(+), 1 deletion(-) create mode 100644 docs/runbooks/paperless-mail-ingest.md create mode 100644 stacks/mailserver/modules/mailserver/extra/docs-at-viktorbarzin.me.dovecot.sieve diff --git a/.claude/reference/service-catalog.md b/.claude/reference/service-catalog.md index fa9bb908..82e11c61 100644 --- a/.claude/reference/service-catalog.md +++ b/.claude/reference/service-catalog.md @@ -81,7 +81,7 @@ | ytdlp | YouTube downloader | ytdlp | | wealthfolio | Finance tracking | wealthfolio | | audiobookshelf | Audiobook server (may be merged into ebooks stack) | audiobookshelf | -| paperless-ngx | Document management | paperless-ngx | +| paperless-ngx | Document management. Mail ingest: forward document emails to `docs@viktorbarzin.me` — sender maps 1:1 to a paperless account (runbook `paperless-mail-ingest.md`) | paperless-ngx | | jsoncrack | JSON visualizer | jsoncrack | | servarr | Media automation (Sonarr/Radarr/etc) | servarr | | aiostreams | Stremio stream aggregator (Real-Debrid + Torrentio/Comet/StremThru Torz/Knaben; **MediaFusion removed 2026-06-07** — broken upstream `500`). `auth=app` (own UUID+password); stream-probe tests **both series+movie paths** with per-source breakdown (`aiostreams_streams_{comet,torrentio,stremthru_torz,knaben}`) + `aiostreams_error_streams` + `aiostreams_movie_stream_count`, success gated on Comet (workhorse) being alive; weekly NFS config + Stremio-account-collection backups to `/srv/nfs/aiostreams-backup/`. PG-backed user config (Comet timeout bumped 5s→10s 2026-06-07). | servarr/aiostreams | diff --git a/docs/architecture/mailserver.md b/docs/architecture/mailserver.md index 0edeffb4..8f8c56ea 100644 --- a/docs/architecture/mailserver.md +++ b/docs/architecture/mailserver.md @@ -161,6 +161,17 @@ https://mail.viktorbarzin.me → Traefik → Roundcubemail DB: MySQL (mysql.dbaas.svc.cluster.local) ``` +### Paperless ingest mailbox (docs@) + +`docs@viktorbarzin.me` is a dedicated real mailbox (explicit self-alias in +`extra/aliases.txt` so the `@domain → spam@` catch-all doesn't shadow it) that +paperless-ngx polls over IMAP; family members forward document emails to it +and the sender maps 1:1 to a paperless account. A per-user Dovecot sieve +(`docs-at-viktorbarzin.me.dovecot.sieve` in the `mailserver.config` ConfigMap, +mounted as `/tmp/docker-mailserver/docs@viktorbarzin.me.dovecot.sieve`) +discards mail from non-allowlisted senders at delivery. Full flow, sender map, +and add-a-sender procedure: [`runbooks/paperless-mail-ingest.md`](../runbooks/paperless-mail-ingest.md). + ## DNS Records All managed in Terraform at `stacks/cloudflared/modules/cloudflared/cloudflare.tf`. diff --git a/docs/runbooks/paperless-mail-ingest.md b/docs/runbooks/paperless-mail-ingest.md new file mode 100644 index 00000000..15028571 --- /dev/null +++ b/docs/runbooks/paperless-mail-ingest.md @@ -0,0 +1,116 @@ +# Paperless-ngx Mail Ingest (docs@viktorbarzin.me) + +Last updated: 2026-07-03 (initial build) + +Forward any email with document attachments to **`docs@viktorbarzin.me`** and +paperless-ngx ingests the attachments, owned by the paperless account mapped +from the **sender** (From) address. Built entirely from existing parts: a +docker-mailserver mailbox + Dovecot sieve, and paperless-ngx's native mail +consumer (the same machinery as the `utility:` rules). + +## Flow + +``` +family member forwards email ──> MX ──> docker-mailserver + │ postfix virtual: docs@ has an explicit self-alias (extra/aliases.txt), + │ so the @domain catch-all (→ spam@, swept by TripIt) does NOT apply + ▼ +Dovecot LMTP delivery to docs@ + │ per-user sieve (docs@viktorbarzin.me.dovecot.sieve): sender NOT in + │ allowlist → discard (decision 2026-07-03: unmatched = ignore & delete) + ▼ +docs@ INBOX ── paperless-ngx mail task (every 10 min, PAPERLESS_EMAIL_TASK_CRON + │ default) applies mail rules in order: filter_from = + │ → consume attachments (attachments-only: inline images like + │ signature logos are skipped), owner = mapped user, + │ tag = email-ingest, title = mail subject + ▼ +consumed mail is MOVED to the "Processed" IMAP folder (audit trail); +INBOX stays empty in steady state +``` + +## Sender → paperless account map (as built) + +| Sender (From) | Paperless user | Rule | +|--------------------------|----------------|-----------------| +| me@viktorbarzin.me | root (id 3) | forward: Viktor (me@) | +| vbarzin@gmail.com | root (id 3) | forward: Viktor (gmail) | +| viktorbarzin@meta.com | root (id 3) | forward: Viktor (meta) | +| ancaelena98@gmail.com | anca (id 4) | forward: Anca | +| emil.barzin@gmail.com | emo (id 7) | forward: Emo | + +The map lives in **two places by design** — keep them in sync: + +1. **Delivery gate (infra, Terraform):** + `stacks/mailserver/modules/mailserver/extra/docs-at-viktorbarzin.me.dovecot.sieve` + — senders not listed here are discarded at delivery (spam control + the + "ignore and delete unmatched" behaviour; paperless cannot express + "delete without ingesting", so this must happen before the mailbox). +2. **Owner map (paperless DB, via API/UI):** one mail rule per sender on the + `docs@viktorbarzin.me` mail account. DB-state like workflows — NOT + Terraform. + +## Add a family member / sender + +1. Add the address to the sieve allowlist file above; commit; apply the + `mailserver` stack (normal apply is enough — the sieve CM key is not under + `ignore_changes`; Reloader restarts the pod). +2. Clone an existing `forward:` mail rule in the paperless admin UI + (Mail → Rules) or via API, changing `filter_from` and the rule **owner** + (documents are owned by the rule owner — `assign_owner_from_rule=true`). + Keep: action = Move to `Processed`, attachment type = attachments-only, + consumption scope = attachments only, tag `email-ingest`, order after the + existing rules. + +## Operations + +- **Trigger a fetch immediately** (instead of waiting ≤10 min): + `kubectl -n paperless-ngx exec deploy/paperless-ngx -c paperless-ngx -- python3 manage.py mail_fetcher` +- **Mailbox credentials:** Vault `secret/platform` → `mailserver_accounts` + JSON, key `docs@viktorbarzin.me` (also used by the paperless mail account). +- **Inspect the mailbox:** + `python3 -c` IMAP to `mailserver.mailserver.svc.cluster.local:993` (in-cluster, + from a pod) or `mail.viktorbarzin.me:993` (externally / devvm). +- **Paperless-side logs:** `kubectl -n paperless-ngx logs deploy/paperless-ngx | grep -i mail` + (also Loki, ns `paperless-ngx`). Rule/account state: `GET /api/mail_rules/`, + `GET /api/mail_accounts/` with the admin token + (k8s secret `paperless-ngx-secrets`, field `api_token`). +- **Account/mailbox provisioning:** adding/rotating anything in + `mailserver_accounts` requires the ConfigMap replace workaround — + `scripts/tg apply mailserver -- -replace=module.mailserver.kubernetes_config_map.mailserver_config` + — because `postfix-accounts.cf` is under `ignore_changes` + (non-deterministic bcrypt; see the module comment). + +## Design notes / caveats + +- **Why not the catch-all?** Mail to unknown `@viktorbarzin.me` addresses + lands in `spam@`, which the TripIt `ingest-plans` CronJob sweeps every + 15 min: it marks everything `\Seen`, LLM-parses mail from linked senders and + replies with ack/failure emails. Forwarded bank statements would get + "couldn't parse a trip" replies. `docs@` being a real mailbox bypasses that + path entirely; TripIt, the `smoke-test@` roundtrip probe, and `dmarc@` are + untouched. +- **Spoofing:** the sender match is on the From header. Rspamd verifies + SPF/DKIM/DMARC on inbound mail, but gmail.com publishes `p=none`, so a + crafted spoof could ingest documents into a family member's account. Accepted + risk (worst case: unwanted documents appear, visible + deletable in + paperless). +- **Not PDF-only:** any real attachment type paperless supports is consumed + (PDF, images, Office via the existing tika+gotenberg pipeline). Inline + images are excluded by `attachment_type=1`. +- **No dedicated alerting** (deliberate, 2026-07-03): mail-task errors surface + in paperless logs; the mailserver inbound path is covered by + `email-roundtrip-monitor`. Revisit if forwards start silently failing. +- **Workflows:** the global `payslip-webhook` + `claude-mcp-readers + auto-permission` workflows fire for mail-ingested docs like any other + consumption source (verified pre-build; payslip receiver does its own + filtering). + +## Rollback + +1. Disable/delete the 5 `forward:` mail rules + the `docs@` mail account + (paperless admin UI or API). +2. Revert the infra commit (aliases.txt entry, sieve file, CM key + mount). +3. Remove `docs@viktorbarzin.me` from Vault `mailserver_accounts`, then apply + with the `-replace` workaround above. Mail to docs@ then falls back to the + catch-all (spam@) like any unknown address. diff --git a/stacks/mailserver/modules/mailserver/extra/aliases.txt b/stacks/mailserver/modules/mailserver/extra/aliases.txt index 284c7061..5d8952af 100644 --- a/stacks/mailserver/modules/mailserver/extra/aliases.txt +++ b/stacks/mailserver/modules/mailserver/extra/aliases.txt @@ -19,3 +19,12 @@ plans@viktorbarzin.me spam@viktorbarzin.me # to trips@, or every verification/recovery send is rejected (550 sender). Also # routes any inbound trips@ to spam@. trips@viktorbarzin.me spam@viktorbarzin.me + +# docs@ -> docs@: explicit self-alias for the paperless-ngx ingest MAILBOX +# (a real account in secret/platform.mailserver_accounts). Without this the +# @domain catch-all above (Vault-side aliases) rewrites docs@ to spam@ and the +# mail lands in the TripIt-swept catch-all mailbox instead. Same pattern as +# me@ -> me@. Delivery-time sender allowlist: docs-at-viktorbarzin.me +# .dovecot.sieve (mounted as docs@viktorbarzin.me.dovecot.sieve). +# Runbook: docs/runbooks/paperless-mail-ingest.md +docs@viktorbarzin.me docs@viktorbarzin.me diff --git a/stacks/mailserver/modules/mailserver/extra/docs-at-viktorbarzin.me.dovecot.sieve b/stacks/mailserver/modules/mailserver/extra/docs-at-viktorbarzin.me.dovecot.sieve new file mode 100644 index 00000000..1978517a --- /dev/null +++ b/stacks/mailserver/modules/mailserver/extra/docs-at-viktorbarzin.me.dovecot.sieve @@ -0,0 +1,17 @@ +# Sender allowlist for the paperless-ngx ingest mailbox docs@viktorbarzin.me. +# Family members forward document emails here; paperless-ngx polls the INBOX +# over IMAP and maps each sender to a paperless account (1 mail rule per +# sender). Decision (Viktor, 2026-07-03): mail from any OTHER sender is +# ignored and deleted — discarded here at LMTP delivery, before paperless +# ever sees it. This also keeps spam to the guessable address out entirely. +# +# Keep this list in sync with the paperless mail rules (the sender -> owner +# map). Add-a-sender procedure: docs/runbooks/paperless-mail-ingest.md +if not address :is "from" ["me@viktorbarzin.me", + "vbarzin@gmail.com", + "viktorbarzin@meta.com", + "ancaelena98@gmail.com", + "emil.barzin@gmail.com"] { + discard; + stop; +} diff --git a/stacks/mailserver/modules/mailserver/main.tf b/stacks/mailserver/modules/mailserver/main.tf index 76a7a251..d1589a8d 100644 --- a/stacks/mailserver/modules/mailserver/main.tf +++ b/stacks/mailserver/modules/mailserver/main.tf @@ -110,6 +110,12 @@ resource "kubernetes_config_map" "mailserver_config" { "postfix-main.cf" = var.postfix_cf "postfix-virtual.cf" = local.postfix_virtual + # Per-user Dovecot sieve for the paperless-ngx ingest mailbox: DMS installs + # any /tmp/docker-mailserver/.dovecot.sieve at startup. ConfigMap + # keys can't contain '@', so the key is sanitized ("-at-") and the + # volume_mount below restores the real filename. + "docs-at-viktorbarzin.me.dovecot.sieve" = file("${path.module}/extra/docs-at-viktorbarzin.me.dovecot.sieve") + KeyTable = "mail._domainkey.viktorbarzin.me viktorbarzin.me:mail:/etc/opendkim/keys/viktorbarzin.me-mail.key\n" SigningTable = "*@viktorbarzin.me mail._domainkey.viktorbarzin.me\n" TrustedHosts = "127.0.0.1\nlocalhost\n" @@ -404,6 +410,12 @@ resource "kubernetes_deployment" "mailserver" { sub_path = "postfix-virtual.cf" read_only = true } + volume_mount { + name = "config" + mount_path = "/tmp/docker-mailserver/docs@viktorbarzin.me.dovecot.sieve" + sub_path = "docs-at-viktorbarzin.me.dovecot.sieve" + read_only = true + } volume_mount { name = "config" mount_path = "/tmp/docker-mailserver/fetchmail.cf"