infra

Author	SHA1	Message	Date
Viktor Barzin	f8facf44dd	[infra] Fix rewrite-body plugin + cleanup TrueNAS + version bumps ## Context The rewrite-body Traefik plugin (packruler/rewrite-body v1.2.0) silently broke on Traefik v3.6.12 — every service using rybbit analytics or anti-AI injection returned HTTP 200 with "Error 404: Not Found" body. Root cause: middleware specs referenced plugin name `rewrite-body` but Traefik registered it as `traefik-plugin-rewritebody`. Migrated to maintained fork `the-ccsn/traefik-plugin-rewritebody` v0.1.3 which uses the correct plugin name. Also added `lastModified = true` and `methods = ["GET"]` to anti-AI middleware to avoid rewriting non-HTML responses. ## This change - Replace packruler/rewrite-body v1.2.0 with the-ccsn/traefik-plugin-rewritebody v0.1.3 - Fix plugin name in all 3 middleware locations (ingress_factory, reverse-proxy factory, traefik anti-AI) - Remove deprecated TrueNAS cloud sync monitor (VM decommissioned 2026-04-13) - Remove CloudSyncStale/CloudSyncFailing/CloudSyncNeverRun alerts - Fix PrometheusBackupNeverRun alert (for: 48h → 32d to match monthly sidecar schedule) - Bump versions: rybbit v1.0.21→v1.1.0, wealthfolio v1.1.0→v3.2, networking-toolbox 1.1.1→1.6.0, cyberchef v10.24.0→v9.55.0 - MySQL standalone storage_limit 30Gi → 50Gi - beads-server: fix Dolt workbench type casing, remove Authentik on GraphQL endpoint Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-17 05:51:52 +00:00
Viktor Barzin	a237ac97e0	docs(upgrades): add bulk upgrade results from first production run 12 services upgraded in 30 min: audiobookshelf, owntracks, open-webui, immich, coturn, shlink, phpipam, onlyoffice, paperless-ngx, linkwarden, synapse, dawarich. Documents auto-rollback behavior, resource awareness (paperless memory bump), bulk upgrade procedure, and rate limit reset. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-16 17:34:27 +00:00
Viktor Barzin	b1d152be1f	[infra] Auto-create Cloudflare DNS records from ingress_factory ## Context Deploying new services required manually adding hostnames to cloudflare_proxied_names/cloudflare_non_proxied_names in config.tfvars — a separate file from the service stack. This was frequently forgotten, leaving services unreachable externally. ## This change: - Add `dns_type` parameter to `ingress_factory` and `reverse_proxy/factory` modules. Setting `dns_type = "proxied"` or `"non-proxied"` auto-creates the Cloudflare DNS record (CNAME to tunnel or A/AAAA to public IP). - Simplify cloudflared tunnel from 100 per-hostname rules to wildcard `*.viktorbarzin.me → Traefik`. Traefik still handles host-based routing. - Add global Cloudflare provider via terragrunt.hcl (separate cloudflare_provider.tf with Vault-sourced API key). - Migrate 118 hostnames from centralized config.tfvars to per-service dns_type. 17 hostnames remain centrally managed (Helm ingresses, special cases). - Update docs, AGENTS.md, CLAUDE.md, dns.md runbook. ``` BEFORE AFTER config.tfvars (manual list) stacks/<svc>/main.tf \| module "ingress" { v dns_type = "proxied" stacks/cloudflared/ } for_each = list \| cloudflare_record auto-creates tunnel per-hostname cloudflare_record + annotation ``` ## What is NOT in this change: - Uptime Kuma monitor migration (still reads from config.tfvars) - 17 remaining centrally-managed hostnames (Helm, special cases) - Removal of allow_overwrite (keep until migration confirmed stable) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-16 13:45:04 +00:00
Viktor Barzin	c33f597111	feat(upgrade-agent): add automated service upgrade pipeline with n8n + DIUN Pipeline: DIUN detects new image versions every 6h → webhook to n8n → n8n filters (skip databases/custom/infra/:latest) and rate-limits (max 5/6h) → SSH to dev VM → claude -p runs upgrade agent. Agent workflow: resolve GitHub repo → fetch changelogs → classify risk (SAFE/CAUTION) → backup DB if needed → bump version in .tf → commit+push → wait for CI → verify (pod ready + HTTP + Uptime Kuma) → rollback on failure. Changes: - stacks/n8n: add N8N_PORT=5678 to fix K8s env var conflict - stacks/n8n/workflows: version-controlled n8n workflow backup - docs/architecture/automated-upgrades.md: full pipeline documentation - AGENTS.md: add upgrade agent section - service-catalog.md: update DIUN description Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-15 21:38:27 +00:00
Viktor Barzin	dcc96f465e	docs(storage): add encrypted LVM documentation Update storage docs to reflect the 2026-04-15 migration of all sensitive services to proxmox-lvm-encrypted. Add encrypted PVC template, LUKS2 flow documentation, updated architecture diagram, and storage class decision rules. Files updated: - .claude/CLAUDE.md: storage decision table, encrypted PVC template - docs/architecture/storage.md: encrypted flow, components, diagram, Vault paths - AGENTS.md: storage section with encrypted SC as default for sensitive data Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-15 21:00:37 +00:00
Viktor Barzin	69474fae96	docs: add comprehensive DNS architecture documentation Covers Technitium HA (3-instance AXFR replication), CoreDNS config, Cloudflare external DNS, Split Horizon hairpin NAT fix, DHCP-DNS auto-registration, 6 automation CronJobs, and troubleshooting guides. Also fixes stale NFS reference in networking.md. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-15 18:10:27 +00:00
Viktor Barzin	0a448c2bae	docs: rewrite incident-response as user contribution guide Complete rewrite of the user-facing documentation: - How to report outages and request features - Mermaid flow diagrams for both incident and feature request paths - SLA expectations (automated vs human response times) - Self-service checks before reporting - Severity level definitions - Status page explanation - Full technical architecture section with component inventory - Safety guardrails, labels, and commit conventions [ci skip] Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-15 17:59:09 +00:00
Viktor Barzin	7bb9ec2934	Add agent task tracking documentation Documents the centralized Beads/Dolt task tracking system used by all Claude Code sessions. Covers architecture, session lifecycle, settings hierarchy, known issues, and E2E test verification. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-15 17:11:26 +00:00
Viktor Barzin	bd41bb9230	fix: cluster healthcheck fixes + Authentik upgrade to 2026.2.2 - Authentik: upgrade 2025.10.3 → 2025.12.4 → 2026.2.2 with DB restore and stepped migration. Switch to existingSecret, PgBouncer session mode. - Mailserver: migrate email roundtrip probe from Mailgun to Brevo API - Redis: fix HAProxy tcp-check regex (rstring), faster health intervals - Nextcloud: fix Redis fallback to HAProxy service, update dependency - MeshCentral: fix TLSOffload + certUrl init container for first-run - Monitoring: remove authentik from latency alert exclusion - Diun: simplify to webhook notifier, remove git auto-update [ci skip] Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-15 06:41:56 +00:00
Viktor Barzin	d31bbc9a18	docs: update monitoring and backup docs for external monitors and per-db backups - CLAUDE.md: document external monitoring (ExternalAccessDivergence alert, external-monitor-sync CronJob) and per-database backup/restore paths - backup-dr.md: add per-db backup CronJobs to inventory table and daily timeline, update restore runbook references - monitoring.md: add External Monitor Sync component and external monitoring architecture section [ci skip] Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-15 06:37:07 +00:00
Viktor Barzin	0256ccdccc	feat: add per-database backups for PostgreSQL and MySQL Add separate CronJobs that dump each database individually: - postgresql-backup-per-db: pg_dump -Fc per DB (daily 00:15) - mysql-backup-per-db: mysqldump per DB (daily 00:45) Dumps go to /backup/per-db/<dbname>/ on the same NFS PVC. Enables single-database restore without affecting other databases. Also fixed CNPG superuser password sync and added --single-transaction --set-gtid-purged=OFF to MySQL per-db dumps. Updated restore runbooks with per-database restore procedures. [ci skip] Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 22:39:33 +00:00
Viktor Barzin	30c3450c61	docs: add user-facing issue reporting guide Adds "Reporting an Issue" section with: - Where to report (Slack, GitHub, DM) - What to include (examples of good vs bad reports) - What happens after reporting (flow diagram) - Self-service status checks (Uptime Kuma, Grafana, K8s Dashboard) [ci skip] Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 18:37:37 +00:00
Viktor Barzin	dfad89ef81	fix: remove br tags from mermaid diagram (GitHub compat) [ci skip]	2026-04-14 18:35:13 +00:00
Viktor Barzin	f658b18a50	docs: add incident response & post-mortem pipeline architecture Documents the full E2E automated incident response pipeline: - /post-mortem skill for generating structured post-mortems - Woodpecker pipeline triggered on post-mortem file changes - Claude Code headless agent for implementing safe TODOs - Safety guardrails, commit conventions, secrets, limitations [ci skip] Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 18:31:29 +00:00
Viktor Barzin	cf87a747d8	docs: update post-mortem follow-up implementation [PM-2026-04-14] [ci skip] Mark all 8 safe TODOs as Done. Add Follow-up Implementation table with commit SHAs. Flag 3 Migration TODOs as needing human review. Co-Authored-By: postmortem-todo-resolver <noreply@anthropic.com>	2026-04-14 18:09:11 +00:00
Viktor Barzin	b1b408ff0e	fix: use full path to claude CLI for non-interactive SSH	2026-04-14 17:44:50 +00:00
Viktor Barzin	7674cf8c5c	docs: final E2E pipeline test	2026-04-14 17:43:38 +00:00
Viktor Barzin	91b97709b7	docs: trigger postmortem pipeline with TODO	2026-04-14 17:27:45 +00:00
Viktor Barzin	f336e5ed53	docs: E2E test postmortem pipeline with deep clone	2026-04-14 17:12:46 +00:00
Viktor Barzin	60c04e51b7		2026-04-14 17:10:45 +00:00
Viktor Barzin	933c562aa9	docs: trigger postmortem pipeline E2E test	2026-04-14 16:49:07 +00:00
Viktor Barzin	df95f52d08	docs: test postmortem with TODO for pipeline E2E	2026-04-14 16:45:44 +00:00
Viktor Barzin	b3cc5fcc32	test: trigger postmortem pipeline webhook	2026-04-14 16:44:11 +00:00
Viktor Barzin	777450cb19	docs: test post-mortem for pipeline E2E validation	2026-04-14 15:55:32 +00:00
Viktor Barzin	a703c6e84f	docs: update post-mortem follow-up implementation [PM-2026-04-14] [ci skip] Added Uptime Kuma TCP monitor for PVE NFS (192.168.1.127:2049), ID 328, Tier 1 (30s/3 retries). Investigation TODO flagged for human review. Co-Authored-By: postmortem-todo-resolver <noreply@anthropic.com>	2026-04-14 15:48:11 +00:00
Viktor Barzin	e832581caf	docs: update Apr 14 post-mortem with Phase 2 findings Key additions: - NFSv3 broke after NFS restart (kernel lockd bug on PVE 6.14) - All 52 PVs migrated to NFSv4, NFSv3 disabled on PVE - DNS zone sync gap: secondary/tertiary had no custom zones - Converted one-time setup Job to recurring zone-sync CronJob - MySQL, Redis, Vault collateral damage and fixes - 3 new lessons learned (zone replication, NFS client state, operator rollout) [ci skip] Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 12:26:11 +00:00
Viktor Barzin	4e059b138c	docs: consolidate all post-mortems under docs/post-mortems/ Move HTML post-mortems from repo root post-mortems/ to docs/post-mortems/. Update index.html with all 3 incidents (newest first). [ci skip] Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 08:24:36 +00:00
Viktor Barzin	bdba15a387	docs: move post-mortems to docs/post-mortems/ Consolidate all outage reports under docs/ for better discoverability. Moved from .claude/post-mortems/ (agent-internal) to docs/post-mortems/ (repo documentation). [ci skip] Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 08:20:09 +00:00
Viktor Barzin	82f674a0b4	rename weekly-backup → daily-backup across scripts, timers, services, and docs [ci skip] Reflects the schedule change from weekly to daily. All references updated: - scripts/weekly-backup.{sh,timer,service} → daily-backup.* - Pushgateway job name: weekly-backup → daily-backup - Prometheus metric names: weekly_backup_* → daily_backup_* - All docs, runbooks, AGENTS.md, CLAUDE.md, proxmox-inventory - offsite-sync dependency: After=daily-backup.service Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-13 18:37:04 +00:00
Viktor Barzin	b45cee5c4a	docs: update backup architecture for inotify change tracking + consolidated Synology layout [ci skip] Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-13 18:16:36 +00:00
Viktor Barzin	38d51ab0af	deprecate TrueNAS: migrate Immich NFS to Proxmox, remove all 10.0.10.15 references [ci skip] - Migrate Immich (8 NFS PVs, 1.1TB) from TrueNAS to Proxmox host NFS - Update config.tfvars nfs_server to 192.168.1.127 (Proxmox) - Update nfs-csi StorageClass share to /srv/nfs - Update scripts (weekly-backup, cluster-healthcheck) to Proxmox IP - Delete obsolete TrueNAS scripts (nfs_exports.sh, truenas-status.sh) - Rewrite nfs-health.sh for Proxmox NFS monitoring - Update Freedify nfs_music_server default to Proxmox - Mark CloudSync monitor CronJob as deprecated - Update Prometheus alert summaries - Update all architecture docs, AGENTS.md, and reference docs - Zero PVs remain on TrueNAS — VM ready for decommission Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-13 14:42:07 +00:00
Viktor Barzin	a2fad3f20e	docs(mailserver): remove HTML visual, fix probe frequency in diagram	2026-04-12 22:25:34 +01:00
Viktor Barzin	1c300a14cf	mailserver: overhaul inbound delivery, monitoring, CrowdSec, and migrate to Brevo relay Inbound: - Direct MX to mail.viktorbarzin.me (ForwardEmail relay attempted and abandoned) - Dedicated MetalLB IP 10.0.20.202 with ETP: Local for CrowdSec real-IP detection - Removed Cloudflare Email Routing (can't store-and-forward) - Fixed dual SPF violation, hardened to -all - Added MTA-STS, TLSRPT, imported Rspamd DKIM into Terraform - Removed dead BIND zones from config.tfvars (199 lines) Outbound: - Migrated from Mailgun (100/day) to Brevo (300/day free) - Added Brevo DKIM CNAMEs and verification TXT Monitoring: - Probe frequency: 30m → 20m, alert thresholds adjusted to 60m - Enabled Dovecot exporter scraping (port 9166) - Added external SMTP monitor on public IP Documentation: - New docs/architecture/mailserver.md with full architecture - New docs/architecture/mailserver-visual.html visualization - Updated monitoring.md, CLAUDE.md, historical plan docs	2026-04-12 22:24:38 +01:00
Viktor Barzin	c740ed1301	docs: update Technitium DNS docs after cache optimization - Fix Technitium IP typo: 10.0.20.101 → 10.0.20.201 (service-catalog, vpn.md) - Fix PDB minAvailable: 1 → 2 (networking.md) - Add emrsn.org stub zone, cache TTL tuning, PG query logging, CronJobs - Update forwarders: was "Cloudflare + Google", actually Cloudflare DoH only - Update config storage: was generic PVC, now NFS path	2026-04-12 18:29:25 +01:00
Viktor Barzin	73531c12e0	docs(vpn): update with dual-stack WG, GL-iNet AllowedIPs fix, and troubleshooting [ci skip] Document fixes from 2026-04-10 London network debugging session: - pfSense WG now dual-stack (IPv4+IPv6 via HE tunnel gif0 pf rule) - GL-iNet AllowedIPs must be single comma-separated UCI entry (parse bug) - AdGuardHome/carrier-monitor must not use 1.1.1.1 (conntrack + rate limit) - Expanded troubleshooting for site-to-site tunnel disconnects	2026-04-10 22:24:19 +01:00
Viktor Barzin	eec6af6aef	docs: add IPAM/DDNS architecture diagram and update docs - networking.md: Add mermaid diagram showing full device discovery pipeline (Kea DHCP → DDNS → Technitium, pfSense import → phpIPAM → DNS sync) - networking.md: Add data flow table, DHCP coverage table - networking.md: Update pfSense (3 subnets + 42 reservations), phpIPAM (passive import replaces fping), Technitium (192.168.1.2 in ACL) - CLAUDE.md: Update phpIPAM and networking descriptions [ci skip] Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-10 20:42:10 +00:00
Viktor Barzin	8cd8743140	docs: add phpIPAM, Kea DDNS, and DNS sync documentation - networking.md: Add phpIPAM IPAM section, Kea DDNS config, reverse DNS zones, Technitium dynamic update policy - CLAUDE.md: Add phpipam to DB rotation list, service notes, networking section - service-catalog.md: Add phpipam, mark netbox as disabled/replaced [ci skip] Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-10 16:01:32 +00:00
Viktor Barzin	98aaba98da	docs: add Split Horizon hairpin NAT fix to networking architecture [ci skip] Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-08 18:45:53 +00:00
Viktor Barzin	cfa7a50cb5	docs: update networking architecture for DNS consolidation - Technitium DNS now at dedicated MetalLB IP 10.0.20.201 (was shared 10.0.20.200) - Document LAN DNS path: pfSense NAT redirect preserves client IPs for Technitium logging - Document pfSense dnsmasq role (K8s VLAN + localhost only, not WAN) - Document pfSense aliases (technitium_dns, k8s_shared_lb) for NAT rule maintainability - Update MetalLB table with per-service IP assignments - Add ClusterIP (10.96.0.53) for CoreDNS internal forwarding [ci skip] Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-08 17:49:33 +00:00
Viktor Barzin	fea8519f51	update VPN architecture docs and Authentik state reference - vpn.md: Rewrite WireGuard section to match actual config (single tun_wg0 interface, 10.3.2.0/24 subnet, hub-and-spoke topology, correct device names and subnets for London/Valchedrym) - authentik-state.md: Document brute-force-protection policy unbinding fix that was blocking all unauthenticated users from login flows [ci skip]	2026-04-06 16:26:21 +03:00
Viktor Barzin	b345b086ef	update backup/DR docs and runbooks for 3-2-1 architecture - Full rewrite of backup-dr.md: 3-2-1 strategy with sda backup disk, PVC file-level copy from LVM snapshots, pfsense backup, two offsite paths. 4 Mermaid diagrams (data flow, timeline, disk layout, restore tree). - Update storage.md: 65 proxmox-lvm PVCs, sda backup tier - Update restore-full-cluster.md: add Phase 3.5 for PVC restore from sda - Update restore-{mysql,postgresql,vault,vaultwarden}.md: add sda fallback paths - New runbook: restore-pvc-from-backup.md (file-level restore from sda) - Update CLAUDE.md Storage & Backup section for 3-2-1 architecture	2026-04-06 15:06:01 +03:00
Viktor Barzin	fc233bd27f	docs: comprehensive audit and update of all architecture docs and runbooks [ci skip] Audited 14 documentation files against live cluster state and Terraform code. Architecture docs: - databases.md: MySQL 8.4.4, proxmox-lvm storage (not iSCSI), anti-affinity excludes k8s-node1 (GPU), 2Gi/3Gi resources, 7-day rotation (not 24h), CNPG 2 instances, PostGIS 16, postgresql.dbaas has endpoints - overview.md: 1x CPU, ~160GB RAM, all nodes 32GB, proxmox-lvm storage, correct Vault paths (secret/ not kv/) - compute.md: 272GB physical host RAM, ~160GB allocated to VMs - secrets.md: 7-day rotation, 7 MySQL + 5 PG roles, correct ESO config - networking.md: MetalLB pool 10.0.20.200-220 - ci-cd.md: 9 GHA projects, travel_blog 5.7GB Runbooks: - restore-mysql/postgresql: backup files are .sql.gz (not .sql) - restore-vault: weekly backup (not daily), auto-unseal sidecar note - restore-vaultwarden: PVC is proxmox (not iscsi) - restore-full-cluster: updated node roles, removed trading Reference docs: - CLAUDE.md: 7-day rotation, removed trading from PG list - AGENTS.md: 100+ stacks, proxmox-lvm, platform empty shell - service-catalog.md: 6 new stacks, 14 stack column updates	2026-04-06 13:21:05 +03:00
Viktor Barzin	9492874c43	fix: restore technitium MySQL query logging with Vault auto-rotation [ci skip] Query logs stopped syncing on 2026-03-16 due to password mismatch after MySQL cluster rebuild and Technitium app config reset. - Add Vault static role mysql-technitium (7-day rotation) - Add ExternalSecret for technitium-db-creds in technitium namespace - Add password-sync CronJob (6h) to push rotated password to Technitium API - Update Grafana datasource to use ESO-managed password - Remove stale technitium_db_password variable (replaced by ESO) - Update databases.md and restore-mysql.md runbook	2026-04-06 13:00:49 +03:00
Viktor Barzin	72d832fee7	add HA Sofia checks (26-29) to cluster healthcheck and backup-dr docs - Healthcheck: add entity availability, integration health, automation status, and system resources checks for Home Assistant Sofia - Docs: add backup-dr architecture documentation	2026-04-06 11:57:36 +03:00
Viktor Barzin	b2cac8cc97	add proxmox-csi cleanup TODO for post-migration tasks [ci skip]	2026-04-03 20:02:14 +03:00
Viktor Barzin	d49acebd8e	migrate ebooks-calibre to proxmox-lvm, update storage docs [ci skip] - Migrate ebooks-calibre-config-iscsi (2Gi, 2380 files) to proxmox-lvm - Update docs/architecture/storage.md: document Proxmox CSI as primary block storage, mark democratic-csi iSCSI as deprecated - Add full migration plan to docs/plans/	2026-04-03 19:45:34 +03:00
Viktor Barzin	2d8aa5ed89	docs: update hardware inventory for R730 RAM upgrade to 272GB Upgraded from 144GB (4x32G + 2x8G) to 272GB (8x32G + 2x8G) DDR4-2400. Added physical DIMM slot diagram, channel layout, and BIOS speed override notes. Updated compute architecture with correct CPU (single socket), VM memory values, and capacity figures.	2026-04-02 00:48:13 +03:00
Viktor Barzin	10f22350c5	exclude frigate, audiblez, ollama, real-estate-crawler from Synology backup [ci skip] Expanded cloud sync excludes to reduce sync time and Synology disk usage. All excluded data is either regenerable or low-value. TrueNAS Task 1 and incremental script already updated live.	2026-03-29 13:44:32 +03:00
Viktor Barzin	78dec8f0ad	add e2e email roundtrip monitoring CronJob (every 30 min) sends test email via Mailgun API to smoke-test@viktorbarzin.me, verifies IMAP delivery in spam@ catch-all, deletes test email, pushes metrics to Pushgateway + Uptime Kuma. Prometheus alerts: EmailRoundtripFailing, EmailRoundtripStale, EmailRoundtripNeverRun. Uptime Kuma: SMTP/IMAP port checks + E2E push.	2026-03-25 22:50:22 +02:00
Viktor Barzin	fe109d9f96	add homepage auto-discovery documentation [ci skip]	2026-03-25 13:06:43 +02:00

1 2

85 commits