snmp-exporter-external.viktorbarzin.me exposed UPS metrics to the
public internet with no authentication. Removed the external ingress
and Cloudflare DNS record. ha-sofia now accesses the SNMP exporter
via the existing .lan ingress (allow_local_access_only=true) using
direct IP 10.0.20.200 with Host header.
- Add snmp-exporter-ingress-external module for external HTTPS access to snmp-exporter
- Register snmp-exporter-external.viktorbarzin.me in Cloudflare DNS (proxied via tunnel)
- Update ha-sofia REST integration to use external HTTPS endpoint
- Fix ingress backend service routing to use existing snmp-exporter service
- All UPS sensors on ha-sofia now report values (voltage, battery %, load, etc.)
Verifies snmp-exporter, idrac-redfish-exporter, proxmox-exporter, and
tuya-bridge pods are running, plus checks Prometheus scrape targets
(snmp-idrac, snmp-ups, redfish-idrac, proxmox-host) are UP.
- changedetection: increase memory from 64Mi to 256Mi/512Mi (was OOMKilling),
set replicas back to 1
- flaresolverr: re-enable with replicas=1, increase memory limit to 1Gi
(needed by book-search for Cloudflare bypass)
Previously only searched for the current run's specific marker subject.
If IMAP deletion failed, old emails accumulated. Now searches for all
emails with "e2e-probe" in subject and deletes them, cleaning up any
leftovers from prior failed runs.
Root cause: Traefik v3 auto-detects HTTPS for backend port 443,
ignoring the port name "http" and serversscheme annotations.
MeshCentral serves HTTP on 443 (TLSOffload mode), but Traefik
connected via HTTPS causing TLS handshake failure → 500.
Fix: Change K8s service port from 443 to 80 with target_port 443.
Traefik sees port 80 → uses HTTP → reaches MeshCentral correctly.
Also disables anti-AI scraping (internal tool behind Authentik).
The rewrite-body plugin (anti-AI trap links) was crashing when
processing MeshCentral's HTML responses, returning 500. Disabled
anti_ai_scraping since it's a protected internal tool behind Authentik.
Re-enabled Authentik protection.
The previous init container incorrectly disabled TLSOffload, causing
MeshCentral to serve HTTPS on port 443. Traefik connects via HTTP,
resulting in protocol mismatch and 500 errors. Fix ensures TLSOffload
is always enabled so MeshCentral serves plain HTTP behind Traefik.
MeshCentral was failing to start with "Zipencryptionmodule failed" error
because the service tried to fetch TLS certificates from an HTTPS endpoint
during bootstrap. When using TLSOffload (reverse proxy terminating TLS),
MeshCentral should not attempt to load certificates.
Root cause: The existing config.json had "certUrl" set to HTTPS, causing
MeshCentral to try fetching the certificate during startup. Since the pod
was bootstrapping, this failed and cascaded into the Zipencryptionmodule
failure.
Fix: Add init container that runs before the main container to disable
the certUrl by prefixing it with underscore (MeshCentral's convention for
disabled settings). The sed command ensures the fix applies to both new
and existing config.json files.
This ensures MeshCentral behaves correctly with TLSOffload enabled:
- Runs in plain HTTP mode on port 443
- Traefik/Ingress handles HTTPS termination
- No certificate bootstrap failures
MeshCentral was migrated from NFS to proxmox-lvm storage (Wave 2). The old NFS
modules for data and files are no longer used by the deployment, leaving behind
orphaned PVCs (meshcentral-data, meshcentral-files). The backups volume remains
on NFS per the backup strategy pattern.
Changes:
- Removed module.nfs_data and module.nfs_files from Terraform config
- Active volumes now: meshcentral-data-proxmox, meshcentral-files-proxmox (proxmox-lvm)
- Backups volume: meshcentral-backups (NFS) - unchanged
Pod status: healthy, running on proxmox-lvm volumes.
Query logs stopped syncing on 2026-03-16 due to password mismatch after
MySQL cluster rebuild and Technitium app config reset.
- Add Vault static role mysql-technitium (7-day rotation)
- Add ExternalSecret for technitium-db-creds in technitium namespace
- Add password-sync CronJob (6h) to push rotated password to Technitium API
- Update Grafana datasource to use ESO-managed password
- Remove stale technitium_db_password variable (replaced by ESO)
- Update databases.md and restore-mysql.md runbook
The http-api sidecar was connecting to the public URL
(https://budget-*.viktorbarzin.me) which goes through Traefik/Authentik.
When pods got rescheduled to different nodes, this caused ETIMEDOUT errors.
Changed to internal service URL (http://budget-*.actualbudget.svc.cluster.local)
which is fast and reliable regardless of pod placement.