infra

Author	SHA1	Message	Date
Viktor Barzin	9d4e2fc4a0	kms: dedicate MetalLB IP 10.0.20.202 + filter probe noise Two coupled fixes for the hourly Slack noise + missing client IPs: 1. Move windows-kms off shared 10.0.20.200 to a dedicated MetalLB IP 10.0.20.202 with externalTrafficPolicy=Local, so vlmcsd sees real WAN client IPs (pfSense WAN forwards do DNAT-only; ETP=Local skips kube-proxy SNAT). Same pattern mailserver used pre-2026-04-19. Sharing 10.0.20.200 is blocked because all 10 services there are ETP=Cluster and MetalLB requires consistent ETP per shared IP. 2. Slack notifier now suppresses Slack posts for bare TCP open/close pairs (no Application/Activation block) — these are Uptime Kuma's port monitor and the new kubelet readiness/liveness probes. Probe counts go to a new metric kms_connection_probes_total{source} where source classifies the IP as internal_pod / cluster_node / external. Real activations are unaffected. Pod fluidity: added TCP readiness/liveness probes on 1688 to gate Pod Ready on the listener actually being up — required for ETP=Local so MetalLB only advertises 10.0.20.202 from a node where vlmcsd is serving. pfSense side (applied separately, not codified): - New alias k8s_kms_lb = 10.0.20.202 (KMS-only) - WAN:1688 NAT + filter rule retargeted from k8s_shared_lb to k8s_kms_lb - All other forwards on k8s_shared_lb (WireGuard, HTTPS, shadowsocks, smtps, etc.) untouched Runbook updated. Tests added for classify_source / is_probe / process_line.	2026-05-10 13:03:19 +00:00
Viktor Barzin	295cfd776c	fire-planner: expose actualbudget creds via ExternalSecret Adds 3 new keys (ACTUALBUDGET_API_URL/KEY/SYNC_ID) sourced from Vault secret/fire-planner so the FastAPI backend can read viktor's spending from the in-cluster actualbudget HTTP API and prefill the Annual spending field on the WhatIf form. Vault keys seeded manually ahead of this commit; ESO has already synced the K8s Secret. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-10 13:03:18 +00:00
Viktor Barzin	203a71768d	x402: consolidate to a single shared forwardAuth gateway The per-site `x402_instance` module created one Deployment + Service + PDB per protected host (9 in total, 9×64Mi). Every pod was running the exact same logic with the same config — the only thing that varied was the upstream URL, which we don't even need since the gateway can return 200 to "allow" and Traefik handles the upstream itself. Refactor to the same pattern as `ai-bot-block`: * single deployment + service in `traefik` namespace, 2 replicas, HA * Traefik `Middleware` CRD `x402` (forwardAuth → x402-gateway:8080/auth) * each consumer ingress just appends `traefik-x402@kubernetescrd` to its middleware chain via `extra_middlewares` x402-gateway gains a `MODE=forwardauth` env var that returns 200 (allow) or 402 (with x402 PaymentRequiredResponse body) instead of reverse- proxying. Image: ghcr ... f4804d62. Pod count: 9 → 2 (78% memory saved). All 9 sites verified still serving the Anubis challenge to plain curl with identical TTFB. DRY_RUN until `var.x402_wallet_address` is set on the traefik stack. Removes `modules/kubernetes/x402_instance/` (dead code now).	2026-05-10 10:54:38 +00:00
Viktor Barzin	786f0434cb	x402: deploy payment gateway in front of Anubis on all 9 public sites Adds modules/kubernetes/x402_instance/ — a small Go reverse proxy (forgejo.viktorbarzin.me/viktor/x402-gateway:ce333419) that selectively issues HTTP 402 Payment Required to declared AI-bot User-Agents and validates X-PAYMENT headers against a Coinbase x402 facilitator. Browsers are forwarded transparently to Anubis (which then handles the JS PoW gate as before). Wired into all nine Anubis-fronted sites: ingress -> x402-X -> anubis-X -> backend While `wallet_address` is empty the gateway runs in DRY_RUN — every request is transparent-proxied, no 402s issued. This lets the pod sit in the request path with zero behavioural impact today; flipping the wallet variable in the per-stack module call activates payment-required mode for AI-bot UAs. Default config: Base mainnet USDC, $0.01/req, x402.org/facilitator, catch-all UA list (ClaudeBot\|GPTBot\|Bytespider\|meta-externalagent\| PerplexityBot\|GoogleOther\|cohere-ai\|Diffbot\|Amazonbot\| Applebot-Extended\|FacebookBot\|ImagesiftBot\|YouBot\|anthropic-ai\| Claude-Web\|petalbot\|spawning-ai\|scrapy\|python-requests). Verified post-apply: 9/9 pods Running, all 9 sites still serve the Anubis challenge to plain curl with identical TTFB, x402 logs confirm "dry_run":true on every instance.	2026-05-10 02:26:10 +00:00
root	c7b996a558	Woodpecker CI deploy [CI SKIP]	2026-05-10 01:25:35 +00:00
Viktor Barzin	3c340c1796	anubis: re-protect f1 with a per-host policy that allows JSON routes Earlier f1 revert left the host fully unprotected (no Anubis, exclude_crowdsec=true on the ingress already). Re-add Anubis with a custom policy_yaml that: - ALLOWs /_app/* (SvelteKit immutable JS/CSS chunks loaded before any cookie exists), /openapi.json, /docs, /api/* (FastAPI meta). - ALLOWs the 9 known JSON/proxy routes (schedule, streams, embed, embed-asset, extract, extractors, health, proxy, relay) so the SvelteKit SPA's XHRs return JSON instead of the challenge HTML. - Catch-all CHALLENGE for everything else — the SPA HTML pages (which fall through to FastAPI's `/{path}` catch-all) get the PoW gate. The ALLOWed JSON routes are technically scrapeable by a determined bot, but the user's stated goal is "avoid accidental scrapes" — the HTML/SPA is the AI-training target, and that stays gated. Verified: / → Anubis challenge HTML; /schedule, /streams → JSON; /_app/.../app.js → text/javascript; ClaudeBot UA → Anubis deny page.	2026-05-10 01:24:50 +00:00
Viktor Barzin	b5f48e7b99	anubis: pull f1 off Anubis (XHR-vs-challenge collision) + add latency alerts f1.viktorbarzin.me is a SPA whose JS fetches /schedule, /embed, /embed-asset, … on the same path tree. With Anubis fronting `/`, those XHRs land on the challenge HTML even when the cookie should be valid, breaking the page with `Unexpected token '<', "<!doctype " ... is not valid JSON`. Removed Anubis from f1 — would need a path carve-out (the way wrongmove does for /api) to re-enable. Added a top-of-block comment so future me remembers why. Plus four new Prometheus alerts in `Slow Ingress Latency` group (stacks/monitoring/.../prometheus_chart_values.tpl): - IngressTTFBHigh (warn, 10m, avg latency >1s) - IngressTTFBCritical (crit, 5m, avg latency >3s) - IngressErrorRate5xxHigh (crit, 5m, 5xx >5%) - AnubisChallengeStoreErrors (crit, 5m, any 5xx on anubis services via Traefik — proxies for the in-pod challenge-store error since Anubis itself only exposes Go-runtime metrics) Notes from the alert author: avg-not-p95 because the existing Prometheus scrape config drops traefik bucket series; once those are restored, swap to histogram_quantile(0.95). TraefikDown inhibit rule extended to suppress these four during a Traefik outage.	2026-05-10 01:01:52 +00:00
Viktor Barzin	efd28ccce5	anubis: fix 500 on multi-replica + roll out to 6 more public sites Browser visits to viktorbarzin.me started returning HTTP 500 with `store: key not found: "challenge:..."` in pod logs. Root cause: each Anubis pod stores in-flight challenges in process memory; with 2 replicas behind a ClusterIP, the PoW-solved request can be routed to a different pod than the one that issued the challenge. Anubis upstream documents the same caveat ("when running multiple instances on the same base domain, the key must be the same across all instances" — true for the ed25519 signing key, but the challenge store is still pod-local without a shared backend). Drop module default replicas: 2 → 1. Worst-case: ~1s cold-start on pod restart. Real fix (Redis-backed challenge store) noted as a follow-up in CLAUDE.md. Roll Anubis out to: f1-stream, cyberchef (cc), jsoncrack (json), privatebin (pb), homepage (home), real-estate-crawler (wrongmove UI only — `/api` ingress stays direct via path-based ingress carve- out so XHRs from the SPA bypass the challenge). End-state: 9 public hosts now Anubis-fronted (blog, www, kms, travel, f1, cc, json, pb, home, wrongmove). All return the challenge HTML to bare curl/browser; verified-IP search engines and /robots.txt + /.well-known still skip via the strict-policy allowlist.	2026-05-10 00:50:30 +00:00
Viktor Barzin	c73cd26a73	fire-planner: dual ingress — /api/* unprotected, / behind Authentik The SPA can't carry an Authentik session on its own fetch() XHRs in all cases (cross-origin redirect to authentik.viktorbarzin.me on a stale cookie returns HTML, fetch().json() parse fails). Splitting the ingress so /api/ paths skip forward-auth lets the React app talk to its API end-to-end. The browser still has to log in via Authentik to load the SPA at /. Verified end-to-end via chrome-service Playwright: dashboard load, scenario list, what-if run with real Monte Carlo, save-as-scenario round-trip, run-now on detail, delete — all pass. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-10 00:06:40 +00:00
Viktor Barzin	f48da84770	anubis: per-site PoW reverse proxy on blog + kms + travel-blog Adds modules/kubernetes/anubis_instance/ — a per-site reverse proxy instance pinned to ghcr.io/techarohq/anubis:v1.25.0. Each instance issues a 30-day JWT cookie scoped to viktorbarzin.me after a tiny proof-of-work (difficulty 2 ≈ 250 ms desktop / 700 ms mobile). The shared ed25519 signing key (Vault: secret/viktor → anubis_ed25519_key) makes a single solve good across every Anubis-fronted subdomain. Wired into blog (viktorbarzin.me + www), kms.viktorbarzin.me, and travel.viktorbarzin.me — each with anti_ai_scraping=false on the ingress so the redundant ai-bot-block forwardAuth is dropped from the chain. Skipped forgejo (Git/API clients can't solve PoW) and resume (replicas=0). Also tightens bot-block-proxy nginx timeouts (3s/5s → 100ms/200ms) so any ingress still using the ai-bot-block forwardAuth pays at most ~150 ms when poison-fountain is scaled down, instead of 3 s. End-to-end TTFB on viktorbarzin.me dropped from ~3.2 s to ~150-200 ms. Docs: .claude/reference/patterns.md "Anti-AI Scraping" updated to 4 layers; .claude/CLAUDE.md adds the Anubis usage paragraph and Forgejo/API caveat.	2026-05-10 00:06:21 +00:00
Viktor Barzin	5ea89ebcd7	postiz: disable signups (DISABLE_REGISTRATION=true) Admin account already exists; we don't want random users registering on the public-facing instance. Sign-in only from now on.	2026-05-10 00:02:27 +00:00
Viktor Barzin	ee8dd2f36c	fire-planner: ingress port 8080 (was defaulting to 80) ingress_factory's port var defaults to 80, but fire-planner publishes on 8080. Traefik logged 'Cannot create service error="service port not found"' and 404'd every request. Cloudflare's standard origin-error decoy page (with the noindex meta + cdn-cgi/content honeypot link) made it look like a bot-block, but it was just the upstream coming back 404. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-09 23:33:11 +00:00
root	bb54b176ac	Woodpecker CI deploy [CI SKIP]	2026-05-09 22:14:19 +00:00
Viktor Barzin	572d6cd8e0	kms: deploy slack-notifier sidecar with Prometheus metrics + document public exposure Slack notifier now also exposes /metrics on :9101 with stdlib HTTP — counts activations and dedup-skips by product, gauges last-activation timestamp. Pod template gets the standard prometheus.io/scrape annotations so the cluster-wide kubernetes-pods job picks it up via pod IP. Memory request bumped to 48Mi to cover counter dicts + HTTPServer. Plus docs: networking.md footnotes the windows-kms row noting public WAN exposure with the rate-limited (max-src-conn 50, max-src-conn-rate 10/60, overload <virusprot> flush) pfSense filter rule, and a new runbook covers log locations, rate-limit tuning, and how to revoke the WAN forward. The matching pfSense rule was tightened in place (TCP-only + rate limits) via SSH; pfSense isn't Terraform-managed.	2026-05-09 22:12:46 +00:00
Viktor Barzin	cfe969fe43	backup: fix daily-backup silent failures, postiz pg_dump CronJob, doc reconcile daily-backup ran out of its 1h budget and SIGTERMed for 10 days straight (Apr 30 → May 9). Each failed run left its snapshot mount stacked on /tmp/pvc-mount, which blocked the next run from completing — root cause of the WeeklyBackupStale alert going silent (the metric never reached its end-of-script push). Fixes: - TimeoutStartSec 1h → 4h (current workload of 118 PVCs needs ~1.5h, was hitting the wall during week 18 runs) - Recursive umount + LUKS cleanup on EXIT trap, plus the same at script start as belt-and-braces for any inherited stuck state from a prior crashed run - TERM/INT trap pushes status=2 metric so WeeklyBackupFailing fires instead of the alert going blind on systemd kills - pfsense metric pushed in BOTH success and failure paths (was only on success; any ssh-to-pfsense outage made PfsenseBackupStale silent until the alert threshold expired) Postiz backup CronJob: bundled bitnami PG/Redis live on local-path (K8s node OS disk) — outside Layer 1+2 of the 3-2-1 pipeline. Added postiz-postgres-backup that pg_dumps postiz + temporal + temporal_visibility daily 03:00 to /srv/nfs/postiz-backup, getting Layer 3 offsite coverage. Verified end-to-end: 3 dumps written, Pushgateway metric received. Note: bitnamilegacy/postgresql image is stripped (no curl/wget/python) — switched to docker.io/library/postgres matching the dbaas/postgresql-backup pattern with apt-installed curl. Doc reconcile (backup-dr.md): metric names had drifted (e.g. the docs claimed backup_weekly_last_success_timestamp but the script pushes daily_backup_last_run_timestamp). Updated to match what's actually emitted, and added a "default-covered" footnote to the Service Protection Matrix so the ~40 services with PVCs not enumerated in the table are no longer ambiguous. Manual PVE-host actions (out-of-band, not in TF): - unmounted 6 stacked snapshots from /tmp/pvc-mount - pruned 5 stale snapshots on vm-9999-pvc-67c90b6b... (origin LV that the loop got SIGTERMed against repeatedly, so prune kept failing) - created /srv/nfs/postiz-backup directory - triggered a one-shot daily-backup run with the new TimeoutStartSec to validate the fix end-to-end Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-09 17:41:04 +00:00
Viktor Barzin	8f0502230b	grafana: env-var datasources + reloader so Vault rotations stop breaking dashboards Wealth, Payslips, and Job-Hunter Grafana datasources all baked the rotating PG password into their ConfigMap at TF-apply time, so every 7-day Vault static-role rotation silently broke the panels until a manual `terragrunt apply`. Same family as the recurring grafana-mysql backend bug — Grafana caches creds at startup and never picks up the new ESO-synced password without a restart. Fix: - Each source stack now creates an ExternalSecret in `monitoring` exposing the rotating password as `<NAME>_PG_PASSWORD` env-var. - Grafana mounts those via `envFromSecrets` (optional=true so a missing source stack doesn't block boot) and the datasource ConfigMaps reference `$__env{<NAME>_PG_PASSWORD}` instead of a literal password. - `reloader.stakater.com/auto: "true"` on the Grafana pod restarts it whenever any of the four DB-cred Secrets is updated. Tested end-to-end: forced `vault write -force database/rotate-role/ pg-wealthfolio-sync` → ESO synced (~30s) → reloader fired → Grafana booted with new env in ~50s total → all three /api/datasources /uid/*/health endpoints return "Database Connection OK". Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-09 17:38:38 +00:00
Viktor Barzin	f9f19e4c54	mysql: bump to 4Gi limit / 3Gi request; grow /srv/nfs LV to 3 TiB mysql-standalone OOMKilled May 8 18:05 (anon-rss 2 GB at the 2 Gi limit). innodb_buffer_pool_size=1Gi plus connection buffers and InnoDB internals don't fit in 2 Gi. Bumping limit to 4 Gi (request 3 Gi) leaves headroom without changing the buffer pool config. /srv/nfs was at 90% (1.7T / 2T); grew the underlying pve/nfs-data LV 1 TiB online and ran resize2fs (now 60% used). Triggered by surfacing during the 2026-05-09 IO-pressure post-mortem; thinpool had ~4.6 TiB free. The post-mortem also covers the stale-NFS-client trigger (legacy /usr/local/bin/weekly-backup pointing at the decommissioned TrueNAS IP) and the resulting wedged kthread on the PVE host. Script removed and node_exporter restarted out-of-band; kthread will clear at next PVE reboot. See docs/post-mortems/2026-05-09-io-pressure-stale-nfs.md. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-09 17:01:57 +00:00
Viktor Barzin	dd69dff3a9	ig-poster: bump to da5b4191 (auto-curate from recent favorites)	2026-05-09 13:57:08 +00:00
root	f89205a979	Woodpecker CI deploy [CI SKIP]	2026-05-09 13:39:14 +00:00
Viktor Barzin	47bb175a4f	n8n: real-time training loop + decoupled posting instagram-approval: after every tap, immediately fetch /candidates?limit=1 and send the next photo as a fresh inline-keyboard message — the user's tap chains back into this same workflow, so the loop is user-paced. When the pool is exhausted, send an 'all caught up' summary with the backlog count + cumulative training stats. instagram-discover: cron throttled from every-30-min to daily 09:00. The chain handles ongoing training; the daily run only kickstarts a session if the user hasn't been tapping. Limit reduced from 3 → 1 so each kickstart sends a single photo (chain takes over).	2026-05-09 13:37:53 +00:00
Viktor Barzin	77a84ae5e0	ig-poster b17a9737 + n8n discover rewritten to use /candidates with CLIP scoring	2026-05-09 13:31:28 +00:00
Viktor Barzin	b57b29f9f6	ig-poster: bump to 3b862fe4 (EXIF orientation + auto-pending /candidates)	2026-05-09 13:23:24 +00:00
Viktor Barzin	a9ee9cee60	ig-poster: 69e395f2 + sync IMMICH_PG_* via ESO for CLIP scoring; postiz publish-notify n8n workflow	2026-05-09 13:16:24 +00:00
Viktor Barzin	f61e7c9bfc	ig-poster: bump to cac6fa97 + sync POSTIZ_INTEGRATION_ID via ESO	2026-05-09 12:40:35 +00:00
Viktor Barzin	36d5cebb5c	postiz: expose /uploads publicly so Meta IG fetcher can pull JPEGs Stories+feed posts via Postiz failed with state=ERROR and Postiz mistranslated the cause as 'Invalid Instagram image resolution max: 1920x1080px'. Real cause: Postiz hands Meta an upload URL under https://postiz.viktorbarzin.me/uploads/... and Meta gets a 302 to the Authentik login page instead of bytes. Meta returns error 36001 (image not fetchable) which Postiz maps to that misleading resolution string. Split the ingress: /uploads/* on a public ingress (matches the instagram-poster /image+/original pattern), everything else remains behind Authentik forward-auth. /uploads contents are random UUIDs, low blast radius if scraped.	2026-05-09 12:29:39 +00:00
Viktor Barzin	e22c023c8c	postiz: wire INSTAGRAM_APP_ID/SECRET via ESO for IG-standalone provider Standalone provider (instagram-standalone OAuth flow) is what the user is trying after the FB-Login path was blocked by their Business Account ad-policy flag. Uses modern scope names (instagram_business_*), so no JS patch needed unlike the FB-Login provider.	2026-05-09 11:43:01 +00:00
Viktor Barzin	ff4cca73b9	chore: remove decommissioned registry.viktorbarzin.me ingress The old port-5050 R/W private registry was decommissioned 2026-05-07 (forgejo-registry-consolidation Phase 4). The reverse-proxy ingress + ExternalName service + Cloudflare DNS record kept pointing at the dead backend, returning 502 to anyone hitting registry.viktorbarzin.me. This was driving 3 monitoring artifacts that auto-cleared on cleanup: - Uptime Kuma external monitor #586 (deleted) - Pushgateway stale registry-integrity-probe metrics (deleted) - ExternalAccessDivergence + RegistryIntegrityProbeStale alerts	2026-05-09 11:03:51 +00:00
Viktor Barzin	9d5da4d8e0	fix: restore pvc-autoresizer by allow-listing kubelet_volume_stats_available_bytes The Prometheus scrape config for the kubernetes-nodes job kept capacity_bytes + used_bytes but dropped available_bytes. pvc-autoresizer computes utilization from available/capacity, so without that metric it was silent for every PVC in the cluster — including mailserver, which filled to 89% (1.7G/2.0G) and started rejecting all inbound mail with '452 4.3.1 Insufficient system storage' (15+ hours, all real senders: Brevo, Gmail, Facebook). Also bumps the floors of mailserver (2Gi -> 5Gi, limit 10Gi) and forgejo (15Gi -> 30Gi) PVCs to recover from the immediate outage, and adds ignore_changes on requests.storage so future autoresizer expansions don't cause TF drift.	2026-05-09 10:49:17 +00:00
Viktor Barzin	352586f711	ig-poster: pivot to Telegram-only delivery (manual IG upload) User dropped Postiz/Instagram OAuth (Meta Business Account flagged + Postiz scope drift). New pipeline ends at Telegram — full-quality JPEG delivered to the bot chat, manually uploaded to IG by the user. - Image bumped to 25e46efd: adds /deliver/{asset_id} endpoint that multipart-uploads to Telegram (URL-fetch fails through Cloudflare for >5MB), then tags 'posted' in Immich. - ESO now syncs telegram_bot_token + telegram_chat_id from Vault. - Public ingress paths grow to ['/image', '/original'] (Authentik bypass on /original is harmless — files are user-tagged, low blast radius — and useful for ad-hoc browser downloads). - Memory limit 512Mi -> 1500Mi: full-resolution Pillow HEIC decode was OOMing on 12MP+ phone photos. - discover.json simplified to scan -> deliver per item; approval and post workflows already deactivated. Telegram bot webhook removed.	2026-05-09 10:45:02 +00:00
Viktor Barzin	c2e61cdf31	postiz: wire FACEBOOK_APP_ID/SECRET via ESO for IG-Business integration	2026-05-09 09:19:43 +00:00
Viktor Barzin	60dd6c61b5	postiz: idempotent Job to drop default Text search attributes (Temporal SQL visibility caps at 3 Text attrs; auto-setup ships with 2, Postiz adds 2 more — gitroomhq/postiz-app#1504 )	2026-05-09 09:16:07 +00:00
Viktor Barzin	d3db617b1b	postiz: bump memory limit to 4Gi (was OOMing during NestJS startup)	2026-05-09 09:12:39 +00:00
Viktor Barzin	8c4a370a34	postiz: add Temporal sidecar; lock both stacks behind Authentik Postiz backend was crashlooping on connect ECONNREFUSED ::1:7233 — Postiz needs Temporal for cron/scheduled posts and the Helm chart doesn't bundle it. Added a single-replica temporalio/auto-setup:1.28.1 Deployment in the postiz namespace, backed by the bundled postiz-postgresql (separate `temporal` + `temporal_visibility` databases pre-created via init container), ENABLE_ES=false (Postiz only uses the workflow engine, not visibility search). Skips DYNAMIC_CONFIG_FILE_PATH because that file isn't bundled in auto-setup. Auth audit: - postiz: ingress now `protected = true` (Authentik forward-auth). Postiz also has its own login on top, but registration is no longer exposed to the open internet. - instagram-poster: split into two ingresses on the same host. `/image/*` stays public (Meta + Telegram fetch the 9:16 derivatives). Everything else (/healthz, /queue, /scan, /enqueue, /reject, /post-next) sits behind Authentik. The protected ingress sets dns_type=none — the public one already created the CF DNS record.	2026-05-09 09:08:21 +00:00
Viktor Barzin	7f7698991e	postiz + n8n: real DB URL + webhook-trigger approval - postiz: set DATABASE_URL/REDIS_URL pointing at the bundled subcharts; the chart does NOT auto-wire even when postgresql.enabled=true, so the prisma db:push was failing with empty DATABASE_URL. - n8n approval workflow: swap telegramTrigger -> webhook node so it works without an n8n-stored Telegram credential. Telegram bot's webhook is set via setWebhook to https://n8n.viktorbarzin.me/webhook/instagram-approval. Parse-callback Code node tolerates both shapes ({body:{callback_query:...}} vs {callback_query:...}) so a future move back to telegramTrigger doesn't break.	2026-05-09 00:55:19 +00:00
Viktor Barzin	71e3439650	postiz + instagram-poster: deploy fixes after first apply - postiz: pin chart name to 'postiz-app' (was 'postiz', wrong path) and override bundled bitnami subchart images to bitnamilegacy/* — Bitnami removed bitnami/postgresql + bitnami/redis from DockerHub in Aug 2025 (Broadcom acquisition). - postiz: enable initial registration (DISABLE_REGISTRATION=false) so first admin user can be created in UI; tighten after. - instagram-poster: add securityContext (fsGroup/runAsUser=10001) so kubelet chowns the PVC mount for the non-root 'poster' user; was crashing on alembic with 'unable to open database file'. - instagram-poster: bump image_tag to 24935ab4 (uvicorn now binds to port 8000 to match Service contract; was 8080 -> probe 404).	2026-05-09 00:47:14 +00:00
Viktor Barzin	d5a01b6ad2	instagram-poster: pin image tag to 23f8b4ed (initial push)	2026-05-09 00:09:58 +00:00
Viktor Barzin	7a93e09d2f	add postiz + instagram-poster stacks for IG Stories pipeline New stacks: - stacks/postiz/ — Postiz scheduler (Helm chart v1.0.5, image v2.21.7) with bundled PG/Redis, /uploads PVC on proxmox-lvm, JWT_SECRET via ESO from secret/instagram-poster. - stacks/instagram-poster/ — custom Python service that polls Immich for the 'instagram' tag, reformats photos to 9:16 with blurred-bg letterbox, exposes /image/<asset_id> publicly so Postiz can fetch. Image: forgejo.viktorbarzin.me/viktor/instagram-poster. n8n: 3 new workflows (discover, approval, post) for the Telegram inline-button approval UX. Adds ExternalSecret + env vars for TELEGRAM_BOT_TOKEN, TELEGRAM_CHAT_ID, IMMICH_API_KEY, plus static URLs for the new service. Vault: seed secret/instagram-poster with telegram_bot_token, telegram_chat_id, immich_api_key, postiz_api_token, postiz_jwt_secret before applying.	2026-05-09 00:07:44 +00:00
Viktor Barzin	c97f0497dd	openclaw: regenerate kubeconfig at pod start using projected SA tokenFile The previously-baked kubeconfig at /home/node/.openclaw/kubeconfig retained a service-account token bound to the original (long-dead) pod, so kubectl calls from inside the openclaw container failed with "the server has asked for the client to provide credentials" even though the openclaw SA has cluster-admin and kubelet projects a fresh token at /var/run/secrets/kubernetes.io/serviceaccount/token. Add init-container "setup-kubeconfig" that writes a kubeconfig with tokenFile + certificate-authority paths pointing at the projected SA volume — kubelet auto-rotates the token, kubectl always reads fresh creds, no Vault K8s-creds-engine refresh needed. Verified end-to-end: agent ran `kubectl get nodes -o wide` inside the pod and delivered a correct one-line summary to Telegram via openai-codex/gpt-5.4-mini.	2026-05-08 08:07:38 +00:00
Viktor Barzin	de96c54456	[woodpecker] Re-fix null_resource trigger after lint reverted it The helm provider in this Terraform version doesn't support list-index access on helm_release.metadata[0]. Switch the woodpecker_server_host_alias trigger to {helm_version, sha256(values)} which works regardless of provider quirks. (Original fix landed 2026-05-07; got reverted by a linter pass.)	2026-05-08 07:42:56 +00:00
Viktor Barzin	b91ae22edb	f1-stream: register HmembedsExtractor in registry Companion commit to `92474254` — the new extractor wasn't being registered, only the file was added. Add the import + register call in create_registry(). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-07 23:47:50 +00:00
Viktor Barzin	92474254e6	f1-stream: hmembeds offline decoder — reverse-engineered the JW Player trap Four-agent parallel investigation finally pinned down what's happening with the hmembeds.one streams. The TL;DR is unexpected: there is no fingerprint check, no decoder failure, no broken JS — the obfuscated decoder is trivial to reproduce, but the upstream origin is dead. Findings (saved at /tmp/jwre/{findings.md, blob-analysis.md, fingerprint-gap.md, trace-summary.md}): 1. The "ZpQw9XkLmN8c3vR3" blob is decoy. It's an Adcash adblock- bypass config — not the stream URL. The actual stream URL is in a different inline `<script>` block of the embed HTML. 2. The real decoder is base64 + XOR with a hardcoded key, the key appears literally in the HTML (e.g. `var k="bux7ver6mow4trh1"`). No browser-derived inputs. We can run it in Python in 50µs. 3. The decoded URL is JWT-bound to /24 of the requestor's IP. JWT payload: `{stream, ip:"176.12.22.0/24", session_id, exp}`. From our cluster (egress 176.12.22.76) the JWT IP-binding is satisfied. 4. The origin still returns 404 (GET) / 403 (HEAD). Tested both curated embeds (Sky F1 888520f3..., DAZN F1 fc3a5463...) — same 404. Origin landing page (`/`) returns 200, so the host is up; the `/sec/<JWT>/<embed_id>.m3u8` endpoint specifically refuses. 5. No fingerprint surface trips this. Runtime trace via chrome-service hooks confirmed: decoder reads navigator.userAgent (heavy), screen dimensions, and a single WebGL getParameter call. No canvas, audio, fonts, fetch-to-fingerprint-API. JW Player setup is given a valid file URL — the playlist stays empty because JW can't fetch the manifest from the (dead) origin. Verdict: the legacy curated hmembeds embeds (`888520f3...` Sky F1, `fc3a5463...` DAZN F1) are upstream-dead. No browser-side fix is possible. The community uses these IDs as "24/7 channels" but they're in a perpetually-offline state right now. This commit ships the offline decoder anyway, registered as a new extractor. Two reasons: - If those origins come back online, no code change needed. - Future curated hmembeds IDs (added by hand or discovered via subreddit posts) will resolve through the same path. Files added: `extractors/hmembeds.py` (~120 lines incl. the decoder and a `decode_embed(html) -> str \| None` helper that's reusable). Registered in `__init__.py`. The existing CuratedExtractor stays disabled; this replaces its mechanism with one that can absorb new embed IDs without code changes. Bonus from the agent work: - Confirmed our stealth.js is sufficient — the runtime trace showed the decoder reads only the surfaces we already cover. - Identified ~10 fingerprint surfaces we don't spoof (platform, userAgentData, hardwareConcurrency, deviceMemory, timezone, AudioContext, ICE candidates) but proved they're not what's blocking us, so no change needed for now. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-07 23:47:25 +00:00
Viktor Barzin	bab760b8fb	kms: replace inline ConfigMap nginx with custom Hugo image The kms-web-page deployment now pulls forgejo.viktorbarzin.me/viktor/kms-website:${var.image_tag} (source in the new Forgejo repo viktor/kms-website). The ConfigMap-mounted index.html is gone — the new site is a Hugo build with full GVLK catalog for every Microsoft KMS-eligible Windows + Office edition, copy-to-clipboard, dark/light themes. The container image tag is managed by CI (kubectl set image), so add lifecycle ignore_changes on container[0].image alongside the existing dns_config (Kyverno) ignore. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-07 23:33:26 +00:00
Viktor Barzin	60aec74acd	f1-stream: Stremio addon extractor — TvVoo + StremVerse Sky F1 / DAZN F1 5 parallel research agents surveyed Stremio addons, F1 TV / Sky / DAZN official APIs, IPTV M3U lists, and free-to-air broadcasters. The clean finding: two community Stremio addons already index Sky Sports F1 + DAZN F1 via their public HTTP APIs — no Stremio client required, just GET /stream/<type>/<id>.json on the addon's hosted instance. New `stremio.py` extractor pulls from: - TvVoo (`https://tvvoo.hayd.uk/manifest.json`) — wraps Vavoo IPTV. Lists Sky Sports F1 UK + Sky Sports F1 HD + Sky Sport F1 IT + Sky Sport F1 HD DE + DAZN F1 ES. Returns 2 IP-bound m3u8 URLs per channel. Source: github.com/qwertyuiop8899/tvvoo. Vavoo's CDN SSL certs are currently expired so most clients fail verification today — addon framework is right but delivery is degraded. - StremVerse (`https://stremverse.onrender.com/manifest.json`) — Returns 11+ streams per id (`stremevent_591` = F1, `stremevent_866` = MotoGP). Mix of DRM-walled DASH, JW-broken-chain JWT URLs, and HuggingFace-Space proxies that 404 without a per-instance api_password. The extractor surfaces 15 candidate URLs per run; verifier filters to the playable subset. Today that subset is 0 (Vavoo cert expiry + JW chain + proxy auth), but the wiring is correct: as the addons fix delivery or rotate to fresh URLs, candidates will start passing. Other agent findings worth noting (not coded but documented): - F1 TV Pro live = Widevine DASH; impossible without a CDM. VOD is clean HLS but only post-session. - Sky Go / DAZN / Viaplay / Canal+ = all Widevine + geo-fenced + active DMCA enforcement. Pursuing not feasible. - ServusTV AT (free F1 race weekends) = clean public HLS at rbmn-live.akamaized.net/hls/live/2002825/geoSTVATweb/master.m3u8 but geo-fenced; needs an Austrian-IP egress proxy/VPN. - iptv-org/iptv has an F1 Channel (Pluto TV IE) at jmp2.uk/plu-6661739641af6400080cd8f1.m3u8 — 24/7 free, BG works, but only historic races + shoulder programming. Worth adding as a curated entry later. - boxboxbox.* (community-favourite F1 race-weekend domain) is dead across all known TLDs as of today. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-07 23:16:39 +00:00
Viktor Barzin	d3d86ce433	[woodpecker] Bump WOODPECKER_FORGE_TIMEOUT 3s → 30s The default forge-API timeout is 3 seconds. The config-loader makes 4-6 sequential calls per pipeline trigger (probing for .woodpecker dir then each .woodpecker.{yaml,yml} variant), and Forgejo responses on this cluster spike to 1-2s under load — easy to trip the cumulative 3s deadline. Result: 'could not load config from forge: context deadline exceeded' on virtually every pipeline trigger. This was the actual root cause of the 'Woodpecker forge-API bug' that v3.13 → v3.14 was supposed to fix — turns out v3.14 didn't change the timeout default, and the v3.13 successes I saw earlier were warm-cache flukes.	2026-05-07 23:10:48 +00:00
Viktor Barzin	c5a0424571	f1-stream: subreddit extractor scans r/motorsportsstreams2 (active sub) User asked specifically for r/motorsportstreams. Reddit banned that sub years ago; the active 12.5k-subscriber successor is r/motorsportsstreams2. Added it to SUBREDDITS plus r/f1streams (709 subs, public). Also extended: - SEARCH_QUERIES with three Sky Sports F1 / live-stream phrases that catch the `[F1 STREAM]` post pattern the community uses on race weekends (titles like "[F1 STREAM] Bahrain GP - Live Race \| No Buffer \| Mobile Friendly" linking to boxboxbox.pro/stream-1). - _INTERESTING_HOSTS allowlist with boxboxbox.{pro,live,lol}, pitsport.live, ppv.to, streamed.pk, acestrlms/aceztrims, and the Super Formula direct CDNs (racelive.jp, cdn.sfgo.jp) — all observed in last-50-posts on r/motorsportsstreams2. Where this leaves us, honestly: - The r/motorsportsstreams2 megathread "Where to watch every F1 race" recommends EXACTLY the four sites we already pull from: pitsport.xyz, streamed.pk, ppv.to, acestrlms. The community has the same broken JW Player chain we have for Sky Sports F1 24/7 streams. There is no free-and-working alternative they know about. - boxboxbox.pro (the most-promoted F1 stream domain in race-weekend posts) is currently NXDOMAIN; .live is parked, .lol unreachable. The domain rotates after takedowns; Reddit posts will surface fresh ones when posters share them. - For F1 specifically: extractor surfaces 2 motomundo.net candidates (MotoGP wrappers) and lights up to ~6+ during F1 race weekends as posters share fresh boxboxbox/equivalent URLs. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-07 22:42:51 +00:00
Viktor Barzin	4326f09baa	monitoring(wealth): monthly contrib-vs-mkt as line chart, not bars User asked for two lines instead of side-by-side bars at monthly granularity. Converts panel 25 from barchart to timeseries: * type: barchart -> timeseries * format: table -> time_series, SELECT month::timestamp AS time * drawStyle line, lineWidth 2, fillOpacity 0, showPoints auto * Same blue (contributions) / green (market gain) colour overrides Where the green line rises above the blue line is the visual cue that the market out-earned new contributions for that month -- the trend the user wants to track. Diff is small (15 ins / 28 del) because the bar-chart-only fields (barRadius, barWidth, groupWidth, stacking, xField, xTickLabelRotation) are dropped.	2026-05-07 22:41:46 +00:00
Viktor Barzin	a2e9a0fecf	monitoring(wealth): monthly contributions vs market gain bar chart Goal stated by user: see when monthly market gain starts to exceed monthly contributions, i.e. the inflection point where the market is out-earning savings rather than the other way around. New panel id=25 between the annual decomposition (13) and per-account ROI (14): bar chart with two side-by-side bars per month -- contributions (blue) and market gain (green). Same calculation as panel 13 but month-grain instead of year-grain. Months where the green bar dwarfs the blue one are visible at a glance. SQL: same endpoints CTE pattern as panel 13, with date_trunc('month', valuation_date) as the grouping key. Uses max_complete cutoff so partial-today doesn't skew the latest month. Layout: panels at y >= 75 shifted down by 11 (chart height). New chart at y=75; panel 14 (per-account ROI) -> y=86; panel 10 (activity log) -> y=96. Spot check (recent months from PG): 2025-07: contrib +£5,601 market +£42,295 <- big market month 2025-09: contrib +£1,501 market +£24,206 2026-02: contrib +£35,501 market +£41,382 2026-03: contrib +£5,501 market -£38,483 <- correction 2026-04: contrib +£73,267 market +£21,448	2026-05-07 22:38:31 +00:00
Viktor Barzin	f6e764a9b0	[wealthfolio] Flip wealthfolio-sync CronJob image to Forgejo The CronJob has been broken since registry-private lost the wealthfolio-sync image (last successful run 36+ days ago). The image is built from /home/wizard/code/broker-sync (the brokerage data sync — Trading 212, Schwab, Fidelity, IMAP-CSV → wealthfolio). Set up: viktor/broker-sync repo on Forgejo with .woodpecker/build.yml that pushes to forgejo.viktorbarzin.me/viktor/wealthfolio-sync. Until Woodpecker recognises the new repo's webhook, the image was bootstrapped via 'docker pull viktorbarzin/broker-sync:latest && docker tag … && docker push forgejo.viktorbarzin.me/viktor/wealthfolio-sync:latest' so the CronJob unblocks immediately.	2026-05-07 22:38:14 +00:00
Viktor Barzin	8f1ee9a55f	[woodpecker] Bump server + agent v3.13.0 → v3.14.0 Fixes the 'could not load config from forge: context deadline exceeded' issue that blocked every Forgejo-triggered pipeline during the forgejo-registry-consolidation cutover. Helm chart 3.5.1 stays (no 3.6 yet); only the image tag overrides change.	2026-05-07 22:28:56 +00:00
Viktor Barzin	48e504bbb0	[claude-memory] Restore truncated main.tf — apply Phase 3 image flip on full file The Phase 3 commit `3148d15d` ran into a disk-full ENOSPC during edit of stacks/claude-memory/main.tf, and the file was committed truncated at line 286 mid-string ('Cor instead of 'Core Platform' / closing braces). terraform validate failed with 'Unterminated template string'. Restoring the trailing 2 lines + re-applying the viktorbarzin/claude-memory-mcp:17 → forgejo.viktorbarzin.me/viktor/ claude-memory-mcp:17 cutover that Phase 3 was meant to do.	2026-05-07 19:05:34 +00:00

1 2 3 4 5 ...

831 commits