infra

Author	SHA1	Message	Date
Viktor Barzin	d76b5dbc4b	priority-pass: backend c2b4ac50 — crop to card before transforming Three fixes for boarding passes uploaded as iPhone screenshots (input includes phone status bar, partial Tesco card below, etc.): 1. Detect the card region first and crop to it. All proportional coordinates (Step 8 text replacement, Step 9 logo removal) are now card-relative instead of full-image-relative — they were landing in the wrong region on tall screenshots, putting "Priority" text inside the QR area and leaving a yellow icon box at the bottom. 2. Step 8 now picks the LONGEST contiguous dark-row run inside a wider y-band, instead of using the dark-row [first, last] span. This distinguishes the QUEUE value text from the QUEUE label above it (both are dark blue in the original) so the erase rectangle no longer eats into the labels. 3. QR container padding bumped 8% → 12% so QR/container ratio matches the ~74-80% golden look. Verified end-to-end against three real samples saved by the previous build's training-data feature, plus the original non-priority.jpeg fixture: outputs now match priority.jpeg layout. [ci skip]	2026-05-01 19:06:02 +00:00
Viktor Barzin	40a6cd067b	authentik: long-lived authenticated sessions, short-lived anonymous ones - Adopt UserLoginStage (default-authentication-login) into Terraform and pin session_duration=weeks=4 so users stay logged in across browser restarts. There is no Brand.session_duration in 2026.2.x; UserLoginStage is the only correct lever. - Cap anonymous Django sessions at 2h via AUTHENTIK_SESSIONS__UNAUTHENTICATED_AGE on server + worker pods (default is days=1). Bots, healthcheckers, and partial flows now get reaped within 2h instead of accumulating for a day. Implementation note: the env var is injected via server.env / worker.env rather than authentik.sessions.unauthenticated_age, because authentik.existingSecret.secretName is set, which makes the chart skip rendering its own AUTHENTIK_* Secret. authentik.* values are therefore inert in this stack -- this is documented in .claude/reference/authentik-state.md so future edits use the right surface. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-01 19:03:50 +00:00
Viktor Barzin	dfbf6faf3d	priority-pass: backend f4246691 (QR fit fix + persist uploads), add encrypted PVC Backend changes: - transformers.py: QR container now sized to actual qr_bbox + 8% padding (was fixed at 45% of card width). When QR was wider than 45% of card, the leftover-pixel branch color-remapped QR pixels outside the container, breaking the scan. New container always encloses qr_mask. - main.py: persist input + output + json metadata under $UPLOAD_DIR/<airline>/<ts-uuid>-{input.<ext>,output.png,*.json} for future training. Failure to save is logged, never breaks the API. Infra: - New PVC priority-pass-uploads (1Gi proxmox-lvm-encrypted, 10Gi autoresize cap) — encrypted because boarding passes contain PII. - Deployment strategy → Recreate (RWO requirement). - Volume + volumeMount + UPLOAD_DIR env on backend container. Applied via kubectl (TF state for this stack is empty — see prior commit). New pod priority-pass-77956b64fb rolled out, PVC bound, test transform succeeded, sample written to /data/uploads/ryanair/. [ci skip]	2026-05-01 18:50:51 +00:00
Viktor Barzin	ce7a584801	priority-pass: frontend ea9176f8 (gallery upload), sync backend pin to live Frontend bug fix: photo input forced camera on mobile via capture="environment". Added separate "Choose from Gallery" / "Take Photo" buttons so users can pick from their photo library. Backend image unchanged; pin synced from stale v8 to live SHA ae1420a0 (the v8 tag was never pushed to the registry). Image built locally and pushed to registry.viktorbarzin.me. The priority-pass project directory isn't under git, so deployment was applied via kubectl set image (matches existing pattern — TF state for this stack is empty). [ci skip]	2026-05-01 18:38:30 +00:00
Viktor Barzin	664a85ef1e	Revert "monitoring(wealth): show daily points + lighter fill on timeseries" This reverts commit `5472720c75`.	2026-05-01 16:24:18 +00:00
Viktor Barzin	5472720c75	monitoring(wealth): show daily points + lighter fill on timeseries Make daily movements visible on the line charts. The y-axis still spans ~£700k–£1M so an £8k daily move is ~1% of vertical range and easy to miss when only the line is drawn. Changes per panel: * 5 (Net worth): showPoints never→always, pointSize 4→5, fillOpacity 20→10 * 6 (Net contrib vs market): showPoints never→always, pointSize 4→5 * 7 (Growth over time): showPoints never→always, pointSize 4→5, fillOpacity 50→25 * 8 (Per-account stacked): showPoints never→always (kept stacking fill at 70) * 9 (Cash vs invested stacked): showPoints never→always (kept stacking fill at 70) Each daily value now renders as a visible dot, so even if the line appears flat at this scale, the per-day points trace the wiggle. Lighter fill on the unstacked panels lets the line + points dominate visually. Caveat: the fundamental "£8k on a £1M base" visibility issue is best solved with a dedicated "Daily change" delta panel — happy to add one on next pass if this isn't enough.	2026-05-01 16:23:25 +00:00
Viktor Barzin	2722260ce9	monitoring(wealth): unbreak timeseries SQL — over-escaped time alias Fix: panels 5–9 had `AS \"time\"` (literal backslash-quote sequence embedded in the SQL string). PostgreSQL parsed that as a syntax error at the leading backslash: ERROR: syntax error at or near "\" LINE 1: ...complete_dates)) SELECT valuation_date::timestamp AS \"time\" Root cause: the patch script for the skew-resilient queries (commit `628f5a0d`) used a Python f-string with `\\\"time\\\"`, which produces a literal backslash-quote in the Python string. When that string was JSON-encoded the backslash was preserved verbatim instead of collapsed to plain `"time"`. Replaces all five occurrences with the correct `AS "time"` form. Verified the corrected query against PG returns 7 daily net-worth rows for 04-25..05-01 as expected.	2026-05-01 16:19:07 +00:00
Viktor Barzin	d67416d4ca	monitoring(wealth): tighten default time range, bump decimals for granularity Two adjustments to make daily movements visible: 1. Default time range: now-5y → now-180d. The timeseries charts (Net worth, Net contribution vs market value, Growth, Per-account stacked, Cash vs invested) auto-fit their y-axis to the data range in view. Over 5 years, daily £1k–£10k moves are ~1% of axis range and visually invisible against the cumulative trend. Over 6 months, the same daily moves dominate. Yearly bar charts (12, 13) are unaffected — they aggregate by calendar year and don't filter on $__timeFilter. 2. Decimals → 2 on every currency panel (1, 2, 3, 5–9, 13, 15, 16) and every percent panel (4, 14). Stat panels now show pennies on currency and 0.01% on rates; chart y-axis ticks are likewise more precise. Honest caveat: pennies on a £1M number don't make the absolute readout easier — to see "today changed by £8,358" cleanly we'd want a dedicated delta panel; pending user direction. Widen the time picker manually to recover the 5-year view; default just zooms into the last 6 months.	2026-05-01 16:15:39 +00:00
Viktor Barzin	628f5a0d26	monitoring(wealth): skew-resilient queries, no more partial-day dips Bug witnessed 2026-05-01: dashboard "Net worth (current)" showed £88k instead of £1.03M because at 02:00 UTC an external trigger refreshed ONE account (Trading212 ISA), creating its 05-01 daily_account_valuation row. The 5 other accounts still had their last row at 04-30. The panel SQL `WHERE valuation_date = (SELECT MAX(valuation_date))` then summed only the single account that had a 05-01 row. Two new SQL patterns adopted across all 15 affected panels: 1. Stat / barchart "current snapshot" panels (1, 2, 3, 4, 11, 14, 15, 16): latest-per-account stitching — WITH latest AS (SELECT DISTINCT ON (d.account_id) ... FROM daily_account_valuation d JOIN accounts a ON a.id = d.account_id ORDER BY d.account_id, d.valuation_date DESC) gives a coherent "now" snapshot regardless of refresh skew, and the inner join filters out orphan/deleted accounts (one such was adding a stale £33k from 04-17). 12-month panels add a parallel `ago` CTE picking each account's row closest to (d_now - 12mo). 2. Time-series / yearly panels (5, 6, 7, 8, 9, 12, 13): complete-days- only filter — WITH active_accounts AS (SELECT COUNT() FROM accounts), complete_dates AS (SELECT valuation_date FROM daily_account_valuation d JOIN accounts a ON a.id=d.account_id GROUP BY valuation_date HAVING COUNT() >= active.n) so a partial today never renders as a chart dip. The day rejoins the chart automatically once the daily 16:00 UTC sync writes rows for every account. Verified end-to-end against live PG: new queries produce £1,033,734 (matches the 6 active accounts' true latest sum) where the old query gave £88k.	2026-05-01 16:08:18 +00:00
Viktor Barzin	1d3ae01aac	wealthfolio(daily-sync): API call CronJob, replaces rollout-restart Restart-only didn't refresh the wealth Grafana dashboard — verified empirically: a fresh `daily_account_valuation` row only lands when a PortfolioJob runs with ValuationRecalcMode != None, and Wealthfolio's internal schedulers don't trigger that path: - 6h quotes scheduler refreshes the `quotes` table only. - 4h broker scheduler short-circuits on missing `sync_refresh_token`. The right knob is `POST /api/v1/market-data/sync`. Replaced the rollout-restart CronJob (+ its SA/Role/RoleBinding) with a curl-based CronJob that logs in (`POST /api/v1/auth/login`) then POSTs to `/api/v1/market-data/sync` with the session cookie. Backfills missing days via IncrementalFromLast in one call. Schedule 16:00 UTC (= 17:00 BST): * After UK market close (15:30 UTC BST), EOD UK prices settled. * US market open ~2.5h, intra-day US quotes fresh. * pg-sync next :07 tick mirrors → Grafana refresh ≤5m → fresh data by ~17:12 BST, comfortably before the 18:00 BST target. Plaintext password lives in Vault `secret/wealthfolio.web_password`, flows via the existing `dataFrom.extract` ExternalSecret — no extra ESO wiring needed. Verified end-to-end: API call backfilled 04-26 through 04-29, pg-sync mirrored, PG now shows rows up to today.	2026-04-29 21:21:24 +00:00
Viktor Barzin	31b9e5d4a9	monitoring(wealth): add 12mo contrib + 12mo gain to top row Top row goes from 5 → 7 stat panels (widths 4+4+4+3+3+3+3=24): - Net worth, Net contribution, Growth shrink from w=5 to w=4. - ROI % shrinks from w=5 to w=3 (now sits at x=12). - 12mo return slides from x=20/w=4 to x=15/w=3. - New: 12mo contrib (id=15, currency, blue) at x=18 — net contributions added in the trailing 12 months. - New: 12mo gain (id=16, currency, red/green) at x=21 — pure market gain in £ over the trailing 12 months (12mo Δnet-worth − 12mo contribs). Live values verified against PG: contrib_12mo=£245k, gain_12mo=£172k, sum = £417k = nw_now − nw_ago, return = 23.51%.	2026-04-27 06:32:53 +00:00
Viktor Barzin	cd96fb64a8	phpipam-pfsense-import: every 5min → hourly Reduces 5-min disk-write spikes on PVE sdc. The cronjob was the heaviest single contributor in our hourly fan-out investigation (11.2 MB/s burst when it fired). Kea DDNS still handles real-time DNS auto-registration; phpIPAM inventory just lags by up to 1h, which we don't need fresher. Docs (dns.md, networking.md, .claude/CLAUDE.md) updated to match.	2026-04-26 22:48:43 +00:00
Viktor Barzin	6ad5292128	immich: bump server to 8Gi + override tier-2-gpu quota to 20Gi Eliminates the OOM-on-face-detection-burst class of incidents (2026-04-26). VPA upper for immich-server is 2.98Gi steady-state; the prior 4Gi limit was 1.34x upper and still got SIGKILL'd when face-detection bursts pushed transient RSS past 4Gi. 8Gi gives 2.7x VPA upper headroom. The kyverno tier-2-gpu default quota is 12Gi requests.memory which can't fit 8Gi (server) + 3.5Gi (ML) + 3Gi (PG) + backup CronJobs simultaneously. Opts the namespace into the kyverno custom-quota exclude rule and overrides with 20Gi (~4.5Gi headroom) — same pattern as woodpecker/nvidia.	2026-04-26 20:02:28 +00:00
Viktor Barzin	d093aed7f6	immich(server,ml): bump server to 4Gi + Recreate strategy on tight quota Root cause of 502/503/decode errors clustered at 19:20 BST 2026-04-26: immich-server hit its 3500Mi memory limit during a face-detection burst and was OOMKilled (Exit Code 137). VPA upperBound is 3050Mi but real-world bursts crossed it; with the single pod running both API and microservices workers, the OOM took the API down for ~30s of restart, surfacing as PlatformException image decode + 502 on uploads + 503 on ActivityService to the iOS app. Bump immich-server requests=limits to 4096Mi (per CLAUDE.md "upperBound x 1.3 for volatile workloads" rule, with headroom over the OOM mark). Quota math: 9680Mi used - 2000Mi old req + 4096Mi new req = 11776Mi, fits the tier-2-gpu 12Gi cap. Switch both immich-server and immich-machine-learning to Recreate strategy: the namespace tier-2-gpu quota is too tight for RollingUpdate to keep an old + new pod up during apply (transient 13776Mi > 12Gi cap, see "ResourceQuota blocks rolling updates" in CLAUDE.md). With single replicas and Recreate, future memory tweaks no longer require manual scale-to-0 dance. Verified: new pod has limits.memory=4Gi, quota usage stable at 11776Mi/12Gi, immich API serving normally. Note: a pending node_selector drift on immich-machine-learning (gpu=true -> nvidia.com/gpu.present=true) also reconciled in this apply; the canonical NVIDIA operator label already on the GPU node, no scheduling impact.	2026-04-26 19:11:50 +00:00
Viktor Barzin	07bc0098e3	ci(woodpecker): show full terraform error on stack apply failure The default workflow truncated the failed-stack output at `tail -5`, which only captured the trailing source-line indicator (`│ 45: resource …`) and dropped the actual `Error: …` line above it. Bump to `tail -50` so the real error is visible without re-running locally to reproduce. Also fix the pre-warm step's FIRST_STACK detection — `head -1 file1 file2 \| head -1` returns the file header (`==> .platform_apply <==`), not the first stack name, so the cd then fails with "no such file or directory". Use `cat \| head -1` instead. Pure logging-and-pre-warm change; no stacks touched, so this commit is a no-op for the apply step.	2026-04-26 18:39:46 +00:00
Viktor Barzin	215717c90f	monitoring(dashboards): tables at the bottom convention wealth: move Activity log table from y=45 to y=77; the three barcharts (Yearly return, Annual change, Per-account ROI) shift up by 14 to fill the gap. uk-payslip: move Sankey "where the money went" from y=80 to y=48 (right above the table block); the three tables (Data integrity, All payslips, YTD reconciliation) shift down by 14 so all four tables (4, 5, 6, 9) sit contiguously at the bottom. fire-planner and job-hunter still have intentional side-by-side table/chart pairings; left untouched pending user direction on whether to break them.	2026-04-26 18:30:52 +00:00
Viktor Barzin	bb28485ce0	monitoring(wealth): move 12mo return to top bar, shrink to w=4 Trailing 12-month investment return % was a full-width stat at y=59. Now sits inline with Net worth / Contribution / Growth / ROI as the fifth headline number — top-row stats reflowed from w=6 (×4) to w=5 (×4) + w=4 (×1). Title shortened to "12mo return" so it fits. Panels below the old row shifted up by 4 rows to close the gap.	2026-04-26 18:19:24 +00:00
Viktor Barzin	532285e48c	traefik: raise websecure idleTimeout 180s -> 600s for iOS Immich -1005 iOS NSURLSession held a dead TCP/TLS socket past Traefik's 180s idle close, then errored with NSURLErrorDomain -1005 on the next thumbnail. Bumping the timeout to 600s pushes the bug to "app idle for >10 min" -- much rarer in normal use. Verified with /home/wizard/.claude/immich-scroll-sim.py keepalive probe: 200s idle, mean reuse latency +1.8ms over warmup (was ~50ms TLS handshake penalty before). Synthesis: ~/.claude/immich-debug/synthesis.md.	2026-04-26 12:32:05 +00:00
Viktor Barzin	3489621a45	nextcloud(backup): pin backup pod to nextcloud's node via podAffinity The weekly backup mounts the same RWO PVC (proxmox-lvm-encrypted) as the main nextcloud deployment. Single-node attach — the backup pod can never mount the volume if it lands on a different node, and was stuck in ContainerCreating for 6+ hours when cron fired today. Add pod_affinity (required, hostname topology) so the backup co-locates with the nextcloud app pod. Discovered via cluster-health probe; manual verify run scheduled on k8s-node3 next to nextcloud's pod and completed the rsync in seconds.	2026-04-26 11:03:20 +00:00
Viktor Barzin	a24cd7ceb7	monitoring(uk-payslip): yearly receipt aligns with P60 (RSU gross) Switch the RSU stack from "after band-aware tax" to gross. Receipt total is now pre-sacrifice gross compensation; bar − pension stack ≈ ytd_gross reported on the final March payslip / P60. Verified alignment for 2025/26: bar−pension = £266,752 vs P60 ytd_gross = £268,127 — gap of £1,375 ≈ "other taxable" (benefits, overtime). Remaining year-level gaps are upstream parser/ingest issues, not dashboard logic: - 2024/25 +£27k: March 2025 payslip parsed bonus=£26,969 but never propagated it into gross_pay/income_tax. Receipt is more accurate than ytd_gross here. - 2023/24 −£36k: Feb 2024 payslip row appears to be missing from the table; ytd_gross has it, sum(gross_pay) doesn't. - 2022/23 −£10k: variant A→B transition residual. SQL simplified — band-aware CTE chain dropped (no longer needed for this panel since RSU is shown gross).	2026-04-26 10:24:06 +00:00
Viktor Barzin	d0152e1f38	crowdsec/traefik: stop captchaing legit Immich mobile bursts Mobile timeline scrubs prefetch ~100 thumbs in <1s, which exhausted the immich-rate-limit (avg=500, burst=5000) and produced a cascade of HTTP 429s. CrowdSec's local http-429-abuse scenario then fired captcha:1 on the source IP (alert #291, 2026-04-25 — owner's Hyperoptic IPv6). Two changes: - crowdsec: add a second whitelist doc (viktor/immich-asset-paths-whitelist) filtering events by Immich asset paths so they never feed leaky buckets. Auth endpoints intentionally excluded — brute-force protection unchanged. - traefik: raise immich-rate-limit avg=500->1000, burst=5000->20000 so legitimate mobile scrubs don't produce 429s in the first place.	2026-04-26 09:27:16 +00:00
Viktor Barzin	222013806d	monitoring(uk-payslip): split salary into cash + pension on yearly receipt The salary field on the payslip is pre-pension-sacrifice, so the "Salary (gross)" stack already silently included the salary-sacrifice pension contribution. Split it out so pension is explicitly visible: - Salary (cash, post-sacrifice) = salary - pension_sacrifice - Pension (salary sacrifice, untaxed) = pension_sacrifice - Bonus - RSU vest (after band-aware tax) Bar total unchanged (just relabels what was already there). Pension is now visibly counted as income — consistent with "untaxed but real" framing. Caveat documented in panel description: receipt total ≠ P60 gross because P60 reports pre-RSU-tax gross. Receipt shows RSU net of tax per earlier intent. To exactly match P60, swap rsu_after_tax → rsu_vest gross.	2026-04-26 09:18:32 +00:00
root	423aac0908	Woodpecker CI Update TLS Certificates Commit	2026-04-26 00:03:26 +00:00
Viktor Barzin	21ac619fac	monitoring(uk-payslip): promote yearly receipt + YTD gross YoY to row 4 Move both barchart/timeseries panels into row 4 (y=29, side-by-side w=12 each, h=10) so the per-tax-year overviews appear right after the income-tax-and-pension YTD row. Shift panels 13, 4, 5, 6, 8, 9 down by 10 to accommodate. Final ordering: rows 1–3 = monthly + YTD timeseries (panels 1/7/2/3/11/12), row 4 = yearly receipt + YTD gross YoY (16/17), then the wider deduction/integrity/table panels below.	2026-04-25 23:58:15 +00:00
Viktor Barzin	53f555dc61	monitoring(uk-payslip): drop 3 panels referencing undeployed data Removed: - Panel 10 "HMRC Tax Year Reconciliation — Individual Tax API" → references hmrc_sync.tax_year_snapshot schema. The hmrc-sync service / DB has not been deployed, so the panel always errored with "relation does not exist". - Panel 14 "Meta payroll: bank deposit vs payslip net pay" → references payslip_ingest.external_meta_deposits, which is created by alembic migration 0007. The deployed payslip-ingest image is at 0005, so the table doesn't exist. - Panel 15 "RSU vest reconciliation — payslip vs Schwab" → references payslip_ingest.rsu_vest_events, created by migration 0008. Same image-staleness story. Verified all 14 remaining panels return without error via Grafana /api/ds/query. SQL for the removed panels is preserved in git history; re-add when the data sources are actually deployed.	2026-04-25 23:56:03 +00:00
Viktor Barzin	b2a25775aa	monitoring(uk-payslip): simplify yearly receipt to earned-and-kept view Replace the 7-stack "where total comp went" decomposition with a 3-stack "what I actually earned" view: salary (gross), bonus (gross), and RSU vest after band-aware tax (PAYE+NI withheld via sell-to-cover). Skips income tax / NI / student loan / pension / RSU offset. Bar height = real income kept across all components. RSU is net of tax because it's withheld at source and never hits the bank account; salary and bonus are gross because they're paid in full and taxes are deducted elsewhere. This is the income-side view where tax is implicit, not the deduction waterfall. Per-year RSU after tax: 2020/21 £18k · 2021/22 £39k · 2022/23 £50k · 2023/24 £26k · 2024/25 £71k · 2025/26 £73k.	2026-04-25 23:42:20 +00:00
Viktor Barzin	a17304f735	monitoring(uk-payslip): fix empty YTD gross YoY chart Two bugs: 1. Synthetic dates projected onto 1970/71 fell outside the dashboard's default time range (now-10y → now), so Grafana filtered out every point. Switched to a sliding 12-month window (CURRENT_DATE - INTERVAL '12 months') as the projection base, plus a per-panel timeFrom: "13M" override so the panel always shows the last 13 months regardless of the dashboard's time picker. 2. ORDER BY tax_year, pay_date violated Grafana's long→wide conversion requirement (data must be ascending by time). Wrapped in a CTE and re-ordered by the synthetic time column. Pivoted result is now a single wide frame with 7 series (2019/20…2025/26).	2026-04-25 23:36:16 +00:00
Viktor Barzin	ac18c49a7b	monitoring(wealth): fix x-axis label formatting on yearly bars The default fieldConfig unit (percent on Yearly investment return %, currencyGBP on Annual change decomposition) was being applied to the "year" string column too — so x-axis labels rendered as "2024%" and "£2,024" respectively. Add field overrides on the "year" column to force unit=string. The earlier "tax_year" panels weren't affected because "2024/25" doesn't parse as a number; "2024" did.	2026-04-25 23:31:03 +00:00
Viktor Barzin	77bed10a51	monitoring: investment-only returns + YoY YTD gross line chart Wealth dashboard: - "Yearly growth %" → "Yearly investment return %": switched to modified-Dietz formula `market_gain / (nw_start + 0.5 × contributions)` so contributions don't inflate the return. New money in is excluded — this is portfolio performance, not net-worth change. - "Trailing 12-month growth %" → "Trailing 12-month investment return %": same formula, applied to the trailing 12mo window. Pre-fix vs post-fix: 2020: 155.0% → 5.12% (large contributions on small base) 2021: 344.7% → 26.45% 2022: 26.9% → -25.65% (the actual 2022 bear market) 2023: 123.2% → 41.60% 2024: 87.4% → 25.70% 2025: 46.8% → 8.43% 2026: 16.7% → 3.28% (YTD) UK Payslip dashboard: - Replaced the per-tax-year stacked bar with a year-over-year line chart: one line per tax year, X = month-of-tax-year (April→March, projected onto a 1970/71 fiscal calendar so years overlay), Y = cumulative YTD gross. Five+ lines visible at a glance for trend comparison.	2026-04-25 23:25:42 +00:00
Viktor Barzin	55d1da41f6	monitoring: more growth detail in Wealth + gross composition in UK Payslip Wealth (4 new panels at the bottom): - Trailing 12-month growth % (stat) — % change in net worth over last 12mo. - Yearly growth % (bar per calendar year) — first→last valuation each year. - Annual change decomposition (stacked bar) — splits each year's NW change into "net contributions" (new money in) and "market gain" (everything else: appreciation, dividends, FX). Answers "did I grow because I saved or because the market did the work?". - Per-account ROI % (horizontal bar) — (value − contribution) / contribution × 100, latest snapshot. Excludes accounts with zero/negative net contribution (Schwab — distorts ratio after RSU sells). UK Payslip (1 new panel below the yearly receipt): - Gross composition by tax year (stacked bar) — salary / bonus / RSU vest / other components per tax year. Bar height = gross pay. Trends in salary growth, bonus levels, and RSU vest sizing at a glance. All queries spot-checked via Grafana /api/ds/query.	2026-04-25 23:21:42 +00:00
Viktor Barzin	d48e222054	monitoring: lock Finance (Personal) folder to admin + fix cash classification Folder ACL: - Move uk-payslip + wealth dashboards to a new "Finance (Personal)" folder; job-hunter + fire-planner stay in "Finance" (open). - New null_resource calls Grafana's folder permissions API after the dashboard sidecar materialises the folder, setting an admin-only ACL ({Admin: 4}). Default Viewer/Editor inheritance is overridden, so anonymous-Viewer (auth.anonymous=true) is denied. Server-admin always retains access. - Verified: anonymous → 403 on uk-payslip + wealth, 200 on control dashboards (node-exporter); admin → 200 on all. Wealth cash fix: - Wealthfolio dumps WORKPLACE_PENSION wrappers entirely into cash_balance because it doesn't track underlying fund holdings. Reclassify pension cash as invested in the "Cash vs invested" panel so the cash series reflects actual uninvested broker cash (~£16k T212 ISA + Schwab) instead of phantom £154k. Pre-fix: cash=£153,789 / invested=£870,282 / total=£1,024,071 Post-fix: cash=£16,064 / invested=£1,008,008 / total=£1,024,071	2026-04-25 23:11:26 +00:00
Viktor Barzin	51bf38815c	vault: record Phase 3 vault Released-PV cleanup Deleted the 6 NFS PVs orphaned by the Phase 2 rolling and removed their /srv/nfs/<dir> subtrees on the PVE host (~1.5 GB; vault-2 audit log was 1.4 GB on its own). Cluster-wide Released-PV sweep on the proxmox-lvm/encrypted side stays out of scope.	2026-04-25 23:08:45 +00:00
Viktor Barzin	498400173c	wealthfolio-sync: skip the synthetic TOTAL row in ETL Wealthfolio's daily_account_valuation includes a row with account_id='TOTAL' that pre-aggregates the per-account values for that day. Mirroring it into PG verbatim caused every SUM(total_value) in the Wealth dashboard to double-count (showing ~£2M against actual ~£1M). Drop the synthetic row at the dump step so the PG mirror only holds real-account rows. Initial sync after fix: 8,649 DAV rows (was 10,798), net worth resolves to £1,024,071 — matches the per-account latest snapshot.	2026-04-25 22:59:24 +00:00
Viktor Barzin	f0ce7b0363	fire-planner: add stack, Vault DB role, dashboard, DB New stacks/fire-planner/ mirrors payslip-ingest layout: - ExternalSecret pulling RECOMPUTE_BEARER_TOKEN from Vault secret/fire-planner - DB ExternalSecret templating DB_CONNECTION_STRING via static role pg-fire-planner - FastAPI Deployment (serve), CronJob (recompute-all monthly on 2nd at 09:00 UTC, scheduled after wealthfolio-sync's 1st at 08:00), ClusterIP Service - Grafana datasource ConfigMap "FirePlanner" — `database` inside jsonData (`cc56ba29` fix; otherwise Grafana 11.2+ hits "you do not have default database") Plus: - vault/main.tf: pg-fire-planner static role (7d rotation), allowed_roles - dbaas/modules/dbaas/main.tf: null_resource creates fire_planner DB+role - monitoring/dashboards/fire-planner.json: 9-panel Finance-folder dashboard (NW timeseries, MC fan chart, success heatmap, lifetime tax bars, years-to-ruin table, optimal leave-UK stat, ending wealth stat, UK success-by-strategy bars, sequence-risk correlation table) - monitoring/modules/monitoring/grafana.tf: register "fire-planner.json" in Finance folder Apply order: 1. vault stack — creates the static role 2. dbaas stack — creates the database & role 3. external-secrets stack picks up vault-database refs (no change needed) 4. fire-planner stack — first apply with -target=kubernetes_manifest.db_external_secret before full apply, per the plan-time-data-source pattern 5. monitoring stack — picks up the new dashboard ConfigMap [ci skip] Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-25 17:27:19 +00:00
Viktor Barzin	484b4c7190	vault: complete Phase 2 NFS-hostile migration; remove nfs-proxmox SC All 3 vault voters now on proxmox-lvm-encrypted (vault-0 16:18, vault-1 + vault-2 today). The NFS fsync incompatibility identified in the 2026-04-22 raft-leader-deadlock post-mortem is no longer reachable — raft consensus log + audit log live on LUKS2 block storage with real fsync semantics. Cluster-wide consumers of the inline kubernetes_storage_class.nfs_proxmox dropped to zero after the rolling, so the resource is removed from infra/stacks/vault/main.tf. Released NFS PVs (6) remain in the cluster and will be reclaimed in Phase 3 cleanup. Lesson learned (recorded in plan): pvc-protection finalizer races the StatefulSet controller — pod recreates on the OLD PVCs unless the finalizer is patched out before pod delete. Force-finalize technique applied to vault-1 + vault-2 successfully. Closes: code-gy7h	2026-04-25 17:10:00 +00:00
Viktor Barzin	df2fa0a31d	state(vault): update encrypted state	2026-04-25 17:09:35 +00:00
Viktor Barzin	bf4c7618d8	wealth: SQLite→PG ETL sidecar + new Grafana dashboard Mirrors Wealthfolio's daily_account_valuation / accounts / activities from SQLite into a new PG database (wealthfolio_sync) every hour, so Grafana can chart net worth, contributions, and growth over time. Components: - dbaas: null_resource creates wealthfolio_sync DB + role on the CNPG cluster (dynamic primary lookup so it survives failover). - vault: pg-wealthfolio-sync static role rotates the password every 7d. - wealthfolio: ExternalSecret pulls the rotated password into the WF namespace; new pg-sync sidecar (alpine + sqlite + postgresql-client + busybox crond) does sqlite3 .backup → TSV dump → truncate-and-reload psql, hourly at :07. Plus a grafana-wealth-datasource ConfigMap in the monitoring namespace (uid: wealth-pg). - monitoring: new Wealth dashboard (wealth.json, 10 panels) — current net worth / contribution / growth / ROI% stats, then time-series for net worth, contribution-vs-market, growth area, per-account stacked area, cash-vs-invested, and a 100-row activity log. Initial sync: 6 accounts, 10,798 daily valuations, 518 activities. Verified PG totals match SQLite latest snapshot exactly.	2026-04-25 17:07:33 +00:00
Viktor Barzin	7dd580972a	state(vault): update encrypted state	2026-04-25 16:57:42 +00:00
Viktor Barzin	ac8d2f548b	paperless-ngx: migrate to proxmox-lvm-encrypted Document scans (receipts, contracts, IDs) are unambiguously sensitive PII. Storage decision rule defaults sensitive data to `proxmox-lvm-encrypted`, but paperless-ngx had been left on plain `proxmox-lvm` by an abandoned migration attempt that left a dormant, non-Terraform-managed encrypted PVC sitting unbound for 11 days. Cleaned up the orphan, added the encrypted PVC properly via Terraform, rsynced data with deployment scaled to 0, swapped claim_name. Plain `proxmox-lvm` PVC retained for a 7-day soak before removal. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-25 16:48:53 +00:00
Viktor Barzin	4f5f1ff8c2	monitoring(uk-payslip): add yearly receipt stacked barchart panel New panel 16 (barchart, h=11, y=179): one stacked bar per tax year showing total comp split into net pay (bank deposit), cash income tax, RSU tax (band-aware marginal: PAYE+NI), cash NI, student loan, pension salary- sacrifice, and RSU offset (Variant A only). X-axis = tax_year (categorical), y-axis = currencyGBP. Bar height ≈ gross_pay + pension_sacrifice (small over-attribution in Variant A years where the band-aware model exceeds recorded payslip PAYE).	2026-04-25 16:26:57 +00:00
Viktor Barzin	288efa89b3	vault: migrate vault-0 storage to proxmox-lvm-encrypted Phase 2 of the NFS-hostile migration: data + audit storageClass on the vault helm release switches from nfs-proxmox to proxmox-lvm-encrypted, then per-pod rolling swap (24h soak between). vault-0 swap done. vault-1 + vault-2 still on NFS — the rolling part is what makes this safe (raft quorum maintained by 2 healthy pods while one is replaced). Also restores chart-default pod securityContext fields. The previous `statefulSet.securityContext.pod = {fsGroupChangePolicy = "..."}` block REPLACED (not merged) the chart's defaults — fsGroup, runAsGroup, runAsUser, runAsNonRoot were all silently dropped. NFS exports were permissive enough to mask the missing fsGroup; ext4 LV volume root is root:root and the vault user (UID 100) couldn't open vault.db, CrashLoopBackOff. Fix: provide all five fields explicitly, survives future chart bumps. vault-1 and vault-2 retained their correct securityContext from when their pod specs were written to etcd, before the partial customization landed — the bug only surfaces when a pod is recreated. Pre-flight raft snapshot saved at /tmp/vault-pre-migration-*.snap (recovery anchor). Refs: code-gy7h Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-25 16:19:49 +00:00
Viktor Barzin	08b13858dd	state(vault): update encrypted state	2026-04-25 16:16:35 +00:00
Viktor Barzin	b3c29eda12	monitoring(uk-payslip): model UK income-tax bands + PA-taper for RSU marginal Replaces the flat 47% (45 PAYE + 2 NI) RSU marginal across panels 3, 7, 8, 11, and 12 with an exact piecewise band-aware computation. Each row computes ani_prior/ani_pre/ani_post over the tax-year YTD (chronological model — the RSU is taxed at the band its YTD ANI position occupies at the vest date, mirroring PAYE withholding behaviour). Bands (2024/25+, applied to all years): IT: 0% / 20% / 40% / 60% (PA-taper) / 45% at 12,570 / 50,270 / 100k / 125,140 NI: 0% / 8% / 2% at 12,570 / 50,270 PA-taper modelled as 60% effective IT marginal in £100k–£125,140 (40% on the £1 + 40% on the £0.50 of lost PA = 60%). Spot-checked per tax-year totals via psql; numbers diverge from the flat 47% baseline most for years where vests cross PA-taper or basic-rate bands (2020/21 ~35%, 2024/25 ~41%, 2025/26 ~43%).	2026-04-25 16:14:49 +00:00
Viktor Barzin	3f85cee1ef	state(vault): update encrypted state	2026-04-25 16:08:38 +00:00
Viktor Barzin	43e4f3f68e	immich: migrate PostgreSQL off NFS to proxmox-lvm-encrypted Live PG data moves to a 10Gi LUKS-encrypted RWO PVC. WAL fsync per commit on NFS contributed to the 2026-04-22 NFS writeback storm (2h43m recovery, 3 of 4 nodes hard-reset). Backups remain on NFS (append-only, NFS-tolerant). The init container that writes postgresql.override.conf is now gated on PG_VERSION presence — on a fresh PVC the file would otherwise make initdb refuse the non-empty PGDATA. First boot skips the override and initdb's cleanly; second boot (after a forced restart) writes the override so vchord/vectors/pg_prewarm load before the dump restore. Idempotent on initialised PVCs. Migration executed: pg_dumpall (1.9GB) → restore on encrypted PVC → REINDEX clip_index/face_index → 111,843 assets verified, external HTTP 200, all 10 extensions present (vector minor 0.8.0→0.8.1 only). LV created on PVE host, picked up by lvm-pvc-snapshot. See docs/plans/2026-04-25-nfs-hostile-migration-{design,plan}.md. Phase 2 (Vault Raft) follows under code-gy7h. Closes: code-ahr7 Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-25 15:47:30 +00:00
Viktor Barzin	0d5f53f337	monitoring(uk-payslip): replace misleading take-home rates in Panel 3 Drop the two misleading series in "Effective rate & take-home % (YTD cumulative)" — both used SUM(gross_pay) as denominator while only counting cash deductions/net in the numerator, which understated take-home by 25-30 pp because RSU shares are absent from the cash deposit but present in gross. Replaced with three semantically clean angles: - ytd_paye_rate_pct: SUM(income_tax) / SUM(taxable_pay) — HMRC audit rate (~41-42% in additional-rate band), kept as before. - ytd_cash_take_home_pct: SUM(net_pay) / SUM(gross_pay - rsu_vest) — what fraction of cash earnings hits the bank (~62-65%). - ytd_total_keep_pct: (SUM(net_pay) + 0.53 × SUM(rsu_vest)) / SUM(gross_pay) — true "what I actually keep" including post-tax RSU shares (47% marginal applied to vest value), ~55-60%. Added field overrides for clear color-coding (red/green/blue). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-25 15:45:47 +00:00
Viktor Barzin	8f0d13282c	monitoring(uk-payslip): drop cash PAYE/NI from "Tax & pension — monthly" Same reasoning as panel 2: cash-side income_tax and NI are inherently bumpy in vest months due to UK cumulative PAYE catching up on YTD, and the flat-47% strip can't fix it. Panel now shows only the explicit RSU vest tax (orange, 47% × rsu_vest), student loan, and pensions. The smooth view of total cash deductions stays available on panel 12 (YTD cumulative). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-25 15:43:32 +00:00
Viktor Barzin	2230cb6cf4	monitoring(uk-payslip): drop tax/NI from "Monthly cash flow (RSU stripped)" panel Vest months still bumped 4-5x in this panel after the flat-47% strip because UK cumulative PAYE genuinely catches up YTD tax in vest months, on top of the marginal RSU portion — no arithmetic split can make that line flat without distorting the data. The cash-flow question this panel answers (what hits the bank, RSU aside) is already covered cleanly by cash_gross + net_pay; the tax detail lives on Panel 11 where the RSU split is now linear. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-25 15:30:46 +00:00
Viktor Barzin	cb3ffa6d8d	monitoring(uk-payslip): smooth quarterly RSU tax bumps via flat 47% marginal Replace the implicit pro-rata RSU/cash split with an explicit flat 47% marginal (45% PAYE + 2% NI) for the RSU vest tax stack. The orange slice now scales linearly with rsu_vest instead of wobbling around the month's effective PAYE rate; cash PAYE/NI slices have those amounts subtracted out so the stack still totals to actual deductions. Affects panel 7 (monthly), panel 12 (YTD cumulative), panel 7 (YTD uses), and the Sankey panel. Verified on 35 months of live data: sum invariant holds exactly (cash + rsu_marginal + cash_ni == income_tax + national_insurance), no negatives in cash slices. Out of scope (left raw): effective-rate %, data-integrity, payslip table, P60/HMRC reconciliation — those are audit views that use unmodified income_tax / cash_income_tax columns. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-25 15:13:29 +00:00
Viktor Barzin	4315ed5c2a	[backup] Fix lvm-pvc-snapshot Pushgateway push (stdout pollution in cmd_prune_count) cmd_prune_count's `log " Pruned: ..."` wrote to stdout, which the caller captures via `pruned=$(cmd_prune_count)`. From 2026-04-16 onward (7d retention kicked in), pruned snapshots polluted the captured value with multi-line log text, breaking the Prometheus exposition format on the metric push (`lvm_snapshot_pruned_total ${pruned}` → 400 from Pushgateway). Snapshots themselves were always fine; only the metric push silently failed for ~9 nights, eventually triggering LVMSnapshotNeverRun (alert has 48h `for:`). Fix: redirect the inner log call to stderr so cmd_prune_count's stdout contains only the count. Also adopts `infra/scripts/lvm-pvc-snapshot.sh` as the source-of-truth (was edited only on the PVE host) and updates backup-dr.md to point at the .sh and document the scp deploy. Deploy: scp infra/scripts/lvm-pvc-snapshot.sh root@192.168.1.127:/usr/local/bin/lvm-pvc-snapshot Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-25 14:30:58 +00:00

1 2 3 4 5 ...

3051 commits