Compare commits

...

66 commits

Author SHA1 Message Date
Viktor Barzin
d093aed7f6 immich(server,ml): bump server to 4Gi + Recreate strategy on tight quota
Root cause of 502/503/decode errors clustered at 19:20 BST 2026-04-26: immich-server
hit its 3500Mi memory limit during a face-detection burst and was OOMKilled (Exit Code
137). VPA upperBound is 3050Mi but real-world bursts crossed it; with the single pod
running both API and microservices workers, the OOM took the API down for ~30s of
restart, surfacing as PlatformException image decode + 502 on uploads + 503 on
ActivityService to the iOS app.

Bump immich-server requests=limits to 4096Mi (per CLAUDE.md "upperBound x 1.3 for
volatile workloads" rule, with headroom over the OOM mark). Quota math: 9680Mi used -
2000Mi old req + 4096Mi new req = 11776Mi, fits the tier-2-gpu 12Gi cap.

Switch both immich-server and immich-machine-learning to Recreate strategy: the
namespace tier-2-gpu quota is too tight for RollingUpdate to keep an old + new pod up
during apply (transient 13776Mi > 12Gi cap, see "ResourceQuota blocks rolling updates"
in CLAUDE.md). With single replicas and Recreate, future memory tweaks no longer
require manual scale-to-0 dance.

Verified: new pod has limits.memory=4Gi, quota usage stable at 11776Mi/12Gi, immich
API serving normally.

Note: a pending node_selector drift on immich-machine-learning (gpu=true ->
nvidia.com/gpu.present=true) also reconciled in this apply; the canonical NVIDIA
operator label already on the GPU node, no scheduling impact.
2026-04-26 19:11:50 +00:00
Viktor Barzin
07bc0098e3
ci(woodpecker): show full terraform error on stack apply failure
The default workflow truncated the failed-stack output at `tail -5`,
which only captured the trailing source-line indicator (`│  45: resource
…`) and dropped the actual `Error: …` line above it. Bump to `tail -50`
so the real error is visible without re-running locally to reproduce.

Also fix the pre-warm step's FIRST_STACK detection — `head -1 file1
file2 | head -1` returns the file header (`==> .platform_apply <==`),
not the first stack name, so the cd then fails with "no such file or
directory". Use `cat | head -1` instead.

Pure logging-and-pre-warm change; no stacks touched, so this commit is
a no-op for the apply step.
2026-04-26 18:39:46 +00:00
Viktor Barzin
215717c90f monitoring(dashboards): tables at the bottom convention
wealth: move Activity log table from y=45 to y=77; the three barcharts
(Yearly return, Annual change, Per-account ROI) shift up by 14 to fill
the gap.

uk-payslip: move Sankey "where the money went" from y=80 to y=48 (right
above the table block); the three tables (Data integrity, All payslips,
YTD reconciliation) shift down by 14 so all four tables (4, 5, 6, 9) sit
contiguously at the bottom.

fire-planner and job-hunter still have intentional side-by-side
table/chart pairings; left untouched pending user direction on whether
to break them.
2026-04-26 18:30:52 +00:00
Viktor Barzin
bb28485ce0 monitoring(wealth): move 12mo return to top bar, shrink to w=4
Trailing 12-month investment return % was a full-width stat at y=59.
Now sits inline with Net worth / Contribution / Growth / ROI as the
fifth headline number — top-row stats reflowed from w=6 (×4) to w=5
(×4) + w=4 (×1). Title shortened to "12mo return" so it fits.
Panels below the old row shifted up by 4 rows to close the gap.
2026-04-26 18:19:24 +00:00
Viktor Barzin
532285e48c traefik: raise websecure idleTimeout 180s -> 600s for iOS Immich -1005
iOS NSURLSession held a dead TCP/TLS socket past Traefik's 180s idle close,
then errored with NSURLErrorDomain -1005 on the next thumbnail. Bumping the
timeout to 600s pushes the bug to "app idle for >10 min" -- much rarer in
normal use. Verified with /home/wizard/.claude/immich-scroll-sim.py
keepalive probe: 200s idle, mean reuse latency +1.8ms over warmup (was ~50ms
TLS handshake penalty before). Synthesis: ~/.claude/immich-debug/synthesis.md.
2026-04-26 12:32:05 +00:00
Viktor Barzin
3489621a45 nextcloud(backup): pin backup pod to nextcloud's node via podAffinity
The weekly backup mounts the same RWO PVC (proxmox-lvm-encrypted) as the
main nextcloud deployment. Single-node attach — the backup pod can never
mount the volume if it lands on a different node, and was stuck in
ContainerCreating for 6+ hours when cron fired today.

Add pod_affinity (required, hostname topology) so the backup co-locates
with the nextcloud app pod. Discovered via cluster-health probe; manual
verify run scheduled on k8s-node3 next to nextcloud's pod and completed
the rsync in seconds.
2026-04-26 11:03:20 +00:00
Viktor Barzin
a24cd7ceb7 monitoring(uk-payslip): yearly receipt aligns with P60 (RSU gross)
Switch the RSU stack from "after band-aware tax" to gross. Receipt
total is now pre-sacrifice gross compensation; bar − pension stack
≈ ytd_gross reported on the final March payslip / P60.

Verified alignment for 2025/26: bar−pension = £266,752 vs P60
ytd_gross = £268,127 — gap of £1,375 ≈ "other taxable" (benefits,
overtime). Remaining year-level gaps are upstream parser/ingest
issues, not dashboard logic:
  - 2024/25 +£27k: March 2025 payslip parsed bonus=£26,969 but never
    propagated it into gross_pay/income_tax. Receipt is more
    accurate than ytd_gross here.
  - 2023/24 −£36k: Feb 2024 payslip row appears to be missing from
    the table; ytd_gross has it, sum(gross_pay) doesn't.
  - 2022/23 −£10k: variant A→B transition residual.

SQL simplified — band-aware CTE chain dropped (no longer needed for
this panel since RSU is shown gross).
2026-04-26 10:24:06 +00:00
Viktor Barzin
d0152e1f38 crowdsec/traefik: stop captchaing legit Immich mobile bursts
Mobile timeline scrubs prefetch ~100 thumbs in <1s, which exhausted the
immich-rate-limit (avg=500, burst=5000) and produced a cascade of HTTP
429s. CrowdSec's local http-429-abuse scenario then fired captcha:1 on
the source IP (alert #291, 2026-04-25 — owner's Hyperoptic IPv6).

Two changes:
- crowdsec: add a second whitelist doc (viktor/immich-asset-paths-whitelist)
  filtering events by Immich asset paths so they never feed leaky buckets.
  Auth endpoints intentionally excluded — brute-force protection unchanged.
- traefik: raise immich-rate-limit avg=500->1000, burst=5000->20000 so
  legitimate mobile scrubs don't produce 429s in the first place.
2026-04-26 09:27:16 +00:00
Viktor Barzin
222013806d monitoring(uk-payslip): split salary into cash + pension on yearly receipt
The salary field on the payslip is pre-pension-sacrifice, so the
"Salary (gross)" stack already silently included the salary-sacrifice
pension contribution. Split it out so pension is explicitly visible:
- Salary (cash, post-sacrifice) = salary - pension_sacrifice
- Pension (salary sacrifice, untaxed) = pension_sacrifice
- Bonus
- RSU vest (after band-aware tax)

Bar total unchanged (just relabels what was already there). Pension
is now visibly counted as income — consistent with "untaxed but real"
framing.

Caveat documented in panel description: receipt total ≠ P60 gross
because P60 reports pre-RSU-tax gross. Receipt shows RSU net of tax
per earlier intent. To exactly match P60, swap rsu_after_tax →
rsu_vest gross.
2026-04-26 09:18:32 +00:00
root
423aac0908 Woodpecker CI Update TLS Certificates Commit 2026-04-26 00:03:26 +00:00
Viktor Barzin
21ac619fac monitoring(uk-payslip): promote yearly receipt + YTD gross YoY to row 4
Move both barchart/timeseries panels into row 4 (y=29, side-by-side
w=12 each, h=10) so the per-tax-year overviews appear right after
the income-tax-and-pension YTD row. Shift panels 13, 4, 5, 6, 8, 9
down by 10 to accommodate.

Final ordering: rows 1–3 = monthly + YTD timeseries (panels 1/7/2/3/11/12),
row 4 = yearly receipt + YTD gross YoY (16/17), then the wider
deduction/integrity/table panels below.
2026-04-25 23:58:15 +00:00
Viktor Barzin
53f555dc61 monitoring(uk-payslip): drop 3 panels referencing undeployed data
Removed:
- Panel 10 "HMRC Tax Year Reconciliation — Individual Tax API"
  → references hmrc_sync.tax_year_snapshot schema. The hmrc-sync
    service / DB has not been deployed, so the panel always errored
    with "relation does not exist".
- Panel 14 "Meta payroll: bank deposit vs payslip net pay"
  → references payslip_ingest.external_meta_deposits, which is
    created by alembic migration 0007. The deployed payslip-ingest
    image is at 0005, so the table doesn't exist.
- Panel 15 "RSU vest reconciliation — payslip vs Schwab"
  → references payslip_ingest.rsu_vest_events, created by migration
    0008. Same image-staleness story.

Verified all 14 remaining panels return without error via Grafana
/api/ds/query. SQL for the removed panels is preserved in git history;
re-add when the data sources are actually deployed.
2026-04-25 23:56:03 +00:00
Viktor Barzin
b2a25775aa monitoring(uk-payslip): simplify yearly receipt to earned-and-kept view
Replace the 7-stack "where total comp went" decomposition with a 3-stack
"what I actually earned" view: salary (gross), bonus (gross), and RSU
vest after band-aware tax (PAYE+NI withheld via sell-to-cover). Skips
income tax / NI / student loan / pension / RSU offset.

Bar height = real income kept across all components. RSU is net of tax
because it's withheld at source and never hits the bank account; salary
and bonus are gross because they're paid in full and taxes are deducted
elsewhere. This is the income-side view where tax is implicit, not the
deduction waterfall.

Per-year RSU after tax: 2020/21 £18k · 2021/22 £39k · 2022/23 £50k ·
2023/24 £26k · 2024/25 £71k · 2025/26 £73k.
2026-04-25 23:42:20 +00:00
Viktor Barzin
a17304f735 monitoring(uk-payslip): fix empty YTD gross YoY chart
Two bugs:
1. Synthetic dates projected onto 1970/71 fell outside the dashboard's
   default time range (now-10y → now), so Grafana filtered out every
   point. Switched to a sliding 12-month window
   (CURRENT_DATE - INTERVAL '12 months') as the projection base, plus
   a per-panel timeFrom: "13M" override so the panel always shows the
   last 13 months regardless of the dashboard's time picker.
2. ORDER BY tax_year, pay_date violated Grafana's long→wide conversion
   requirement (data must be ascending by time). Wrapped in a CTE and
   re-ordered by the synthetic time column. Pivoted result is now a
   single wide frame with 7 series (2019/20…2025/26).
2026-04-25 23:36:16 +00:00
Viktor Barzin
ac18c49a7b monitoring(wealth): fix x-axis label formatting on yearly bars
The default fieldConfig unit (percent on Yearly investment return %,
currencyGBP on Annual change decomposition) was being applied to the
"year" string column too — so x-axis labels rendered as "2024%" and
"£2,024" respectively. Add field overrides on the "year" column to
force unit=string. The earlier "tax_year" panels weren't affected
because "2024/25" doesn't parse as a number; "2024" did.
2026-04-25 23:31:03 +00:00
Viktor Barzin
77bed10a51 monitoring: investment-only returns + YoY YTD gross line chart
Wealth dashboard:
- "Yearly growth %" → "Yearly investment return %": switched to
  modified-Dietz formula `market_gain / (nw_start + 0.5 × contributions)`
  so contributions don't inflate the return. New money in is excluded —
  this is portfolio performance, not net-worth change.
- "Trailing 12-month growth %" → "Trailing 12-month investment return %":
  same formula, applied to the trailing 12mo window.

Pre-fix vs post-fix:
  2020: 155.0% → 5.12%   (large contributions on small base)
  2021: 344.7% → 26.45%
  2022: 26.9%  → -25.65% (the actual 2022 bear market)
  2023: 123.2% → 41.60%
  2024: 87.4%  → 25.70%
  2025: 46.8%  → 8.43%
  2026: 16.7%  → 3.28%   (YTD)

UK Payslip dashboard:
- Replaced the per-tax-year stacked bar with a year-over-year line chart:
  one line per tax year, X = month-of-tax-year (April→March, projected
  onto a 1970/71 fiscal calendar so years overlay), Y = cumulative YTD
  gross. Five+ lines visible at a glance for trend comparison.
2026-04-25 23:25:42 +00:00
Viktor Barzin
55d1da41f6 monitoring: more growth detail in Wealth + gross composition in UK Payslip
Wealth (4 new panels at the bottom):
- Trailing 12-month growth % (stat) — % change in net worth over last 12mo.
- Yearly growth % (bar per calendar year) — first→last valuation each year.
- Annual change decomposition (stacked bar) — splits each year's NW change
  into "net contributions" (new money in) and "market gain" (everything
  else: appreciation, dividends, FX). Answers "did I grow because I saved
  or because the market did the work?".
- Per-account ROI % (horizontal bar) — (value − contribution) / contribution
  × 100, latest snapshot. Excludes accounts with zero/negative net
  contribution (Schwab — distorts ratio after RSU sells).

UK Payslip (1 new panel below the yearly receipt):
- Gross composition by tax year (stacked bar) — salary / bonus / RSU vest /
  other components per tax year. Bar height = gross pay. Trends in salary
  growth, bonus levels, and RSU vest sizing at a glance.

All queries spot-checked via Grafana /api/ds/query.
2026-04-25 23:21:42 +00:00
Viktor Barzin
d48e222054 monitoring: lock Finance (Personal) folder to admin + fix cash classification
Folder ACL:
- Move uk-payslip + wealth dashboards to a new "Finance (Personal)"
  folder; job-hunter + fire-planner stay in "Finance" (open).
- New null_resource calls Grafana's folder permissions API after the
  dashboard sidecar materialises the folder, setting an admin-only
  ACL ({Admin: 4}). Default Viewer/Editor inheritance is overridden,
  so anonymous-Viewer (auth.anonymous=true) is denied. Server-admin
  always retains access.
- Verified: anonymous → 403 on uk-payslip + wealth, 200 on
  control dashboards (node-exporter); admin → 200 on all.

Wealth cash fix:
- Wealthfolio dumps WORKPLACE_PENSION wrappers entirely into
  cash_balance because it doesn't track underlying fund holdings.
  Reclassify pension cash as invested in the "Cash vs invested"
  panel so the cash series reflects actual uninvested broker cash
  (~£16k T212 ISA + Schwab) instead of phantom £154k.

  Pre-fix:  cash=£153,789 / invested=£870,282 / total=£1,024,071
  Post-fix: cash=£16,064  / invested=£1,008,008 / total=£1,024,071
2026-04-25 23:11:26 +00:00
Viktor Barzin
51bf38815c vault: record Phase 3 vault Released-PV cleanup
Deleted the 6 NFS PVs orphaned by the Phase 2 rolling and removed
their /srv/nfs/<dir> subtrees on the PVE host (~1.5 GB; vault-2 audit
log was 1.4 GB on its own). Cluster-wide Released-PV sweep on the
proxmox-lvm/encrypted side stays out of scope.
2026-04-25 23:08:45 +00:00
Viktor Barzin
498400173c wealthfolio-sync: skip the synthetic TOTAL row in ETL
Wealthfolio's daily_account_valuation includes a row with
account_id='TOTAL' that pre-aggregates the per-account values for that
day. Mirroring it into PG verbatim caused every SUM(total_value) in
the Wealth dashboard to double-count (showing ~£2M against actual
~£1M). Drop the synthetic row at the dump step so the PG mirror only
holds real-account rows.

Initial sync after fix: 8,649 DAV rows (was 10,798), net worth resolves
to £1,024,071 — matches the per-account latest snapshot.
2026-04-25 22:59:24 +00:00
Viktor Barzin
f0ce7b0363 fire-planner: add stack, Vault DB role, dashboard, DB
New stacks/fire-planner/ mirrors payslip-ingest layout:
- ExternalSecret pulling RECOMPUTE_BEARER_TOKEN from Vault secret/fire-planner
- DB ExternalSecret templating DB_CONNECTION_STRING via static role pg-fire-planner
- FastAPI Deployment (serve), CronJob (recompute-all monthly on 2nd at 09:00 UTC,
  scheduled after wealthfolio-sync's 1st at 08:00), ClusterIP Service
- Grafana datasource ConfigMap "FirePlanner" — `database` inside jsonData
  (cc56ba29 fix; otherwise Grafana 11.2+ hits "you do not have default database")

Plus:
- vault/main.tf: pg-fire-planner static role (7d rotation), allowed_roles
- dbaas/modules/dbaas/main.tf: null_resource creates fire_planner DB+role
- monitoring/dashboards/fire-planner.json: 9-panel Finance-folder dashboard
  (NW timeseries, MC fan chart, success heatmap, lifetime tax bars,
  years-to-ruin table, optimal leave-UK stat, ending wealth stat,
  UK success-by-strategy bars, sequence-risk correlation table)
- monitoring/modules/monitoring/grafana.tf: register "fire-planner.json" in Finance folder

Apply order:
  1. vault stack — creates the static role
  2. dbaas stack — creates the database & role
  3. external-secrets stack picks up vault-database refs (no change needed)
  4. fire-planner stack — first apply with -target=kubernetes_manifest.db_external_secret
     before full apply, per the plan-time-data-source pattern
  5. monitoring stack — picks up the new dashboard ConfigMap

[ci skip]

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-25 17:27:19 +00:00
Viktor Barzin
484b4c7190 vault: complete Phase 2 NFS-hostile migration; remove nfs-proxmox SC
All 3 vault voters now on proxmox-lvm-encrypted (vault-0 16:18, vault-1
+ vault-2 today). The NFS fsync incompatibility identified in the
2026-04-22 raft-leader-deadlock post-mortem is no longer reachable —
raft consensus log + audit log live on LUKS2 block storage with real
fsync semantics.

Cluster-wide consumers of the inline kubernetes_storage_class.nfs_proxmox
dropped to zero after the rolling, so the resource is removed from
infra/stacks/vault/main.tf. Released NFS PVs (6) remain in the cluster
and will be reclaimed in Phase 3 cleanup.

Lesson learned (recorded in plan): pvc-protection finalizer races the
StatefulSet controller — pod recreates on the OLD PVCs unless the
finalizer is patched out before pod delete. Force-finalize technique
applied to vault-1 + vault-2 successfully.

Closes: code-gy7h
2026-04-25 17:10:00 +00:00
Viktor Barzin
df2fa0a31d state(vault): update encrypted state 2026-04-25 17:09:35 +00:00
Viktor Barzin
bf4c7618d8 wealth: SQLite→PG ETL sidecar + new Grafana dashboard
Mirrors Wealthfolio's daily_account_valuation / accounts / activities
from SQLite into a new PG database (wealthfolio_sync) every hour, so
Grafana can chart net worth, contributions, and growth over time.

Components:
- dbaas: null_resource creates wealthfolio_sync DB + role on the CNPG
  cluster (dynamic primary lookup so it survives failover).
- vault: pg-wealthfolio-sync static role rotates the password every 7d.
- wealthfolio: ExternalSecret pulls the rotated password into the WF
  namespace; new pg-sync sidecar (alpine + sqlite + postgresql-client +
  busybox crond) does sqlite3 .backup → TSV dump → truncate-and-reload
  psql, hourly at :07. Plus a grafana-wealth-datasource ConfigMap in
  the monitoring namespace (uid: wealth-pg).
- monitoring: new Wealth dashboard (wealth.json, 10 panels) — current
  net worth / contribution / growth / ROI% stats, then time-series
  for net worth, contribution-vs-market, growth area, per-account
  stacked area, cash-vs-invested, and a 100-row activity log.

Initial sync: 6 accounts, 10,798 daily valuations, 518 activities.
Verified PG totals match SQLite latest snapshot exactly.
2026-04-25 17:07:33 +00:00
Viktor Barzin
7dd580972a state(vault): update encrypted state 2026-04-25 16:57:42 +00:00
Viktor Barzin
ac8d2f548b paperless-ngx: migrate to proxmox-lvm-encrypted
Document scans (receipts, contracts, IDs) are unambiguously sensitive
PII. Storage decision rule defaults sensitive data to
`proxmox-lvm-encrypted`, but paperless-ngx had been left on plain
`proxmox-lvm` by an abandoned migration attempt that left a dormant,
non-Terraform-managed encrypted PVC sitting unbound for 11 days.

Cleaned up the orphan, added the encrypted PVC properly via Terraform,
rsynced data with deployment scaled to 0, swapped claim_name. Plain
`proxmox-lvm` PVC retained for a 7-day soak before removal.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-25 16:48:53 +00:00
Viktor Barzin
4f5f1ff8c2 monitoring(uk-payslip): add yearly receipt stacked barchart panel
New panel 16 (barchart, h=11, y=179): one stacked bar per tax year showing
total comp split into net pay (bank deposit), cash income tax, RSU tax
(band-aware marginal: PAYE+NI), cash NI, student loan, pension salary-
sacrifice, and RSU offset (Variant A only).

X-axis = tax_year (categorical), y-axis = currencyGBP. Bar height ≈
gross_pay + pension_sacrifice (small over-attribution in Variant A years
where the band-aware model exceeds recorded payslip PAYE).
2026-04-25 16:26:57 +00:00
Viktor Barzin
288efa89b3 vault: migrate vault-0 storage to proxmox-lvm-encrypted
Phase 2 of the NFS-hostile migration: data + audit storageClass on
the vault helm release switches from nfs-proxmox to
proxmox-lvm-encrypted, then per-pod rolling swap (24h soak between).

vault-0 swap done. vault-1 + vault-2 still on NFS — the rolling part
is what makes this safe (raft quorum maintained by 2 healthy pods
while one is replaced).

Also restores chart-default pod securityContext fields. The previous
`statefulSet.securityContext.pod = {fsGroupChangePolicy = "..."}`
block REPLACED (not merged) the chart's defaults — fsGroup,
runAsGroup, runAsUser, runAsNonRoot were all silently dropped. NFS
exports were permissive enough to mask the missing fsGroup; ext4 LV
volume root is root:root and the vault user (UID 100) couldn't open
vault.db, CrashLoopBackOff. Fix: provide all five fields explicitly,
survives future chart bumps. vault-1 and vault-2 retained their
correct securityContext from when their pod specs were written to
etcd, before the partial customization landed — the bug only surfaces
when a pod is recreated.

Pre-flight raft snapshot saved at /tmp/vault-pre-migration-*.snap
(recovery anchor).

Refs: code-gy7h

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-25 16:19:49 +00:00
Viktor Barzin
08b13858dd state(vault): update encrypted state 2026-04-25 16:16:35 +00:00
Viktor Barzin
b3c29eda12 monitoring(uk-payslip): model UK income-tax bands + PA-taper for RSU marginal
Replaces the flat 47% (45 PAYE + 2 NI) RSU marginal across panels 3, 7, 8, 11,
and 12 with an exact piecewise band-aware computation. Each row computes
ani_prior/ani_pre/ani_post over the tax-year YTD (chronological model — the
RSU is taxed at the band its YTD ANI position occupies at the vest date,
mirroring PAYE withholding behaviour).

Bands (2024/25+, applied to all years):
  IT:  0% / 20% / 40% / 60% (PA-taper) / 45%   at 12,570 / 50,270 / 100k / 125,140
  NI:  0% / 8% / 2%                            at 12,570 / 50,270

PA-taper modelled as 60% effective IT marginal in £100k–£125,140
(40% on the £1 + 40% on the £0.50 of lost PA = 60%).

Spot-checked per tax-year totals via psql; numbers diverge from the flat
47% baseline most for years where vests cross PA-taper or basic-rate bands
(2020/21 ~35%, 2024/25 ~41%, 2025/26 ~43%).
2026-04-25 16:14:49 +00:00
Viktor Barzin
3f85cee1ef state(vault): update encrypted state 2026-04-25 16:08:38 +00:00
Viktor Barzin
43e4f3f68e immich: migrate PostgreSQL off NFS to proxmox-lvm-encrypted
Live PG data moves to a 10Gi LUKS-encrypted RWO PVC. WAL fsync per
commit on NFS contributed to the 2026-04-22 NFS writeback storm
(2h43m recovery, 3 of 4 nodes hard-reset). Backups remain on NFS
(append-only, NFS-tolerant).

The init container that writes postgresql.override.conf is now gated
on PG_VERSION presence — on a fresh PVC the file would otherwise make
initdb refuse the non-empty PGDATA. First boot skips the override and
initdb's cleanly; second boot (after a forced restart) writes the
override so vchord/vectors/pg_prewarm load before the dump restore.
Idempotent on initialised PVCs.

Migration executed: pg_dumpall (1.9GB) → restore on encrypted PVC →
REINDEX clip_index/face_index → 111,843 assets verified, external
HTTP 200, all 10 extensions present (vector minor 0.8.0→0.8.1 only).
LV created on PVE host, picked up by lvm-pvc-snapshot.

See docs/plans/2026-04-25-nfs-hostile-migration-{design,plan}.md.
Phase 2 (Vault Raft) follows under code-gy7h.

Closes: code-ahr7

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-25 15:47:30 +00:00
Viktor Barzin
0d5f53f337 monitoring(uk-payslip): replace misleading take-home rates in Panel 3
Drop the two misleading series in "Effective rate & take-home % (YTD
cumulative)" — both used SUM(gross_pay) as denominator while only
counting cash deductions/net in the numerator, which understated
take-home by 25-30 pp because RSU shares are absent from the cash
deposit but present in gross. Replaced with three semantically clean
angles:

- ytd_paye_rate_pct: SUM(income_tax) / SUM(taxable_pay) — HMRC audit
  rate (~41-42% in additional-rate band), kept as before.
- ytd_cash_take_home_pct: SUM(net_pay) / SUM(gross_pay - rsu_vest) —
  what fraction of cash earnings hits the bank (~62-65%).
- ytd_total_keep_pct: (SUM(net_pay) + 0.53 × SUM(rsu_vest)) /
  SUM(gross_pay) — true "what I actually keep" including post-tax RSU
  shares (47% marginal applied to vest value), ~55-60%.

Added field overrides for clear color-coding (red/green/blue).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-25 15:45:47 +00:00
Viktor Barzin
8f0d13282c monitoring(uk-payslip): drop cash PAYE/NI from "Tax & pension — monthly"
Same reasoning as panel 2: cash-side income_tax and NI are inherently
bumpy in vest months due to UK cumulative PAYE catching up on YTD,
and the flat-47% strip can't fix it. Panel now shows only the
explicit RSU vest tax (orange, 47% × rsu_vest), student loan, and
pensions. The smooth view of total cash deductions stays available on
panel 12 (YTD cumulative).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-25 15:43:32 +00:00
Viktor Barzin
2230cb6cf4 monitoring(uk-payslip): drop tax/NI from "Monthly cash flow (RSU stripped)" panel
Vest months still bumped 4-5x in this panel after the flat-47% strip
because UK cumulative PAYE genuinely catches up YTD tax in vest
months, on top of the marginal RSU portion — no arithmetic split can
make that line flat without distorting the data. The cash-flow
question this panel answers (what hits the bank, RSU aside) is
already covered cleanly by cash_gross + net_pay; the tax detail lives
on Panel 11 where the RSU split is now linear.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-25 15:30:46 +00:00
Viktor Barzin
cb3ffa6d8d monitoring(uk-payslip): smooth quarterly RSU tax bumps via flat 47% marginal
Replace the implicit pro-rata RSU/cash split with an explicit flat
47% marginal (45% PAYE + 2% NI) for the RSU vest tax stack. The orange
slice now scales linearly with rsu_vest instead of wobbling around the
month's effective PAYE rate; cash PAYE/NI slices have those amounts
subtracted out so the stack still totals to actual deductions.

Affects panel 7 (monthly), panel 12 (YTD cumulative), panel 7
(YTD uses), and the Sankey panel. Verified on 35 months of live data:
sum invariant holds exactly (cash + rsu_marginal + cash_ni ==
income_tax + national_insurance), no negatives in cash slices.

Out of scope (left raw): effective-rate %, data-integrity, payslip
table, P60/HMRC reconciliation — those are audit views that use
unmodified income_tax / cash_income_tax columns.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-25 15:13:29 +00:00
Viktor Barzin
4315ed5c2a [backup] Fix lvm-pvc-snapshot Pushgateway push (stdout pollution in cmd_prune_count)
cmd_prune_count's `log "  Pruned: ..."` wrote to stdout, which the
caller captures via `pruned=$(cmd_prune_count)`. From 2026-04-16 onward
(7d retention kicked in), pruned snapshots polluted the captured value
with multi-line log text, breaking the Prometheus exposition format
on the metric push (`lvm_snapshot_pruned_total ${pruned}` → 400 from
Pushgateway). Snapshots themselves were always fine; only the metric
push silently failed for ~9 nights, eventually triggering
LVMSnapshotNeverRun (alert has 48h `for:`).

Fix: redirect the inner log call to stderr so cmd_prune_count's stdout
contains only the count. Also adopts `infra/scripts/lvm-pvc-snapshot.sh`
as the source-of-truth (was edited only on the PVE host) and updates
backup-dr.md to point at the .sh and document the scp deploy.

Deploy: scp infra/scripts/lvm-pvc-snapshot.sh root@192.168.1.127:/usr/local/bin/lvm-pvc-snapshot

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-25 14:30:58 +00:00
Viktor Barzin
d231615ebb [monitoring] Fix fuse voltage alerts — divide raw deciVolt reading by 10
The tuya-bridge exporter reports `fuse_main_voltage` and
`fuse_garage_voltage` as raw uint16 from the Tuya protocol, which
encodes voltage in deciVolts (e.g. 2352 = 235.2V). The 200/260V
thresholds were comparing against the raw integer, so both
FuseMainVoltageAbnormal and FuseGarageVoltageAbnormal fired
continuously during normal mains conditions.

Dividing in the expression also makes `{{ $value }}V` render the
correct human-readable value in the alert summary.

Root fix would be in tuya-bridge `_decode_value()` where
`name.startswith("voltage")` returns `int.from_bytes(...)` without the
/10 scaling that `decode_voltage_threshold` applies. Leaving that
alone to avoid breaking the automatic_transfer_switch scrape which
uses a different code path (`parse_voltage_string`).
2026-04-24 11:12:56 +00:00
Viktor Barzin
a5e4db9af8 [monitoring] Tuya Cloud root-cause alert + cascade suppression
New alert TuyaCloudDown fires when any *_tuya_cloud_up gauge == 0
(i.e., the Tuya Cloud API rejects scrape calls — the symptom during
last night's iot.tuya.com trial expiry, code=28841002). 5m for-duration
beats the 15m window of the seven downstream *MetricsMissing alerts, so
the new Alertmanager inhibit rule suppresses the per-device noise and
only TuyaCloudDown pages.

Also flips helm_release.prometheus.force_update from true to false:
force_update was tripping on the pushgateway PVC added in rev 188
(commit e51c104) — Helm's --force path tried to reset spec.volumeName
on a bound PVC. Disabled here; re-enable temporarily when a
StatefulSet volumeClaimTemplate change actually needs --force.

Bundled with pre-existing working-tree additions for Fuse/Thermostat
threshold alerts and expanded PowerOutage inhibit regex (landed in the
same Helm revision 190).

Verified: rule loaded, value=7 (all 7 tuya-bridge devices report
cloud_up=0 right now), TuyaCloudDown moved pending→firing after 5m,
3 *MetricsMissing alerts currently suppressed in Alertmanager with
inhibitedBy=1 (thermostat alerts still pending their 15m window, will
be suppressed on transition).
2026-04-23 09:59:48 +00:00
Viktor Barzin
5ebd3a81c3 tuya-bridge: liveness probe hits /health so k8s restarts silently-hung bridge
The bridge was down 10h 40m on 2026-04-22 without being restarted —
the liveness probe hit `/` (trivial Flask handler) which passed while
the actual Tuya-cloud call path was stuck. /health now reports Tuya
cloud reachability via a background probe in the app; point both
probes at it. Liveness: 60s grace + 6x30s = 3min of 503s before
restart; readiness: 2x15s = 30s before removal from service.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 07:47:41 +00:00
Viktor Barzin
8e55c4357a [poison-fountain] opt ingress out of Uptime Kuma external monitor
Deployment is scaled to replicas=0 to silence ExternalAccessDivergence,
but the ingress at poison.viktorbarzin.me was still auto-annotated
`external-monitor=true` by ingress_factory (dns_type=non-proxied path),
so external-monitor-sync kept creating `[External] poison` which
probed a backend with no endpoints and flagged DOWN.

Setting `external_monitor = false` emits the explicit opt-out
annotation; next sync run deleted the orphaned monitor.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 21:24:22 +00:00
Viktor Barzin
344fce3692 [monitoring][poison-fountain] pushgateway persistence + cronjob uid-0
Two independent root-cause fixes surfaced by the 2026-04-22 cluster
health check:

1. Pushgateway lost all in-memory metrics when node3 kubelet hiccuped
   at 11:42 UTC, hiding backup_last_success_timestamp{job="offsite-
   backup-sync"} until the next 06:01 UTC push — a ~18h false-negative
   window. Enable persistence on a 2Gi proxmox-lvm-encrypted PVC with
   --persistence.interval=1m. Chart note: values key is
   `prometheus-pushgateway:` (subchart alias), not `pushgateway:`.

2. poison-fountain-fetcher CronJob runs curlimages/curl as UID 100
   but the NFS mount /srv/nfs/poison-fountain is root:root 755 and
   the main Deployment runs as root, so mkdir /data/cache fails
   every 6h. Set run_as_user=0 on the CronJob container (no_root_squash
   is set on the export).

Closes the backup_offsite_sync FAIL on the next 06:01 UTC offsite
sync; closes the recurring poison-fountain evicted-pod noise on the
next 00:00 UTC cron tick.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 18:32:29 +00:00
Viktor Barzin
f1f723be83 [technitium] zone-sync now reconciles primaryNameServerAddresses
When a zone is created against a stale primary IP (e.g. the old primary
pod IP 10.10.36.189 before the technitium-primary ClusterIP service
existed), AXFR refresh keeps failing forever while every other zone on
the same replica refreshes fine from 10.110.37.186. The resync-only
branch didn't touch zone options, so the bad IP was pinned indefinitely.

This surfaced as rpi-sofia.viktorbarzin.lan returning 192.168.1.16
(pre-move) on secondaries while primary had the correct .10 from
2026-04-22 morning — Uptime Kuma Sofia RPI monitor DOWN, cluster
cluster_healthcheck FAIL.

The sync loop now re-applies primaryNameServerAddresses on every run
for existing zones. Idempotent — Technitium accepts identical values
— and self-heals any drift within 30 min. Env renamed PRIMARY_IP →
PRIMARY_HOST for consistency with the reconcile semantics.

Hostname form (technitium-primary.technitium.svc.cluster.local) was
tried but Technitium's own resolver doesn't forward svc.cluster.local,
so the field must stay a literal IP. Terraform tracks the ClusterIP on
every apply and the reconcile loop propagates it to replicas.
2026-04-22 17:47:18 +00:00
Viktor Barzin
7dfe89a6e0 [redis] stabilise against node-crash flap cascade — RC1-RC5 fixes
Five compounding factors produced the 2026-04-22 flap cascade: soft
anti-affinity let 2/3 pods co-locate on k8s-node3 (which bounced
NotReady→Ready at 11:42Z and took quorum), aggressive sentinel/probe
timing amplified LUKS-encrypted LVM I/O stalls into spurious
+switch-master loops, HAProxy's 1s polling raced sentinel failovers
and routed writes to demoted masters, publish_not_ready_addresses=true
fed not-yet-ready pods into HAProxy DNS, and realestate-crawler-celery
CrashLoopBackOff closed the feedback loop.

Changes:
- Anti-affinity: preferred → required (one redis pod per node, hard)
- Sentinel down-after-ms 5000→15000, failover-timeout 30000→60000
- Redis + sentinel liveness: timeout 3→10, failure_threshold 3→5
- HAProxy: check inter 1s→2s / fall 2→3, timeout check 3s→5s
- Headless svc: publish_not_ready_addresses true→false

Post-rollout verification clean: 0 flaps, 0 +switch-master events,
0 celery ReadOnlyError in the 60s window after settle. Docs updated.
2026-04-22 15:59:00 +00:00
Viktor Barzin
fdced7577b [monitoring] HomeAssistantCriticalSensorUnavailable alert 2026-04-22 14:52:23 +00:00
Viktor Barzin
dc05c440bc [hermes-agent] disable deployment — PVC permission mismatch
Main container crashes with "mkdir: cannot create directory '/opt/data':
Permission denied". Init container writes fine but main container runs
with different fsGroup/runAsUser. Scaling to 0 until the PVC permission
model is reworked.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-22 14:31:50 +00:00
Viktor Barzin
a4eafafe49 [monitoring] Add GPUNodeUnschedulable alert — fires when GPU node is cordoned
After k8s-node1 was silently cordoned and broke Frigate camera streams,
existing alerts (NvidiaExporterDown, PodUnschedulable) didn't catch the
root cause proactively. This alert fires within 5m of the GPU node being
cordoned, before any pod restart attempts to schedule and fails.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-22 14:05:12 +00:00
Viktor Barzin
e2146e6916 gpu: schedule off NFD label, not k8s-node1 hostname
Remove every hardcoded reference to k8s-node1 that pinned GPU
scheduling to a specific host:

- GPU workload nodeSelectors: gpu=true -> nvidia.com/gpu.present=true
  (frigate, immich, whisper, piper, ytdlp, ebook2audiobook, audiblez,
  audiblez-web, nvidia-exporter, gpu-pod-exporter). The NFD label is
  auto-applied by gpu-feature-discovery on any node carrying an
  NVIDIA PCI device, so the selector follows the card.

- null_resource.gpu_node_config: rewrite to enumerate NFD-labeled
  nodes (feature.node.kubernetes.io/pci-10de.present=true) and taint
  each with nvidia.com/gpu=true:PreferNoSchedule. Drop the manual
  'kubectl label gpu=true' since NFD handles labeling.

- MySQL anti-affinity: kubernetes.io/hostname NotIn [k8s-node1] ->
  nvidia.com/gpu.present NotIn [true]. Same intent (keep MySQL off
  the GPU node) but portable when the card relocates.

Net effect: moving the GPU card between nodes no longer requires any
Terraform edit. Verified no-op for current scheduling — both old and
new labels resolve to node1 today.

Docs updated to match: AGENTS.md, compute.md, overview.md,
proxmox-inventory.md, k8s-portal agent-guidance string.
2026-04-22 13:43:07 +00:00
Viktor Barzin
134d6b9a82 vault runbook + raft/HA stuck-leader alerts
Post-2026-04-22 Step 5 deliverables:
- docs/runbooks/vault-raft-leader-deadlock.md — safe pod-restart
  sequence that avoids zombie containerd-shim + kernel NFS
  corruption, qm reset no-op gotcha, boot-order gotcha.
- prometheus_chart_values.tpl — VaultRaftLeaderStuck +
  VaultHAStatusUnavailable. Silent until vault telemetry
  scraping lands (tracked as beads code-vkpn).

Epic for moving vault off NFS tracked as beads code-gy7h.
2026-04-22 12:44:46 +00:00
Viktor Barzin
4cb2c157da post-mortem 2026-04-22: full timeline — second regression + node4 reboot
The initial recovery at 11:03 was premature; vault-1's audit writes over
NFS started hanging ~15 min later and the cluster regressed to 503.
Full recovery required rebooting node4 (to free vault-0's stuck NFS
mount and shed PVE NFS thread contention) and a second reboot of node3
(to clear another round of kernel NFS client degradation). Final
recovery at 11:43:28 UTC with vault-2 as active leader on the quorum
vault-0 + vault-2.

vault-1 remains stuck in ContainerCreating on node2 — a third node2
reboot is required for full 3/3 quorum, but 2/3 is operationally
sufficient, so that's deferred.
2026-04-22 11:44:56 +00:00
Viktor Barzin
2f1f9107f8 vault: add fsGroupChangePolicy=OnRootMismatch + 2026-04-22 post-mortem
The 2026-04-22 Vault outage caught kubelet in a 2-minute chown loop that
never exited because the default fsGroupChangePolicy (Always) walks every
file on the NFS-backed data PVC. With retrans=3,timeo=30 NFS options and
a 1GB audit log, the recursive chown outlasted the deadline and restarted
forever — blocking raft quorum recovery. OnRootMismatch makes chown a
no-op when the volume root is already correct, which it always is after
initial setup.

The breakglass fix was applied live via kubectl patch at 10:54 UTC; this
commit persists it in Terraform so the next apply doesn't revert.

The post-mortem also documents the upstream raft stuck-leader pattern,
NFS kernel client corruption after force-kill, and the path to migrate
Vault off NFS to proxmox-lvm-encrypted.
2026-04-22 11:12:19 +00:00
Viktor Barzin
6a4a477336 [infra] Update RPi Sofia DNS: 192.168.1.16 → 192.168.1.10
RPi now uses USB Ethernet (eth1) as primary uplink at .10 instead of
the old wlan1 address at .16. Camera namespace and DNAT updated to
use eth1 with systemd persistence.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-22 10:55:34 +00:00
Viktor Barzin
d39770b30d monitoring: tighten LVMSnapshotStale to 30h for daily-cadence detection
Threshold was 48h + 30m for: a job that runs daily. We don't need
to wait 2.5 days to detect a broken timer — bring it down to 30h
+ 30m (just over a day of cadence + minor drift/retry grace). Also
add a description pointing to the restore runbook so the alert
text surfaces the fix path directly.

Threshold change: 172800s → 108000s. Docs in backup-dr.md synced.

Re-triggers default.yml apply now that ci/Dockerfile is rebuilt
with vault CLI — this is the first commit touching a stack that
will actually succeed since the e80b2f02 regression.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-22 08:54:37 +00:00
Viktor Barzin
3eb8b9a4ea ci: add vault CLI to infra-ci image + surface real errors in scripts/tg
The Woodpecker CI pipeline has been silently failing to apply Tier 1
stacks since the state-migration commit e80b2f02 because the Alpine
CI image never had the vault CLI. `scripts/tg` swallowed stderr with
`2>/dev/null` and surfaced a misleading "Cannot read PG credentials
from Vault" message — the real error was `sh: vault: not found`.

Verified with an in-cluster probe: woodpecker/default SA + role=ci
already gets the terraform-state policy and has read capability on
database/static-creds/pg-terraform-state. Auth was never the problem;
the vault binary just wasn't there.

- ci/Dockerfile: pin vault v1.18.1 (matches server) and install
- scripts/tg: pre-flight check + surface real vault output on failure
- Next build-ci-image.yml run rebuilds :latest with vault included;
  subsequent default.yml runs unblock monitoring apply (code-aoxk)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-22 08:46:50 +00:00
Viktor Barzin
4a343c33f0 monitoring: bring EmailRoundtripStale threshold docs in sync with for:20m
Doc claimed >40m; actual fire time is 80m (60m last-success threshold +
20m 'for'). Stale since pre-existing config; now re-stale after raising
'for' from 10m to 20m in 9b4970da. Files out of sync only on this one
alert row.
2026-04-21 22:39:46 +00:00
Viktor Barzin
9b4970da61 monitoring: alert hygiene — disambiguate, rename, tune, fix inhibits
- HighPowerUsage: add subsystem:gpu (line 724) + subsystem:r730 (line 775)
  labels so the two same-named alerts are distinguishable in routing.
- HeadscaleDown (deployment-replicas flavor, line 1414) → rename to
  HeadscaleReplicasMismatch. Line 2039 keeps HeadscaleDown as the real
  up-metric critical check. NodeDown inhibit rule updated to suppress
  the renamed alert too.
- EmailRoundtripStale (line 1816): for 10m → 20m. Survives one missed
  20-min probe cycle before firing, cuts flapping (12 short-burst fires
  over last 24h).

ATSOverload tuning skipped: 24h fire-count is 0, it's continuously
firing not flapping — already-known sustained 83% ATS load, tuning
would not change behavior.

8 backup *NeverSucceeded rules audited: all 7 using
kube_cronjob_status_last_successful_time target real K8s CronJobs with
active metrics (not Pushgateway-sourced). PrometheusBackupNeverRun
already uses absent() correctly. No fixes needed.
2026-04-21 22:29:15 +00:00
Viktor Barzin
ac695dea38 [registry] bulk-clean 34 orphan manifests + beads-server image bump
Registry integrity probe surfaced 38 broken manifest references
(34 unique repo:tag pairs, same OCI-index orphan pattern as the 04-19
infra-ci incident). Deleted all via registry HTTP API + ran GC;
reclaimed ~3GB blob storage.

beads-server CronJobs were stuck ImagePullBackOff on
claude-agent-service:0c24c9b6 for >6h — bumped variable default to
2fd7670d (canonical tag in claude-agent-service stack, already healthy
in registry) so new ticks can fire.

Rebuilt in-use broken tags: freedify:{latest,c803de02} and
beadboard:{17a38e43,latest} on registry VM; priority-pass via
Woodpecker pipeline #8. wealthfolio-sync:latest deferred (monthly
CronJob, next run 2026-05-01).

Probe now reports 0/39 failures. RegistryManifestIntegrityFailure
alert cleared.

Closes: code-8hk
Closes: code-jh3c

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 23:16:34 +00:00
Viktor Barzin
9041f52b05 monitoring: TechnitiumZoneCountMismatch — compare replicas only, exclude primary
Primary has only the Primary-type zones it owns (10). Replicas have those
+ built-in zones (localhost, in-addr.arpa reverse, etc.), so their count
(14) can never match primary. Alert expr compared max-min across all
instances, making it chronically firing.

Fix: instance!="primary" filter. The real signal this alert wants is
"did one replica drift from the others" — replica-to-replica comparison
captures that; primary was never comparable.
2026-04-19 22:15:55 +00:00
Viktor Barzin
4bedabb9e8 healthcheck: fix three false-positive WARNs (HA token, cert-manager, LVM snap grep)
- HA Sofia token: auto-bootstrap from Vault secret/viktor/haos_api_token when
  HOME_ASSISTANT_SOFIA_{URL,TOKEN} env vars are unset. Default URL =
  https://ha-sofia.viktorbarzin.me.
- cert-manager: add cert_manager_installed() probe (kubectl get crd
  certificates.cert-manager.io). When not installed — which is our current
  state — report PASS "N/A" instead of noisy WARN "CRDs unavailable".
- LVM snapshot freshness: grep pattern was `-- -snap` but actual LV names use
  underscore (`foo_snap_YYYY...`), so the grep matched nothing and the check
  always WARN'd. Fixed to `grep _snap`.

After fix: PASS 36→40, WARN 9→6, FAIL 1→1 (new ha_entities FAIL is a real
HA issue, not a script bug — 400/1401 sensors stale on ha-sofia).
2026-04-19 22:13:32 +00:00
Viktor Barzin
e092f159b3 monitoring: drop MAM Mouse-class + qBittorrent-unsatisfied alerts
Both alerts fired as expected noise while the MAM account is in new-member
Mouse class — tracker refuses announces and the 72h seed-gate can't be met
until ratio recovers. Keeping the rest of the MAM rules (cookie expiry,
ratio, farming/janitor stalls, qbt disconnect) which still signal real
pipeline failures.

Firing count drops from 7 → 3 in healthcheck.
2026-04-19 21:24:46 +00:00
Viktor Barzin
68a10905e0 [monitoring] uk-payslip Panel 13: stacked bars + sum-in-legend
"Monthly cash flow — tax impact (RSU excluded)" was already stacking
group A in normal mode but rendered as 70%-opacity filled lines — the
overlap made the total-per-month figure visually inaccessible.

Switch drawStyle to bars (100% fill, 0-width lineWidth, no per-point
markers) so each month reads as a single stacked bar whose top edge is
the total cash-side deduction. Add "sum" to legend.calcs so the
tax-year totals per series show in the legend table alongside last and
max.

Panel 11 (Tax & pension — monthly, RSU-inclusive) retains the line/
area style so the two panels remain visually distinct.
2026-04-19 20:31:53 +00:00
Viktor Barzin
2224a6b2cc [job-hunter] Bump image to 92afc38d — Frankfurter FX + comp_table COALESCE 2026-04-19 19:09:54 +00:00
Viktor Barzin
e813170960 [job-hunter] Bump image to 99ab188f — levels.fyi per-level + comp_points
99ab188f adds the structured-comp pipeline: levels.fyi __NEXT_DATA__
scraper, Robert Walters + Hays PDF parser, comp_points/levels tables
(alembic 0003), CLI comp/comp-table/comp-band/backfill-levels, and
Grafana panels 6-9. Alembic 0003 runs via the existing init container.

After apply, exec:
  kubectl -n job-hunter exec deploy/job-hunter -c job-hunter -- \
    python -m job_hunter backfill-levels
  kubectl -n job-hunter exec deploy/job-hunter -c job-hunter -- \
    python -m job_hunter refresh --source levels_fyi
  kubectl -n job-hunter exec deploy/job-hunter -c job-hunter -- \
    python -m job_hunter refresh --source uk_surveys
2026-04-19 18:56:20 +00:00
Viktor Barzin
3f6dfb10aa [monitoring] job-hunter: panels 6-9 for comp_points tables + trends
Append the structured-comp dashboard surface to the job-hunter
dashboard:

Panel 6 — Per-company salary by level (p50 base, GBP table).
Panel 7 — Total-comp heatmap per (company, level), p50 GBP.
Panel 8 — Comp-point volume by source (daily time-series).
Panel 9 — Base-salary trend (p50) over time for the top 5 companies.

Adds templating: $location (multi, default london), $level (single,
default senior), $company (multi, default all) — populated from
comp_points + levels metadata so the selection reflects what was
actually ingested.

Closes: code-5ph
2026-04-19 18:50:48 +00:00
Viktor Barzin
a8280e77b6 [broker-sync] unsuspend IMAP + Panel 15 RSU vest reconciliation (Phase D)
Activates the Schwab/InvestEngine IMAP ingest CronJob that's been
scaffolded-but-suspended since Phase 2 of broker-sync, now that the
Schwab parser can detect vest-confirmation emails. Runs nightly 02:30 UK.

Current behaviour once deployed:
  - Trade confirmations (Schwab sell-to-cover, InvestEngine orders) →
    Activity rows posted to Wealthfolio. Unchanged.
  - Release Confirmations (Schwab RSU vests) → parser returns gross-vest
    BUY + sell-to-cover SELL Activities (to Wealthfolio) and a VestEvent
    object (NOT YET persisted — Postgres sink + DB grant pending; see
    follow-up under code-860). Vest detection uses a subject/body
    heuristic that will need tightening against a real email fixture.

Panel 15 of the UK payslip dashboard added: per-vest-month join of
payslip.rsu_vest vs rsu_vest_events (gross_value_gbp, tax_withheld_gbp)
with delta columns. Tax-delta-percent coloured green/orange/red at
0/2%/5% thresholds. Table is empty until broker-sync starts persisting
VestEvents — harmless until then.

Before applying:
  - Verify IMAP creds in Vault (secret/broker-sync: imap_host,
    imap_user, imap_password, imap_directory) are still valid.
  - Empty vest-event table is expected; delta columns show NULL until
    the postgres sink lands.

Part of: code-860
2026-04-19 18:29:01 +00:00
Viktor Barzin
1c0e1bcdde [payslip-ingest] ActualBudget payroll sync CronJob + Panel 14 (Phase C)
Wires the daily ActualBudget deposit sync from the payslip-ingest app into
K8s as a CronJob, and adds dashboard Panel 14 to overlay bank deposits
against payslip net_pay.

CronJob: actualbudget-payroll-sync in payslip-ingest namespace, runs
02:00 UTC. Calls `python -m payslip_ingest sync-meta-deposits`, which
hits budget-http-api-viktor in the actualbudget namespace and upserts
matching Meta payroll deposits into payslip_ingest.external_meta_deposits.

ExternalSecret extended with three new Vault keys:
  - ACTUALBUDGET_API_KEY (same as actualbudget-http-api-viktor's env API_KEY)
  - ACTUALBUDGET_ENCRYPTION_PASSWORD (Viktor's budget password)
  - ACTUALBUDGET_BUDGET_SYNC_ID (Viktor's sync_id)

These must be seeded at secret/payslip-ingest in Vault before the
CronJob will run — it'll CrashLoop on missing env vars otherwise. First
run can be triggered on demand via `kubectl -n payslip-ingest create
job --from=cronjob/actualbudget-payroll-sync initial-sync`.

Panel 14 plots monthly SUM(external_meta_deposits.amount) vs
SUM(payslip.net_pay), plus a delta bar series — |delta| > £50 flags
likely parser drift on net_pay.

Part of: code-860
2026-04-19 18:21:20 +00:00
60 changed files with 5997 additions and 2175 deletions

View file

@ -122,8 +122,9 @@ Channel 3: A4 [32G] ──── A8 [32G] ──── A12[ 8G ] = 72 GB
| `offsite-sync-backup.timer` | Timer | Daily 06:00 | Two-step rsync to Synology (sda + NFS via inotify) |
| `nfs-change-tracker.service` | Service | Continuous | inotifywait on `/srv/nfs` + `/srv/nfs-ssd`, logs to `/mnt/backup/.nfs-changes.log` |
## GPU Node (k8s-node1)
- **VMID**: 201, **PCIe**: `0000:06:00.0` (NVIDIA Tesla T4)
- **Taint**: `nvidia.com/gpu=true:NoSchedule`, **Label**: `gpu=true`
- GPU workloads need: `node_selector = { "gpu": "true" }` + nvidia toleration
- Taint applied via `null_resource.gpu_node_taint` in `modules/kubernetes/nvidia/main.tf`
## GPU Node (currently k8s-node1)
- **VMID**: 201, **PCIe**: `0000:06:00.0` (NVIDIA Tesla T4) — physical passthrough, no Terraform pin
- **Taint**: `nvidia.com/gpu=true:PreferNoSchedule` (applied dynamically to every NFD-discovered GPU node)
- **Label**: `nvidia.com/gpu.present=true` (auto-applied by gpu-feature-discovery; also `feature.node.kubernetes.io/pci-10de.present=true` from NFD)
- GPU workloads need: `node_selector = { "nvidia.com/gpu.present" : "true" }` + nvidia toleration
- Taint applied via `null_resource.gpu_node_config` in `stacks/nvidia/modules/nvidia/main.tf`; node discovery keyed on the NFD `pci-10de.present` label so the taint follows the card to whichever host is carrying it

View file

@ -128,7 +128,7 @@ steps:
# ── Pre-warm provider cache ──
- |
if [ -s .platform_apply ] || [ -s .app_apply ]; then
FIRST_STACK=$(head -1 .platform_apply .app_apply 2>/dev/null | head -1)
FIRST_STACK=$(cat .platform_apply .app_apply 2>/dev/null | head -1)
if [ -n "$FIRST_STACK" ]; then
echo "Pre-warming provider cache from stacks/$FIRST_STACK..."
cd "stacks/$FIRST_STACK" && terragrunt init --terragrunt-non-interactive -input=false 2>&1 | tail -3 && cd ../..
@ -150,7 +150,7 @@ steps:
if echo "$OUTPUT" | grep -q "is locked by"; then
echo "[$stack] SKIPPED (locked by another session)"
else
echo "$OUTPUT" | tail -5
echo "$OUTPUT" | tail -50
echo "[$stack] FAILED (exit $EXIT)"
FAILED_PLATFORM_STACKS="$FAILED_PLATFORM_STACKS $stack"
fi
@ -178,7 +178,7 @@ steps:
if echo "$OUTPUT" | grep -q "is locked by"; then
echo "[$stack] SKIPPED (locked by another session)"
else
echo "$OUTPUT" | tail -5
echo "$OUTPUT" | tail -50
echo "[$stack] FAILED (exit $EXIT)"
FAILED_APP_STACKS="$FAILED_APP_STACKS $stack"
fi

View file

@ -163,10 +163,10 @@ lifecycle {
## Infrastructure
- **Proxmox**: 192.168.1.127 (Dell R730, 22c/44t, 142GB RAM)
- **Nodes**: k8s-master (10.0.20.100), node1 (GPU, Tesla T4), node2-4
- **GPU**: `node_selector = { "gpu": "true" }` + toleration `nvidia.com/gpu`
- **GPU**: `node_selector = { "nvidia.com/gpu.present" : "true" }` + toleration `nvidia.com/gpu`. The label is auto-applied by NFD/gpu-feature-discovery on any node with an NVIDIA PCI device — nothing is hostname-pinned, so the GPU card can move between nodes without Terraform edits.
- **Pull-through cache**: 10.0.20.10 — docker.io (:5000), ghcr.io (:5010) only. Caches stale manifests for :latest tags — use versioned tags or pre-pull with `ctr --hosts-dir ''` to bypass.
- **pfSense**: 10.0.20.1 (gateway, firewall, DNS forwarding)
- **MySQL InnoDB Cluster**: 1 instance on proxmox-lvm (scaled from 3 — only Uptime Kuma + phpIPAM remain), PriorityClass `mysql-critical` + PDB, anti-affinity excludes k8s-node1 (GPU node)
- **MySQL InnoDB Cluster**: 1 instance on proxmox-lvm (scaled from 3 — only Uptime Kuma + phpIPAM remain), PriorityClass `mysql-critical` + PDB, anti-affinity excludes any GPU node (`nvidia.com/gpu.present=true`) so MySQL moves off the GPU host automatically if the card is relocated
- **SMTP**: `var.mail_host` port 587 STARTTLS (not internal svc address — cert mismatch)
## Contributor Onboarding

View file

@ -1,12 +1,11 @@
FROM alpine:3.20
# Rebuild 2026-04-19 — previous :latest index referenced missing blobs (404 on 98f718c8 / 27d5ab83)
# Pin versions to match CI requirements
ARG TERRAFORM_VERSION=1.5.7
ARG TERRAGRUNT_VERSION=0.99.4
ARG SOPS_VERSION=3.9.4
ARG KUBECTL_VERSION=1.34.0
ARG VAULT_VERSION=1.18.1
# Install system packages (single layer)
RUN apk add --no-cache \
@ -36,6 +35,16 @@ RUN curl -fsSL "https://dl.k8s.io/release/v${KUBECTL_VERSION}/bin/linux/amd64/ku
-o /usr/local/bin/kubectl \
&& chmod +x /usr/local/bin/kubectl
# Vault CLI — required by scripts/tg for Tier 1 stack PG credential reads
# and Tier 0 advisory locks. Pinned to server version (1.18.1). Without this
# the CI pipeline surfaces the misleading "Cannot read PG credentials" error
# because scripts/tg swallows stderr ("vault: not found").
RUN curl -fsSL "https://releases.hashicorp.com/vault/${VAULT_VERSION}/vault_${VAULT_VERSION}_linux_amd64.zip" \
-o /tmp/vault.zip \
&& unzip /tmp/vault.zip -d /usr/local/bin/ \
&& rm /tmp/vault.zip \
&& vault version
# Provider cache directory (shared across stacks)
ENV TF_PLUGIN_CACHE_DIR=/tmp/terraform-plugin-cache
ENV TF_PLUGIN_CACHE_MAY_BREAK_DEPENDENCY_LOCK_FILE=1

Binary file not shown.

View file

@ -217,7 +217,7 @@ graph LR
Native LVM thin snapshots provide crash-consistent point-in-time recovery for 62 Proxmox CSI PVCs. These are CoW snapshots — instant creation, minimal overhead, sharing the thin pool's free space.
**Script**: `/usr/local/bin/lvm-pvc-snapshot` on PVE host (source: `infra/scripts/lvm-pvc-snapshot`)
**Script**: `/usr/local/bin/lvm-pvc-snapshot` on PVE host (source: `infra/scripts/lvm-pvc-snapshot.sh`). Deploy: `scp infra/scripts/lvm-pvc-snapshot.sh root@192.168.1.127:/usr/local/bin/lvm-pvc-snapshot`
**Schedule**: Daily 03:00 via systemd timer, 7-day retention
**Discovery**: Auto-discovers PVC LVs matching `vm-*-pvc-*` pattern in VG `pve` thin pool `data`
@ -226,7 +226,7 @@ Native LVM thin snapshots provide crash-consistent point-in-time recovery for 62
- They already have app-level dumps (Layer 2)
- Including them causes ~36% write amplification; excluding them reduces overhead to ~0%
**Monitoring**: Pushes metrics to Pushgateway via NodePort (30091). Alerts: `LVMSnapshotStale` (>24h), `LVMSnapshotFailing`, `LVMThinPoolLow` (<15% free).
**Monitoring**: Pushes metrics to Pushgateway via NodePort (30091). Alerts: `LVMSnapshotStale` (>30h since last run + 30m `for:`), `LVMSnapshotFailing`, `LVMThinPoolLow` (<15% free).
**Restore**: `lvm-pvc-snapshot restore <pvc-lv> <snapshot-lv>` — auto-discovers K8s workload, scales down, swaps LVs, scales back up. See `docs/runbooks/restore-lvm-snapshot.md`.
@ -234,7 +234,7 @@ Native LVM thin snapshots provide crash-consistent point-in-time recovery for 62
**Backup disk**: sda (1.1TB RAID1 SAS) → VG `backup` → LV `data` → ext4 → mounted at `/mnt/backup` on PVE host. Dedicated backup disk, independent of live storage.
**Script**: `/usr/local/bin/daily-backup` on PVE host (source: `infra/scripts/daily-backup`)
**Script**: `/usr/local/bin/daily-backup` on PVE host (source: `infra/scripts/daily-backup.sh`)
**Schedule**: Daily 05:00 via systemd timer
**Retention**: 4 weekly versions (weeks 0-3 via `--link-dest` hardlink dedup)
@ -673,7 +673,7 @@ module "nfs_backup" {
~~CloudSyncNeverRun~~ REMOVED (TrueNAS decommissioned) │
~~CloudSyncFailing~~ REMOVED (TrueNAS decommissioned) │
│ VaultwardenIntegrityFail integrity_ok == 0 │
│ LVMSnapshotStale > 24h since last snapshot │
│ LVMSnapshotStale > 30h since last snapshot │
│ LVMSnapshotFailing snapshot creation failed │
│ LVMThinPoolLow < 15% free space in thin pool
│ WeeklyBackupStale > 8d since last success │
@ -692,6 +692,16 @@ module "nfs_backup" {
- ~~CloudSync monitor~~: Removed (TrueNAS decommissioned)
- Vaultwarden integrity: Pushes `vaultwarden_sqlite_integrity_ok` hourly
**Pushgateway persistence**: The Pushgateway is configured with
`--persistence.file=/data/pushgateway.bin --persistence.interval=1m`
on a 2Gi `proxmox-lvm-encrypted` PVC (helm values:
`prometheus-pushgateway.persistentVolume`). Without this, every pod
restart drops in-memory metrics. Once-per-day pushers (offsite-sync,
weekly backup) are otherwise invisible for up to 24h if the
Pushgateway restarts between pushes — which is exactly what triggered
the 2026-04-22 backup_offsite_sync FAIL (node3 kubelet hiccup at
11:42 UTC terminated the Pushgateway 8h after the 03:12 UTC push).
**Alert routing**:
- All backup alerts → Slack `#infra-alerts`
- Vaultwarden integrity fail → Slack `#infra-critical` (immediate action required)

View file

@ -18,7 +18,7 @@ graph TB
subgraph Proxmox["Proxmox VE"]
direction TB
MASTER["VM 200: k8s-master<br/>8c / 32GB<br/>10.0.20.100"]
NODE1["VM 201: k8s-node1<br/>16c / 32GB<br/>GPU Passthrough<br/>nvidia.com/gpu=true:NoSchedule"]
NODE1["VM 201: k8s-node1<br/>16c / 32GB<br/>GPU Passthrough<br/>nvidia.com/gpu=true:PreferNoSchedule"]
NODE2["VM 202: k8s-node2<br/>8c / 32GB"]
NODE3["VM 203: k8s-node3<br/>8c / 32GB"]
NODE4["VM 204: k8s-node4<br/>8c / 32GB"]
@ -72,7 +72,7 @@ graph TB
| VM | VMID | vCPUs | RAM | Network | Role | Taints |
|----|------|-------|-----|---------|------|--------|
| k8s-master | 200 | 8 | 32GB | vmbr1:vlan20 (10.0.20.100) | Control Plane | `node-role.kubernetes.io/control-plane:NoSchedule` |
| k8s-node1 | 201 | 16 | 32GB | vmbr1:vlan20 | GPU Worker | `nvidia.com/gpu=true:NoSchedule` |
| k8s-node1 | 201 | 16 | 32GB | vmbr1:vlan20 | GPU Worker | `nvidia.com/gpu=true:PreferNoSchedule` (applied dynamically to whichever node carries the GPU) |
| k8s-node2 | 202 | 8 | 32GB | vmbr1:vlan20 | Worker | None |
| k8s-node3 | 203 | 8 | 32GB | vmbr1:vlan20 | Worker | None |
| k8s-node4 | 204 | 8 | 32GB | vmbr1:vlan20 | Worker | None |
@ -85,9 +85,9 @@ graph TB
|-----------|-------|
| Device | NVIDIA Tesla T4 (16GB GDDR6) |
| PCIe Address | 0000:06:00.0 |
| Assigned VM | VMID 201 (k8s-node1) |
| Node Label | `gpu=true` |
| Node Taint | `nvidia.com/gpu=true:NoSchedule` |
| Assigned VM | VMID 201 (k8s-node1) — physical location only, no Terraform pin |
| Node Label | `nvidia.com/gpu.present=true` (auto-applied by gpu-feature-discovery; also `feature.node.kubernetes.io/pci-10de.present=true` from NFD) |
| Node Taint | `nvidia.com/gpu=true:PreferNoSchedule` (applied by `null_resource.gpu_node_config` to every NFD-tagged GPU node) |
| Driver | NVIDIA GPU Operator |
| Resource Name | `nvidia.com/gpu` |
@ -273,8 +273,8 @@ resources {
### GPU Resource Management
**Node Selection**: GPU pods must:
1. Tolerate `nvidia.com/gpu=true:NoSchedule` taint
2. Select `gpu=true` label
1. Tolerate `nvidia.com/gpu=true:PreferNoSchedule` taint
2. Select `nvidia.com/gpu.present=true` label (auto-applied by gpu-feature-discovery wherever the card is)
3. Request `nvidia.com/gpu: 1` resource
**Example**:
@ -286,7 +286,7 @@ spec:
value: "true"
effect: NoSchedule
nodeSelector:
gpu: "true"
nvidia.com/gpu.present: "true"
containers:
- name: app
resources:
@ -294,6 +294,14 @@ spec:
nvidia.com/gpu: 1
```
**Portability**: No Terraform code references a specific hostname for
GPU scheduling. If the GPU card is physically moved to a different
node, gpu-feature-discovery moves the `nvidia.com/gpu.present=true`
label with it, and `null_resource.gpu_node_config` re-applies the
`nvidia.com/gpu=true:PreferNoSchedule` taint to the new host on the
next apply (discovery keyed on
`feature.node.kubernetes.io/pci-10de.present=true`).
**GPU Workloads**:
- Ollama (LLM inference)
- ComfyUI (Stable Diffusion workflows)
@ -529,7 +537,7 @@ kubectl describe pod <pod-name> -n <namespace>
```
0/5 nodes are available: 5 Insufficient nvidia.com/gpu.
```
**Fix**: Verify GPU node (201) is Ready and labeled `gpu=true`.
**Fix**: Verify the GPU-carrying node is Ready and has the `nvidia.com/gpu.present=true` label. Check `kubectl get nodes -l nvidia.com/gpu.present=true` — if empty, gpu-feature-discovery hasn't labeled any node (operator not running, driver not loaded, or PCI passthrough broken).
### Pods OOMKilled repeatedly
@ -614,7 +622,7 @@ spec:
value: "true"
effect: NoSchedule
nodeSelector:
gpu: "true"
nvidia.com/gpu.present: "true"
containers:
- name: app
resources:

View file

@ -127,9 +127,13 @@ Single shared cluster for all 17 consumers (Immich, Authentik, Nextcloud, Paperl
3 pods in StatefulSet `redis-v2`, each co-locating redis + sentinel + redis_exporter, using `docker.io/library/redis:8-alpine` (8.6.2). HAProxy (3 replicas, PDB minAvailable=2) routes clients to the current master via 1s `INFO replication` tcp-checks. Full context behind the April 2026 rework in beads `code-v2b`.
- 3 redis pods + 3 co-located sentinels (quorum=2). Odd sentinel count eliminates split-brain.
- **Pod anti-affinity is `required` (hard)** — each redis pod must land on a distinct node. Soft anti-affinity previously let the scheduler co-locate 2/3 pods on the same node; when that node (`k8s-node3`) went `NotReady→Ready` at 11:42 UTC on 2026-04-22 it took 2 redis pods with it and the cluster lost quorum. Cluster-wide PV `nodeAffinity` matches one zone (`topology.kubernetes.io/region=pve, zone=pve`), so PVCs rebind freely on reschedule.
- `podManagementPolicy=Parallel` + init container that regenerates `sentinel.conf` on every boot by probing peer sentinels for consensus master (priority: sentinel vote → peer role:master with slaves → deterministic pod-0 fallback). No persistent sentinel runtime state — can't drift out of sync with reality (root cause of 2026-04-19 PM incident).
- redis.conf has `include /shared/replica.conf`; the init container writes either an empty file (master) or `replicaof <master> 6379` (replicas), so pods come up already in the right role — no bootstrap race.
- **Sentinel hostname persistence**: `sentinel resolve-hostnames yes` + `sentinel announce-hostnames yes` in the init-generated sentinel.conf are mandatory — without them, sentinel stores resolved IPs in its rewritten config, and pod-IP churn on restart breaks failover. The MONITOR command itself must be issued with a hostname and the flags must be active before MONITOR, otherwise sentinel stores an IP that goes stale the next time the pod is deleted.
- **Failover timing (tuned 2026-04-22)**: `sentinel down-after-milliseconds=15000` + `sentinel failover-timeout=60000`. Redis liveness probe `timeout_seconds=10, failure_threshold=5`; sentinel liveness probe same. LUKS-encrypted LVM + BGSAVE fork can briefly stall master I/O >5s, which under the old 5s/30s sentinel timings + 3s/3 probes induced spurious `+sdown``+odown``+switch-master` cycles every 1-2 minutes. The new values absorb normal BGSAVE pauses without triggering failover.
- **HAProxy check smoothing (tuned 2026-04-22)**: `check inter 2s fall 3 rise 2` (was `1s / 2 / 2`) + `timeout check 5s` (was `3s`). The aggressive 1s polling used to race sentinel failovers — during a legitimate promote, HAProxy could catch the old master serving `role:slave` in the 1-3s window before re-probing the new master, leaving the backend empty and clients receiving `ReadOnlyError`.
- **Headless service `publish_not_ready_addresses=false`** (flipped 2026-04-22). Previously `true` meant HAProxy's DNS resolver saw not-yet-ready pods during rollouts, compounding the check-race above. Sentinel peer discovery is unaffected because sentinels announce to each other explicitly via `sentinel announce-hostnames yes`.
- Memory: master + replicas `requests=limits=768Mi`. Concurrent BGSAVE + AOF-rewrite fork can double RSS via COW, so headroom must cover it. `auto-aof-rewrite-percentage=200` + `auto-aof-rewrite-min-size=128mb` tune down rewrite frequency.
- Persistence: RDB (`save 900 1 / 300 100 / 60 10000`) + AOF `appendfsync=everysec`. Disk-wear analysis on 2026-04-19 (sdb Samsung 850 EVO 1TB, 150 TBW): Redis contributes <1 GB/day cluster-wide 40+ year runway at the 20% TBW budget.
- `maxmemory=640mb` (83% of 768Mi limit), `maxmemory-policy=allkeys-lru`.
@ -138,7 +142,7 @@ Single shared cluster for all 17 consumers (Immich, Authentik, Nextcloud, Paperl
**Observability** (redis-v2 only): `oliver006/redis_exporter:v1.62.0` sidecar per pod on port 9121, auto-scraped via Prometheus pod annotation. Alerts: `RedisDown`, `RedisMemoryPressure`, `RedisEvictions`, `RedisReplicationLagHigh`, `RedisForkLatencyHigh`, `RedisAOFRewriteLong`, `RedisReplicasMissing`, `RedisBackupStale`, `RedisBackupNeverSucceeded`.
**Why this design** — three incidents in April 2026 drove the rework: (a) 2026-04-04 service selector routed reads+writes to master+replica causing `READONLY` errors; (b) 2026-04-19 AM master OOMKilled during BGSAVE+PSYNC with the 256Mi limit too tight for a 204 MB working set under COW amplification; (c) 2026-04-19 PM sentinel runtime state drifted (only 2 sentinels, no majority) and routed writes to a slave. See beads epic `code-v2b` for the full plan and linked challenger analyses.
**Why this design** — four incidents in April 2026 drove the rework: (a) 2026-04-04 service selector routed reads+writes to master+replica causing `READONLY` errors; (b) 2026-04-19 AM master OOMKilled during BGSAVE+PSYNC with the 256Mi limit too tight for a 204 MB working set under COW amplification; (c) 2026-04-19 PM sentinel runtime state drifted (only 2 sentinels, no majority) and routed writes to a slave; (d) 2026-04-22 five-factor flap cascade — soft anti-affinity let 2/3 pods co-locate on `k8s-node3`, node bounced NotReady→Ready and took quorum with it; aggressive sentinel/probe timing (5s/30s + 3s/3) amplified disk-I/O stalls under LUKS-encrypted LVM into spurious `+switch-master` loops; HAProxy's 1s polling raced sentinel failovers and routed writes to demoted masters; `publish_not_ready_addresses=true` fed not-yet-ready pods into HAProxy DNS; downstream `realestate-crawler-celery` CrashLoopBackOff closed the feedback loop. See beads epic `code-v2b` for the full plan and linked challenger analyses.
### SQLite (Per-App)

View file

@ -231,7 +231,7 @@ Push secrets (`BREVO_API_KEY`, `EMAIL_MONITOR_IMAP_PASSWORD`) come from External
|-------|-----------|----------|
| MailServerDown | No replicas for 5m | warning |
| EmailRoundtripFailing | Probe failing for 30m | warning |
| EmailRoundtripStale | No success in >40m | warning |
| EmailRoundtripStale | No success in >80m (60m threshold + for:20m) | warning |
| EmailRoundtripNeverRun | Metric absent for 40m | warning |
### Uptime Kuma Monitors

View file

@ -158,7 +158,7 @@ spec:
#### Email Monitoring Alerts
- **EmailRoundtripFailing**: E2E email probe returning failure for >30m
- **EmailRoundtripStale**: No successful email round-trip in >40m
- **EmailRoundtripStale**: No successful email round-trip in >80m (60m threshold + for:20m)
- **EmailRoundtripNeverRun**: Email probe has never reported (40m)
#### Registry Integrity Alerts

View file

@ -139,7 +139,7 @@ The Kubernetes cluster consists of 5 nodes:
- **k8s-node1 (201)**: 16c/32GB GPU node with Tesla T4 passthrough, tainted for GPU workloads only
- **k8s-node2-4 (202-204)**: 8c/32GB workers running general-purpose workloads
GPU passthrough on node1 uses PCIe device 0000:06:00.0, with Kubernetes taint `nvidia.com/gpu=true:NoSchedule` and label `gpu=true` to ensure only GPU-requesting pods schedule there.
GPU passthrough on node1 uses PCIe device 0000:06:00.0. The NVIDIA GPU Operator's gpu-feature-discovery auto-labels whichever node carries the card with `nvidia.com/gpu.present=true`; `null_resource.gpu_node_config` taints the same set of nodes with `nvidia.com/gpu=true:PreferNoSchedule`. No hostname is hardcoded — moving the card to a different node requires no Terraform edits.
### Service Organization

View file

@ -129,7 +129,9 @@ graph TB
5. **Passphrase management**: ExternalSecret syncs passphrase from Vault KV (`secret/viktor/proxmox_csi_encryption_passphrase`) → K8s Secret. Backup key at `/root/.luks-backup-key` on PVE host.
**Services on encrypted storage (2026-04-15 migration):**
vaultwarden, dbaas (mysql+pg+pgadmin), mailserver, nextcloud, forgejo, matrix, n8n, affine, health, hackmd, redis, headscale, frigate, meshcentral, technitium, actualbudget, grampsweb, owntracks, paperless-ngx, wealthfolio, monitoring (alertmanager)
vaultwarden, dbaas (mysql+pg+pgadmin), mailserver, nextcloud, forgejo, matrix, n8n, affine, health, hackmd, redis, headscale, frigate, meshcentral, technitium, actualbudget, grampsweb, owntracks, wealthfolio, monitoring (alertmanager)
**Services migrated later** (post-audit catch-up): paperless-ngx (2026-04-25 — sensitive document scans had been left on plain `proxmox-lvm` by an abandoned attempt; rsync swap cleaned up the orphan and re-did via Terraform). Vault raft cluster (2026-04-25 — all 3 voters migrated from `nfs-proxmox` to `proxmox-lvm-encrypted` after the 2026-04-22 raft-leader-deadlock post-mortem found NFS fsync semantics incompatible with raft consensus log; rolled non-leader-first with force-finalize on the pvc-protection finalizer to avoid pod-recreating on the old PVCs).
**CSI node plugin memory**: Requires 1280Mi limit for LUKS2 Argon2id key derivation (~1GiB). Set via `node.plugin.resources` in Helm values (not `node.resources`).

View file

@ -0,0 +1,142 @@
# NFS-Hostile Workload Migration — Design
**Date**: 2026-04-25
**Author**: Viktor (with Claude)
**Status**: Phase 1 done, Phase 2 in progress
**Beads**: code-gy7h (Vault), code-ahr7 (Immich PG)
## Problem
The 2026-04-22 Vault Raft leader deadlock (post-mortem
`2026-04-22-vault-raft-leader-deadlock.md`) traced to NFS client
writeback stalls poisoning kernel state. Recovery took 2h43m and
required hard-resetting 3 of 4 cluster VMs. Two workload classes on
NFS are NFS-hostile per the criteria in
`infra/.claude/CLAUDE.md` ("Critical services MUST NOT use NFS"):
1. **Postgres with WAL fsync per commit** — Immich primary
2. **Vault Raft consensus log** — fsync per append-entry, 3 replicas
Everything else on NFS (47 PVCs, ~455 GiB) is correctly placed:
RWX media libraries, append-only backups, ML caches.
## Decision
Migrate exactly those two workload classes to
`proxmox-lvm-encrypted` (LUKS2 LVM-thin via Proxmox CSI). No iSCSI,
no RWX media migration, no backup-target migration.
## Rationale
- Block storage decouples PG / Raft fsync from NFS client kernel
state. Failure mode that triggered the post-mortem cannot recur for
these workloads.
- `proxmox-lvm-encrypted` is the documented default for sensitive data
(`infra/.claude/CLAUDE.md` storage decision rule). It already backs
~28 PVCs across the cluster — pattern is proven.
- Existing nightly `lvm-pvc-snapshot` PVE host script (03:00, 7-day
retention) auto-picks-up new PVCs via thin snapshots — no extra
backup wiring needed for the live data side.
- LUKS2 satisfies "encrypted at rest for sensitive data" requirement.
## Out of scope
- iSCSI evaluation (already retired 2026-04-13).
- RWX media (Immich library, music, ebooks) — correct placement.
- Backup target PVCs (`*-backup` on NFS) — append-only, NFS-tolerant.
- Prometheus 200 GiB — already on `proxmox-lvm`.
## Pattern per workload
### Immich PG (single replica, Deployment, Recreate strategy)
- Add new RWO PVC on `proxmox-lvm-encrypted`.
- Quiesce app pods (server + ML + frame).
- `pg_dumpall` from running NFS pod → local file.
- Swap deployment `claim_name` → encrypted PVC.
- PG bootstraps fresh on empty PVC; restore dump.
- REINDEX vector indexes (`clip_index`, `face_index`).
- Backup CronJob keeps writing to NFS module (correct: append-only).
### Vault Raft (3 replicas, StatefulSet, helm-managed)
- Change `dataStorage.storageClass` and `auditStorage.storageClass`
from `nfs-proxmox``proxmox-lvm-encrypted`.
- StatefulSet `volumeClaimTemplates` is immutable → use
`kubectl delete sts vault --cascade=orphan` then re-apply (memory
pattern for VCT swaps).
- Per-pod rolling: delete pod + PVCs, controller recreates with new
template. Auto-unseal sidecar handles unseal; raft `retry_join`
rejoins cluster.
- 24h validation window between pods. Migrate non-leader pods first;
step-down current leader before migrating it last.
- Backup target (`vault-backup-host` on NFS) stays on NFS.
## Risks and rollbacks
### Immich PG
- pg_dumpall captures schema + data, not file-level state. Vector
index versions matter (vchord 0.3.0 unchanged; vector 0.8.0 →
0.8.1 is a minor automatic bump on `CREATE EXTENSION` — confirmed
benign). Rollback: revert `claim_name`, scale apps; old NFS PVC
retained for 7 days post-migration.
### Vault Raft
- Cluster keeps quorum from 2 standby replicas while one pod is
swapped. Migrating the leader last avoids quorum churn.
- Recovery anchor: pre-migration `vault operator raft snapshot save`
+ nightly `vault-raft-backup` CronJob. RTO < 1h via snapshot
restore.
## Helm `securityContext.pod` replace-not-merge (Vault, discovered during execution)
The Vault helm chart sets pod-level securityContext defaults
(`fsGroup=1000, runAsGroup=1000, runAsUser=100, runAsNonRoot=true`)
from chart templates, not from values.yaml. When `main.tf` provided
its own `server.statefulSet.securityContext.pod = {fsGroupChangePolicy
= "OnRootMismatch"}` the helm rendering REPLACED the chart defaults
rather than merging into them. On NFS this was harmless (`async,
insecure` exports made the volume world-writable enough for any UID),
but on a fresh ext4 LV via Proxmox CSI the volume root is `root:root`
and vault user (UID 100) cannot open `/vault/data/vault.db`.
vault-1 and vault-2 happened to be Running with the correct
securityContext because their pod specs were written into etcd
**before** the customization landed; helm chart upgrades don't
restart pods, so the broken values lay dormant until vault-0 was
recreated by the orphan-deleted STS during this migration.
Resolution: provide all five fields (`fsGroup`, `fsGroupChangePolicy`,
`runAsGroup`, `runAsUser`, `runAsNonRoot`) explicitly in main.tf so
`runAsGroup=1000` etc. survive future chart bumps. Idempotent on
both fresh PVCs and existing pods.
## Init container chicken-and-egg (Immich PG, discovered during execution)
The pre-existing `write-pg-override-conf` init container on the
Immich PG deployment writes `postgresql.override.conf` directly to
`PGDATA`. On a populated NFS PVC this was a no-op (init was already
run). On the fresh encrypted PVC, the file made `initdb` refuse the
non-empty directory and the pod CrashLoopBackOff'd.
Resolution: gate the init container on `PG_VERSION` presence — first
boot skips the override write, PG `initdb`s cleanly; force a pod
restart and the second boot writes the override and PG loads
`vchord` / `vectors` / `pg_prewarm` before the dump restore. Change
is permanent and idempotent (correct on both fresh and initialised
PVCs). One restart pre-migration only.
## Verification
End-to-end DONE when:
- `kubectl get pvc -A | grep nfs-proxmox` returns only the
`vault-backup-host` PVC (or zero, if backup PVC moves elsewhere).
- `vault operator raft list-peers` shows 3 voters on
`proxmox-lvm-encrypted`, leader elected.
- Immich PG `\dx` matches pre-migration extensions (vector minor
drift OK).
- `lvm-pvc-snapshot` captures new LVs in next 03:00 run.
- 7 consecutive days of clean backup CronJob runs and no new alerts.

View file

@ -0,0 +1,169 @@
# NFS-Hostile Workload Migration — Plan
**Date**: 2026-04-25
**Design**: `2026-04-25-nfs-hostile-migration-design.md`
**Beads**: code-gy7h (Vault, epic), code-ahr7 (Immich PG)
## Phase 1 — Immich PG (DONE 2026-04-25)
| Step | Done |
|---|---|
| Snapshot extensions + row counts to `/tmp/immich-pre-migration-*` | ✓ |
| Quiesce `immich-server` + `immich-machine-learning` + `immich-frame` | ✓ |
| `pg_dumpall``/tmp/immich-pre-migration-<ts>.sql` (1.9 GB) | ✓ |
| Add `kubernetes_persistent_volume_claim.immich_postgresql_encrypted` (10Gi, autoresize 20Gi cap) | ✓ |
| Swap `claim_name` at `infra/stacks/immich/main.tf` deployment | ✓ |
| Patch init container to gate on `PG_VERSION` (chicken-and-egg fix) | ✓ |
| Force pod restart so override.conf gets written | ✓ |
| Restore dump | ✓ |
| `REINDEX clip_index`, `REINDEX face_index` | ✓ |
| Scale apps back up | ✓ |
| Verify: `\dx`, row counts (~111k assets), HTTP 200 internal/external | ✓ |
| LV present on PVE host (`vm-9999-pvc-...`) | ✓ |
### Phase 1 follow-ups (not blocking)
- Old NFS PVC `immich-postgresql-data-host` retained 7 days for
rollback. After 2026-05-02: remove `module.nfs_postgresql_host`
from `infra/stacks/immich/main.tf` and the CronJob's reference.
- Backup CronJob (`postgresql-backup`) still writes to the NFS
module. After cleanup, point it at a dedicated backup PVC or to
the existing `immich-backups` NFS share.
## Phase 2 — Vault Raft (DONE 2026-04-25)
**Phase 2 complete 2026-04-25; all 3 voters on `proxmox-lvm-encrypted`.**
### Pre-flight (T-0) — DONE 2026-04-25 15:50 UTC
- [x] Verify all 3 vault pods sealed=false, raft healthy.
- [x] Take fresh `vault operator raft snapshot save` (anchor saved at
`/tmp/vault-pre-migration-20260425-155029.snap`, 1.5 MB).
- [ ] Optional: scale ESO to 0 — skipped (auto-unseal sidecar is
independent; ESO refresh churn is non-disruptive for one swap).
- [x] Confirmed leader is **vault-2** → migrate vault-0 first
(non-leader), vault-1 next, vault-2 last (with step-down).
Plan originally assumed vault-0 was leader; same intent
(non-leader first).
- [x] Thin pool headroom: 54.63% used, plenty for 6 × 2 GiB LVs.
### Step 0 — Helm values + StatefulSet swap — DONE 2026-04-25 16:08 UTC
- [x] Edit `infra/stacks/vault/main.tf`: change
`dataStorage.storageClass` and `auditStorage.storageClass`
from `nfs-proxmox``proxmox-lvm-encrypted`.
- [x] `kubectl -n vault delete sts vault --cascade=orphan` (StatefulSet
`volumeClaimTemplates` is immutable; orphan keeps pods+PVCs
alive while we recreate the controller with the new template).
- [x] `tg apply -target=helm_release.vault` → recreates STS with new
VCT (full-stack `tg plan` blocks on unrelated for_each-with-
apply-time-keys errors at lines 848/865/909/917; targeted
apply on the helm release alone is the right scope here).
Existing pods still on old NFS PVCs.
### Step 1 — Roll vault-0 first (non-leader) — DONE 2026-04-25 16:18 UTC
- [x] `kubectl -n vault delete pod vault-0 --grace-period=30`
- [x] `kubectl -n vault delete pvc data-vault-0 audit-vault-0`
- [x] STS controller recreated pod; new PVCs auto-provisioned on
`proxmox-lvm-encrypted` (LVs `vm-9999-pvc-fb732fd7-...` data
4.12%, `vm-9999-pvc-36451f42-...` audit 3.99%).
- [x] **Hit and fixed**: vault-0 CrashLoopBackOff'd with
`permission denied` on `/vault/data/vault.db`. The helm chart's
`statefulSet.securityContext.pod` block in main.tf only set
`fsGroupChangePolicy`, replacing (not merging) the chart's
defaults `fsGroup=1000, runAsGroup=1000, runAsUser=100,
runAsNonRoot=true`. NFS exports made the missing fsGroup a
no-op; ext4 LV needs it to chown the volume root for the
vault user. Old vault-1/vault-2 pods were created before that
block was added so they still had the chart-default
securityContext from their original spec. Fix: provide all
five fields explicitly in main.tf and re-apply. Same root
cause will affect vault-1 and vault-2 swaps unless this stays
in place.
- [x] Wait Ready; auto-unseal sidecar unsealed; `retry_join` rejoined
raft cluster.
- [x] Verify: `vault operator raft list-peers` shows 3 voters,
vault-0 follower, leader=vault-2. External HTTPS 200.
### Step 2 — 24h soak (SKIPPED per user direction 2026-04-25)
User instructed "continue with all the remaining actions" — soak
gates compressed to per-pod settle windows + raft-state verification
between rollings. No Raft alarms, no Vault errors observed at each
verification gate.
### Step 3 — Roll vault-1 — DONE 2026-04-25
- [x] Force-finalize PVCs to break re-mount race:
`kubectl -n vault patch pvc data-vault-1 audit-vault-1 -p '{"metadata":{"finalizers":null}}' --type=merge`.
(Initial pod-then-PVC delete recreated pod on the OLD NFS PVCs
because pvc-protection finalizer hadn't cleared. Lesson learned
and applied to vault-2 below.)
- [x] Pod recreated on encrypted PVCs; auto-unsealed; rejoined raft.
### Step 4 — Settle window — DONE 2026-04-25
3-check verification over 90s; raft index advancing (2730010→2730012),
all 3 voters healthy.
### Step 5 — Roll vault-2 (leader) — DONE 2026-04-25
- [x] `vault operator step-down` on vault-2; vault-0 took leadership.
Confirmed vault-0 active, vault-1+vault-2 standby before delete.
- [x] Snapshot anchor at `/tmp/vault-pre-vault2.snap` (1.5 MB) from new
leader vault-0.
- [x] Force-finalize + delete PVCs + delete pod (lesson from vault-1).
- [x] Pod recreated on encrypted PVCs; auto-unsealed; rejoined raft.
- [x] `vault operator raft list-peers` shows 3 voters all healthy on
encrypted storage; leader vault-0.
### Step 6 — Cleanup — DONE 2026-04-25
- [x] `kubectl get pvc -A` cross-cluster shows zero PVCs on
`nfs-proxmox` SC (only Released PVs remain → Phase 3).
- [x] Removed inline `kubernetes_storage_class.nfs_proxmox` from
`infra/stacks/vault/main.tf` (was lines 2942).
- [x] All 3 PVC pairs on `proxmox-lvm-encrypted`.
- [x] `vault operator raft autopilot state` healthy=true.
- [x] External `https://vault.viktorbarzin.me/v1/sys/health` = 200.
## Phase 3 — Released-PV cleanup (FOLLOW-UP)
### Step 3.1 — vault Released PVs — DONE 2026-04-25
6 vault NFS PVs (Released, `nfs-proxmox` SC, Retain policy) deleted
along with their NFS subdirectories on PVE host (~1.5 GB reclaimed):
| PV | Claim | Size on disk |
|---|---|---|
| pvc-004a5d3b-… | data-vault-2 | 45M |
| pvc-808a78ec-… | audit-vault-1 | 1.4M |
| pvc-918ee7c1-… | audit-vault-0 | 3.2M |
| pvc-9d2ddcb4-… | data-vault-0 | 46M |
| pvc-a659711d-… | data-vault-1 | 46M |
| pvc-d2e65109-… | audit-vault-2 | 1.4G |
Procedure: `kubectl delete pv <name>` (cluster object only — Retain
policy means CSI never touches NFS) then `rm -rf /srv/nfs/<dir>` on
192.168.1.127.
### Step 3.2 — Cluster-wide Released PV sweep (DEFERRED)
~50 other Released PVs persist across the cluster (~200 GiB on
`proxmox-lvm` and `proxmox-lvm-encrypted`). Out of scope for the
2026-04-25 NFS-hostile session per user direction. To reclaim:
1. List Released PVs, confirm LV exists on PVE.
2. `kubectl delete pv <name>` (CSI removes underlying LV when PV is
orphaned with `Retain` reclaim policy and no PVC reference).
3. If LV survives: manual `lvremove pve/vm-9999-pvc-<uuid>`.
## Rollback
| Phase | Trigger | Action |
|---|---|---|
| 1 | Immich UI broken / data loss | Revert `claim_name`; restore from `/tmp/immich-pre-migration-*.sql` to old NFS PVC |
| 2 (mid-rolling) | Single pod broken | Delete the encrypted PVC; recreate with NFS SC explicitly; cluster keeps quorum from 2 healthy pods |
| 2 (post-rolling, raft corrupt) | Cluster-wide failure | `vault operator raft snapshot restore <pre-migration.snap>` |
| Catastrophic | All Vault data lost | Restore from latest `/srv/nfs/vault-backup/` snapshot via CronJob output |

View file

@ -190,3 +190,57 @@ unaddressed.
- **Runbook**: `docs/runbooks/registry-rebuild-image.md` (new).
- **Hot-fix commits**: `a05d63ee`, `6371e75e`, `c113be4d`.
- **Upstream bug class**: `distribution/distribution#3324`.
## 2026-04-19 — Bulk cleanup sweep (beads code-8hk + code-jh3c)
Same failure class, broader scope. The `registry-integrity-probe`
surfaced 38 broken manifest references persisting after the 04-19
infra-ci fix. `beads-dispatcher` + `beads-reaper` CronJobs were stuck
`ImagePullBackOff` on `claude-agent-service:0c24c9b6` for >6h. All 34
affected `repo:tag` pairs were OCI indexes whose `linux/amd64` child
manifests were absent from blob storage (same orphan pattern).
**Action taken**:
1. Bumped `beads-server/main.tf` var default `claude_agent_service_image_tag`
from `0c24c9b6``2fd7670d` (the canonical tag in
`claude-agent-service/main.tf`), reused — same image already healthy
on the registry. `scripts/tg apply` on `beads-server`. Deleted the
stuck Jobs so new CronJob ticks could fire.
2. Enumerated 34 broken `(repo, tag, parent_digest)` triples via HTTP
probe using `registry-probe-credentials` K8s Secret. Deleted each
via `DELETE /v2/<repo>/manifests/<digest>` (33× 202, 1× 404 —
claude-agent-service:latest pointed at an already-deleted digest).
3. Ran `docker exec registry-private /bin/registry garbage-collect
/etc/docker/registry/config.yml` — reclaimed ~3GB of orphan blob
storage.
4. Rebuilt the 3 in-use broken tags (all 3 OCI-index parents pointed
at missing children, so no cached copies would survive pod
reschedule):
- `freedify:latest` / `freedify:c803de02` — built on registry VM
directly (no CI pipeline exists for this image; python FastAPI).
- `beadboard:17a38e43` / `beadboard:latest` — GHA
`workflow_dispatch` failed at registry login (missing
`REGISTRY_USERNAME`/`REGISTRY_PASSWORD` GH secrets). Built on
registry VM directly as the fallback. GitHub secret gap is a
follow-up — beads `code-8hk` notes it.
- `priority-pass-backend:ae1420a0` / `priority-pass-frontend:ae1420a0`
— Woodpecker pipeline #8 on repo 81. Pipeline `kubectl set image`'d
the Deployment to `ae1420a0` (drift vs TF `v5`/`v8` defaults, but
that drift is pre-existing, not introduced by this cleanup).
- `wealthfolio-sync:latest`**not rebuilt**. Monthly CronJob (next
run 2026-05-01), no source tree or CI pipeline available in the
monorepo; deferred for separate follow-up.
**Post-cleanup state**:
- Probe: 39 tags, 0 failures. `registry_manifest_integrity_failures{} = 0`.
- Alert `RegistryManifestIntegrityFailure` cleared (was firing for
5h 32m).
- No `ImagePullBackOff` pods anywhere in the cluster.
- 28 of 34 deleted manifests were **dangling tags not referenced by any
workload** — old `382d6b1*`, `v2`-`v7`, `yt-fallback`, etc. Safe
deletes, no rebuilds needed.
**Permanent fix still in flight**: Phase 2/3 of this post-mortem
(post-push verification in CI, atomic `cleanup-tags.sh`) — not
addressed by this cleanup. The probe continues to be the
authoritative detector.

View file

@ -0,0 +1,155 @@
# Post-Mortem: Vault Raft Leader Deadlock + NFS Kernel Client Corruption Cascade
> **Resolution status (2026-04-25):** Resolved structurally by code-gy7h
> migration. All 3 vault voters now on `proxmox-lvm-encrypted` block
> storage; the NFS fsync incompatibility that triggered the original
> raft hang is no longer reachable. See
> `docs/plans/2026-04-25-nfs-hostile-migration-plan.md` Phase 2.
| Field | Value |
|-------|-------|
| **Date** | 2026-04-22 |
| **Duration** | External endpoint 503 from ~09:00 UTC to ~11:43 UTC (~2h 43m). vault-2 became active leader 11:43:28 UTC. |
| **Severity** | SEV1 (Vault — single source of secrets for 40+ services) |
| **Affected Services** | All ESO-backed services (password rotation paused). CronJobs that read plan-time secrets (14 stacks). Woodpecker CI (blocked pipeline `d39770b3`). Everything with `ExternalSecret` refresh interval ≤ 2h. |
| **Status** | Vault HA operational with vault-0 + vault-2 quorum. vault-1 still stuck ContainerCreating on node2 (third node2 reboot pending; workload can accept 2/3 quorum). Terraform fix committed as `2f1f9107`; apply pending. |
## Summary
A Vault raft leader (`vault-2`) entered a stuck goroutine state where its cluster port (8201) accepted TCP but never completed msgpack RPC. Standbys could not detect leader death because the TCP layer looked healthy, so no re-election fired. The only recovery was to kill the leader. During recovery, abrupt `kubectl delete --force` of the stuck Vault pods left kernel-side NFS client state on k8s-node1/node3/node4 in a corrupted state — **all new NFS mounts from those nodes timed out at 110s**, while existing mounts kept working. This created a cascade: the stuck leader blocked quorum, killing the leader broke NFS on the destination node for the recreated pod, force-killing the stuck pods left zombie `containerd-shim` processes kubelet couldn't clean up, and the resulting volume-manager loops pegged kubelet into 2-minute timeouts. Recovery required a VM hard-reset for node2 and node3 (kubelet was zombie on both). vault-0 remains down pending node4 reboot.
## Impact
- **User-facing**: `vault.viktorbarzin.me` returned HTTP 503 for ~2h. Any service that needed a Vault token during that window was degraded; Woodpecker CI pipeline blocked.
- **Blast radius**: 3/3 Vault pods affected (raft deadlock blocked re-election even with standbys up). Three k8s nodes degraded simultaneously with kernel NFS client stuck state (node1, node3, node4). Two nodes required VM hard-reset to recover kubelet (node2, node3).
- **Duration**: Degraded ~2h; resolution required sequential hard reboots.
- **Data loss**: None. Raft data integrity preserved on NFS. vault-1 came up with index 2475732, caught up to 2476009+ once leader was elected.
- **Observability gap**: No alert fired for the stuck raft leader. Standbys report `HA Mode: standby, Active Node Address: <leader IP>` as if healthy even when leader is hung.
## Timeline (UTC)
| Time | Event |
|------|-------|
| **~09:00** | `vault-2` (original raft leader) enters hung state — port 8201 open but msgpack RPCs hang. Its own logs go silent. Standbys continue heartbeat/appendEntries with `msgpack decode error [pos 0]: i/o timeout`. Neither standby triggers re-election because raft transport does not distinguish "TCP open + silent" from "TCP open + healthy". |
| **~09:15** | External endpoint starts serving 503. Woodpecker CI pipeline `d39770b3` blocks waiting for Vault. |
| **09:59** | Operator force-deletes `vault-2` pod — replacement comes up on node3 and enters candidate loop (term=32), cannot get quorum because DNS for `vault-0` is NXDOMAIN (ContainerCreating) and vault-1 does not respond (its raft goroutine also hung). |
| **10:07** | Operator force-deletes `vault-1` — new `vault-1` gets scheduled to node2. Its raft would be fine, but kubelet on node2 hangs in the pod cleanup path for the old pod's NFS mount. Concurrently, a new `vault-0` pod is attempted on node4, but **NFS mount from node4 times out at 110s** — the host kernel NFS client is in a degraded state that blocks all new mounts (including to completely different NFS paths like `/srv/nfs/ytdlp`). |
| **10:09** | Diagnostic test: from node1 and node4 CSI pods, `mount -t nfs -o nfsvers=4 192.168.1.127:/srv/nfs/ytdlp /tmp/test` times out. From node2 and node3 the same mount succeeds. NFS server is healthy (`showmount -e` works; `rpcinfo` shows all programs registered). The common factor on the broken nodes: they had a force-terminated Vault pod earlier in the session, leaving stuck `mount.nfs` processes in D-state. |
| **10:18** | Manual unmount of stale NFS mount from the force-deleted old vault-0 pod on node4. New mount attempts from CSI still time out — clearing the old mount did not recover kernel NFS client state. |
| **10:22** | Workaround discovered: mounting with `nfsvers=4.0` or `nfsvers=4.1` (instead of default `nfsvers=4` which negotiates to 4.2) succeeds on broken nodes. Confirms the stuck state is version-specific (NFSv4.2 session state), not a general NFS issue. Decision: rather than change CSI mount options cluster-wide (risk of remounting existing 48+ PVs), fix the nodes directly. |
| **10:31** | Investigated node2 kubelet state: old `vault-1` container shows `vault` process in **Z (zombie)** state with its `sh` wrapper stuck in `do_wait` in kernel (`zap_pid_ns_processes`). Containerd-shim PID killed manually — `sh` and zombie reparented to init but remained stuck (uninterruptible kernel wait tied to NFS). |
| **10:34** | Attempted `systemctl restart kubelet` on node2 — kubelet itself went into Z (zombie) with 2 tasks still attached. Classic NFS-related kernel deadlock. |
| **10:42** | **Decision: hard-reset node2 VM** (`qm reset 202`). Disruption: 22 pods evicted. |
| **10:43** | node2 back up (Ready). CSI registered. New `vault-1` scheduled to node2. NFS mount succeeded (fresh kernel state). Kubelet began chowning volume — **extremely slow, ~3 files per minute over NFS**. |
| **10:48** | `vault-1` (2/2 Running) unsealed. **Raft leader elected: `vault-2` wins term 32, election tally=2** (vault-1 voted yes once it came up, vault-0 unreachable). However vault-2's vault-layer (HA active/standby) never transitioned to active — raft leader with `active_time: 0001-01-01T00:00:00Z` and `/sys/ha-status` returning 500. |
| **10:50** | Restarted `vault-2` pod to force clean leader transition. New `vault-2` stuck in chown loop on node3 (same pattern as node2 earlier). |
| **10:54** | Patched the Vault `StatefulSet` with `fsGroupChangePolicy: OnRootMismatch` so subsequent recreations skip the recursive chown. |
| **10:57** | Force-deleted `vault-2` and `06fa940b` pod directory on node3. New pod spawned but kubelet again stuck on phantom state from the old pod. |
| **11:01** | **Hard-reset node3 VM** (`qm reset 203`). |
| **11:03** | First 200 response: vault-1 elected leader, vault-2 standby. Premature celebration — vault-1's audit log on node2 NFS starts timing out; `/sys/ha-status` returns 500 even though raft thinks vault-1 is active. |
| **~11:18** | Service regresses. `vault-1` audit writes hanging (`event not processed by enough 'sink' nodes, context deadline exceeded`). Readiness probe fails; pod goes 1/2; `vault-active` endpoint stays pointed at vault-1's IP but backend unresponsive → 503. |
| **11:22** | Force-restart `vault-1` to trigger re-election with new pod. Delete + containerd-shim cleanup leaves yet another zombie on node2. Same pattern: force-delete → zombie. |
| **11:29** | **Hard-reset node4 VM** (`qm reset 204`). Rationale: vault-0 was still blocked there; 74 pods on node4 contribute to NFS server load (load avg 16 on PVE). After reboot, vault-0 mounts its PVCs on fresh kernel state and comes up 2/2 Running 11:31. |
| **11:31** | Increased PVE NFS threads from 16 to 64 (`echo 64 > /proc/fs/nfsd/threads`). Did not help immediate mount failures — the stuck state is per-client kernel, not server capacity. |
| **11:38** | Discover DNS resolution issue: vault-2's Go resolver returns NXDOMAIN for short names `vault-0.vault-internal` even though glibc resolver works. CoreDNS restart issued earlier didn't fix. Restart vault-2 pod to force fresh resolver state. |
| **11:42** | **Second hard-reset of node3 VM** (`qm reset 203`). Kubelet+CSI re-register; vault-2 scheduled, NFS mounts finally succeed on fresh kernel state. |
| **11:43:28** | **vault-2 becomes active leader.** External endpoint returns 200 and stays there. vault-0 follower, catches up to index 2477632+. vault-1 still stuck on node2; left for later recovery. |
## Root Cause Chain
```
[1] Vault-2 raft goroutine hang (root cause — upstream Vault bug or infra-induced)
└─> Cluster port 8201 accepts TCP but never responds to msgpack RPCs
└─> Standbys' appendEntries calls return `msgpack decode error [pos 0]: i/o timeout`
└─> Raft protocol: no re-election because leader is heartbeating at the TCP level
└─> External endpoint returns 503 because HA layer has no active leader
[2] Recovery complication — abrupt pod termination
└─> `kubectl delete --force --grace-period=0` on vault-0/1/2
└─> containerd-shim fails to kill container cleanly (NFS I/O in D-state)
└─> vault process ends as zombie; sh wrapper stuck in do_wait
└─> Kubelet retries forever, cannot tear down old pod volumes
└─> NFS-CSI unmount requests succeed at the NFS layer but kubelet's
volume state-machine never marks the volume as unmounted
(stale 0000-mode mount directory blocks teardown completion)
[3] Kernel NFS client corruption on node1/node4
└─> Force-terminated Vault pod left stuck `mount.nfs` processes in D-state
└─> Kernel NFS4.2 client session state corrupted (held open mount slot)
└─> All subsequent mount syscalls for nfsvers=4 block 110s+ waiting for
session slot that will never be freed
└─> Manual workaround: nfsvers=4.1 bypasses the corrupted session state
[4] Kubelet starvation
└─> Combination of (2) and (3) means kubelet is stuck in a 2-minute volume-setup
context deadline loop — each iteration times out, new iteration restarts,
infinite loop
└─> Hard VM reset is the only exit
└─> After reset, kubelet starts clean, CSI re-registers, mounts succeed
[5] Slow recursive chown amplifies impact
└─> Default fsGroupChangePolicy: Always (Vault Helm chart 0.29.1 default)
└─> Kubelet walks every file on NFS setting gid=1000
└─> Over a 1GB audit log and a 47MB raft.db on NFS with timeo=30,retrans=3,
each chown syscall takes seconds; kubelet 2-minute deadline runs out
before the walk finishes
└─> Loop never exits even when ownership is already correct
```
## Why This Failed
1. **Raft transport does not detect stuck leaders.** If TCP is open and the process is alive enough to hold the port, standbys assume the leader is healthy. A stuck goroutine that never responds to RPCs appears to raft as "leader with high RTT" and does not trigger re-election. This is an upstream Vault bug (or at least a missing liveness check).
2. **Abrupt pod termination + NFS = kernel-level zombie.** When a Vault pod holding an NFS mount is force-killed before it cleanly closes file handles, the kernel's NFS4.2 client session state enters a corrupted state. This blocks all new mounts from that node — not just to the same NFS path, but to ANY NFS path on the same server. The fix is a kernel reboot; there is no userspace recovery.
3. **Vault data on NFS violates the documented rule.** `infra/.claude/CLAUDE.md` explicitly states: *"Critical services MUST NOT use NFS storage — circular dependency risk."* Vault currently uses `nfs-proxmox` for both `dataStorage` and `auditStorage`. If Vault had been on `proxmox-lvm-encrypted`, none of the NFS corruption cascade would have happened.
4. **fsGroupChangePolicy: Always is the Helm default.** Every pod restart walks every file over NFS. On a 1GB audit log with degraded NFS RTT, this takes longer than kubelet's internal 2-minute deadline, causing infinite restart loops. `OnRootMismatch` makes chown a no-op when the root is already correct (which it always is after first setup).
5. **No alert for this failure mode.** Prometheus alerts exist for `VaultSealed`, `VaultDown` (`up` metric), and backup staleness, but none for "raft leader has been running without advancing commit index" or "standby reports leader but leader's `/sys/ha-status` returns 500".
## Remediation (Applied)
- [x] Hard-reset node2 and node3 VMs to clear kernel NFS state and kubelet zombies.
- [x] Manually patched live `StatefulSet vault/vault` with `fsGroupChangePolicy: OnRootMismatch` to stop the chown loop.
- [x] Lazy-unmounted stale NFS mounts from force-deleted pod directories on node2 and node3.
- [x] Removed stale kubelet pod directories (`/var/lib/kubelet/pods/<UID>`) that had 0000-mode mount subdirectories blocking teardown.
- [x] Updated `stacks/vault/main.tf` with the `fsGroupChangePolicy` setting so the next `scripts/tg apply vault` makes it durable.
## Remediation (Pending)
- [ ] **Hard-reset node4** to recover vault-0 (same NFS kernel corruption pattern).
- [ ] **Run `scripts/tg apply` on the vault stack** to persist the fsGroupChangePolicy change.
- [ ] **Add Prometheus alert `VaultRaftLeaderStuck`** — fire when `vault_raft_last_index_gauge` (or derivation from `vault_runtime_total_gc_runs`) stops advancing for >2 minutes while `vault_core_active` is 1.
- [ ] **Add Prometheus alert `VaultHAStatusUnavailable`** — fire when `vault_core_active{}` reports 0 across all pods but `up{job="vault"}` reports 1 (HA layer broken but pods alive).
- [ ] **Migrate Vault to `proxmox-lvm-encrypted` block storage** — eliminates the entire NFS failure class. This follows the rule already documented in `infra/.claude/CLAUDE.md`. Tracked as beads task (open after Dolt is back up; currently down on node4).
- [ ] **Consider raising kubelet volume-manager deadline** for large-volume chown scenarios, or document the `fsGroupChangePolicy: OnRootMismatch` requirement for all NFS-backed StatefulSets.
- [ ] **Runbook**: `docs/runbooks/vault-raft-leader-deadlock.md` — how to detect stuck leader, safe force-restart procedure that avoids zombie pods, NFS kernel state recovery.
## Contributing Factors
1. **NFS mount options use bare `nfsvers=4`**. This negotiates to the highest version the server supports (NFSv4.2). When 4.2 session state corrupts, mounts fail; 4.1 works. Pinning to `nfsvers=4.1` in the `nfs-proxmox` StorageClass would make the failure mode recoverable without node reboot, but would also require recreating 48+ existing PVs (volumeAttributes are immutable). Deferred.
2. **`kubectl delete --force` is the default for stuck pods**. Operators reach for force-delete when a pod won't terminate, but this leaves containerd in an inconsistent state when the underlying storage is hung. Better approach: identify the stuck process (typically `mount.nfs` or a kernel NFS callback) and fix the root cause before force-deleting.
3. **Beads / Dolt server was on node4**, so beads task tracking went offline during this incident and couldn't be used to log progress cross-session.
4. **node1 was cordoned mid-incident** to prevent rescheduling to a node with confirmed NFS issues, but this reduced the scheduling surface for anti-affinity-sensitive StatefulSets.
## Learnings
1. **NFS for stateful critical services is structurally unsafe.** When NFS breaks, the recovery involves killing pods → which can break NFS further → until a reboot. The rule exists for a reason; Vault should never have been on NFS.
2. **Raft liveness needs application-layer probing, not TCP.** Every time we've seen a "stuck leader" issue in the homelab, TCP was fine and the app was unresponsive. A lightweight RPC probe with a short timeout and Prometheus alert would catch this in minutes instead of hours.
3. **kubelet volume-manager is fragile against stuck NFS.** Once kubelet enters a chown loop with a context deadline shorter than the chown duration, it cannot make progress — even when the filesystem is otherwise healthy. `OnRootMismatch` is effectively mandatory for any pod with `fsGroup` and a volume >100MB.
4. **VM hard-reset is cheap but disruptive.** The two reboots took ~60 seconds each but evicted 22+44 = 66 pods. Doing this twice in one session is a lot of churn. A post-mortem-driven improvement: pre-prepare "hot-standby" capacity so we can cordon+drain instead of hard-reset when kubelet zombies appear.
5. **Documentation of this rule is worth more than the rule itself.** The CLAUDE.md already says "critical services must not use NFS". The vault stack violates it. The rule without enforcement (validation, linting, CI) is ignored during the rush to ship.
## References
- Related: `docs/post-mortems/2026-04-14-nfs-fsid0-dns-vault-outage.md` — previous Vault+NFS incident (different root cause, similar blast pattern).
- Vault helm chart 0.29.1 default `fsGroupChangePolicy` is unset (behaves as `Always`).
- Upstream Vault HA layer: raft leader → vault-active transition is in `vault/external_tests/raft`. Stuck goroutine pattern not documented as a known issue.

View file

@ -0,0 +1,217 @@
# Runbook: Vault Raft Leader Deadlock + Safe Pod Restart
Captures the 2026-04-22 incident pattern. When a Vault raft leader enters a
stuck goroutine state (port 8201 accepts TCP but RPCs never return), the
recovery is *not* `kubectl delete --force`. Force-deleting a Vault pod that
holds a stuck NFS mount leaves kernel NFS client state corrupted, which
blocks all subsequent NFS mounts from the node and usually requires a VM
hard-reset to clear.
**Related**: [post-mortems/2026-04-22-vault-raft-leader-deadlock.md](../post-mortems/2026-04-22-vault-raft-leader-deadlock.md).
## Symptoms
- `https://vault.viktorbarzin.me/v1/sys/health` returns HTTP 503.
- Standbys log `msgpack decode error [pos 0]: i/o timeout` every 2s.
- `kubectl exec` into a standby shows raft thinks the leader is alive
(peers list all `Voter`, leader address populated) but `vault operator
raft autopilot state` stalls or errors.
- The "leader" pod's logs go silent — no heartbeats, no audit writes,
nothing. TCP on 8201 still accepts connections.
- ESO-backed secrets stop refreshing (ExternalSecret `SecretSyncedError`).
- Woodpecker CI pipelines that read from Vault at plan time hang.
## 0. Confirm the diagnosis (before touching anything)
Don't jump to force-delete. Verify the leader is actually stuck, not just
slow:
```sh
# 1. Who does raft think the leader is?
kubectl exec -n vault vault-0 -c vault -- vault status 2>&1 | \
grep -E 'HA Mode|Active Node|Leader|Raft'
# 2. Is the leader's port open but unresponsive?
LEADER_POD=vault-2 # or whichever vault status reports
kubectl exec -n vault $LEADER_POD -c vault -- sh -c \
'timeout 3 nc -zv 127.0.0.1 8200 2>&1; echo; timeout 3 vault status'
# 3. Is the active vault service pointing at a real pod?
kubectl get endpoints -n vault vault-active -o yaml | \
grep -E 'addresses|notReadyAddresses' -A2
# 4. What do standby logs say?
kubectl logs -n vault vault-0 -c vault --tail=40 | grep -iE 'msgpack|decode|rpc'
```
If (2) hangs and (4) shows repeated msgpack errors → stuck leader.
## 1. Identify the stuck pod precisely
```sh
# Find the pod whose vault_core_active would be 1 if it were scraping
# (currently no telemetry — use logs as proxy until telemetry is enabled).
for p in vault-0 vault-1 vault-2; do
echo "=== $p ==="
kubectl logs -n vault $p -c vault --tail=5 2>&1 | head -5
done | grep -B1 'no recent output'
```
The pod whose logs have been silent for minutes while the others are
actively erroring is the stuck leader.
## 2. The safe restart sequence (avoids zombie containers)
**DO NOT** `kubectl delete pod --force --grace-period=0` as the first
step. On NFS-backed Vault that's the exact move that leaves the kernel
NFS client corrupted on the node where the stuck pod ran.
Instead:
### 2a. Graceful delete first (30s grace)
```sh
kubectl delete pod -n vault vault-2
```
Wait 30 seconds. Most of the time the TERM → SIGKILL path works and the
new pod schedules cleanly. The remaining leaders re-elect and the external
endpoint recovers.
### 2b. If the pod is Terminating after 60s, find the stuck process
```sh
NODE=$(kubectl get pod -n vault vault-2-<suffix> -o jsonpath='{.spec.nodeName}')
POD_UID=$(kubectl get pod -n vault vault-2-<suffix> -o jsonpath='{.metadata.uid}')
ssh $NODE "sudo ps auxf | grep -A2 $POD_UID | head -20"
# Look for: mount.nfs (D-state), vault (Z-state), or the sh wrapper in do_wait
```
### 2c. Unmount stale NFS before force-deleting
If the old pod's NFS mount is still present, lazy-unmount it FIRST so
the kernel can release NFS session state cleanly:
```sh
ssh $NODE "sudo mount | grep $POD_UID | awk '{print \$3}' | xargs -I{} sudo umount -l {}"
```
Verify no mount.nfs processes are in D-state on the node:
```sh
ssh $NODE "ps -eo state,pid,comm | grep '^D' | head -5"
```
### 2d. Only NOW force-delete if needed
```sh
kubectl delete pod -n vault vault-2-<suffix> --force --grace-period=0
```
## 3. Recovery when the node is already stuck
If you force-deleted before reading this runbook and NFS is now broken
on the node:
**Diagnostic — confirm NFS client state is corrupted:**
```sh
NODE=k8s-node2 # node where the force-delete happened
ssh $NODE "sudo mkdir -p /tmp/nfstest && sudo timeout 30 \
mount -t nfs 192.168.1.127:/srv/nfs /tmp/nfstest && echo MOUNT_OK"
```
If the mount times out at 30-110s, kernel NFS client state is stuck.
No userspace recovery exists — only a VM reboot clears it.
**Workaround before rebooting**: mounting with `nfsvers=4.1` succeeds
on broken nodes (the corruption is NFSv4.2 session-state specific).
This is useful for diagnostic mounts, but does NOT fix CSI pods —
their mount options come from the `nfs-proxmox` StorageClass and can't
be overridden per-pod.
**Reboot the affected node VM:**
```sh
# Find PVE VM ID — nodes numbered 201-204 for k8s-node1..4
ssh root@192.168.1.127 "qm reset 20<N>"
# If qm reset leaves the VM PID unchanged (it didn't actually reboot),
# use qm stop/start:
ssh root@192.168.1.127 "qm stop 20<N> && qm start 20<N>"
```
Wait for the node to become Ready (`kubectl get node k8s-node<N> -w`)
and CSI driver to register (`kubectl get pods -n nfs-csi -o wide`).
**Gotcha — `qm reset` can be a no-op.** On the 2026-04-22 incident,
`qm reset 201` returned exit 0 but did NOT restart the VM (same QEMU PID
before and after). `qm status` reported "running" throughout. Always
verify by checking the QEMU PID or VM uptime post-reset. If uptime is
unchanged, escalate to `qm stop && qm start`.
**Gotcha — check boot order before stop/start.** Long-running VMs
(630+ day uptime) may have stale `bootdisk:` config that's been hidden
by never rebooting. On 2026-04-22, k8s-node1's config had `bootdisk:
scsi0` but the actual OS disk was on `scsi1`, so the first boot after
stop attempted iPXE and failed. Before stopping, verify:
```sh
ssh root@192.168.1.127 "grep -E 'boot|scsi[0-9]+:' /etc/pve/qemu-server/20<N>.conf"
```
If `bootdisk` references a disk ID that doesn't exist, fix it first
with `qm set 20<N> --boot "order=scsi<ID>"` (use the ID of the main
OS disk).
## 4. Prevent re-infection — the chown loop
After the node comes back, the vault pod's PV chown walk can still
peg kubelet. The durable fix is in `stacks/vault/main.tf`:
```hcl
statefulSet = {
securityContext = {
pod = {
fsGroupChangePolicy = "OnRootMismatch"
}
}
}
```
This was applied in commit `2f1f9107` (2026-04-22). If you find
yourself editing this in a kubectl patch for live recovery, follow
up with a Terraform apply the same session — leaving the cluster
ahead of Terraform state is technical debt that re-triggers on the
next apply.
## 5. Verify end-to-end
```sh
# External endpoint — the user-facing health check
curl -sk -o /dev/null -w "%{http_code}\n" https://vault.viktorbarzin.me/v1/sys/health
# expect: 200
# Raft peers (needs VAULT_TOKEN with operator capability)
kubectl exec -n vault vault-0 -c vault -- vault operator raft list-peers
# All pods 2/2
kubectl get pods -n vault -l app.kubernetes.io/name=vault -o wide
# No alerts fired (once VaultRaftLeaderStuck + VaultHAStatusUnavailable are live)
curl -s https://alertmanager.viktorbarzin.me/api/v2/alerts | \
jq '.[] | select(.labels.alertname | test("Vault"))'
```
## Known limitations
- **No alert for stuck leaders yet.** `VaultRaftLeaderStuck` and
`VaultHAStatusUnavailable` require Vault telemetry enabled
(`telemetry { unauthenticated_metrics_access = true }`) and a
scrape job. Alerts are defined in `prometheus_chart_values.tpl`
but stay silent until telemetry lands — tracked as a beads task.
- **Vault on NFS violates the documented rule.** `infra/.claude/CLAUDE.md`
says critical services must use `proxmox-lvm-encrypted`. The
`dataStorage`/`auditStorage` still use `nfs-proxmox`. Migration
tracked as an epic-level beads task.

View file

@ -1242,9 +1242,17 @@ check_overcommit() {
HA_CACHE_DIR=""
ha_sofia_available() {
if [[ -z "${HOME_ASSISTANT_SOFIA_URL:-}" ]] || [[ -z "${HOME_ASSISTANT_SOFIA_TOKEN:-}" ]]; then
return 1
if [[ -z "${HOME_ASSISTANT_SOFIA_URL:-}" ]]; then
export HOME_ASSISTANT_SOFIA_URL="https://ha-sofia.viktorbarzin.me"
fi
if [[ -z "${HOME_ASSISTANT_SOFIA_TOKEN:-}" ]]; then
if command -v vault >/dev/null 2>&1 && [[ -n "${VAULT_TOKEN:-}${HOME:-}" ]]; then
local t
t=$(vault kv get -field=haos_api_token secret/viktor 2>/dev/null || true)
[[ -n "$t" ]] && export HOME_ASSISTANT_SOFIA_TOKEN="$t"
fi
fi
[[ -n "${HOME_ASSISTANT_SOFIA_TOKEN:-}" ]] || return 1
return 0
}
@ -1752,14 +1760,25 @@ else:
json_add "hardware_exporters" "$status" "${detail:-All healthy}"
}
# Returns 0 if cert-manager CRDs are installed, 1 otherwise.
cert_manager_installed() {
$KUBECTL get crd certificates.cert-manager.io -o name >/dev/null 2>&1
}
# --- 31. cert-manager: Certificate Readiness ---
check_cert_manager_certificates() {
section 31 "cert-manager — Certificate Readiness"
local certs not_ready detail="" status="PASS"
if ! cert_manager_installed; then
pass "cert-manager not installed — N/A"
json_add "certmanager_certificates" "PASS" "N/A (cert-manager not installed)"
return 0
fi
certs=$($KUBECTL get certificates.cert-manager.io -A -o json 2>/dev/null) || {
warn "cert-manager CRDs not installed or inaccessible"
json_add "certmanager_certificates" "WARN" "CRDs unavailable"
warn "cert-manager CRDs installed but API query failed"
json_add "certmanager_certificates" "WARN" "API query failed"
return 0
}
@ -1797,9 +1816,15 @@ check_cert_manager_expiry() {
section 32 "cert-manager — Certificate Expiry (<14d)"
local certs expiring detail="" status="PASS"
if ! cert_manager_installed; then
pass "cert-manager not installed — N/A"
json_add "certmanager_expiry" "PASS" "N/A (cert-manager not installed)"
return 0
fi
certs=$($KUBECTL get certificates.cert-manager.io -A -o json 2>/dev/null) || {
warn "cert-manager CRDs not installed or inaccessible"
json_add "certmanager_expiry" "WARN" "CRDs unavailable"
warn "cert-manager CRDs installed but API query failed"
json_add "certmanager_expiry" "WARN" "API query failed"
return 0
}
@ -1852,9 +1877,15 @@ check_cert_manager_requests() {
section 33 "cert-manager — Failed CertificateRequests"
local requests failed detail="" status="PASS"
if ! cert_manager_installed; then
pass "cert-manager not installed — N/A"
json_add "certmanager_requests" "PASS" "N/A (cert-manager not installed)"
return 0
fi
requests=$($KUBECTL get certificaterequests.cert-manager.io -A -o json 2>/dev/null) || {
warn "cert-manager CRDs not installed or inaccessible"
json_add "certmanager_requests" "WARN" "CRDs unavailable"
warn "cert-manager CRDs installed but API query failed"
json_add "certmanager_requests" "WARN" "API query failed"
return 0
}
@ -1998,7 +2029,7 @@ check_backup_lvm_snapshots() {
local snap_output detail="" status="PASS"
snap_output=$(ssh -o BatchMode=yes -o ConnectTimeout=5 -o StrictHostKeyChecking=no \
root@192.168.1.127 "lvs -o lv_name,lv_time --noheadings 2>/dev/null | grep -- -snap" 2>/dev/null || true)
root@192.168.1.127 "lvs -o lv_name,lv_time --noheadings 2>/dev/null | grep _snap" 2>/dev/null || true)
if [[ -z "$snap_output" ]]; then
[[ "$QUIET" == true ]] && section_always 36 "Backup Freshness — LVM PVC Snapshots"

469
scripts/lvm-pvc-snapshot.sh Executable file
View file

@ -0,0 +1,469 @@
#!/usr/bin/env bash
# lvm-pvc-snapshot — LVM thin snapshot management for Proxmox CSI PVCs
# Deploy to PVE host at /usr/local/bin/lvm-pvc-snapshot
set -euo pipefail
# --- Configuration ---
VG="pve"
THINPOOL="data"
SNAP_SUFFIX_FORMAT="%Y%m%d_%H%M"
RETENTION_DAYS=7
MIN_FREE_PCT=10
PUSHGATEWAY="${LVM_SNAP_PUSHGATEWAY:-http://10.0.20.100:30091}"
PUSHGATEWAY_JOB="lvm-pvc-snapshot"
LOCKFILE="/run/lvm-pvc-snapshot.lock"
KUBECONFIG="${KUBECONFIG:-/root/.kube/config}"
export KUBECONFIG
# Namespaces to exclude from snapshots (high-churn, have app-level dumps)
# These PVCs cause significant CoW write amplification (~36% overhead)
EXCLUDE_NAMESPACES="${LVM_SNAP_EXCLUDE_NS:-dbaas,monitoring}"
# --- Logging ---
log() { echo "[$(date '+%Y-%m-%d %H:%M:%S')] $*"; }
warn() { log "WARN: $*" >&2; }
die() { log "FATAL: $*" >&2; exit 1; }
# --- Helpers ---
get_thinpool_free_pct() {
local data_pct
data_pct=$(lvs --noheadings --nosuffix -o data_percent "${VG}/${THINPOOL}" 2>/dev/null | tr -d ' ')
echo "scale=2; 100 - ${data_pct}" | bc
}
build_exclude_lv_list() {
# Query K8s for PVs in excluded namespaces, extract their LV names
if [[ -z "${EXCLUDE_NAMESPACES}" ]] || ! command -v kubectl &>/dev/null; then
return
fi
kubectl get pv -o json 2>/dev/null | jq -r --arg ns "${EXCLUDE_NAMESPACES}" '
($ns | split(",")) as $excl |
.items[] |
select(.spec.csi.driver == "csi.proxmox.sinextra.dev") |
select(.spec.claimRef.namespace as $n | $excl | index($n)) |
.spec.csi.volumeHandle | split("/") | last
' 2>/dev/null || true
}
discover_pvc_lvs() {
# List thin LVs matching PVC pattern, excluding snapshots, pre-restore backups,
# and LVs belonging to excluded namespaces (high-churn databases/metrics)
local all_lvs exclude_lvs
all_lvs=$(lvs --noheadings -o lv_name,pool_lv "${VG}" 2>/dev/null \
| awk -v pool="${THINPOOL}" '$2 == pool { print $1 }' \
| grep -E '^vm-[0-9]+-pvc-' \
| grep -v '_snap_' \
| grep -v '_pre_restore_')
exclude_lvs=$(build_exclude_lv_list)
if [[ -n "${exclude_lvs}" ]]; then
# Filter out excluded LVs
local exclude_pattern
exclude_pattern=$(echo "${exclude_lvs}" | paste -sd'|' -)
echo "${all_lvs}" | grep -vE "(${exclude_pattern})" || true
else
echo "${all_lvs}"
fi
}
list_snapshots() {
lvs --noheadings -o lv_name,pool_lv "${VG}" 2>/dev/null \
| awk -v pool="${THINPOOL}" '$2 == pool { print $1 }' \
| grep '_snap_' || true
}
parse_snap_timestamp() {
# Extract YYYYMMDD_HHMM from snapshot name, convert to epoch
local snap_name="$1"
local ts_str
ts_str=$(echo "${snap_name}" | grep -oE '[0-9]{8}_[0-9]{4}$')
if [[ -z "${ts_str}" ]]; then
echo "0"
return
fi
local ymd="${ts_str:0:8}"
local hm="${ts_str:9:4}"
date -d "${ymd:0:4}-${ymd:4:2}-${ymd:6:2} ${hm:0:2}:${hm:2:2}" +%s 2>/dev/null || echo "0"
}
get_original_lv_from_snap() {
# vm-200-pvc-abc_snap_20260403_1200 -> vm-200-pvc-abc
echo "$1" | sed 's/_snap_[0-9]\{8\}_[0-9]\{4\}$//'
}
push_metrics() {
local status="$1" created="$2" failed="$3" pruned="$4"
local free_pct
free_pct=$(get_thinpool_free_pct)
cat <<METRICS | curl -sf --connect-timeout 5 --max-time 10 --data-binary @- \
"${PUSHGATEWAY}/metrics/job/${PUSHGATEWAY_JOB}" 2>/dev/null || warn "Failed to push metrics to Pushgateway"
# HELP lvm_snapshot_last_run_timestamp Unix timestamp of last snapshot run
# TYPE lvm_snapshot_last_run_timestamp gauge
lvm_snapshot_last_run_timestamp $(date +%s)
# HELP lvm_snapshot_last_status Exit status (0=success, 1=partial failure, 2=aborted)
# TYPE lvm_snapshot_last_status gauge
lvm_snapshot_last_status ${status}
# HELP lvm_snapshot_created_total Number of snapshots created in last run
# TYPE lvm_snapshot_created_total gauge
lvm_snapshot_created_total ${created}
# HELP lvm_snapshot_failed_total Number of snapshot failures in last run
# TYPE lvm_snapshot_failed_total gauge
lvm_snapshot_failed_total ${failed}
# HELP lvm_snapshot_pruned_total Number of snapshots pruned in last run
# TYPE lvm_snapshot_pruned_total gauge
lvm_snapshot_pruned_total ${pruned}
# HELP lvm_snapshot_thinpool_free_pct Thin pool free percentage
# TYPE lvm_snapshot_thinpool_free_pct gauge
lvm_snapshot_thinpool_free_pct ${free_pct}
METRICS
}
# --- Subcommands ---
cmd_snapshot() {
log "Starting PVC LVM thin snapshot run"
# Check thin pool free space
local free_pct
free_pct=$(get_thinpool_free_pct)
log "Thin pool free space: ${free_pct}%"
if (( $(echo "${free_pct} < ${MIN_FREE_PCT}" | bc -l) )); then
warn "Thin pool has only ${free_pct}% free (minimum: ${MIN_FREE_PCT}%). Aborting."
push_metrics 2 0 0 0
exit 1
fi
# Discover PVC LVs
local lvs_list
lvs_list=$(discover_pvc_lvs)
if [[ -z "${lvs_list}" ]]; then
warn "No PVC LVs found matching pattern"
push_metrics 2 0 0 0
exit 1
fi
local count=0 failed=0 total
total=$(echo "${lvs_list}" | wc -l | tr -d ' ')
local snap_ts
snap_ts=$(date +"${SNAP_SUFFIX_FORMAT}")
log "Found ${total} PVC LVs to snapshot"
while IFS= read -r lv; do
local snap_name="${lv}_snap_${snap_ts}"
if lvcreate -s -kn -n "${snap_name}" "${VG}/${lv}" >/dev/null 2>&1; then
log " Created: ${snap_name}"
count=$((count + 1))
else
warn " Failed to create snapshot for ${lv}"
failed=$((failed + 1))
fi
done <<< "${lvs_list}"
log "Snapshot run complete: ${count} created, ${failed} failed out of ${total}"
# Auto-prune
log "Running auto-prune..."
local pruned
pruned=$(cmd_prune_count)
# Determine status
local status=0
if (( failed > 0 && count > 0 )); then
status=1 # partial
elif (( failed > 0 && count == 0 )); then
status=2 # all failed
fi
push_metrics "${status}" "${count}" "${failed}" "${pruned}"
log "Done"
}
cmd_list() {
printf "%-45s %-50s %8s %8s\n" "ORIGINAL LV" "SNAPSHOT" "AGE" "DATA%"
printf "%-45s %-50s %8s %8s\n" "-----------" "--------" "---" "-----"
local now
now=$(date +%s)
local snap_lines
snap_lines=$(lvs --noheadings --nosuffix -o lv_name,lv_size,data_percent "${VG}" 2>/dev/null \
| grep -E '_snap_|_pre_restore_' || true)
if [[ -z "${snap_lines}" ]]; then
echo "(no snapshots found)"
return
fi
echo "${snap_lines}" | while read -r name size data_pct; do
local original age_str ts epoch
if [[ "${name}" == *"_pre_restore_"* ]]; then
original=$(echo "${name}" | sed 's/_pre_restore_[0-9]\{8\}_[0-9]\{4\}$//')
ts=$(echo "${name}" | grep -oE '[0-9]{8}_[0-9]{4}$')
else
original=$(get_original_lv_from_snap "${name}")
ts=$(echo "${name}" | grep -oE '[0-9]{8}_[0-9]{4}$')
fi
epoch=$(parse_snap_timestamp "${name}")
if (( epoch > 0 )); then
local age_s=$(( now - epoch ))
local days=$(( age_s / 86400 ))
local hours=$(( (age_s % 86400) / 3600 ))
age_str="${days}d${hours}h"
else
age_str="unknown"
fi
printf "%-45s %-50s %8s %7s%%\n" "${original}" "${name}" "${age_str}" "${data_pct}"
done
}
cmd_prune() {
local pruned
pruned=$(cmd_prune_count)
log "Pruned ${pruned} expired snapshots"
}
cmd_prune_count() {
# NOTE: stdout of this function is captured by callers (`pruned=$(cmd_prune_count)`),
# so all log/warn output must go to stderr — the only thing on stdout is the count.
local now cutoff pruned=0
now=$(date +%s)
cutoff=$(( now - RETENTION_DAYS * 86400 ))
local snaps
snaps=$(lvs --noheadings -o lv_name,pool_lv "${VG}" 2>/dev/null \
| awk -v pool="${THINPOOL}" '$2 == pool { print $1 }' \
| grep -E '_snap_|_pre_restore_' || true)
if [[ -z "${snaps}" ]]; then
echo "0"
return
fi
while IFS= read -r snap; do
local epoch
epoch=$(parse_snap_timestamp "${snap}")
if (( epoch > 0 && epoch < cutoff )); then
if lvremove -f "${VG}/${snap}" >/dev/null 2>&1; then
log " Pruned: ${snap}" >&2
pruned=$((pruned + 1))
else
warn " Failed to prune: ${snap}"
fi
fi
done <<< "${snaps}"
echo "${pruned}"
}
cmd_restore() {
local pvc_lv="${1:-}" snapshot_lv="${2:-}"
if [[ -z "${pvc_lv}" || -z "${snapshot_lv}" ]]; then
die "Usage: $0 restore <pvc-lv-name> <snapshot-lv-name>"
fi
# Validate LVs exist
if ! lvs "${VG}/${pvc_lv}" >/dev/null 2>&1; then
die "PVC LV '${pvc_lv}' not found in VG '${VG}'"
fi
if ! lvs "${VG}/${snapshot_lv}" >/dev/null 2>&1; then
die "Snapshot LV '${snapshot_lv}' not found in VG '${VG}'"
fi
# Discover K8s context
log "Discovering Kubernetes context for LV '${pvc_lv}'..."
local volume_handle="local-lvm:${pvc_lv}"
local pv_info
pv_info=$(kubectl get pv -o json 2>/dev/null | jq -r \
--arg vh "${volume_handle}" \
'.items[] | select(.spec.csi.volumeHandle == $vh) | "\(.metadata.name) \(.spec.claimRef.namespace) \(.spec.claimRef.name)"' \
) || die "Failed to query PVs (is kubectl configured?)"
if [[ -z "${pv_info}" ]]; then
die "No PV found with volumeHandle '${volume_handle}'"
fi
local pv_name pvc_ns pvc_name
read -r pv_name pvc_ns pvc_name <<< "${pv_info}"
log "Found: PV=${pv_name}, PVC=${pvc_ns}/${pvc_name}"
# Find the workload (Deployment or StatefulSet) that uses this PVC
local workload_type="" workload_name="" original_replicas=""
# Check StatefulSets first (databases use these)
local sts_info
sts_info=$(kubectl get statefulset -n "${pvc_ns}" -o json 2>/dev/null | jq -r \
--arg pvc "${pvc_name}" \
'.items[] | select(
(.spec.template.spec.volumes // [] | .[].persistentVolumeClaim.claimName == $pvc) or
(.spec.volumeClaimTemplates // [] | .[].metadata.name as $vct |
.spec.replicas as $r | range($r) | "\($vct)-\(.metadata.name)-\(.)" ) == $pvc
) | "\(.metadata.name) \(.spec.replicas)"' 2>/dev/null \
) || true
# If not found via simple volume check, try matching VCT naming pattern
if [[ -z "${sts_info}" ]]; then
sts_info=$(kubectl get statefulset -n "${pvc_ns}" -o json 2>/dev/null | jq -r \
--arg pvc "${pvc_name}" \
'.items[] | .metadata.name as $sts | .spec.replicas as $r |
select(.spec.volumeClaimTemplates != null) |
.spec.volumeClaimTemplates[].metadata.name as $vct |
[range($r)] | map("\($vct)-\($sts)-\(.)") |
if any(. == $pvc) then "\($sts) \($r)" else empty end' 2>/dev/null \
) || true
fi
if [[ -n "${sts_info}" ]]; then
read -r workload_name original_replicas <<< "${sts_info}"
workload_type="statefulset"
else
# Check Deployments
local deploy_info
deploy_info=$(kubectl get deployment -n "${pvc_ns}" -o json 2>/dev/null | jq -r \
--arg pvc "${pvc_name}" \
'.items[] | select(
.spec.template.spec.volumes // [] | .[].persistentVolumeClaim.claimName == $pvc
) | "\(.metadata.name) \(.spec.replicas)"' 2>/dev/null \
) || true
if [[ -n "${deploy_info}" ]]; then
read -r workload_name original_replicas <<< "${deploy_info}"
workload_type="deployment"
fi
fi
if [[ -z "${workload_type}" ]]; then
warn "Could not auto-discover workload for PVC '${pvc_name}' in namespace '${pvc_ns}'."
warn "You may need to scale down the pod manually."
echo ""
read -rp "Continue with LV swap anyway? (yes/no): " confirm
[[ "${confirm}" == "yes" ]] || die "Aborted by user"
workload_type="manual"
fi
# Dry-run output
local backup_name="${pvc_lv}_pre_restore_$(date +"${SNAP_SUFFIX_FORMAT}")"
echo ""
echo "╔══════════════════════════════════════════════════════════════╗"
echo "║ RESTORE DRY-RUN ║"
echo "╠══════════════════════════════════════════════════════════════╣"
echo "║ PVC: ${pvc_ns}/${pvc_name}"
echo "║ PV: ${pv_name}"
if [[ "${workload_type}" != "manual" ]]; then
echo "║ Workload: ${workload_type}/${workload_name} (replicas: ${original_replicas}→0→${original_replicas})"
fi
echo "║"
echo "║ Actions:"
if [[ "${workload_type}" != "manual" ]]; then
echo "║ 1. Scale ${workload_type}/${workload_name} to 0 replicas"
echo "║ 2. Wait for pod termination"
fi
echo "║ 3. Rename ${pvc_lv}${backup_name}"
echo "║ 4. Rename ${snapshot_lv}${pvc_lv}"
if [[ "${workload_type}" != "manual" ]]; then
echo "║ 5. Scale ${workload_type}/${workload_name} back to ${original_replicas} replicas"
fi
echo "╚══════════════════════════════════════════════════════════════╝"
echo ""
# Interactive confirmation
read -rp "Type 'yes' to proceed with restore: " confirm
if [[ "${confirm}" != "yes" ]]; then
die "Aborted by user"
fi
# Scale down
if [[ "${workload_type}" != "manual" ]]; then
log "Scaling ${workload_type}/${workload_name} to 0 replicas..."
kubectl scale "${workload_type}/${workload_name}" -n "${pvc_ns}" --replicas=0
log "Waiting for pod termination (timeout: 120s)..."
kubectl wait --for=delete pod -l "app.kubernetes.io/name=${workload_name}" -n "${pvc_ns}" --timeout=120s 2>/dev/null || \
kubectl wait --for=delete pod -l "app=${workload_name}" -n "${pvc_ns}" --timeout=120s 2>/dev/null || \
warn "Timeout waiting for pods — continuing anyway (LV may still be in use)"
sleep 5 # extra grace period for device detach
fi
# Verify LV is not active
local lv_active
lv_active=$(lvs --noheadings -o lv_active "${VG}/${pvc_lv}" 2>/dev/null | tr -d ' ')
if [[ "${lv_active}" == "active" ]]; then
warn "LV ${pvc_lv} is still active. Attempting to deactivate..."
# Close any LUKS mapper on the LV before deactivation
if dmsetup ls 2>/dev/null | grep -q "${pvc_lv}"; then
log "Closing LUKS mapper for ${pvc_lv}..."
cryptsetup luksClose "${pvc_lv}" 2>/dev/null || true
fi
lvchange -an "${VG}/${pvc_lv}" 2>/dev/null || warn "Could not deactivate — proceeding with caution"
fi
# LV swap
log "Renaming ${pvc_lv}${backup_name}"
lvrename "${VG}" "${pvc_lv}" "${backup_name}" || die "Failed to rename original LV"
log "Renaming ${snapshot_lv}${pvc_lv}"
lvrename "${VG}" "${snapshot_lv}" "${pvc_lv}" || die "Failed to rename snapshot LV"
# Scale back up
if [[ "${workload_type}" != "manual" ]]; then
log "Scaling ${workload_type}/${workload_name} back to ${original_replicas} replicas..."
kubectl scale "${workload_type}/${workload_name}" -n "${pvc_ns}" --replicas="${original_replicas}"
log "Waiting for pod to become Ready (timeout: 300s)..."
kubectl wait --for=condition=Ready pod -l "app.kubernetes.io/name=${workload_name}" -n "${pvc_ns}" --timeout=300s 2>/dev/null || \
kubectl wait --for=condition=Ready pod -l "app=${workload_name}" -n "${pvc_ns}" --timeout=300s 2>/dev/null || \
warn "Timeout waiting for pod Ready — check manually"
fi
echo ""
log "Restore complete!"
log "Old data preserved as: ${backup_name}"
log "To delete old data after verification: lvremove -f ${VG}/${backup_name}"
}
# --- Main ---
usage() {
cat <<EOF
Usage: $(basename "$0") <command> [args]
Commands:
snapshot Create thin snapshots of all PVC LVs
list List existing snapshots with age and data%
prune Remove snapshots older than ${RETENTION_DAYS} days
restore <lv> <snap> Restore a PVC from a snapshot (interactive)
Environment:
LVM_SNAP_PUSHGATEWAY Pushgateway URL (default: ${PUSHGATEWAY})
KUBECONFIG Kubeconfig path (default: /root/.kube/config)
EOF
}
main() {
local cmd="${1:-}"
shift || true
# Acquire lock (except for list which is read-only)
if [[ "${cmd}" != "list" && "${cmd}" != "" && "${cmd}" != "help" && "${cmd}" != "--help" && "${cmd}" != "-h" ]]; then
exec 200>"${LOCKFILE}"
if ! flock -n 200; then
die "Another instance is already running (lockfile: ${LOCKFILE})"
fi
fi
case "${cmd}" in
snapshot) cmd_snapshot ;;
list) cmd_list ;;
prune) cmd_prune ;;
restore) cmd_restore "$@" ;;
help|--help|-h|"") usage ;;
*) die "Unknown command: ${cmd}. Run '$0 help' for usage." ;;
esac
}
main "$@"

View file

@ -72,12 +72,23 @@ if [ -n "$STACK_NAME" ]; then
else
# Tier 1: PG backend — fetch credentials from Vault
if [ -z "${PG_CONN_STR:-}" ]; then
PG_CREDS=$(vault read -format=json database/static-creds/pg-terraform-state 2>/dev/null) || {
echo "ERROR: Cannot read PG credentials from Vault. Run: vault login -method=oidc" >&2
# Pre-flight: vault CLI must be available. Previously CI failed with a
# misleading "Cannot read PG credentials" message because the Alpine CI
# image lacked the vault binary — the 2>/dev/null below swallowed the
# real "vault: not found" error. Fail fast with a clear message instead.
if ! command -v vault >/dev/null 2>&1; then
echo "ERROR: vault CLI not found on PATH. Install it or use an image that includes it (ci/Dockerfile)." >&2
exit 1
fi
VAULT_OUT=$(vault read -format=json database/static-creds/pg-terraform-state 2>&1) || {
echo "ERROR: Cannot read PG credentials from Vault. Vault output follows:" >&2
echo "$VAULT_OUT" >&2
echo "" >&2
echo "Hint: humans run 'vault login -method=oidc'; CI auths via K8s SA (role=ci)." >&2
exit 1
}
PG_USER=$(echo "$PG_CREDS" | jq -r .data.username)
PG_PASS=$(echo "$PG_CREDS" | jq -r .data.password)
PG_USER=$(echo "$VAULT_OUT" | jq -r .data.username)
PG_PASS=$(echo "$VAULT_OUT" | jq -r .data.password)
export PG_CONN_STR="postgres://${PG_USER}:${PG_PASS}@10.0.20.200:5432/terraform_state?sslmode=disable"
fi
fi

Binary file not shown.

Binary file not shown.

View file

@ -14,7 +14,7 @@ variable "beadboard_image_tag" {
# already ships.
variable "claude_agent_service_image_tag" {
type = string
default = "0c24c9b6"
default = "2fd7670d"
}
# Kill switch for auto-dispatch. When false, both CronJobs are suspended. The

View file

@ -105,7 +105,7 @@ resource "kubernetes_cron_job_v1" "version_probe" {
metadata {}
spec {
backoff_limit = 1
ttl_seconds_after_finished = 300
ttl_seconds_after_finished = 86400
template {
metadata {
labels = { app = "broker-sync", component = "version-probe" }
@ -246,7 +246,12 @@ resource "kubernetes_cron_job_v1" "imap" {
concurrency_policy = "Forbid"
successful_jobs_history_limit = 3
failed_jobs_history_limit = 5
suspend = true # enable in Phase 2
# Unsuspended 2026-04-19 for RSU vest ground-truth ingestion the parser
# now detects Schwab Release Confirmations and scaffolds VestEvents; the
# postgres sink that persists them into payslip_ingest.rsu_vest_events is
# pending a real-email fixture and cross-service DB grant (see
# follow-up beads task filed under the RSU tax spike fix epic).
suspend = false
job_template {
metadata {}
spec {

View file

@ -96,6 +96,21 @@ resource "kubernetes_config_map" "crowdsec_whitelist" {
reason: "Trusted IP - never block"
ip:
- "176.12.22.76"
---
name: viktor/immich-asset-paths-whitelist
description: "Don't penalise legit Immich timeline bursts (mobile scrub, web grid)"
whitelist:
reason: "Immich asset endpoints are auth-gated; mobile scrub legitimately bursts"
expression:
- >
evt.Parsed.target_fqdn == "immich.viktorbarzin.me" &&
(evt.Parsed.request startsWith "/api/assets/" ||
evt.Parsed.request startsWith "/api/timeline/" ||
evt.Parsed.request startsWith "/api/asset/" ||
evt.Parsed.request startsWith "/api/search/" ||
evt.Parsed.request startsWith "/api/memories" ||
evt.Parsed.request startsWith "/api/albums" ||
evt.Parsed.request startsWith "/api/activities")
YAML
}
}

View file

@ -157,9 +157,9 @@ resource "kubernetes_stateful_set_v1" "mysql_standalone" {
required_during_scheduling_ignored_during_execution {
node_selector_term {
match_expressions {
key = "kubernetes.io/hostname"
key = "nvidia.com/gpu.present"
operator = "NotIn"
values = ["k8s-node1"]
values = ["true"]
}
}
}
@ -1209,6 +1209,61 @@ resource "null_resource" "pg_job_hunter_db" {
}
}
# Create wealthfolio_sync database for the SQLitePG ETL sidecar that mirrors
# Wealthfolio's daily_account_valuation/accounts/activities into PG so Grafana
# can chart net worth, contributions, and growth.
# Role password is managed by Vault Database Secrets Engine (static role `pg-wealthfolio-sync`, 7d rotation).
resource "null_resource" "pg_wealthfolio_sync_db" {
depends_on = [null_resource.pg_cluster]
triggers = {
db_name = "wealthfolio_sync"
username = "wealthfolio_sync"
}
provisioner "local-exec" {
command = <<-EOT
PRIMARY=$(kubectl --kubeconfig ${var.kube_config_path} get cluster -n dbaas pg-cluster -o jsonpath='{.status.currentPrimary}')
kubectl --kubeconfig ${var.kube_config_path} exec -n dbaas $PRIMARY -c postgres -- \
bash -c '
psql -U postgres -tc "SELECT 1 FROM pg_catalog.pg_roles WHERE rolname = '"'"'wealthfolio_sync'"'"'" | grep -q 1 || \
psql -U postgres -c "CREATE ROLE wealthfolio_sync WITH LOGIN PASSWORD '"'"'changeme-vault-will-rotate'"'"'"
psql -U postgres -tc "SELECT 1 FROM pg_catalog.pg_database WHERE datname = '"'"'wealthfolio_sync'"'"'" | grep -q 1 || \
psql -U postgres -c "CREATE DATABASE wealthfolio_sync OWNER wealthfolio_sync"
psql -U postgres -c "GRANT ALL PRIVILEGES ON DATABASE wealthfolio_sync TO wealthfolio_sync"
'
EOT
}
}
# Create fire_planner database for the FIRE retirement-planning service.
# Role password is managed by Vault Database Secrets Engine
# (static role `pg-fire-planner`, 7d rotation).
# fire_planner reads from payslip_ingest + wealthfolio_sync (read-only)
# and writes its own MC results into schema fire_planner.
resource "null_resource" "pg_fire_planner_db" {
depends_on = [null_resource.pg_cluster]
triggers = {
db_name = "fire_planner"
username = "fire_planner"
}
provisioner "local-exec" {
command = <<-EOT
PRIMARY=$(kubectl --kubeconfig ${var.kube_config_path} get cluster -n dbaas pg-cluster -o jsonpath='{.status.currentPrimary}')
kubectl --kubeconfig ${var.kube_config_path} exec -n dbaas $PRIMARY -c postgres -- \
bash -c '
psql -U postgres -tc "SELECT 1 FROM pg_catalog.pg_roles WHERE rolname = '"'"'fire_planner'"'"'" | grep -q 1 || \
psql -U postgres -c "CREATE ROLE fire_planner WITH LOGIN PASSWORD '"'"'changeme-vault-will-rotate'"'"'"
psql -U postgres -tc "SELECT 1 FROM pg_catalog.pg_database WHERE datname = '"'"'fire_planner'"'"'" | grep -q 1 || \
psql -U postgres -c "CREATE DATABASE fire_planner OWNER fire_planner"
psql -U postgres -c "GRANT ALL PRIVILEGES ON DATABASE fire_planner TO fire_planner"
'
EOT
}
}
# Old PostgreSQL deployment kept commented for rollback reference
# resource "kubernetes_deployment" "postgres" {
# metadata {

View file

@ -72,7 +72,7 @@ resource "kubernetes_deployment" "ebook2audiobook" {
spec {
node_selector = {
"gpu" : "true"
"nvidia.com/gpu.present" : "true"
}
toleration {
key = "nvidia.com/gpu"
@ -290,7 +290,7 @@ resource "kubernetes_deployment" "audiblez" {
}
spec {
node_selector = {
"gpu" : "true"
"nvidia.com/gpu.present" : "true"
}
toleration {
key = "nvidia.com/gpu"
@ -356,7 +356,7 @@ resource "kubernetes_deployment" "audiblez-web" {
}
spec {
node_selector = {
"gpu" : "true"
"nvidia.com/gpu.present" : "true"
}
toleration {
key = "nvidia.com/gpu"

383
stacks/fire-planner/main.tf Normal file
View file

@ -0,0 +1,383 @@
variable "image_tag" {
type = string
default = "latest"
description = "fire-planner image tag. Use 8-char git SHA in CI; :latest only for local trials."
}
variable "postgresql_host" { type = string }
locals {
namespace = "fire-planner"
image = "registry.viktorbarzin.me/fire-planner:${var.image_tag}"
labels = {
app = "fire-planner"
}
}
resource "kubernetes_namespace" "fire_planner" {
metadata {
name = local.namespace
labels = {
tier = local.tiers.aux
"istio-injection" = "disabled"
}
}
lifecycle {
# KYVERNO_LIFECYCLE_V1: goldilocks-vpa-auto-mode ClusterPolicy stamps
# this label on every namespace.
ignore_changes = [metadata[0].labels["goldilocks.fairwinds.com/vpa-update-mode"]]
}
}
# App secrets the recompute-API bearer token (manual seed in Vault).
# Seed before applying:
# secret/fire-planner -> property `recompute_bearer_token`
resource "kubernetes_manifest" "external_secret" {
manifest = {
apiVersion = "external-secrets.io/v1beta1"
kind = "ExternalSecret"
metadata = {
name = "fire-planner-secrets"
namespace = local.namespace
}
spec = {
refreshInterval = "15m"
secretStoreRef = {
name = "vault-kv"
kind = "ClusterSecretStore"
}
target = {
name = "fire-planner-secrets"
template = {
metadata = {
annotations = {
"reloader.stakater.com/match" = "true"
}
}
}
}
data = [
{
secretKey = "RECOMPUTE_BEARER_TOKEN"
remoteRef = {
key = "fire-planner"
property = "recompute_bearer_token"
}
},
]
}
}
depends_on = [kubernetes_namespace.fire_planner]
}
# DB credentials from Vault database engine (rotated every 7 days).
# Template builds the asyncpg DSN consumed by the FastAPI app + CronJob
# as DB_CONNECTION_STRING.
resource "kubernetes_manifest" "db_external_secret" {
manifest = {
apiVersion = "external-secrets.io/v1beta1"
kind = "ExternalSecret"
metadata = {
name = "fire-planner-db-creds"
namespace = local.namespace
}
spec = {
refreshInterval = "15m"
secretStoreRef = {
name = "vault-database"
kind = "ClusterSecretStore"
}
target = {
name = "fire-planner-db-creds"
template = {
metadata = {
annotations = {
"reloader.stakater.com/match" = "true"
}
}
data = {
DB_CONNECTION_STRING = "postgresql+asyncpg://fire_planner:{{ .password }}@${var.postgresql_host}:5432/fire_planner"
DB_PASSWORD = "{{ .password }}"
}
}
}
data = [{
secretKey = "password"
remoteRef = {
key = "static-creds/pg-fire-planner"
property = "password"
}
}]
}
}
depends_on = [kubernetes_namespace.fire_planner]
}
resource "kubernetes_deployment" "fire_planner" {
metadata {
name = "fire-planner"
namespace = kubernetes_namespace.fire_planner.metadata[0].name
labels = merge(local.labels, {
tier = local.tiers.aux
})
annotations = {
"reloader.stakater.com/search" = "true"
}
}
spec {
replicas = 1
strategy {
type = "Recreate"
}
selector {
match_labels = local.labels
}
template {
metadata {
labels = local.labels
annotations = {
"dependency.kyverno.io/wait-for" = "postgresql.dbaas:5432"
}
}
spec {
image_pull_secrets {
name = "registry-credentials"
}
init_container {
name = "alembic-migrate"
image = local.image
command = ["python", "-m", "fire_planner", "migrate"]
env_from {
secret_ref {
name = "fire-planner-db-creds"
}
}
resources {
requests = {
cpu = "50m"
memory = "256Mi"
}
limits = {
memory = "512Mi"
}
}
}
container {
name = "fire-planner"
image = local.image
command = ["python", "-m", "fire_planner", "serve"]
port {
container_port = 8080
}
env_from {
secret_ref {
name = "fire-planner-secrets"
}
}
env_from {
secret_ref {
name = "fire-planner-db-creds"
}
}
readiness_probe {
http_get {
path = "/healthz"
port = 8080
}
initial_delay_seconds = 5
period_seconds = 10
}
liveness_probe {
http_get {
path = "/healthz"
port = 8080
}
initial_delay_seconds = 5
period_seconds = 10
}
resources {
requests = {
cpu = "100m"
memory = "512Mi"
}
limits = {
memory = "1024Mi"
}
}
}
}
}
}
lifecycle {
ignore_changes = [spec[0].template[0].spec[0].dns_config] # KYVERNO_LIFECYCLE_V1
}
depends_on = [
kubernetes_manifest.external_secret,
kubernetes_manifest.db_external_secret,
]
}
# ClusterIP-only /recompute is cluster-internal (operator triggers
# via kubectl port-forward or ad-hoc CronJob).
resource "kubernetes_service" "fire_planner" {
metadata {
name = "fire-planner"
namespace = kubernetes_namespace.fire_planner.metadata[0].name
labels = local.labels
}
spec {
type = "ClusterIP"
selector = local.labels
port {
name = "http"
port = 8080
target_port = 8080
}
}
}
# Monthly recompute on the 2nd at 09:00 UTC. Wealthfolio-sync runs on
# the 1st at 08:00, so account_snapshot is fresh by the time the
# planner picks up.
resource "kubernetes_cron_job_v1" "fire_planner_recompute" {
metadata {
name = "fire-planner-recompute"
namespace = kubernetes_namespace.fire_planner.metadata[0].name
}
spec {
schedule = "0 9 2 * *"
concurrency_policy = "Forbid"
successful_jobs_history_limit = 3
failed_jobs_history_limit = 5
starting_deadline_seconds = 600
job_template {
metadata {
labels = local.labels
}
spec {
backoff_limit = 1
ttl_seconds_after_finished = 86400
template {
metadata {
labels = local.labels
}
spec {
restart_policy = "OnFailure"
image_pull_secrets {
name = "registry-credentials"
}
container {
name = "recompute"
image = local.image
command = ["python", "-m", "fire_planner", "recompute-all"]
env_from {
secret_ref {
name = "fire-planner-secrets"
}
}
env_from {
secret_ref {
name = "fire-planner-db-creds"
}
}
resources {
requests = {
cpu = "200m"
memory = "1Gi"
}
limits = {
memory = "2Gi"
}
}
}
}
}
}
}
}
lifecycle {
# KYVERNO_LIFECYCLE_V1
ignore_changes = [spec[0].job_template[0].spec[0].template[0].spec[0].dns_config]
}
depends_on = [
kubernetes_manifest.external_secret,
kubernetes_manifest.db_external_secret,
]
}
# Plan-time read of the ESO-created K8s Secret for Grafana datasource
# password. First-apply gotcha: must
# `terragrunt apply -target=kubernetes_manifest.db_external_secret` so
# the Secret exists before this data source plans.
data "kubernetes_secret" "fire_planner_db_creds" {
metadata {
name = "fire-planner-db-creds"
namespace = kubernetes_namespace.fire_planner.metadata[0].name
}
depends_on = [kubernetes_manifest.db_external_secret]
}
# Grafana datasource for fire_planner PostgreSQL DB.
# Lives in the monitoring namespace so the grafana sidecar
# (label grafana_datasource=1) picks it up.
#
# Grafana 11.2+ Postgres plugin reads the DB name from jsonData.database;
# the top-level `database` field is silently ignored by the frontend and
# triggers "you do not have default database" on every panel.
# See github.com/grafana/grafana#112418 same fix as the payslip-ingest
# datasource (commit cc56ba29).
resource "kubernetes_config_map" "grafana_fire_planner_datasource" {
metadata {
name = "grafana-fire-planner-datasource"
namespace = "monitoring"
labels = {
grafana_datasource = "1"
}
}
data = {
"fire-planner-datasource.yaml" = yamlencode({
apiVersion = 1
datasources = [{
name = "FirePlanner"
type = "postgres"
access = "proxy"
url = "${var.postgresql_host}:5432"
user = "fire_planner"
uid = "fire-planner-pg"
jsonData = {
database = "fire_planner"
sslmode = "disable"
postgresVersion = 1600
timescaledb = false
}
secureJsonData = {
password = data.kubernetes_secret.fire_planner_db_creds.data["DB_PASSWORD"]
}
editable = true
}]
})
}
}

View file

@ -0,0 +1,28 @@
include "root" {
path = find_in_parent_folders()
}
dependency "platform" {
config_path = "../platform"
skip_outputs = true
}
dependency "vault" {
config_path = "../vault"
skip_outputs = true
}
dependency "external-secrets" {
config_path = "../external-secrets"
skip_outputs = true
}
dependency "dbaas" {
config_path = "../dbaas"
skip_outputs = true
}
inputs = {
# fire-planner repo HEAD bump on every deploy.
image_tag = "latest"
}

View file

@ -87,7 +87,7 @@ resource "kubernetes_deployment" "frigate" {
}
spec {
node_selector = {
"gpu" : true
"nvidia.com/gpu.present" : "true"
}
toleration {
key = "nvidia.com/gpu"

View file

@ -220,7 +220,8 @@ resource "kubernetes_deployment" "hermes_agent" {
strategy {
type = "Recreate"
}
replicas = 1
# Disabled 2026-04-22 main container fails with "mkdir: cannot create directory '/opt/data': Permission denied" (fsGroup/runAsUser mismatch vs init container). Re-enable after fixing PVC permissions.
replicas = 0
selector {
match_labels = {
app = "hermes-agent"

View file

@ -85,6 +85,30 @@ module "nfs_postgresql_host" {
nfs_path = "/srv/nfs/immich/postgresql"
}
# Migrated 2026-04-25: PG live data moved off NFS to LUKS-encrypted block.
# WAL fsync per commit on NFS contributed to the 2026-04-22 NFS writeback storm
# (see post-mortems/2026-04-22-vault-raft-leader-deadlock.md).
# Backup CronJob still writes to module.nfs_postgresql_host (NFS append-only).
resource "kubernetes_persistent_volume_claim" "immich_postgresql_encrypted" {
wait_until_bound = false
metadata {
name = "immich-postgresql-data-encrypted"
namespace = kubernetes_namespace.immich.metadata[0].name
annotations = {
"resize.topolvm.io/threshold" = "80%"
"resize.topolvm.io/increase" = "100%"
"resize.topolvm.io/storage_limit" = "20Gi"
}
}
spec {
access_modes = ["ReadWriteOnce"]
storage_class_name = "proxmox-lvm-encrypted"
resources {
requests = { storage = "10Gi" }
}
}
}
module "nfs_ml_cache_host" {
source = "../../modules/kubernetes/nfs_volume"
name = "immich-ml-cache-host"
@ -164,7 +188,7 @@ resource "kubernetes_deployment" "immich_server" {
}
strategy {
type = "RollingUpdate"
type = "Recreate"
}
template {
@ -287,10 +311,10 @@ resource "kubernetes_deployment" "immich_server" {
resources {
requests = {
cpu = "100m"
memory = "2000Mi"
memory = "4096Mi"
}
limits = {
memory = "3500Mi"
memory = "4096Mi"
}
}
}
@ -462,6 +486,13 @@ resource "kubernetes_deployment" "immich-postgres" {
name = "write-pg-override-conf"
image = "busybox:1.36"
command = ["sh", "-c", <<-EOT
# Skip write on uninitialised PGDATA initdb refuses non-empty dirs.
# On first boot the override is absent; trigger a pod restart after
# initdb completes so the override is applied before extension load.
if [ ! -f /data/PG_VERSION ]; then
echo "PGDATA uninitialised, skipping override conf (will write on next pod start)"
exit 0
fi
cat > /data/postgresql.override.conf <<'PGCONF'
# Immich vector search performance tuning
shared_buffers = 2048MB
@ -481,7 +512,7 @@ resource "kubernetes_deployment" "immich-postgres" {
volume {
name = "postgresql-persistent-storage"
persistent_volume_claim {
claim_name = module.nfs_postgresql_host.claim_name
claim_name = kubernetes_persistent_volume_claim.immich_postgresql_encrypted.metadata[0].name
}
}
}
@ -548,7 +579,7 @@ resource "kubernetes_deployment" "immich-machine-learning" {
}
}
strategy {
type = "RollingUpdate"
type = "Recreate"
}
template {
metadata {
@ -559,7 +590,7 @@ resource "kubernetes_deployment" "immich-machine-learning" {
spec {
priority_class_name = "gpu-workload"
node_selector = {
"gpu" : "true"
"nvidia.com/gpu.present" : "true"
}
toleration {
key = "nvidia.com/gpu"

View file

@ -18,8 +18,9 @@ dependency "external-secrets" {
}
inputs = {
# 8-char SHA from the Forgejo commit viktor/job-hunter@9c42eac9
# (first image built locally + pushed 2026-04-19 due to a Woodpecker
# v3.13 Forgejo webhook bug; bump on every deploy once CI recovers).
image_tag = "48f8615d"
# 92afc38d = master HEAD with levels.fyi scraper + comp_table COALESCE
# fix + Frankfurter FX backend (exchangerate.host free tier deprecated
# in 2026). Built + pushed locally 2026-04-19 while the Woodpecker
# Forgejo webhook remains broken.
image_tag = "92afc38d"
}

View file

@ -138,7 +138,7 @@ Kyverno auto-generates LimitRange + ResourceQuota per namespace based on tier la
- **Proxmox**: 192.168.1.127 (Dell R730, 22c/44t, 142GB RAM)
- **Nodes**: k8s-master (10.0.20.100), node1 (GPU, Tesla T4), node2-4
- **GPU workloads**: \`node_selector = { "gpu": "true" }\` + toleration \`nvidia.com/gpu\`
- **GPU workloads**: \`node_selector = { "nvidia.com/gpu.present" : "true" }\` + toleration \`nvidia.com/gpu\` (label auto-applied by gpu-feature-discovery, no hostname pins)
- **Pull-through cache**: 10.0.20.10 use versioned image tags (cache serves stale :latest manifests)
- **MySQL InnoDB Cluster**: 3 instances on iSCSI
- **SMTP**: \`var.mail_host\` port 587 STARTTLS

View file

@ -30,6 +30,7 @@ module "monitoring" {
haos_api_token = data.vault_kv_secret_v2.secrets.data["haos_api_token"]
pve_password = data.vault_kv_secret_v2.secrets.data["pve_password"]
grafana_admin_password = data.vault_kv_secret_v2.secrets.data["grafana_admin_password"]
kube_config_path = var.kube_config_path
registry_user = data.vault_kv_secret_v2.viktor.data["registry_user"]
registry_password = data.vault_kv_secret_v2.viktor.data["registry_password"]
tier = local.tiers.cluster

View file

@ -0,0 +1,226 @@
{
"annotations": {
"list": [
{
"builtIn": 1,
"datasource": {"type": "datasource", "uid": "grafana"},
"enable": true,
"hide": true,
"iconColor": "rgba(0, 211, 255, 1)",
"name": "Annotations & Alerts",
"type": "dashboard"
}
]
},
"description": "FIRE Retirement Planner — risk-adjusted, tax-minimised Monte Carlo over jurisdictions, withdrawal strategies, and UK-departure years. Backed by fire_planner schema on pg-cluster-rw.",
"editable": true,
"fiscalYearStartMonth": 0,
"id": null,
"templating": {
"list": [
{
"name": "scenario",
"type": "query",
"label": "Scenario",
"datasource": {"type": "grafana-postgresql-datasource", "uid": "fire-planner-pg"},
"query": "SELECT external_id FROM fire_planner.scenario ORDER BY external_id",
"refresh": 1,
"includeAll": false,
"multi": false,
"current": {"selected": false, "text": "cyprus-vpw-leave-y3-glide-rising", "value": "cyprus-vpw-leave-y3-glide-rising"}
}
]
},
"links": [],
"panels": [
{
"id": 1,
"title": "Net worth over time (real + nominal)",
"type": "timeseries",
"datasource": {"type": "grafana-postgresql-datasource", "uid": "fire-planner-pg"},
"gridPos": {"h": 8, "w": 24, "x": 0, "y": 0},
"fieldConfig": {
"defaults": {"unit": "currencyGBP", "decimals": 0},
"overrides": []
},
"options": {"legend": {"displayMode": "table", "showLegend": true}, "tooltip": {"mode": "multi"}},
"targets": [
{
"refId": "A",
"datasource": {"type": "grafana-postgresql-datasource", "uid": "fire-planner-pg"},
"rawQuery": true,
"editorMode": "code",
"format": "time_series",
"rawSql": "SELECT snapshot_date AS time, account_name AS metric, SUM(market_value_gbp) AS value FROM fire_planner.account_snapshot WHERE snapshot_date >= NOW() - INTERVAL '10 years' GROUP BY snapshot_date, account_name ORDER BY snapshot_date"
}
]
},
{
"id": 2,
"title": "Monte Carlo fan chart — selected scenario",
"type": "timeseries",
"datasource": {"type": "grafana-postgresql-datasource", "uid": "fire-planner-pg"},
"gridPos": {"h": 10, "w": 24, "x": 0, "y": 8},
"description": "P10/p25/p50/p75/p90 portfolio value across MC paths, for the scenario picked in the selector at the top.",
"fieldConfig": {"defaults": {"unit": "currencyGBP", "decimals": 0}, "overrides": []},
"options": {"legend": {"displayMode": "table", "showLegend": true}, "tooltip": {"mode": "multi"}},
"targets": [
{
"refId": "A",
"datasource": {"type": "grafana-postgresql-datasource", "uid": "fire-planner-pg"},
"rawQuery": true,
"editorMode": "code",
"format": "time_series",
"rawSql": "SELECT (DATE_TRUNC('year', NOW()) + (year_idx || ' years')::interval) AS time, 'p10' AS metric, p10_portfolio_gbp AS value FROM fire_planner.projection_yearly p JOIN fire_planner.mc_run r ON r.id = p.mc_run_id JOIN fire_planner.scenario s ON s.id = r.scenario_id WHERE s.external_id = '$scenario' UNION ALL SELECT (DATE_TRUNC('year', NOW()) + (year_idx || ' years')::interval), 'p25', p25_portfolio_gbp FROM fire_planner.projection_yearly p JOIN fire_planner.mc_run r ON r.id = p.mc_run_id JOIN fire_planner.scenario s ON s.id = r.scenario_id WHERE s.external_id = '$scenario' UNION ALL SELECT (DATE_TRUNC('year', NOW()) + (year_idx || ' years')::interval), 'p50', p50_portfolio_gbp FROM fire_planner.projection_yearly p JOIN fire_planner.mc_run r ON r.id = p.mc_run_id JOIN fire_planner.scenario s ON s.id = r.scenario_id WHERE s.external_id = '$scenario' UNION ALL SELECT (DATE_TRUNC('year', NOW()) + (year_idx || ' years')::interval), 'p75', p75_portfolio_gbp FROM fire_planner.projection_yearly p JOIN fire_planner.mc_run r ON r.id = p.mc_run_id JOIN fire_planner.scenario s ON s.id = r.scenario_id WHERE s.external_id = '$scenario' UNION ALL SELECT (DATE_TRUNC('year', NOW()) + (year_idx || ' years')::interval), 'p90', p90_portfolio_gbp FROM fire_planner.projection_yearly p JOIN fire_planner.mc_run r ON r.id = p.mc_run_id JOIN fire_planner.scenario s ON s.id = r.scenario_id WHERE s.external_id = '$scenario' ORDER BY time"
}
]
},
{
"id": 3,
"title": "Confidence heatmap — jurisdiction × strategy",
"type": "table",
"datasource": {"type": "grafana-postgresql-datasource", "uid": "fire-planner-pg"},
"gridPos": {"h": 8, "w": 12, "x": 0, "y": 18},
"description": "Median success rate by (jurisdiction, strategy), averaged across leave-UK years and glide paths.",
"fieldConfig": {
"defaults": {"custom": {"align": "left", "displayMode": "auto"}, "unit": "percentunit", "decimals": 2},
"overrides": []
},
"options": {"showHeader": true},
"targets": [
{
"refId": "A",
"datasource": {"type": "grafana-postgresql-datasource", "uid": "fire-planner-pg"},
"rawQuery": true,
"editorMode": "code",
"format": "table",
"rawSql": "SELECT jurisdiction, strategy, AVG(success_rate) AS avg_success FROM fire_planner.scenario_summary GROUP BY jurisdiction, strategy ORDER BY jurisdiction, strategy"
}
]
},
{
"id": 4,
"title": "Median lifetime tax — by jurisdiction",
"type": "barchart",
"datasource": {"type": "grafana-postgresql-datasource", "uid": "fire-planner-pg"},
"gridPos": {"h": 8, "w": 12, "x": 12, "y": 18},
"fieldConfig": {"defaults": {"unit": "currencyGBP", "decimals": 0}, "overrides": []},
"options": {"orientation": "horizontal", "showValue": "auto", "stacking": "none", "legend": {"displayMode": "list"}},
"targets": [
{
"refId": "A",
"datasource": {"type": "grafana-postgresql-datasource", "uid": "fire-planner-pg"},
"rawQuery": true,
"editorMode": "code",
"format": "table",
"rawSql": "SELECT jurisdiction, AVG(median_lifetime_tax_gbp) AS lifetime_tax FROM fire_planner.scenario_summary GROUP BY jurisdiction ORDER BY lifetime_tax DESC"
}
]
},
{
"id": 5,
"title": "Withdrawal runway — years to ruin (failing paths)",
"type": "table",
"datasource": {"type": "grafana-postgresql-datasource", "uid": "fire-planner-pg"},
"gridPos": {"h": 8, "w": 12, "x": 0, "y": 26},
"description": "Among scenarios where some MC paths failed, the median year-to-ruin. Empty where every path survives.",
"fieldConfig": {"defaults": {"unit": "y", "decimals": 1}, "overrides": []},
"options": {"showHeader": true},
"targets": [
{
"refId": "A",
"datasource": {"type": "grafana-postgresql-datasource", "uid": "fire-planner-pg"},
"rawQuery": true,
"editorMode": "code",
"format": "table",
"rawSql": "SELECT jurisdiction, strategy, leave_uk_year, glide_path, median_years_to_ruin FROM fire_planner.scenario_summary WHERE median_years_to_ruin IS NOT NULL ORDER BY median_years_to_ruin ASC LIMIT 20"
}
]
},
{
"id": 6,
"title": "Optimal leave-UK year",
"type": "stat",
"datasource": {"type": "grafana-postgresql-datasource", "uid": "fire-planner-pg"},
"gridPos": {"h": 4, "w": 6, "x": 12, "y": 26},
"description": "leave_uk_year that maximises success_rate lifetime_tax (tax in £M; small weighting).",
"fieldConfig": {"defaults": {"unit": "none"}, "overrides": []},
"options": {"colorMode": "value", "reduceOptions": {"calcs": ["lastNotNull"]}},
"targets": [
{
"refId": "A",
"datasource": {"type": "grafana-postgresql-datasource", "uid": "fire-planner-pg"},
"rawQuery": true,
"editorMode": "code",
"format": "table",
"rawSql": "SELECT leave_uk_year FROM fire_planner.scenario_summary WHERE jurisdiction <> 'uk' ORDER BY (success_rate - median_lifetime_tax_gbp / 1000000.0) DESC LIMIT 1"
}
]
},
{
"id": 7,
"title": "Median ending wealth — selected scenario",
"type": "stat",
"datasource": {"type": "grafana-postgresql-datasource", "uid": "fire-planner-pg"},
"gridPos": {"h": 4, "w": 6, "x": 18, "y": 26},
"fieldConfig": {"defaults": {"unit": "currencyGBP", "decimals": 0}, "overrides": []},
"options": {"colorMode": "value", "reduceOptions": {"calcs": ["lastNotNull"]}},
"targets": [
{
"refId": "A",
"datasource": {"type": "grafana-postgresql-datasource", "uid": "fire-planner-pg"},
"rawQuery": true,
"editorMode": "code",
"format": "table",
"rawSql": "SELECT p50_ending_gbp FROM fire_planner.scenario_summary WHERE scenario_id = (SELECT id FROM fire_planner.scenario WHERE external_id = '$scenario')"
}
]
},
{
"id": 8,
"title": "Success rate vs spend (UK-stay)",
"type": "barchart",
"datasource": {"type": "grafana-postgresql-datasource", "uid": "fire-planner-pg"},
"gridPos": {"h": 8, "w": 12, "x": 0, "y": 30},
"description": "Sanity gauge — UK success rate by strategy, helps anchor expectations against published cFIREsim numbers.",
"fieldConfig": {"defaults": {"unit": "percentunit", "decimals": 2}, "overrides": []},
"options": {"orientation": "horizontal", "showValue": "auto", "legend": {"displayMode": "list"}},
"targets": [
{
"refId": "A",
"datasource": {"type": "grafana-postgresql-datasource", "uid": "fire-planner-pg"},
"rawQuery": true,
"editorMode": "code",
"format": "table",
"rawSql": "SELECT strategy, AVG(success_rate) AS success FROM fire_planner.scenario_summary WHERE jurisdiction = 'uk' GROUP BY strategy ORDER BY success DESC"
}
]
},
{
"id": 9,
"title": "Sequence-of-returns sensitivity (top failing scenarios)",
"type": "table",
"datasource": {"type": "grafana-postgresql-datasource", "uid": "fire-planner-pg"},
"gridPos": {"h": 8, "w": 12, "x": 12, "y": 30},
"description": "Pearson correlation between year-1 portfolio drawdown and overall success — strongly negative ⇒ scenario is sequence-of-returns sensitive (case for the rising-equity glide).",
"fieldConfig": {"defaults": {"unit": "none", "decimals": 4}, "overrides": []},
"options": {"showHeader": true},
"targets": [
{
"refId": "A",
"datasource": {"type": "grafana-postgresql-datasource", "uid": "fire-planner-pg"},
"rawQuery": true,
"editorMode": "code",
"format": "table",
"rawSql": "SELECT s.external_id, r.sequence_risk_correlation, r.success_rate FROM fire_planner.mc_run r JOIN fire_planner.scenario s ON s.id = r.scenario_id WHERE r.id IN (SELECT MAX(id) FROM fire_planner.mc_run GROUP BY scenario_id) ORDER BY r.sequence_risk_correlation ASC LIMIT 15"
}
]
}
],
"schemaVersion": 39,
"tags": ["finance", "fire", "retirement", "monte-carlo"],
"title": "FIRE Planner",
"uid": "fire-planner",
"version": 1,
"weekStart": ""
}

View file

@ -197,12 +197,192 @@
],
"title": "Top roles",
"type": "table"
},
{
"datasource": {"type": "grafana-postgresql-datasource", "uid": "job-hunter-pg"},
"description": "Per-company median base salary broken out by seniority level (comp_points, GBP).",
"fieldConfig": {
"defaults": {
"color": {"mode": "thresholds"},
"custom": {
"align": "auto",
"cellOptions": {"type": "auto"},
"filterable": true,
"inspect": false
},
"mappings": [],
"thresholds": {"mode": "absolute", "steps": []},
"unit": "currencyGBP"
},
"overrides": []
},
"gridPos": {"h": 10, "w": 24, "x": 0, "y": 29},
"id": 6,
"options": {
"cellHeight": "sm",
"footer": {"countRows": false, "fields": "", "reducer": ["sum"], "show": false},
"showHeader": true
},
"targets": [
{
"datasource": {"type": "grafana-postgresql-datasource", "uid": "job-hunter-pg"},
"format": "table",
"rawQuery": true,
"rawSql": "SELECT c.display_name AS company, l.slug AS level, percentile_cont(0.5) WITHIN GROUP (ORDER BY cp.base_gbp) AS p50_base_gbp, COUNT(*) AS n FROM job_hunter.comp_points cp JOIN job_hunter.companies c ON cp.company_id = c.id LEFT JOIN job_hunter.levels l ON cp.level_id = l.id WHERE cp.base_gbp IS NOT NULL AND cp.location_bucket IN (${location:sqlstring}) AND (c.slug = ANY(string_to_array(${company:sqlstring}, ',')) OR ${company:sqlstring} = 'all') GROUP BY c.display_name, l.slug ORDER BY c.display_name, l.rank NULLS LAST",
"refId": "A"
}
],
"title": "Per-company salary by level (p50 base)",
"type": "table"
},
{
"datasource": {"type": "grafana-postgresql-datasource", "uid": "job-hunter-pg"},
"description": "p50 total comp (base + bonus + RSU/year + sign-on/year) per (company, level).",
"fieldConfig": {
"defaults": {
"color": {"mode": "continuous-GrYlRd"},
"custom": {"align": "center", "cellOptions": {"type": "color-background"}},
"unit": "currencyGBP"
},
"overrides": []
},
"gridPos": {"h": 10, "w": 12, "x": 0, "y": 39},
"id": 7,
"options": {
"cellHeight": "sm",
"footer": {"countRows": false, "fields": "", "reducer": ["sum"], "show": false},
"showHeader": true
},
"targets": [
{
"datasource": {"type": "grafana-postgresql-datasource", "uid": "job-hunter-pg"},
"format": "table",
"rawQuery": true,
"rawSql": "SELECT c.display_name AS company, l.slug AS level, percentile_cont(0.5) WITHIN GROUP (ORDER BY COALESCE(cp.base_gbp, 0) + COALESCE(cp.bonus_gbp, 0) + COALESCE(cp.rsu_annual_gbp, 0) + COALESCE(cp.signon_gbp, 0)) AS p50_total_gbp FROM job_hunter.comp_points cp JOIN job_hunter.companies c ON cp.company_id = c.id LEFT JOIN job_hunter.levels l ON cp.level_id = l.id WHERE cp.base_gbp IS NOT NULL AND cp.location_bucket IN (${location:sqlstring}) GROUP BY c.display_name, l.slug ORDER BY c.display_name",
"refId": "A"
}
],
"title": "Total comp heatmap (p50, GBP)",
"type": "table"
},
{
"datasource": {"type": "grafana-postgresql-datasource", "uid": "job-hunter-pg"},
"description": "Comp-datapoint ingestion volume by source.",
"fieldConfig": {
"defaults": {
"color": {"mode": "palette-classic"},
"custom": {
"drawStyle": "bars",
"fillOpacity": 60,
"lineWidth": 1,
"stacking": {"mode": "normal"}
}
},
"overrides": []
},
"gridPos": {"h": 10, "w": 12, "x": 12, "y": 39},
"id": 8,
"options": {
"legend": {"displayMode": "list", "placement": "bottom", "showLegend": true},
"tooltip": {"mode": "single", "sort": "none"}
},
"targets": [
{
"datasource": {"type": "grafana-postgresql-datasource", "uid": "job-hunter-pg"},
"format": "time_series",
"rawQuery": true,
"rawSql": "SELECT date_trunc('day', fetched_at) AT TIME ZONE 'UTC' AS time, source, COUNT(*) AS value FROM job_hunter.comp_points WHERE $__timeFilter(fetched_at) GROUP BY 1, 2 ORDER BY 1",
"refId": "A"
}
],
"title": "Comp-point volume by source",
"type": "timeseries"
},
{
"datasource": {"type": "grafana-postgresql-datasource", "uid": "job-hunter-pg"},
"description": "p50 base salary trend by (company, level) for top 5 companies.",
"fieldConfig": {
"defaults": {
"color": {"mode": "palette-classic"},
"custom": {
"drawStyle": "line",
"lineInterpolation": "linear",
"lineWidth": 2,
"pointSize": 6,
"showPoints": "auto"
},
"unit": "currencyGBP"
},
"overrides": []
},
"gridPos": {"h": 10, "w": 24, "x": 0, "y": 49},
"id": 9,
"options": {
"legend": {"displayMode": "table", "placement": "right", "showLegend": true},
"tooltip": {"mode": "multi", "sort": "desc"}
},
"targets": [
{
"datasource": {"type": "grafana-postgresql-datasource", "uid": "job-hunter-pg"},
"format": "time_series",
"rawQuery": true,
"rawSql": "WITH ranked AS (SELECT c.slug AS company_slug, COUNT(*) AS n FROM job_hunter.comp_points cp JOIN job_hunter.companies c ON cp.company_id = c.id WHERE cp.base_gbp IS NOT NULL AND cp.location_bucket IN (${location:sqlstring}) GROUP BY c.slug ORDER BY n DESC LIMIT 5) SELECT date_trunc('month', cp.effective_date)::timestamp AS time, c.display_name || ' / ' || COALESCE(l.slug, 'unknown') AS metric, percentile_cont(0.5) WITHIN GROUP (ORDER BY cp.base_gbp) AS value FROM job_hunter.comp_points cp JOIN job_hunter.companies c ON cp.company_id = c.id LEFT JOIN job_hunter.levels l ON cp.level_id = l.id WHERE cp.base_gbp IS NOT NULL AND cp.effective_date IS NOT NULL AND cp.location_bucket IN (${location:sqlstring}) AND c.slug IN (SELECT company_slug FROM ranked) AND (l.slug = ${level:sqlstring} OR ${level:sqlstring} = 'all') GROUP BY 1, 2 ORDER BY 1",
"refId": "A"
}
],
"title": "Base-salary trend (p50) — top 5 companies",
"type": "timeseries"
}
],
"refresh": "",
"schemaVersion": 39,
"tags": ["job-hunter", "jobs", "careers"],
"templating": {"list": []},
"templating": {"list": [
{
"current": {"selected": true, "text": ["london"], "value": ["london"]},
"datasource": {"type": "grafana-postgresql-datasource", "uid": "job-hunter-pg"},
"definition": "SELECT DISTINCT location_bucket FROM job_hunter.comp_points ORDER BY 1",
"includeAll": false,
"label": "Location",
"multi": true,
"name": "location",
"options": [],
"query": "SELECT DISTINCT location_bucket FROM job_hunter.comp_points ORDER BY 1",
"refresh": 1,
"regex": "",
"type": "query"
},
{
"current": {"selected": true, "text": "senior", "value": "senior"},
"datasource": {"type": "grafana-postgresql-datasource", "uid": "job-hunter-pg"},
"definition": "SELECT slug FROM job_hunter.levels WHERE company_id IS NULL ORDER BY rank",
"includeAll": true,
"allValue": "all",
"label": "Level",
"multi": false,
"name": "level",
"options": [],
"query": "SELECT slug FROM job_hunter.levels WHERE company_id IS NULL ORDER BY rank",
"refresh": 1,
"regex": "",
"type": "query"
},
{
"current": {"selected": true, "text": "all", "value": "all"},
"datasource": {"type": "grafana-postgresql-datasource", "uid": "job-hunter-pg"},
"definition": "SELECT slug FROM job_hunter.companies ORDER BY slug",
"includeAll": true,
"allValue": "all",
"label": "Company",
"multi": true,
"name": "company",
"options": [],
"query": "SELECT slug FROM job_hunter.companies ORDER BY slug",
"refresh": 1,
"regex": "",
"type": "query"
}
]},
"time": {"from": "now-30d", "to": "now"},
"timepicker": {},
"timezone": "browser",

View file

@ -179,7 +179,7 @@
{
"id": 7,
"title": "YTD uses \u2014 where gross went",
"description": "Year-to-date cumulative breakdown of where the gross went. Stacked \u2014 top equals gross_pay minus student loan and RSU offset (both small; shown on Panel 8 Sankey). Green = take-home; red = cash income tax; orange = RSU-attributed income tax + NI; purple = pension.",
"description": "Year-to-date cumulative breakdown of where the gross went. Stacked \u2014 top equals gross_pay minus student loan and RSU offset (both small; shown on Panel 8 Sankey). RSU vest tax broken out at the exact band-aware marginal (PA-taper aware: 60% in the \u00a3100k\u2013\u00a3125,140 zone, 47% additional-rate, etc.) \u2014 see SQL for full bands. Green = take-home; red = cash income tax; orange = tax on RSU vest; orange = cash NI; purple = pension.",
"type": "timeseries",
"datasource": {
"type": "grafana-postgresql-datasource",
@ -258,14 +258,14 @@
},
{
"id": "displayName",
"value": "Income Tax (cash pay)"
"value": "Income Tax (cash)"
}
]
},
{
"matcher": {
"id": "byName",
"options": "ytd_rsu_income_tax"
"options": "ytd_rsu_tax_marginal"
},
"properties": [
{
@ -277,14 +277,14 @@
},
{
"id": "displayName",
"value": "Income Tax (RSU-attributed)"
"value": "Tax on RSU vest (band-aware marginal)"
}
]
},
{
"matcher": {
"id": "byName",
"options": "ytd_ni"
"options": "ytd_cash_ni"
},
"properties": [
{
@ -296,7 +296,7 @@
},
{
"id": "displayName",
"value": "National Insurance"
"value": "National Insurance (cash)"
}
]
},
@ -341,7 +341,7 @@
"type": "grafana-postgresql-datasource",
"uid": "payslips-pg"
},
"rawSql": "SELECT pay_date AS \"time\", SUM(net_pay) OVER w AS ytd_net, SUM(COALESCE(cash_income_tax, income_tax)) OVER w AS ytd_cash_income_tax, SUM(income_tax - COALESCE(cash_income_tax, income_tax)) OVER w AS ytd_rsu_income_tax, SUM(national_insurance) OVER w AS ytd_ni, SUM(pension_employee) OVER w AS ytd_pension_employee FROM payslip_ingest.payslip WHERE $__timeFilter(pay_date) WINDOW w AS (PARTITION BY tax_year ORDER BY pay_date) ORDER BY pay_date",
"rawSql": "WITH r AS (SELECT * FROM payslip_ingest.payslip WHERE $__timeFilter(pay_date)), ani AS (SELECT *, COALESCE(SUM(gross_pay - COALESCE(pension_sacrifice, 0)) OVER (PARTITION BY tax_year ORDER BY pay_date ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING), 0) AS ani_prior FROM r), slice AS (SELECT *, ani_prior + gross_pay - COALESCE(rsu_vest, 0) - COALESCE(pension_sacrifice, 0) AS ani_pre, ani_prior + gross_pay - COALESCE(pension_sacrifice, 0) AS ani_post FROM ani), m AS (SELECT *, GREATEST(0, LEAST(ani_post, 12570) - GREATEST(ani_pre, 0)) * 0.00 + GREATEST(0, LEAST(ani_post, 50270) - GREATEST(ani_pre, 12570)) * 0.20 + GREATEST(0, LEAST(ani_post, 100000) - GREATEST(ani_pre, 50270)) * 0.40 + GREATEST(0, LEAST(ani_post, 125140) - GREATEST(ani_pre, 100000)) * 0.60 + GREATEST(0, ani_post - GREATEST(ani_pre, 125140)) * 0.45 AS rsu_paye_marginal, GREATEST(0, LEAST(ani_post, 12570) - GREATEST(ani_pre, 0)) * 0.00 + GREATEST(0, LEAST(ani_post, 50270) - GREATEST(ani_pre, 12570)) * 0.08 + GREATEST(0, ani_post - GREATEST(ani_pre, 50270)) * 0.02 AS rsu_ni_marginal FROM slice) SELECT pay_date AS \"time\", SUM(net_pay) OVER w AS ytd_net, SUM(GREATEST(0, income_tax - rsu_paye_marginal)) OVER w AS ytd_cash_income_tax, SUM(rsu_paye_marginal + rsu_ni_marginal) OVER w AS ytd_rsu_tax_marginal, SUM(GREATEST(0, national_insurance - rsu_ni_marginal)) OVER w AS ytd_cash_ni, SUM(pension_employee) OVER w AS ytd_pension_employee FROM m WINDOW w AS (PARTITION BY tax_year ORDER BY pay_date) ORDER BY pay_date",
"format": "time_series",
"refId": "A",
"rawQuery": true,
@ -352,6 +352,7 @@
{
"id": 2,
"title": "Monthly cash flow (RSU stripped)",
"description": "Cash-only view: gross pay minus the RSU vest (cash_gross) and the bank-deposited net_pay. Tax and NI are not shown here because UK cumulative PAYE genuinely takes a YTD true-up chunk in vest months on top of the marginal RSU PAYE \u2014 see Panel 11 for the full tax breakdown with the band-aware RSU split.",
"type": "timeseries",
"datasource": {
"type": "grafana-postgresql-datasource",
@ -422,7 +423,7 @@
"type": "grafana-postgresql-datasource",
"uid": "payslips-pg"
},
"rawSql": "SELECT pay_date AS \"time\", (gross_pay - rsu_vest) AS cash_gross, net_pay, COALESCE(cash_income_tax, income_tax) AS income_tax, national_insurance FROM payslip_ingest.payslip WHERE $__timeFilter(pay_date) ORDER BY pay_date",
"rawSql": "SELECT pay_date AS \"time\", (gross_pay - rsu_vest) AS cash_gross, net_pay FROM payslip_ingest.payslip WHERE $__timeFilter(pay_date) ORDER BY pay_date",
"format": "time_series",
"refId": "A",
"rawQuery": true,
@ -433,7 +434,7 @@
{
"id": 3,
"title": "Effective rate & take-home % (YTD cumulative)",
"description": "YTD-cumulative rates. PAYE rate uses reported taxable_pay as the base; all-deductions rate uses gross_pay. Computed from cumulative SUM over the tax year, so vest-month RSU tax is blended proportionally with RSU value \u2014 no per-slip attribution hack, no spikes.",
"description": "YTD-cumulative rates \u2014 three angles on take-home. (1) PAYE rate = SUM(income_tax) / SUM(taxable_pay): the audit number HMRC uses, converges to ~marginal in the additional-rate band. (2) Cash take-home % = SUM(net_pay) / SUM(gross_pay - rsu_vest): what fraction of cash earnings becomes a bank deposit; useful for cash-flow planning. (3) Total keep % = (SUM(net_pay) + SUM(rsu_vest - rsu_paye_marginal - rsu_ni_marginal)) / SUM(gross_pay): true 'what I actually keep' including post-tax RSU shares with the exact band-aware marginal (PA-taper aware). Resets on 6-April tax year boundary.",
"type": "timeseries",
"datasource": {
"type": "grafana-postgresql-datasource",
@ -484,7 +485,65 @@
}
}
},
"overrides": []
"overrides": [
{
"matcher": {
"id": "byName",
"options": "ytd_paye_rate_pct"
},
"properties": [
{
"id": "color",
"value": {
"mode": "fixed",
"fixedColor": "#C4162A"
}
},
{
"id": "displayName",
"value": "PAYE rate (HMRC, on taxable_pay)"
}
]
},
{
"matcher": {
"id": "byName",
"options": "ytd_cash_take_home_pct"
},
"properties": [
{
"id": "color",
"value": {
"mode": "fixed",
"fixedColor": "green"
}
},
{
"id": "displayName",
"value": "Cash take-home % (net / cash_gross)"
}
]
},
{
"matcher": {
"id": "byName",
"options": "ytd_total_keep_pct"
},
"properties": [
{
"id": "color",
"value": {
"mode": "fixed",
"fixedColor": "blue"
}
},
{
"id": "displayName",
"value": "Total keep % (cash + post-tax shares)"
}
]
}
]
},
"options": {
"legend": {
@ -506,7 +565,7 @@
"type": "grafana-postgresql-datasource",
"uid": "payslips-pg"
},
"rawSql": "SELECT pay_date AS \"time\", ROUND(((SUM(income_tax) OVER w)::numeric / NULLIF(SUM(COALESCE(taxable_pay, gross_pay)) OVER w, 0)) * 100, 2) AS \"ytd_paye_rate_pct\", ROUND((((SUM(income_tax) OVER w) + (SUM(national_insurance) OVER w) + (SUM(student_loan) OVER w))::numeric / NULLIF(SUM(gross_pay) OVER w, 0)) * 100, 2) AS \"ytd_all_deductions_pct\", ROUND(((SUM(net_pay) OVER w)::numeric / NULLIF(SUM(gross_pay) OVER w, 0)) * 100, 2) AS \"ytd_take_home_pct\" FROM payslip_ingest.payslip WHERE $__timeFilter(pay_date) WINDOW w AS (PARTITION BY tax_year ORDER BY pay_date) ORDER BY pay_date",
"rawSql": "WITH r AS (SELECT * FROM payslip_ingest.payslip WHERE $__timeFilter(pay_date)), ani AS (SELECT *, COALESCE(SUM(gross_pay - COALESCE(pension_sacrifice, 0)) OVER (PARTITION BY tax_year ORDER BY pay_date ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING), 0) AS ani_prior FROM r), slice AS (SELECT *, ani_prior + gross_pay - COALESCE(rsu_vest, 0) - COALESCE(pension_sacrifice, 0) AS ani_pre, ani_prior + gross_pay - COALESCE(pension_sacrifice, 0) AS ani_post FROM ani), m AS (SELECT *, GREATEST(0, LEAST(ani_post, 12570) - GREATEST(ani_pre, 0)) * 0.00 + GREATEST(0, LEAST(ani_post, 50270) - GREATEST(ani_pre, 12570)) * 0.20 + GREATEST(0, LEAST(ani_post, 100000) - GREATEST(ani_pre, 50270)) * 0.40 + GREATEST(0, LEAST(ani_post, 125140) - GREATEST(ani_pre, 100000)) * 0.60 + GREATEST(0, ani_post - GREATEST(ani_pre, 125140)) * 0.45 AS rsu_paye_marginal, GREATEST(0, LEAST(ani_post, 12570) - GREATEST(ani_pre, 0)) * 0.00 + GREATEST(0, LEAST(ani_post, 50270) - GREATEST(ani_pre, 12570)) * 0.08 + GREATEST(0, ani_post - GREATEST(ani_pre, 50270)) * 0.02 AS rsu_ni_marginal FROM slice) SELECT pay_date AS \"time\", ROUND(((SUM(income_tax) OVER w)::numeric / NULLIF(SUM(COALESCE(taxable_pay, gross_pay)) OVER w, 0)) * 100, 2) AS \"ytd_paye_rate_pct\", ROUND(((SUM(net_pay) OVER w)::numeric / NULLIF(SUM(gross_pay - rsu_vest) OVER w, 0)) * 100, 2) AS \"ytd_cash_take_home_pct\", ROUND((((SUM(net_pay) OVER w) + (SUM(rsu_vest - rsu_paye_marginal - rsu_ni_marginal) OVER w))::numeric / NULLIF(SUM(gross_pay) OVER w, 0)) * 100, 2) AS \"ytd_total_keep_pct\" FROM m WINDOW w AS (PARTITION BY tax_year ORDER BY pay_date) ORDER BY pay_date",
"format": "time_series",
"refId": "A",
"rawQuery": true,
@ -517,7 +576,7 @@
{
"id": 11,
"title": "Tax & pension \u2014 monthly",
"description": "Per-month deductions and pension contributions. Stacked \u2014 top equals total tax + pension (both sides). Red = cash income tax; orange = RSU-attributed income tax; amber = NI; brown = student loan; purple = employee pension; light purple = employer pension (paid on top of salary).",
"description": "Per-month RSU vest tax + recurring deductions. Cash-side PAYE/NI hidden because UK cumulative PAYE makes them inherently bumpy in vest months despite the marginal RSU strip \u2014 see Panel 12 (YTD cumulative) for the smoothed totals or Panel 3 for the effective rate. Orange = tax on RSU vest at the exact band-aware marginal (PA-taper aware: 60% in \u00a3100k\u2013\u00a3125,140 zone); brown = student loan; purple = employee pension; light purple = employer pension (paid on top of salary).",
"type": "timeseries",
"datasource": {
"type": "grafana-postgresql-datasource",
@ -565,26 +624,7 @@
{
"matcher": {
"id": "byName",
"options": "cash_income_tax"
},
"properties": [
{
"id": "color",
"value": {
"mode": "fixed",
"fixedColor": "#C4162A"
}
},
{
"id": "displayName",
"value": "Income Tax (cash pay)"
}
]
},
{
"matcher": {
"id": "byName",
"options": "rsu_income_tax"
"options": "rsu_tax_marginal"
},
"properties": [
{
@ -596,26 +636,7 @@
},
{
"id": "displayName",
"value": "Income Tax (RSU-attributed)"
}
]
},
{
"matcher": {
"id": "byName",
"options": "ni"
},
"properties": [
{
"id": "color",
"value": {
"mode": "fixed",
"fixedColor": "orange"
}
},
{
"id": "displayName",
"value": "National Insurance"
"value": "Tax on RSU vest (band-aware marginal)"
}
]
},
@ -698,7 +719,7 @@
"type": "grafana-postgresql-datasource",
"uid": "payslips-pg"
},
"rawSql": "SELECT pay_date AS \"time\", COALESCE(cash_income_tax, income_tax) AS cash_income_tax, income_tax - COALESCE(cash_income_tax, income_tax) AS rsu_income_tax, national_insurance AS ni, student_loan, pension_employee, pension_employer FROM payslip_ingest.payslip WHERE $__timeFilter(pay_date) ORDER BY pay_date",
"rawSql": "WITH r AS (SELECT * FROM payslip_ingest.payslip WHERE $__timeFilter(pay_date)), ani AS (SELECT *, COALESCE(SUM(gross_pay - COALESCE(pension_sacrifice, 0)) OVER (PARTITION BY tax_year ORDER BY pay_date ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING), 0) AS ani_prior FROM r), slice AS (SELECT *, ani_prior + gross_pay - COALESCE(rsu_vest, 0) - COALESCE(pension_sacrifice, 0) AS ani_pre, ani_prior + gross_pay - COALESCE(pension_sacrifice, 0) AS ani_post FROM ani), m AS (SELECT *, GREATEST(0, LEAST(ani_post, 12570) - GREATEST(ani_pre, 0)) * 0.00 + GREATEST(0, LEAST(ani_post, 50270) - GREATEST(ani_pre, 12570)) * 0.20 + GREATEST(0, LEAST(ani_post, 100000) - GREATEST(ani_pre, 50270)) * 0.40 + GREATEST(0, LEAST(ani_post, 125140) - GREATEST(ani_pre, 100000)) * 0.60 + GREATEST(0, ani_post - GREATEST(ani_pre, 125140)) * 0.45 AS rsu_paye_marginal, GREATEST(0, LEAST(ani_post, 12570) - GREATEST(ani_pre, 0)) * 0.00 + GREATEST(0, LEAST(ani_post, 50270) - GREATEST(ani_pre, 12570)) * 0.08 + GREATEST(0, ani_post - GREATEST(ani_pre, 50270)) * 0.02 AS rsu_ni_marginal FROM slice) SELECT pay_date AS \"time\", (rsu_paye_marginal + rsu_ni_marginal) AS rsu_tax_marginal, student_loan, pension_employee, pension_employer FROM m ORDER BY pay_date",
"format": "time_series",
"refId": "A",
"rawQuery": true,
@ -709,7 +730,7 @@
{
"id": 12,
"title": "Tax & pension \u2014 YTD cumulative",
"description": "Year-to-date cumulative tax and pension. Same series and colors as the monthly panel; resets on 6-April tax year boundary.",
"description": "Year-to-date cumulative tax and pension. Same series and colors as the monthly panel \u2014 RSU vest tax broken out at the exact band-aware marginal (PA-taper aware: 60% in \u00a3100k\u2013\u00a3125,140 zone, 47% additional-rate, etc.). Resets on 6-April tax year boundary.",
"type": "timeseries",
"datasource": {
"type": "grafana-postgresql-datasource",
@ -769,14 +790,14 @@
},
{
"id": "displayName",
"value": "Income Tax (cash pay)"
"value": "Income Tax (cash)"
}
]
},
{
"matcher": {
"id": "byName",
"options": "ytd_rsu_income_tax"
"options": "ytd_rsu_tax_marginal"
},
"properties": [
{
@ -788,14 +809,14 @@
},
{
"id": "displayName",
"value": "Income Tax (RSU-attributed)"
"value": "Tax on RSU vest (band-aware marginal)"
}
]
},
{
"matcher": {
"id": "byName",
"options": "ytd_ni"
"options": "ytd_cash_ni"
},
"properties": [
{
@ -807,7 +828,7 @@
},
{
"id": "displayName",
"value": "National Insurance"
"value": "National Insurance (cash)"
}
]
},
@ -890,7 +911,7 @@
"type": "grafana-postgresql-datasource",
"uid": "payslips-pg"
},
"rawSql": "SELECT pay_date AS \"time\", SUM(COALESCE(cash_income_tax, income_tax)) OVER w AS ytd_cash_income_tax, SUM(income_tax - COALESCE(cash_income_tax, income_tax)) OVER w AS ytd_rsu_income_tax, SUM(national_insurance) OVER w AS ytd_ni, SUM(student_loan) OVER w AS ytd_student_loan, SUM(pension_employee) OVER w AS ytd_pension_employee, SUM(pension_employer) OVER w AS ytd_pension_employer FROM payslip_ingest.payslip WHERE $__timeFilter(pay_date) WINDOW w AS (PARTITION BY tax_year ORDER BY pay_date) ORDER BY pay_date",
"rawSql": "WITH r AS (SELECT * FROM payslip_ingest.payslip WHERE $__timeFilter(pay_date)), ani AS (SELECT *, COALESCE(SUM(gross_pay - COALESCE(pension_sacrifice, 0)) OVER (PARTITION BY tax_year ORDER BY pay_date ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING), 0) AS ani_prior FROM r), slice AS (SELECT *, ani_prior + gross_pay - COALESCE(rsu_vest, 0) - COALESCE(pension_sacrifice, 0) AS ani_pre, ani_prior + gross_pay - COALESCE(pension_sacrifice, 0) AS ani_post FROM ani), m AS (SELECT *, GREATEST(0, LEAST(ani_post, 12570) - GREATEST(ani_pre, 0)) * 0.00 + GREATEST(0, LEAST(ani_post, 50270) - GREATEST(ani_pre, 12570)) * 0.20 + GREATEST(0, LEAST(ani_post, 100000) - GREATEST(ani_pre, 50270)) * 0.40 + GREATEST(0, LEAST(ani_post, 125140) - GREATEST(ani_pre, 100000)) * 0.60 + GREATEST(0, ani_post - GREATEST(ani_pre, 125140)) * 0.45 AS rsu_paye_marginal, GREATEST(0, LEAST(ani_post, 12570) - GREATEST(ani_pre, 0)) * 0.00 + GREATEST(0, LEAST(ani_post, 50270) - GREATEST(ani_pre, 12570)) * 0.08 + GREATEST(0, ani_post - GREATEST(ani_pre, 50270)) * 0.02 AS rsu_ni_marginal FROM slice) SELECT pay_date AS \"time\", SUM(GREATEST(0, income_tax - rsu_paye_marginal)) OVER w AS ytd_cash_income_tax, SUM(rsu_paye_marginal + rsu_ni_marginal) OVER w AS ytd_rsu_tax_marginal, SUM(GREATEST(0, national_insurance - rsu_ni_marginal)) OVER w AS ytd_cash_ni, SUM(student_loan) OVER w AS ytd_student_loan, SUM(pension_employee) OVER w AS ytd_pension_employee, SUM(pension_employer) OVER w AS ytd_pension_employer FROM m WINDOW w AS (PARTITION BY tax_year ORDER BY pay_date) ORDER BY pay_date",
"format": "time_series",
"refId": "A",
"rawQuery": true,
@ -911,7 +932,7 @@
"h": 9,
"w": 24,
"x": 0,
"y": 29
"y": 39
},
"fieldConfig": {
"defaults": {
@ -921,20 +942,20 @@
"unit": "currencyGBP",
"custom": {
"axisPlacement": "auto",
"drawStyle": "line",
"fillOpacity": 70,
"drawStyle": "bars",
"fillOpacity": 100,
"gradientMode": "none",
"hideFrom": {
"legend": false,
"tooltip": false,
"viz": false
},
"lineWidth": 1,
"lineWidth": 0,
"pointSize": 4,
"scaleDistribution": {
"type": "linear"
},
"showPoints": "auto",
"showPoints": "never",
"spanNulls": false,
"stacking": {
"group": "A",
@ -1028,7 +1049,8 @@
"legend": {
"calcs": [
"last",
"max"
"max",
"sum"
],
"displayMode": "table",
"placement": "bottom"
@ -1064,7 +1086,7 @@
"h": 6,
"w": 24,
"x": 0,
"y": 38
"y": 62
},
"fieldConfig": {
"defaults": {
@ -1238,7 +1260,7 @@
"h": 14,
"w": 24,
"x": 0,
"y": 44
"y": 68
},
"fieldConfig": {
"defaults": {
@ -1414,7 +1436,7 @@
"h": 12,
"w": 24,
"x": 0,
"y": 58
"y": 82
},
"fieldConfig": {
"defaults": {
@ -1724,7 +1746,7 @@
"h": 14,
"w": 24,
"x": 0,
"y": 70
"y": 48
},
"options": {
"monochrome": false,
@ -1745,7 +1767,7 @@
"rawQuery": true,
"editorMode": "code",
"format": "table",
"rawSql": "WITH agg AS (SELECT COALESCE(SUM(salary), 0) AS salary, COALESCE(SUM(bonus), 0) AS bonus, COALESCE(SUM(rsu_vest), 0) AS rsu_vest, COALESCE(SUM(GREATEST(gross_pay - salary - bonus - rsu_vest, 0)), 0) AS other_income, COALESCE(SUM(net_pay), 0) AS net_pay, COALESCE(SUM(COALESCE(cash_income_tax, income_tax)), 0) AS cash_income_tax, COALESCE(SUM(income_tax - COALESCE(cash_income_tax, income_tax)), 0) AS rsu_income_tax, COALESCE(SUM(national_insurance), 0) AS ni, COALESCE(SUM(pension_employee), 0) AS pension, COALESCE(SUM(student_loan), 0) AS student_loan, COALESCE(SUM(rsu_offset), 0) AS rsu_offset FROM payslip_ingest.payslip WHERE $__timeFilter(pay_date)) SELECT 'Salary' AS source, 'Gross' AS target, salary AS value FROM agg WHERE salary > 0 UNION ALL SELECT 'Bonus', 'Gross', bonus FROM agg WHERE bonus > 0 UNION ALL SELECT 'RSU', 'Gross', rsu_vest FROM agg WHERE rsu_vest > 0 UNION ALL SELECT 'Other income', 'Gross', other_income FROM agg WHERE other_income > 0 UNION ALL SELECT 'Gross', 'Net pay', net_pay FROM agg WHERE net_pay > 0 UNION ALL SELECT 'Gross', 'Income Tax (cash)', cash_income_tax FROM agg WHERE cash_income_tax > 0 UNION ALL SELECT 'Gross', 'Income Tax (RSU)', rsu_income_tax FROM agg WHERE rsu_income_tax > 0 UNION ALL SELECT 'Gross', 'National Insurance', ni FROM agg WHERE ni > 0 UNION ALL SELECT 'Gross', 'Pension', pension FROM agg WHERE pension > 0 UNION ALL SELECT 'Gross', 'Student Loan', student_loan FROM agg WHERE student_loan > 0 UNION ALL SELECT 'Gross', 'RSU Offset', rsu_offset FROM agg WHERE rsu_offset > 0"
"rawSql": "WITH r AS (SELECT * FROM payslip_ingest.payslip WHERE $__timeFilter(pay_date)), ani AS (SELECT *, COALESCE(SUM(gross_pay - COALESCE(pension_sacrifice, 0)) OVER (PARTITION BY tax_year ORDER BY pay_date ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING), 0) AS ani_prior FROM r), slice AS (SELECT *, ani_prior + gross_pay - COALESCE(rsu_vest, 0) - COALESCE(pension_sacrifice, 0) AS ani_pre, ani_prior + gross_pay - COALESCE(pension_sacrifice, 0) AS ani_post FROM ani), m AS (SELECT *, GREATEST(0, LEAST(ani_post, 12570) - GREATEST(ani_pre, 0)) * 0.00 + GREATEST(0, LEAST(ani_post, 50270) - GREATEST(ani_pre, 12570)) * 0.20 + GREATEST(0, LEAST(ani_post, 100000) - GREATEST(ani_pre, 50270)) * 0.40 + GREATEST(0, LEAST(ani_post, 125140) - GREATEST(ani_pre, 100000)) * 0.60 + GREATEST(0, ani_post - GREATEST(ani_pre, 125140)) * 0.45 AS rsu_paye_marginal, GREATEST(0, LEAST(ani_post, 12570) - GREATEST(ani_pre, 0)) * 0.00 + GREATEST(0, LEAST(ani_post, 50270) - GREATEST(ani_pre, 12570)) * 0.08 + GREATEST(0, ani_post - GREATEST(ani_pre, 50270)) * 0.02 AS rsu_ni_marginal FROM slice), agg AS (SELECT COALESCE(SUM(salary), 0) AS salary, COALESCE(SUM(bonus), 0) AS bonus, COALESCE(SUM(rsu_vest), 0) AS rsu_vest, COALESCE(SUM(GREATEST(gross_pay - salary - bonus - rsu_vest, 0)), 0) AS other_income, COALESCE(SUM(net_pay), 0) AS net_pay, COALESCE(SUM(GREATEST(0, income_tax - rsu_paye_marginal)), 0) AS cash_income_tax, COALESCE(SUM(rsu_paye_marginal + rsu_ni_marginal), 0) AS rsu_tax_marginal, COALESCE(SUM(GREATEST(0, national_insurance - rsu_ni_marginal)), 0) AS cash_ni, COALESCE(SUM(pension_employee), 0) AS pension, COALESCE(SUM(student_loan), 0) AS student_loan, COALESCE(SUM(rsu_offset), 0) AS rsu_offset FROM m) SELECT 'Salary' AS source, 'Gross' AS target, salary AS value FROM agg WHERE salary > 0 UNION ALL SELECT 'Bonus', 'Gross', bonus FROM agg WHERE bonus > 0 UNION ALL SELECT 'RSU', 'Gross', rsu_vest FROM agg WHERE rsu_vest > 0 UNION ALL SELECT 'Other income', 'Gross', other_income FROM agg WHERE other_income > 0 UNION ALL SELECT 'Gross', 'Net pay', net_pay FROM agg WHERE net_pay > 0 UNION ALL SELECT 'Gross', 'Income Tax (cash)', cash_income_tax FROM agg WHERE cash_income_tax > 0 UNION ALL SELECT 'Gross', 'Tax on RSU vest', rsu_tax_marginal FROM agg WHERE rsu_tax_marginal > 0 UNION ALL SELECT 'Gross', 'National Insurance (cash)', cash_ni FROM agg WHERE cash_ni > 0 UNION ALL SELECT 'Gross', 'Pension', pension FROM agg WHERE pension > 0 UNION ALL SELECT 'Gross', 'Student Loan', student_loan FROM agg WHERE student_loan > 0 UNION ALL SELECT 'Gross', 'RSU Offset', rsu_offset FROM agg WHERE rsu_offset > 0"
}
]
},
@ -1762,7 +1784,7 @@
"h": 10,
"w": 24,
"x": 0,
"y": 84
"y": 94
},
"fieldConfig": {
"defaults": {
@ -1908,116 +1930,142 @@
]
},
{
"id": 10,
"title": "HMRC Tax Year Reconciliation \u2014 Individual Tax API",
"description": "Latest snapshot from HMRC Individual Tax API v1.1 vs SUM(payslip.income_tax) per tax year. Delta > \u00a310 turns red \u2014 that's parser drift vs HMRC's held figures, the authoritative ground truth. Shown only for years where hmrc-sync has pulled a snapshot.",
"type": "table",
"id": 16,
"title": "Yearly receipt \u2014 gross income per tax year",
"description": "One stacked bar per tax year showing all gross income components: salary (cash, post-pension-sacrifice), pension (salary-sacrifice \u2014 untaxed but real income), bonus, and RSU vest gross. Bar total = pre-sacrifice gross compensation. Aligns with P60: bar \u2212 pension_sacrifice \u2248 ytd_gross reported on the final March payslip / P60. Where the parser correctly captured bonus into gross_pay (every year except 2023/24 and 2024/25 \u2014 March payslip parsing bug), the match is exact.",
"type": "barchart",
"datasource": {
"type": "grafana-postgresql-datasource",
"uid": "payslips-pg"
},
"gridPos": {
"h": 10,
"w": 24,
"w": 12,
"x": 0,
"y": 94
"y": 29
},
"fieldConfig": {
"defaults": {
"color": {
"mode": "palette-classic"
},
"unit": "currencyGBP",
"custom": {
"align": "right",
"displayMode": "auto"
"axisPlacement": "auto",
"axisLabel": "",
"axisCenteredZero": false,
"fillOpacity": 80,
"gradientMode": "none",
"lineWidth": 1,
"hideFrom": {
"legend": false,
"tooltip": false,
"viz": false
},
"thresholdsStyle": {
"mode": "off"
}
}
},
"overrides": [
{
"matcher": {
"id": "byRegexp",
"options": "^delta_"
"id": "byName",
"options": "salary_cash"
},
"properties": [
{
"id": "custom.displayMode",
"value": "color-background"
},
{
"id": "thresholds",
"id": "color",
"value": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
},
{
"color": "green",
"value": -10
},
{
"color": "red",
"value": 10
},
{
"color": "red",
"value": -10
}
]
"mode": "fixed",
"fixedColor": "green"
}
},
{
"id": "displayName",
"value": "Salary (cash, post-sacrifice)"
}
]
},
{
"matcher": {
"id": "byName",
"options": "tax_year"
"options": "pension_sacrifice"
},
"properties": [
{
"id": "unit",
"value": "string"
"id": "color",
"value": {
"mode": "fixed",
"fixedColor": "#CE96D8"
}
},
{
"id": "custom.align",
"value": "left"
"id": "displayName",
"value": "Pension (salary sacrifice, untaxed)"
}
]
},
{
"matcher": {
"id": "byName",
"options": "employer_paye_ref"
"options": "bonus"
},
"properties": [
{
"id": "unit",
"value": "string"
"id": "color",
"value": {
"mode": "fixed",
"fixedColor": "#FADE2A"
}
},
{
"id": "custom.align",
"value": "left"
"id": "displayName",
"value": "Bonus (gross)"
}
]
},
{
"matcher": {
"id": "byName",
"options": "snapshot_date"
"options": "rsu_gross"
},
"properties": [
{
"id": "unit",
"value": "dateTimeAsIso"
"id": "color",
"value": {
"mode": "fixed",
"fixedColor": "#3274D9"
}
},
{
"id": "displayName",
"value": "RSU vest (gross)"
}
]
}
]
},
"options": {
"showHeader": true,
"cellHeight": "sm",
"footer": {
"show": false
"barRadius": 0,
"barWidth": 0.6,
"groupWidth": 0.7,
"orientation": "auto",
"showValue": "auto",
"stacking": "normal",
"xField": "tax_year",
"xTickLabelRotation": 0,
"xTickLabelSpacing": 0,
"legend": {
"calcs": [
"sum"
],
"displayMode": "table",
"placement": "bottom"
},
"tooltip": {
"mode": "multi",
"sort": "desc"
}
},
"targets": [
@ -2030,7 +2078,74 @@
"rawQuery": true,
"editorMode": "code",
"format": "table",
"rawSql": "WITH latest AS (SELECT DISTINCT ON (tax_year, employer_paye_ref) tax_year, employer_paye_ref, snapshot_date, gross_pay, income_tax, ni_contributions FROM hmrc_sync.tax_year_snapshot ORDER BY tax_year, employer_paye_ref, snapshot_date DESC), summed AS (SELECT tax_year, COALESCE(SUM(gross_pay), 0) AS sum_gross, COALESCE(SUM(income_tax), 0) AS sum_tax, COALESCE(SUM(national_insurance), 0) AS sum_ni FROM payslip_ingest.payslip GROUP BY tax_year) SELECT l.tax_year, l.employer_paye_ref, l.snapshot_date, l.gross_pay AS hmrc_gross, s.sum_gross AS computed_gross, (l.gross_pay - s.sum_gross) AS delta_gross, l.income_tax AS hmrc_tax, s.sum_tax AS computed_tax, (l.income_tax - s.sum_tax) AS delta_tax, l.ni_contributions AS hmrc_ni, s.sum_ni AS computed_ni, (l.ni_contributions - s.sum_ni) AS delta_ni FROM latest l LEFT JOIN summed s ON s.tax_year = l.tax_year ORDER BY l.tax_year DESC"
"rawSql": "SELECT tax_year, SUM(salary - COALESCE(pension_sacrifice, 0)) AS salary_cash, SUM(COALESCE(pension_sacrifice, 0)) AS pension_sacrifice, SUM(bonus) AS bonus, SUM(rsu_vest) AS rsu_gross FROM payslip_ingest.payslip GROUP BY tax_year ORDER BY tax_year"
}
]
},
{
"id": 17,
"title": "YTD gross salary \u2014 year-over-year comparison",
"description": "Cumulative gross pay built up month by month within each UK tax year (April \u2192 March). One line per tax year. Pay dates are projected onto a sliding 12-month window ending now, so years overlay cleanly without falling outside the dashboard's time range. X-axis shows month-of-tax-year (April first, March last).",
"type": "timeseries",
"timeFrom": "13M",
"datasource": {
"type": "grafana-postgresql-datasource",
"uid": "payslips-pg"
},
"gridPos": {
"h": 10,
"w": 12,
"x": 12,
"y": 29
},
"fieldConfig": {
"defaults": {
"color": {
"mode": "palette-classic"
},
"unit": "currencyGBP",
"decimals": 0,
"custom": {
"drawStyle": "line",
"lineWidth": 2,
"fillOpacity": 0,
"pointSize": 5,
"showPoints": "auto",
"spanNulls": true,
"axisPlacement": "auto",
"stacking": {
"group": "A",
"mode": "none"
}
}
},
"overrides": []
},
"options": {
"legend": {
"calcs": [
"last",
"max"
],
"displayMode": "table",
"placement": "bottom"
},
"tooltip": {
"mode": "multi",
"sort": "desc"
}
},
"targets": [
{
"refId": "A",
"datasource": {
"type": "grafana-postgresql-datasource",
"uid": "payslips-pg"
},
"rawQuery": true,
"editorMode": "code",
"format": "time_series",
"rawSql": "WITH projected AS (SELECT ((CURRENT_DATE - INTERVAL '12 months')::date + (pay_date - MAKE_DATE(SUBSTRING(tax_year, 1, 4)::int, 4, 6)))::timestamp AS t, tax_year, SUM(gross_pay) OVER (PARTITION BY tax_year ORDER BY pay_date) AS ytd FROM payslip_ingest.payslip) SELECT t AS \"time\", tax_year AS metric, ytd AS ytd_gross FROM projected ORDER BY t, tax_year"
}
]
}

View file

@ -0,0 +1,671 @@
{
"annotations": {
"list": [
{
"builtIn": 1,
"datasource": {"type": "datasource", "uid": "grafana"},
"enable": true,
"hide": true,
"iconColor": "rgba(0, 211, 255, 1)",
"name": "Annotations & Alerts",
"type": "dashboard"
}
]
},
"description": "Wealth — net worth, contributions, and growth over time. Backed by the wealthfolio_sync PG mirror of Wealthfolio's SQLite, refreshed hourly by the pg-sync sidecar.",
"editable": true,
"fiscalYearStartMonth": 0,
"id": null,
"links": [],
"panels": [
{
"id": 1,
"title": "Net worth (current)",
"type": "stat",
"datasource": {"type": "grafana-postgresql-datasource", "uid": "wealth-pg"},
"gridPos": {"h": 4, "w": 5, "x": 0, "y": 0},
"fieldConfig": {
"defaults": {
"unit": "currencyGBP",
"color": {"mode": "fixed", "fixedColor": "green"},
"decimals": 0
},
"overrides": []
},
"options": {
"colorMode": "value",
"graphMode": "area",
"justifyMode": "center",
"orientation": "auto",
"reduceOptions": {"calcs": ["lastNotNull"], "fields": "", "values": false},
"textMode": "auto"
},
"targets": [
{
"refId": "A",
"datasource": {"type": "grafana-postgresql-datasource", "uid": "wealth-pg"},
"rawQuery": true,
"editorMode": "code",
"format": "table",
"rawSql": "SELECT SUM(total_value) AS net_worth FROM daily_account_valuation WHERE valuation_date = (SELECT MAX(valuation_date) FROM daily_account_valuation)"
}
]
},
{
"id": 2,
"title": "Net contribution (cumulative)",
"description": "Total deposits minus withdrawals across all accounts.",
"type": "stat",
"datasource": {"type": "grafana-postgresql-datasource", "uid": "wealth-pg"},
"gridPos": {"h": 4, "w": 5, "x": 5, "y": 0},
"fieldConfig": {
"defaults": {
"unit": "currencyGBP",
"color": {"mode": "fixed", "fixedColor": "blue"},
"decimals": 0
},
"overrides": []
},
"options": {
"colorMode": "value",
"graphMode": "area",
"justifyMode": "center",
"orientation": "auto",
"reduceOptions": {"calcs": ["lastNotNull"], "fields": "", "values": false},
"textMode": "auto"
},
"targets": [
{
"refId": "A",
"datasource": {"type": "grafana-postgresql-datasource", "uid": "wealth-pg"},
"rawQuery": true,
"editorMode": "code",
"format": "table",
"rawSql": "SELECT SUM(net_contribution) AS contribution FROM daily_account_valuation WHERE valuation_date = (SELECT MAX(valuation_date) FROM daily_account_valuation)"
}
]
},
{
"id": 3,
"title": "Growth (unrealised)",
"description": "Net worth minus net contribution — the gain on everything you've put in.",
"type": "stat",
"datasource": {"type": "grafana-postgresql-datasource", "uid": "wealth-pg"},
"gridPos": {"h": 4, "w": 5, "x": 10, "y": 0},
"fieldConfig": {
"defaults": {
"unit": "currencyGBP",
"color": {"mode": "thresholds"},
"decimals": 0,
"thresholds": {
"mode": "absolute",
"steps": [
{"color": "red", "value": null},
{"color": "green", "value": 0}
]
}
},
"overrides": []
},
"options": {
"colorMode": "value",
"graphMode": "area",
"justifyMode": "center",
"orientation": "auto",
"reduceOptions": {"calcs": ["lastNotNull"], "fields": "", "values": false},
"textMode": "auto"
},
"targets": [
{
"refId": "A",
"datasource": {"type": "grafana-postgresql-datasource", "uid": "wealth-pg"},
"rawQuery": true,
"editorMode": "code",
"format": "table",
"rawSql": "SELECT (SUM(total_value) - SUM(net_contribution)) AS growth FROM daily_account_valuation WHERE valuation_date = (SELECT MAX(valuation_date) FROM daily_account_valuation)"
}
]
},
{
"id": 4,
"title": "ROI %",
"description": "Growth / net contribution × 100. Excludes accounts with zero/negative contribution (Schwab) to avoid distortion.",
"type": "stat",
"datasource": {"type": "grafana-postgresql-datasource", "uid": "wealth-pg"},
"gridPos": {"h": 4, "w": 5, "x": 15, "y": 0},
"fieldConfig": {
"defaults": {
"unit": "percent",
"color": {"mode": "thresholds"},
"decimals": 1,
"thresholds": {
"mode": "absolute",
"steps": [
{"color": "red", "value": null},
{"color": "yellow", "value": 0},
{"color": "green", "value": 5}
]
}
},
"overrides": []
},
"options": {
"colorMode": "value",
"graphMode": "area",
"justifyMode": "center",
"orientation": "auto",
"reduceOptions": {"calcs": ["lastNotNull"], "fields": "", "values": false},
"textMode": "auto"
},
"targets": [
{
"refId": "A",
"datasource": {"type": "grafana-postgresql-datasource", "uid": "wealth-pg"},
"rawQuery": true,
"editorMode": "code",
"format": "table",
"rawSql": "WITH latest AS (SELECT * FROM daily_account_valuation WHERE valuation_date = (SELECT MAX(valuation_date) FROM daily_account_valuation) AND net_contribution > 0) SELECT (SUM(total_value - net_contribution) / NULLIF(SUM(net_contribution), 0) * 100) AS roi_pct FROM latest"
}
]
},
{
"id": 5,
"title": "Net worth — total over time",
"description": "Daily total_value summed across all accounts (base GBP).",
"type": "timeseries",
"datasource": {"type": "grafana-postgresql-datasource", "uid": "wealth-pg"},
"gridPos": {"h": 10, "w": 24, "x": 0, "y": 4},
"fieldConfig": {
"defaults": {
"color": {"mode": "fixed", "fixedColor": "green"},
"unit": "currencyGBP",
"custom": {
"drawStyle": "line",
"lineWidth": 2,
"fillOpacity": 20,
"pointSize": 4,
"showPoints": "never",
"spanNulls": true,
"axisPlacement": "auto",
"stacking": {"group": "A", "mode": "none"}
}
},
"overrides": [
{
"matcher": {"id": "byName", "options": "net_worth"},
"properties": [{"id": "displayName", "value": "Net worth"}]
}
]
},
"options": {
"legend": {"calcs": ["last", "max"], "displayMode": "table", "placement": "bottom"},
"tooltip": {"mode": "multi", "sort": "desc"}
},
"targets": [
{
"refId": "A",
"datasource": {"type": "grafana-postgresql-datasource", "uid": "wealth-pg"},
"rawQuery": true,
"editorMode": "code",
"format": "time_series",
"rawSql": "SELECT valuation_date::timestamp AS \"time\", SUM(total_value) AS net_worth FROM daily_account_valuation WHERE $__timeFilter(valuation_date) GROUP BY valuation_date ORDER BY valuation_date"
}
]
},
{
"id": 6,
"title": "Net contribution vs market value",
"description": "Net contribution = cumulative deposits withdrawals. Market value = total_value (cash + investments). Gap between the two = unrealised growth.",
"type": "timeseries",
"datasource": {"type": "grafana-postgresql-datasource", "uid": "wealth-pg"},
"gridPos": {"h": 10, "w": 12, "x": 0, "y": 14},
"fieldConfig": {
"defaults": {
"color": {"mode": "palette-classic"},
"unit": "currencyGBP",
"custom": {
"drawStyle": "line",
"lineWidth": 2,
"fillOpacity": 0,
"pointSize": 4,
"showPoints": "never",
"spanNulls": true,
"axisPlacement": "auto",
"stacking": {"group": "A", "mode": "none"}
}
},
"overrides": [
{
"matcher": {"id": "byName", "options": "market_value"},
"properties": [
{"id": "color", "value": {"mode": "fixed", "fixedColor": "green"}},
{"id": "displayName", "value": "Market value"}
]
},
{
"matcher": {"id": "byName", "options": "net_contribution"},
"properties": [
{"id": "color", "value": {"mode": "fixed", "fixedColor": "blue"}},
{"id": "displayName", "value": "Net contribution"}
]
}
]
},
"options": {
"legend": {"calcs": ["last"], "displayMode": "table", "placement": "bottom"},
"tooltip": {"mode": "multi", "sort": "desc"}
},
"targets": [
{
"refId": "A",
"datasource": {"type": "grafana-postgresql-datasource", "uid": "wealth-pg"},
"rawQuery": true,
"editorMode": "code",
"format": "time_series",
"rawSql": "SELECT valuation_date::timestamp AS \"time\", SUM(net_contribution) AS net_contribution, SUM(total_value) AS market_value FROM daily_account_valuation WHERE $__timeFilter(valuation_date) GROUP BY valuation_date ORDER BY valuation_date"
}
]
},
{
"id": 7,
"title": "Growth (market value contribution) over time",
"description": "Unrealised gain across all accounts. Filled area to emphasise the wealth created above the contributed capital.",
"type": "timeseries",
"datasource": {"type": "grafana-postgresql-datasource", "uid": "wealth-pg"},
"gridPos": {"h": 10, "w": 12, "x": 12, "y": 14},
"fieldConfig": {
"defaults": {
"color": {"mode": "fixed", "fixedColor": "#56A64B"},
"unit": "currencyGBP",
"custom": {
"drawStyle": "line",
"lineWidth": 2,
"fillOpacity": 50,
"gradientMode": "opacity",
"pointSize": 4,
"showPoints": "never",
"spanNulls": true,
"axisPlacement": "auto",
"stacking": {"group": "A", "mode": "none"}
}
},
"overrides": [
{
"matcher": {"id": "byName", "options": "growth"},
"properties": [{"id": "displayName", "value": "Growth"}]
}
]
},
"options": {
"legend": {"calcs": ["last", "max"], "displayMode": "table", "placement": "bottom"},
"tooltip": {"mode": "multi", "sort": "desc"}
},
"targets": [
{
"refId": "A",
"datasource": {"type": "grafana-postgresql-datasource", "uid": "wealth-pg"},
"rawQuery": true,
"editorMode": "code",
"format": "time_series",
"rawSql": "SELECT valuation_date::timestamp AS \"time\", (SUM(total_value) - SUM(net_contribution)) AS growth FROM daily_account_valuation WHERE $__timeFilter(valuation_date) GROUP BY valuation_date ORDER BY valuation_date"
}
]
},
{
"id": 8,
"title": "Per-account stacked — total value",
"description": "Stacked area showing each account's contribution to total net worth over time. Useful for spotting which account drives the trajectory.",
"type": "timeseries",
"datasource": {"type": "grafana-postgresql-datasource", "uid": "wealth-pg"},
"gridPos": {"h": 11, "w": 24, "x": 0, "y": 24},
"fieldConfig": {
"defaults": {
"color": {"mode": "palette-classic"},
"unit": "currencyGBP",
"custom": {
"drawStyle": "line",
"lineWidth": 1,
"fillOpacity": 70,
"pointSize": 3,
"showPoints": "never",
"spanNulls": true,
"axisPlacement": "auto",
"stacking": {"group": "A", "mode": "normal"}
}
},
"overrides": []
},
"options": {
"legend": {"calcs": ["last"], "displayMode": "table", "placement": "bottom"},
"tooltip": {"mode": "multi", "sort": "desc"}
},
"targets": [
{
"refId": "A",
"datasource": {"type": "grafana-postgresql-datasource", "uid": "wealth-pg"},
"rawQuery": true,
"editorMode": "code",
"format": "time_series",
"rawSql": "SELECT d.valuation_date::timestamp AS \"time\", a.name AS metric, d.total_value AS value FROM daily_account_valuation d JOIN accounts a ON a.id = d.account_id WHERE $__timeFilter(d.valuation_date) ORDER BY d.valuation_date, a.name"
}
]
},
{
"id": 9,
"title": "Cash vs invested (stacked)",
"description": "Daily breakdown of uninvested broker cash vs market value of investments. WORKPLACE_PENSION accounts (Fidelity) are reclassified entirely as invested — Wealthfolio dumps pension wrappers into cash_balance because it doesn't track the underlying fund holdings, but they are not actually cash.",
"type": "timeseries",
"datasource": {"type": "grafana-postgresql-datasource", "uid": "wealth-pg"},
"gridPos": {"h": 10, "w": 24, "x": 0, "y": 35},
"fieldConfig": {
"defaults": {
"color": {"mode": "palette-classic"},
"unit": "currencyGBP",
"custom": {
"drawStyle": "line",
"lineWidth": 1,
"fillOpacity": 70,
"pointSize": 3,
"showPoints": "never",
"spanNulls": true,
"axisPlacement": "auto",
"stacking": {"group": "A", "mode": "normal"}
}
},
"overrides": [
{
"matcher": {"id": "byName", "options": "cash"},
"properties": [
{"id": "color", "value": {"mode": "fixed", "fixedColor": "#FADE2A"}},
{"id": "displayName", "value": "Cash"}
]
},
{
"matcher": {"id": "byName", "options": "invested"},
"properties": [
{"id": "color", "value": {"mode": "fixed", "fixedColor": "#56A64B"}},
{"id": "displayName", "value": "Invested"}
]
}
]
},
"options": {
"legend": {"calcs": ["last"], "displayMode": "table", "placement": "bottom"},
"tooltip": {"mode": "multi", "sort": "desc"}
},
"targets": [
{
"refId": "A",
"datasource": {"type": "grafana-postgresql-datasource", "uid": "wealth-pg"},
"rawQuery": true,
"editorMode": "code",
"format": "time_series",
"rawSql": "SELECT d.valuation_date::timestamp AS \"time\", SUM(CASE WHEN a.account_type = 'WORKPLACE_PENSION' THEN 0 ELSE d.cash_balance END) AS cash, SUM(CASE WHEN a.account_type = 'WORKPLACE_PENSION' THEN d.cash_balance + d.investment_market_value ELSE d.investment_market_value END) AS invested FROM daily_account_valuation d JOIN accounts a ON a.id = d.account_id WHERE $__timeFilter(d.valuation_date) GROUP BY d.valuation_date ORDER BY d.valuation_date"
}
]
},
{
"id": 10,
"title": "Activity log",
"description": "Recent activities (BUY / SELL / DEPOSIT / WITHDRAWAL / DIVIDEND / etc.) across all accounts. Limited to 100 most recent.",
"type": "table",
"datasource": {"type": "grafana-postgresql-datasource", "uid": "wealth-pg"},
"gridPos": {"h": 14, "w": 24, "x": 0, "y": 77},
"fieldConfig": {
"defaults": {
"custom": {"align": "auto", "displayMode": "auto"}
},
"overrides": [
{
"matcher": {"id": "byName", "options": "amount"},
"properties": [{"id": "unit", "value": "currencyGBP"}]
}
]
},
"options": {
"cellHeight": "sm",
"footer": {"show": false}
},
"targets": [
{
"refId": "A",
"datasource": {"type": "grafana-postgresql-datasource", "uid": "wealth-pg"},
"rawQuery": true,
"editorMode": "code",
"format": "table",
"rawSql": "SELECT a.activity_date AS \"date\", acc.name AS \"account\", a.activity_type AS \"type\", a.asset_id AS \"asset\", a.quantity AS \"qty\", a.unit_price AS \"unit_price\", a.amount AS \"amount\", a.currency AS \"ccy\", a.notes AS \"notes\" FROM activities a LEFT JOIN accounts acc ON acc.id = a.account_id WHERE $__timeFilter(a.activity_date) ORDER BY a.activity_date DESC LIMIT 100"
}
]
},
{
"id": 11,
"title": "12mo return",
"description": "Modified-Dietz return over the trailing 12 months: market_gain / (nw_12mo_ago + 0.5 × contributions_12mo). Excludes new money in — answers 'how did my investments perform' rather than 'how much did my net worth change'.",
"type": "stat",
"datasource": {"type": "grafana-postgresql-datasource", "uid": "wealth-pg"},
"gridPos": {"h": 4, "w": 4, "x": 20, "y": 0},
"fieldConfig": {
"defaults": {
"unit": "percent",
"color": {"mode": "thresholds"},
"decimals": 2,
"thresholds": {
"mode": "absolute",
"steps": [
{"color": "red", "value": null},
{"color": "yellow", "value": 0},
{"color": "green", "value": 5}
]
}
},
"overrides": []
},
"options": {
"colorMode": "value",
"graphMode": "area",
"justifyMode": "center",
"orientation": "auto",
"reduceOptions": {"calcs": ["lastNotNull"], "fields": "", "values": false},
"textMode": "auto"
},
"targets": [
{
"refId": "A",
"datasource": {"type": "grafana-postgresql-datasource", "uid": "wealth-pg"},
"rawQuery": true,
"editorMode": "code",
"format": "table",
"rawSql": "WITH bounds AS (SELECT (SELECT MAX(valuation_date) FROM daily_account_valuation) AS d_now, (SELECT MIN(valuation_date) FROM daily_account_valuation WHERE valuation_date >= (SELECT MAX(valuation_date) - INTERVAL '12 months' FROM daily_account_valuation)) AS d_ago), agg AS (SELECT (SELECT SUM(total_value) FROM daily_account_valuation WHERE valuation_date = b.d_now) AS nw_now, (SELECT SUM(net_contribution) FROM daily_account_valuation WHERE valuation_date = b.d_now) AS contrib_now, (SELECT SUM(total_value) FROM daily_account_valuation WHERE valuation_date = b.d_ago) AS nw_ago, (SELECT SUM(net_contribution) FROM daily_account_valuation WHERE valuation_date = b.d_ago) AS contrib_ago FROM bounds b) SELECT ROUND((((nw_now - nw_ago - (contrib_now - contrib_ago)) / NULLIF(nw_ago + 0.5 * (contrib_now - contrib_ago), 0)) * 100)::numeric, 2) AS pct_12mo FROM agg"
}
]
},
{
"id": 12,
"title": "Yearly investment return %",
"description": "Modified-Dietz return per calendar year: market_gain / (nw_start + 0.5 × contributions). Pure investment performance — excludes new contributions, so a £100k vest doesn't show as 100% growth. Negative bars = market losses (e.g., 2022 bear market).",
"type": "barchart",
"datasource": {"type": "grafana-postgresql-datasource", "uid": "wealth-pg"},
"gridPos": {"h": 11, "w": 24, "x": 0, "y": 45},
"fieldConfig": {
"defaults": {
"color": {"mode": "thresholds"},
"unit": "percent",
"decimals": 1,
"thresholds": {
"mode": "absolute",
"steps": [
{"color": "red", "value": null},
{"color": "yellow", "value": 0},
{"color": "green", "value": 5}
]
},
"custom": {
"axisPlacement": "auto",
"axisLabel": "",
"fillOpacity": 80,
"gradientMode": "none",
"lineWidth": 1
}
},
"overrides": [
{
"matcher": {"id": "byName", "options": "year"},
"properties": [
{"id": "unit", "value": "string"}
]
}
]
},
"options": {
"barRadius": 0,
"barWidth": 0.6,
"groupWidth": 0.7,
"orientation": "auto",
"showValue": "always",
"stacking": "none",
"xField": "year",
"xTickLabelRotation": 0,
"legend": {"displayMode": "list", "placement": "bottom"},
"tooltip": {"mode": "single", "sort": "none"}
},
"targets": [
{
"refId": "A",
"datasource": {"type": "grafana-postgresql-datasource", "uid": "wealth-pg"},
"rawQuery": true,
"editorMode": "code",
"format": "table",
"rawSql": "WITH yearly AS (SELECT EXTRACT(YEAR FROM valuation_date)::int AS yr, valuation_date, SUM(total_value) AS nw, SUM(net_contribution) AS contrib FROM daily_account_valuation GROUP BY valuation_date), endpoints AS (SELECT yr, (array_agg(nw ORDER BY valuation_date ASC))[1] AS nw_start, (array_agg(nw ORDER BY valuation_date DESC))[1] AS nw_end, (array_agg(contrib ORDER BY valuation_date ASC))[1] AS contrib_start, (array_agg(contrib ORDER BY valuation_date DESC))[1] AS contrib_end FROM yearly GROUP BY yr) SELECT yr::text AS year, ROUND((((nw_end - nw_start - (contrib_end - contrib_start)) / NULLIF(nw_start + 0.5 * (contrib_end - contrib_start), 0)) * 100)::numeric, 2) AS return_pct FROM endpoints WHERE (nw_start + 0.5 * (contrib_end - contrib_start)) > 0 ORDER BY yr"
}
]
},
{
"id": 13,
"title": "Annual change decomposition — contributions vs market gain",
"description": "Each calendar year's net worth change split into 'new money in' (contributions withdrawals) and 'market gain' (everything else: price appreciation, dividends, etc.). Shows whether you grew because you saved or because the market did the work. Negative bars = withdrawals or market losses.",
"type": "barchart",
"datasource": {"type": "grafana-postgresql-datasource", "uid": "wealth-pg"},
"gridPos": {"h": 11, "w": 24, "x": 0, "y": 56},
"fieldConfig": {
"defaults": {
"color": {"mode": "palette-classic"},
"unit": "currencyGBP",
"decimals": 0,
"custom": {
"axisPlacement": "auto",
"axisLabel": "",
"fillOpacity": 80,
"gradientMode": "none",
"lineWidth": 1
}
},
"overrides": [
{
"matcher": {"id": "byName", "options": "year"},
"properties": [
{"id": "unit", "value": "string"}
]
},
{
"matcher": {"id": "byName", "options": "contributions"},
"properties": [
{"id": "color", "value": {"mode": "fixed", "fixedColor": "blue"}},
{"id": "displayName", "value": "Net contributions"}
]
},
{
"matcher": {"id": "byName", "options": "market_gain"},
"properties": [
{"id": "color", "value": {"mode": "fixed", "fixedColor": "#56A64B"}},
{"id": "displayName", "value": "Market gain"}
]
}
]
},
"options": {
"barRadius": 0,
"barWidth": 0.6,
"groupWidth": 0.7,
"orientation": "auto",
"showValue": "auto",
"stacking": "normal",
"xField": "year",
"xTickLabelRotation": 0,
"legend": {"calcs": ["sum"], "displayMode": "table", "placement": "bottom"},
"tooltip": {"mode": "multi", "sort": "desc"}
},
"targets": [
{
"refId": "A",
"datasource": {"type": "grafana-postgresql-datasource", "uid": "wealth-pg"},
"rawQuery": true,
"editorMode": "code",
"format": "table",
"rawSql": "WITH yearly AS (SELECT EXTRACT(YEAR FROM valuation_date)::int AS yr, valuation_date, SUM(total_value) AS nw, SUM(net_contribution) AS contrib FROM daily_account_valuation GROUP BY valuation_date), endpoints AS (SELECT yr, (array_agg(nw ORDER BY valuation_date ASC))[1] AS nw_start, (array_agg(nw ORDER BY valuation_date DESC))[1] AS nw_end, (array_agg(contrib ORDER BY valuation_date ASC))[1] AS contrib_start, (array_agg(contrib ORDER BY valuation_date DESC))[1] AS contrib_end FROM yearly GROUP BY yr) SELECT yr::text AS year, ROUND((contrib_end - contrib_start)::numeric, 0) AS contributions, ROUND((nw_end - nw_start - (contrib_end - contrib_start))::numeric, 0) AS market_gain FROM endpoints ORDER BY yr"
}
]
},
{
"id": 14,
"title": "Per-account ROI %",
"description": "(market value net contribution) / net contribution × 100, latest snapshot. Excludes accounts with zero/negative net contribution (Schwab — RSU vests sold = negative contribution distorts the ratio). Pension shows 0% because Wealthfolio doesn't track underlying fund holdings, so cost_basis = 0 and 'growth' is just the cash balance reported.",
"type": "barchart",
"datasource": {"type": "grafana-postgresql-datasource", "uid": "wealth-pg"},
"gridPos": {"h": 10, "w": 24, "x": 0, "y": 67},
"fieldConfig": {
"defaults": {
"color": {"mode": "thresholds"},
"unit": "percent",
"decimals": 1,
"thresholds": {
"mode": "absolute",
"steps": [
{"color": "red", "value": null},
{"color": "yellow", "value": 0},
{"color": "green", "value": 10}
]
},
"custom": {
"axisPlacement": "auto",
"axisLabel": "",
"fillOpacity": 80,
"gradientMode": "none",
"lineWidth": 1
}
},
"overrides": []
},
"options": {
"barRadius": 0,
"barWidth": 0.6,
"groupWidth": 0.7,
"orientation": "horizontal",
"showValue": "always",
"stacking": "none",
"xField": "account",
"legend": {"displayMode": "list", "placement": "bottom"},
"tooltip": {"mode": "single", "sort": "none"}
},
"targets": [
{
"refId": "A",
"datasource": {"type": "grafana-postgresql-datasource", "uid": "wealth-pg"},
"rawQuery": true,
"editorMode": "code",
"format": "table",
"rawSql": "SELECT a.name AS account, ROUND(((d.total_value - d.net_contribution) / NULLIF(d.net_contribution, 0) * 100)::numeric, 2) AS roi_pct FROM daily_account_valuation d JOIN accounts a ON a.id = d.account_id WHERE d.valuation_date = (SELECT MAX(valuation_date) FROM daily_account_valuation) AND d.net_contribution > 0 ORDER BY roi_pct DESC"
}
]
}
],
"refresh": "5m",
"schemaVersion": 39,
"tags": ["finance", "personal", "wealth"],
"templating": {"list": []},
"time": {"from": "now-5y", "to": "now"},
"timepicker": {},
"timezone": "browser",
"title": "Wealth",
"uid": "wealth",
"version": 1
}

View file

@ -134,9 +134,19 @@ locals {
# Applications
"qbittorrent.json" = "Applications"
"realestate-crawler.json" = "Applications"
"uk-payslip.json" = "Finance"
"uk-payslip.json" = "Finance (Personal)"
"wealth.json" = "Finance (Personal)"
"job-hunter.json" = "Finance"
"fire-planner.json" = "Finance"
}
# Folders restricted to the Grafana admin user (anonymous Viewer + any future
# non-admin users are denied). Permission set by null_resource below via the
# Grafana folder permissions API after the dashboard sidecar auto-creates the
# folder. Server-admin always retains access regardless of folder ACL.
admin_only_folders = [
"Finance (Personal)",
]
}
resource "kubernetes_config_map" "grafana_dashboards" {
@ -157,6 +167,60 @@ resource "kubernetes_config_map" "grafana_dashboards" {
}
}
# Lock down "admin only" folders via Grafana folder permissions API.
# Default org-role inheritance gives Viewer + Editor read access to every
# folder; explicitly setting the folder ACL to {Admin: 4} overrides that
# inheritance so Viewer/Editor (incl. anonymous-Viewer) get no access.
# The Grafana super-admin (`admin` user) always retains access regardless.
resource "null_resource" "grafana_admin_only_folder_acl" {
for_each = toset(local.admin_only_folders)
# Re-runs on tg apply (cheap, idempotent API call). Catches drift if anyone
# edits permissions via the UI or the folder is rebuilt.
triggers = {
folder = each.value
always = timestamp()
}
provisioner "local-exec" {
interpreter = ["/bin/bash", "-c"]
command = <<-EOT
set -euo pipefail
FOLDER='${each.value}'
KUBECONFIG_FLAG='--kubeconfig ${var.kube_config_path}'
POD=$(kubectl $KUBECONFIG_FLAG get pod -n monitoring -l app.kubernetes.io/name=grafana -o jsonpath='{.items[0].metadata.name}')
ADMIN_PW=$(kubectl $KUBECONFIG_FLAG get secret -n monitoring grafana -o jsonpath='{.data.admin-password}' | base64 -d)
# Wait up to 60s for the dashboard sidecar to materialise the folder.
for i in $(seq 1 12); do
FOLDER_UID=$(kubectl $KUBECONFIG_FLAG exec -n monitoring "$POD" -c grafana -- \
curl -sf -u "admin:$ADMIN_PW" "http://localhost:3000/api/folders" \
| python3 -c "import json,sys; folders=json.load(sys.stdin); print(next((f['uid'] for f in folders if f['title']==sys.argv[1]), ''))" "$FOLDER" || true)
if [ -n "$FOLDER_UID" ]; then break; fi
sleep 5
done
if [ -z "$FOLDER_UID" ]; then
echo "ERROR: folder '$FOLDER' not found in Grafana after 60s"
exit 1
fi
# Admin-only ACL. permission codes: 1=View, 2=Edit, 4=Admin.
kubectl $KUBECONFIG_FLAG exec -n monitoring "$POD" -c grafana -- \
curl -sf -u "admin:$ADMIN_PW" -X POST \
-H "Content-Type: application/json" \
-d '{"items":[{"role":"Admin","permission":4}]}' \
"http://localhost:3000/api/folders/$FOLDER_UID/permissions" >/dev/null
echo "set admin-only ACL on folder '$FOLDER' (uid=$FOLDER_UID)"
EOT
}
depends_on = [
helm_release.grafana,
kubernetes_config_map.grafana_dashboards,
]
}
resource "helm_release" "grafana" {
namespace = kubernetes_namespace.monitoring.metadata[0].name
create_namespace = true

View file

@ -27,6 +27,10 @@ variable "grafana_admin_password" {
type = string
sensitive = true
}
variable "kube_config_path" {
type = string
sensitive = true
}
variable "tier" { type = string }
variable "mysql_host" { type = string }
variable "registry_user" {

View file

@ -41,7 +41,10 @@ resource "helm_release" "prometheus" {
version = "25.8.2"
timeout = 900 # 15 min Recreate strategy + iSCSI reattach is slow
force_update = true # Required for StatefulSet volumeClaimTemplate changes (immutable field)
# force_update disabled 2026-04-23: caused Helm to try replacing the bound
# pushgateway PVC (added in rev 188, see commit e51c104), which is immutable.
# Re-enable temporarily only when a StatefulSet volumeClaimTemplate change needs --force.
force_update = false
values = [templatefile("${path.module}/prometheus_chart_values.tpl", { alertmanager_mail_pass = var.alertmanager_account_password, alertmanager_slack_api_url = var.alertmanager_slack_api_url, tuya_api_key = var.tiny_tuya_service_secret, haos_api_token = var.haos_api_token })]
}

View file

@ -73,7 +73,7 @@ alertmanager:
- source_matchers:
- alertname = NodeDown
target_matchers:
- alertname =~ "NodeNotReady|NodeConditionBad|PodCrashLooping|ContainerOOMKilled|DeploymentReplicasMismatch|StatefulSetReplicasMismatch|DaemonSetMissingPods|ScrapeTargetDown|NodeLowFreeMemory|PostgreSQLDown|RedisDown|HeadscaleDown|AuthentikDown|PoisonFountainDown|HackmdDown|PrivatebinDown|MailServerDown|EmailRoundtripFailing|EmailRoundtripStale|NodeExporterDown|DockerRegistryDown|HomeAssistantDown|CloudflaredDown|TechnitiumDNSDown|iDRACRedfishMetricsMissing|iDRACSNMPMetricsMissing|HomeAssistantMetricsMissing"
- alertname =~ "NodeNotReady|NodeConditionBad|PodCrashLooping|ContainerOOMKilled|DeploymentReplicasMismatch|StatefulSetReplicasMismatch|DaemonSetMissingPods|ScrapeTargetDown|NodeLowFreeMemory|PostgreSQLDown|RedisDown|HeadscaleDown|HeadscaleReplicasMismatch|AuthentikDown|PoisonFountainDown|HackmdDown|PrivatebinDown|MailServerDown|EmailRoundtripFailing|EmailRoundtripStale|NodeExporterDown|DockerRegistryDown|HomeAssistantDown|CloudflaredDown|TechnitiumDNSDown|iDRACRedfishMetricsMissing|iDRACSNMPMetricsMissing|HomeAssistantMetricsMissing"
# NFS down causes mass pod failures and NFS-dependent service outages
- source_matchers:
- alertname = NFSServerUnresponsive
@ -98,7 +98,7 @@ alertmanager:
- source_matchers:
- alertname = PowerOutage
target_matchers:
- alertname =~ "NodeDown|NFSServerUnresponsive|NodeExporterDown|CloudflaredDown|MetalLBSpeakerDown|MetalLBControllerDown|UPSMetricsMissing|iDRACRedfishMetricsMissing|iDRACSNMPMetricsMissing|ATSMetricsMissing|HomeAssistantMetricsMissing|FuseMainMetricsMissing|FuseGarageMetricsMissing|ProxmoxMetricsMissing|iDRACSystemUnhealthy|iDRACServerPoweredOff|ProxmoxExporterDown"
- alertname =~ "NodeDown|NFSServerUnresponsive|NodeExporterDown|CloudflaredDown|MetalLBSpeakerDown|MetalLBControllerDown|UPSMetricsMissing|iDRACRedfishMetricsMissing|iDRACSNMPMetricsMissing|ATSMetricsMissing|HomeAssistantMetricsMissing|FuseMainMetricsMissing|FuseGarageMetricsMissing|ThermostatHolMetricsMissing|ThermostatMasterBedroomMetricsMissing|ThermostatOfficeMetricsMissing|ThermostatKidsRoomMetricsMissing|ProxmoxMetricsMissing|iDRACSystemUnhealthy|iDRACServerPoweredOff|ProxmoxExporterDown"
# iDRAC system-level unhealthy suppresses component-level alerts
- source_matchers:
- alertname = iDRACSystemUnhealthy
@ -113,6 +113,11 @@ alertmanager:
- alertname = FuseGarageFault
target_matchers:
- alertname = FuseGarageMetricsMissing
# Tuya Cloud API down suppresses all per-device metrics-missing alerts
- source_matchers:
- alertname = TuyaCloudDown
target_matchers:
- alertname =~ "ATSMetricsMissing|FuseMainMetricsMissing|FuseGarageMetricsMissing|ThermostatHolMetricsMissing|ThermostatMasterBedroomMetricsMissing|ThermostatOfficeMetricsMissing|ThermostatKidsRoomMetricsMissing"
# Containerd broken suppresses downstream pod alerts
- source_matchers:
- alertname = KubeletImagePullErrors
@ -160,6 +165,27 @@ prometheus-node-exporter:
memory: 100Mi
limits:
memory: 100Mi
# NOTE: The parent chart forwards subchart values under `prometheus-pushgateway:`,
# not `pushgateway:` — using the wrong key silently no-ops.
prometheus-pushgateway:
# Without persistence the pushgateway's in-memory metrics are lost on restart.
# Once-per-day pushers (offsite-backup-sync) stay invisible until their next run,
# which is why backup_last_success_timestamp{job="offsite-backup-sync"} vanished
# after the 2026-04-22 node3 kubelet hiccup.
persistentVolume:
enabled: true
size: 2Gi
storageClass: proxmox-lvm-encrypted
mountPath: /data
extraArgs:
- --persistence.file=/data/pushgateway.bin
- --persistence.interval=1m
resources:
requests:
cpu: 10m
memory: 64Mi
limits:
memory: 256Mi
server:
# Enable me to delete metrics
extraFlags:
@ -726,6 +752,7 @@ serverFiles:
for: 30m
labels:
severity: info
subsystem: gpu
annotations:
summary: "GPU power: {{ $value | printf \"%.0f\" }}W (threshold: 50W)"
- alert: HighUtilization
@ -749,6 +776,14 @@ serverFiles:
severity: critical
annotations:
summary: "NVIDIA GPU exporter is down - no GPU metrics available"
- alert: GPUNodeUnschedulable
expr: kube_node_spec_unschedulable{node="k8s-node1"} == 1
for: 5m
labels:
severity: critical
subsystem: gpu
annotations:
summary: "GPU node {{ $labels.node }} is cordoned — Frigate and GPU workloads cannot schedule"
- name: Power
rules:
- alert: OnBattery
@ -777,6 +812,7 @@ serverFiles:
for: 60m
labels:
severity: info
subsystem: r730
annotations:
summary: "Server power: {{ $value | printf \"%.0f\" }}W (threshold: 300W)"
- alert: UsingInverterEnergyForTooLong
@ -914,6 +950,83 @@ serverFiles:
severity: critical
annotations:
summary: "Garage fuse panel fault detected"
- alert: FuseMainHighLeakage
expr: fuse_main_leakage_current > 30
for: 5m
labels:
severity: critical
annotations:
summary: "Main fuse leakage current: {{ $value }}mA (threshold: 30mA)"
- alert: FuseGarageHighLeakage
expr: fuse_garage_leakage_current > 30
for: 5m
labels:
severity: critical
annotations:
summary: "Garage fuse leakage current: {{ $value }}mA (threshold: 30mA)"
- alert: FuseMainOvertemperature
expr: fuse_main_temperature > 70
for: 5m
labels:
severity: warning
annotations:
summary: "Main fuse temperature: {{ $value }}°C (threshold: 70°C)"
- alert: FuseGarageOvertemperature
expr: fuse_garage_temperature > 70
for: 5m
labels:
severity: warning
annotations:
summary: "Garage fuse temperature: {{ $value }}°C (threshold: 70°C)"
- alert: FuseMainVoltageAbnormal
expr: fuse_main_voltage / 10 < 200 or fuse_main_voltage / 10 > 260
for: 5m
labels:
severity: critical
annotations:
summary: "Main fuse voltage: {{ $value }}V (expected 200-260V)"
- alert: FuseGarageVoltageAbnormal
expr: fuse_garage_voltage / 10 < 200 or fuse_garage_voltage / 10 > 260
for: 5m
labels:
severity: critical
annotations:
summary: "Garage fuse voltage: {{ $value }}V (expected 200-260V)"
- name: Thermostats
rules:
- alert: ThermostatOverheating
expr: >
thermostat_hol_temp_current > 400
or thermostat_master_bedroom_temp_current > 400
or thermostat_office_temp_current > 400
or thermostat_kids_room_temp_current > 400
for: 10m
labels:
severity: warning
annotations:
summary: "Thermostat temperature {{ $value | printf \"%.1f\" }} (x10 °C) exceeds 40°C"
- alert: ThermostatFreezing
expr: >
thermostat_hol_temp_current < 50
or thermostat_master_bedroom_temp_current < 50
or thermostat_office_temp_current < 50
or thermostat_kids_room_temp_current < 50
for: 15m
labels:
severity: critical
annotations:
summary: "Thermostat temperature {{ $value | printf \"%.1f\" }} (x10 °C) below 5°C — risk of freezing"
- alert: ThermostatHumidityHigh
expr: >
thermostat_hol_humidity > 80
or thermostat_master_bedroom_humidity > 80
or thermostat_office_humidity > 80
or thermostat_kids_room_humidity > 80
for: 30m
labels:
severity: warning
annotations:
summary: "Thermostat humidity {{ $value }}% exceeds 80%"
- name: Metric Staleness
rules:
- alert: UPSMetricsMissing
@ -958,6 +1071,41 @@ serverFiles:
severity: warning
annotations:
summary: "Fuse garage panel metrics missing for 15m - check tuya-bridge pod"
- alert: ThermostatHolMetricsMissing
expr: absent(thermostat_hol_temp_current)
for: 15m
labels:
severity: warning
annotations:
summary: "Thermostat hol metrics missing for 15m - check tuya-bridge pod"
- alert: ThermostatMasterBedroomMetricsMissing
expr: absent(thermostat_master_bedroom_temp_current)
for: 15m
labels:
severity: warning
annotations:
summary: "Thermostat master bedroom metrics missing for 15m - check tuya-bridge pod"
- alert: ThermostatOfficeMetricsMissing
expr: absent(thermostat_office_temp_current)
for: 15m
labels:
severity: warning
annotations:
summary: "Thermostat office metrics missing for 15m - check tuya-bridge pod"
- alert: ThermostatKidsRoomMetricsMissing
expr: absent(thermostat_kids_room_temp_current)
for: 15m
labels:
severity: warning
annotations:
summary: "Thermostat kids room metrics missing for 15m - check tuya-bridge pod"
- alert: TuyaCloudDown
expr: count(({__name__=~".*_tuya_cloud_up"}) == 0) > 0
for: 5m
labels:
severity: warning
annotations:
summary: "Tuya Cloud API rejecting calls ({{ $value }} devices affected) — renew subscription at iot.tuya.com (code 28841002 = expired trial) or rotate TINYTUYA_API_KEY"
- alert: ProxmoxMetricsMissing
expr: absent(pve_up)
for: 10m
@ -1071,6 +1219,14 @@ serverFiles:
severity: warning
annotations:
summary: "Home Assistant down: {{ $labels.instance }}"
- alert: HomeAssistantCriticalSensorUnavailable
expr: haos_entity_available{entity=~"sensor\\.(tesla_t4_gpu_(temperature|power_usage|utilization|memory_used)|r730_(cpu_temperature|power_consumption|power_supply_input_voltage_[12]|system_board_(exhaust|inlet)_temperature)|ups_(input_voltage|output_voltage|load|battery_remaining|output_source))"} == 0
for: 15m
labels:
severity: critical
annotations:
summary: "HA sensor unavailable: {{ $labels.friendly_name }} ({{ $labels.entity }})"
description: "{{ $labels.entity }} on {{ $labels.instance }} has been unavailable for 15+ minutes. Common cause: REST sensor needs HA restart (reload_all doesn't rebuild rest: platform). Verify exporter endpoint from HA: `ssh vbarzin@192.168.1.8` → `curl -sk <exporter-url>`. Fix: `curl -X POST -H \"Authorization: Bearer $HOME_ASSISTANT_SOFIA_TOKEN\" $HOME_ASSISTANT_SOFIA_URL/api/services/homeassistant/restart`."
- alert: CoreDNSErrors
expr: rate(coredns_dns_responses_total{rcode="SERVFAIL"}[5m]) > 1 and on() (time() - process_start_time_seconds{job="prometheus"}) > 900
for: 10m
@ -1155,6 +1311,28 @@ serverFiles:
severity: critical
annotations:
summary: "Vault backup CronJob has never completed successfully"
- alert: VaultRaftLeaderStuck
expr: |
(vault_core_active == 1)
and on(instance)
(rate(vault_raft_last_index_gauge[5m]) == 0)
for: 2m
labels:
severity: critical
annotations:
summary: "Vault raft leader {{ $labels.instance }} is active but commit index has not advanced for >2m"
description: "The raft leader is reachable on TCP but its commit index has stalled — likely a stuck goroutine hang (see 2026-04-22 post-mortem). External /v1/sys/health will be 503. Recovery: graceful delete of the stuck pod (see docs/runbooks/vault-raft-leader-deadlock.md). NOTE: silent until vault telemetry + scrape job are enabled."
- alert: VaultHAStatusUnavailable
expr: |
(count(up{job="vault"} == 1) > 0)
and
(count(vault_core_active == 1) == 0)
for: 5m
labels:
severity: critical
annotations:
summary: "Vault pods are Up but no pod reports HA active leader"
description: "At least one Vault pod is scraping healthy, but no pod has vault_core_active=1. HA layer is broken — external endpoint will be 503 even though the pods themselves are alive. See docs/runbooks/vault-raft-leader-deadlock.md. NOTE: silent until vault telemetry + scrape job are enabled."
- alert: VaultwardenBackupStale
expr: (time() - kube_cronjob_status_last_successful_time{cronjob="vaultwarden-backup", namespace="vaultwarden"}) > 86400
for: 30m
@ -1247,12 +1425,13 @@ serverFiles:
annotations:
summary: "Backup job failed: {{ $labels.namespace }}/{{ $labels.job_name }}"
- alert: LVMSnapshotStale
expr: (time() - lvm_snapshot_last_run_timestamp{job="lvm-pvc-snapshot"}) > 172800
expr: (time() - lvm_snapshot_last_run_timestamp{job="lvm-pvc-snapshot"}) > 108000
for: 30m
labels:
severity: critical
annotations:
summary: "LVM PVC snapshots are {{ $value | humanizeDuration }} old (expected daily)"
description: "Timer lvm-pvc-snapshot.timer on 192.168.1.127 hasn't pushed fresh metrics. Runbook: docs/runbooks/restore-lvm-snapshot.md"
- alert: LVMSnapshotNeverRun
expr: absent(lvm_snapshot_last_run_timestamp{job="lvm-pvc-snapshot"})
for: 48h
@ -1411,7 +1590,7 @@ serverFiles:
severity: warning
annotations:
summary: "Redis master {{ $labels.pod }} has only {{ $value }} connected replicas (expected 2)"
- alert: HeadscaleDown
- alert: HeadscaleReplicasMismatch
expr: (kube_deployment_status_replicas_available{namespace="headscale"} or on() vector(0)) < 1
for: 5m
labels:
@ -1815,7 +1994,7 @@ serverFiles:
summary: "Email round-trip probe failing. Check MX DNS, Postfix, Mailgun API, and IMAP."
- alert: EmailRoundtripStale
expr: (time() - email_roundtrip_last_success_timestamp{job="email-roundtrip-monitor"}) > 3600
for: 10m
for: 20m
labels:
severity: warning
annotations:
@ -1980,7 +2159,7 @@ serverFiles:
annotations:
summary: "Technitium zone-sync has not run successfully in >1h (last: {{ $value | humanizeDuration }} ago)"
- alert: TechnitiumZoneCountMismatch
expr: (max(technitium_zone_count) - min(technitium_zone_count)) > 0
expr: (max(technitium_zone_count{instance!="primary"}) - min(technitium_zone_count{instance!="primary"})) > 0
for: 15m
labels:
severity: warning
@ -1995,13 +2174,6 @@ serverFiles:
summary: "CoreDNS forward SERVFAIL/REFUSED rate: {{ $value | printf \"%.2f\" }}/s — upstream DNS (pfSense/public) may be unhealthy"
- name: qbittorrent
rules:
- alert: MAMMouseClass
expr: mam_class_code == 0
for: 1h
labels:
severity: critical
annotations:
summary: "MAM account is in Mouse class — tracker is refusing announces, ratio cannot recover"
- alert: MAMCookieExpired
expr: mam_farming_cookie_expired > 0
for: 0m
@ -2040,13 +2212,6 @@ serverFiles:
severity: critical
annotations:
summary: "qBittorrent is disconnected from the network"
- alert: QBittorrentMAMUnsatisfied
expr: qbt_tracker_unsatisfied{tracker="mam"} > 15
for: 10m
labels:
severity: warning
annotations:
summary: "{{ $value | printf \"%.0f\" }} MAM torrents not yet seeded 72h (limit: 20 for new members)"
- name: Headscale VPN
rules:
@ -2361,6 +2526,58 @@ extraScrapeConfigs: |
action: replace
regex: '(.*)'
replacement: 'fuse_main_$${1}'
- job_name: 'thermostat-hol'
static_configs:
- targets:
- "tuya-bridge.tuya-bridge.svc.cluster.local:80"
metrics_path: '/metrics/bf7efce9519bd508df431s'
params:
api-key: ['${tuya_api_key}']
metric_relabel_configs:
- source_labels: [ __name__ ]
target_label: '__name__'
action: replace
regex: '(.*)'
replacement: 'thermostat_hol_$${1}'
- job_name: 'thermostat-master-bedroom'
static_configs:
- targets:
- "tuya-bridge.tuya-bridge.svc.cluster.local:80"
metrics_path: '/metrics/bf70e80159641f61a5lzho'
params:
api-key: ['${tuya_api_key}']
metric_relabel_configs:
- source_labels: [ __name__ ]
target_label: '__name__'
action: replace
regex: '(.*)'
replacement: 'thermostat_master_bedroom_$${1}'
- job_name: 'thermostat-office'
static_configs:
- targets:
- "tuya-bridge.tuya-bridge.svc.cluster.local:80"
metrics_path: '/metrics/bf9597a0064f0349d4b09x'
params:
api-key: ['${tuya_api_key}']
metric_relabel_configs:
- source_labels: [ __name__ ]
target_label: '__name__'
action: replace
regex: '(.*)'
replacement: 'thermostat_office_$${1}'
- job_name: 'thermostat-kids-room'
static_configs:
- targets:
- "tuya-bridge.tuya-bridge.svc.cluster.local:80"
metrics_path: '/metrics/bfe64da91577117e0annt5'
params:
api-key: ['${tuya_api_key}']
metric_relabel_configs:
- source_labels: [ __name__ ]
target_label: '__name__'
action: replace
regex: '(.*)'
replacement: 'thermostat_kids_room_$${1}'
- job_name: 'haos'
static_configs:
- targets:

View file

@ -493,6 +493,25 @@ resource "kubernetes_cron_job_v1" "nextcloud-backup" {
spec {
restart_policy = "OnFailure"
# Backup mounts the same RWO PVC (proxmox-lvm-encrypted) as the
# main nextcloud pod, so it MUST schedule on the same node the
# volume cannot attach to two nodes simultaneously. Without this
# the backup pod is stuck in ContainerCreating until cron retries.
affinity {
pod_affinity {
required_during_scheduling_ignored_during_execution {
label_selector {
match_labels = {
"app.kubernetes.io/name" = "nextcloud"
"app.kubernetes.io/instance" = "nextcloud"
}
}
topology_key = "kubernetes.io/hostname"
namespaces = [kubernetes_namespace.nextcloud.metadata[0].name]
}
}
}
container {
name = "backup"
image = "alpine:latest"

View file

@ -63,18 +63,25 @@ resource "kubernetes_resource_quota" "nvidia_quota" {
}
}
# Apply GPU taint and label to ensure only GPU workloads run on GPU node
# Apply GPU taint dynamically based on NFD-discovered GPU nodes. The
# NFD label `feature.node.kubernetes.io/pci-10de.present=true` is
# auto-applied on any node with an NVIDIA PCI device (vendor 0x10de),
# so the taint follows the card if it moves between nodes. Workload
# nodeSelectors key off `nvidia.com/gpu.present=true` (applied by
# gpu-feature-discovery once the operator is up).
resource "null_resource" "gpu_node_config" {
provisioner "local-exec" {
command = <<-EOT
kubectl taint nodes k8s-node1 nvidia.com/gpu=true:PreferNoSchedule --overwrite
kubectl label nodes k8s-node1 gpu=true --overwrite
set -euo pipefail
for node in $(kubectl get nodes -l feature.node.kubernetes.io/pci-10de.present=true -o jsonpath='{.items[*].metadata.name}'); do
kubectl taint nodes "$node" nvidia.com/gpu=true:PreferNoSchedule --overwrite
done
EOT
}
# Re-run if namespace changes (proxy for cluster changes)
triggers = {
namespace = kubernetes_namespace.nvidia.metadata[0].name
command_hash = "dynamic-taint-v1"
}
}
@ -141,7 +148,7 @@ resource "kubernetes_deployment" "nvidia-exporter" {
}
spec {
node_selector = {
"gpu" : "true"
"nvidia.com/gpu.present" : "true"
}
toleration {
key = "nvidia.com/gpu"
@ -604,7 +611,7 @@ resource "kubernetes_daemonset" "gpu_pod_exporter" {
service_account_name = kubernetes_service_account.gpu_pod_exporter.metadata[0].name
node_selector = {
"gpu" : "true"
"nvidia.com/gpu.present" : "true"
}
toleration {

View file

@ -86,6 +86,28 @@ resource "kubernetes_persistent_volume_claim" "data_proxmox" {
}
}
resource "kubernetes_persistent_volume_claim" "data_encrypted" {
wait_until_bound = false
metadata {
name = "paperless-ngx-data-encrypted"
namespace = kubernetes_namespace.paperless-ngx.metadata[0].name
annotations = {
"resize.topolvm.io/threshold" = "80%"
"resize.topolvm.io/increase" = "100%"
"resize.topolvm.io/storage_limit" = "5Gi"
}
}
spec {
access_modes = ["ReadWriteOnce"]
storage_class_name = "proxmox-lvm-encrypted"
resources {
requests = {
storage = "1Gi"
}
}
}
}
resource "kubernetes_deployment" "paperless-ngx" {
metadata {
@ -196,7 +218,7 @@ resource "kubernetes_deployment" "paperless-ngx" {
volume {
name = "data"
persistent_volume_claim {
claim_name = kubernetes_persistent_volume_claim.data_proxmox.metadata[0].name
claim_name = kubernetes_persistent_volume_claim.data_encrypted.metadata[0].name
}
}
}

View file

@ -32,7 +32,20 @@ resource "kubernetes_namespace" "payslip_ingest" {
# Seed these manually in Vault before applying:
# secret/paperless-ngx -> property `api_token`
# secret/claude-agent-service -> property `api_bearer_token`
# secret/payslip-ingest -> property `webhook_bearer_token`
# secret/payslip-ingest -> properties:
# - `webhook_bearer_token`
# - `actualbudget_api_key` (same value as
# actualbudget-http-api-viktor random
# api-key fetch via `kubectl get pods
# -n actualbudget -l
# app=actualbudget-http-api-viktor -o
# jsonpath={.items[0].spec.containers[0].env}`
# and grep API_KEY)
# - `actualbudget_encryption_password`
# (same as Viktor's budget password in
# secret/actualbudget/credentials[viktor])
# - `actualbudget_budget_sync_id`
# (same as Viktor's sync_id)
resource "kubernetes_manifest" "external_secret" {
manifest = {
apiVersion = "external-secrets.io/v1beta1"
@ -79,6 +92,27 @@ resource "kubernetes_manifest" "external_secret" {
property = "webhook_bearer_token"
}
},
{
secretKey = "ACTUALBUDGET_API_KEY"
remoteRef = {
key = "payslip-ingest"
property = "actualbudget_api_key"
}
},
{
secretKey = "ACTUALBUDGET_ENCRYPTION_PASSWORD"
remoteRef = {
key = "payslip-ingest"
property = "actualbudget_encryption_password"
}
},
{
secretKey = "ACTUALBUDGET_BUDGET_SYNC_ID"
remoteRef = {
key = "payslip-ingest"
property = "actualbudget_budget_sync_id"
}
},
]
}
}
@ -288,6 +322,85 @@ resource "kubernetes_service" "payslip_ingest" {
}
}
# Daily sync of Meta payroll deposits from ActualBudget's http-api sidecar.
# Populates payslip_ingest.external_meta_deposits so Panel 14 can overlay bank
# deposits against payslip.net_pay catches parser drift on net_pay.
resource "kubernetes_cron_job_v1" "actualbudget_payroll_sync" {
metadata {
name = "actualbudget-payroll-sync"
namespace = kubernetes_namespace.payslip_ingest.metadata[0].name
}
spec {
schedule = "0 2 * * *"
concurrency_policy = "Forbid"
successful_jobs_history_limit = 3
failed_jobs_history_limit = 5
starting_deadline_seconds = 300
job_template {
metadata {
labels = local.labels
}
spec {
backoff_limit = 1
ttl_seconds_after_finished = 86400
template {
metadata {
labels = local.labels
}
spec {
restart_policy = "OnFailure"
image_pull_secrets {
name = "registry-credentials"
}
container {
name = "sync"
image = local.image
command = ["python", "-m", "payslip_ingest", "sync-meta-deposits"]
env_from {
secret_ref {
name = "payslip-ingest-secrets"
}
}
env_from {
secret_ref {
name = "payslip-ingest-db-creds"
}
}
env {
name = "ACTUALBUDGET_HTTP_API_URL"
value = "http://budget-http-api-viktor.actualbudget.svc.cluster.local"
}
resources {
requests = {
cpu = "50m"
memory = "128Mi"
}
limits = {
memory = "256Mi"
}
}
}
}
}
}
}
}
lifecycle {
# KYVERNO_LIFECYCLE_V1
ignore_changes = [spec[0].job_template[0].spec[0].template[0].spec[0].dns_config]
}
depends_on = [
kubernetes_manifest.external_secret,
kubernetes_manifest.db_external_secret,
]
}
# Plan-time read of the ESO-created K8s Secret for Grafana datasource password.
# First apply: -target=kubernetes_manifest.db_external_secret first so the Secret exists.
data "kubernetes_secret" "payslip_ingest_db_creds" {

View file

@ -219,6 +219,10 @@ module "ingress" {
skip_default_rate_limit = true
exclude_crowdsec = true
anti_ai_scraping = false
# Deployment is scaled to 0 (see replicas above). Opt the ingress out of
# Uptime Kuma external monitoring so the sync CronJob deletes the orphaned
# `[External] poison` monitor instead of flapping DOWN.
external_monitor = false
extra_annotations = {
"gethomepage.dev/enabled" = "true"
"gethomepage.dev/name" = "Poison Fountain"
@ -252,6 +256,13 @@ resource "kubernetes_cron_job_v1" "poison_fetcher" {
name = "poison-fountain-fetcher"
}
spec {
security_context {
# curlimages/curl defaults to uid 100, but the NFS mount at /data is
# owned root:root 755 (writes from the main Deployment which runs as
# root). Align the CronJob with the Deployment so mkdir /data/cache
# succeeds. no_root_squash is set on the /srv/nfs export.
run_as_user = 0
}
container {
name = "fetcher"
image = "curlimages/curl:latest"

View file

@ -9,6 +9,10 @@ terraform {
source = "cloudflare/cloudflare"
version = "~> 4"
}
authentik = {
source = "goauthentik/authentik"
version = "~> 2024.10"
}
}
}

View file

@ -43,7 +43,7 @@ resource "kubernetes_config_map" "haproxy" {
timeout connect 5s
timeout client 30s
timeout server 30s
timeout check 3s
timeout check 5s
# Dynamic DNS resolution via cluster CoreDNS. Without this, haproxy
# resolves server hostnames once at startup and caches forever, so
@ -82,9 +82,9 @@ resource "kubernetes_config_map" "haproxy" {
tcp-check expect rstring role:master
tcp-check send "QUIT\r\n"
tcp-check expect string +OK
server redis-v2-0 redis-v2-0.redis-v2-headless.redis.svc.cluster.local:6379 check inter 1s fall 2 rise 2 resolvers kubernetes init-addr last,libc,none
server redis-v2-1 redis-v2-1.redis-v2-headless.redis.svc.cluster.local:6379 check inter 1s fall 2 rise 2 resolvers kubernetes init-addr last,libc,none
server redis-v2-2 redis-v2-2.redis-v2-headless.redis.svc.cluster.local:6379 check inter 1s fall 2 rise 2 resolvers kubernetes init-addr last,libc,none
server redis-v2-0 redis-v2-0.redis-v2-headless.redis.svc.cluster.local:6379 check inter 2s fall 3 rise 2 resolvers kubernetes init-addr last,libc,none
server redis-v2-1 redis-v2-1.redis-v2-headless.redis.svc.cluster.local:6379 check inter 2s fall 3 rise 2 resolvers kubernetes init-addr last,libc,none
server redis-v2-2 redis-v2-2.redis-v2-headless.redis.svc.cluster.local:6379 check inter 2s fall 3 rise 2 resolvers kubernetes init-addr last,libc,none
backend redis_sentinel
balance roundrobin
@ -362,8 +362,8 @@ resource "kubernetes_config_map" "redis_v2_sentinel_bootstrap" {
sentinel resolve-hostnames yes
sentinel announce-hostnames yes
sentinel monitor mymaster $MASTER_HOST 6379 2
sentinel down-after-milliseconds mymaster 5000
sentinel failover-timeout mymaster 30000
sentinel down-after-milliseconds mymaster 15000
sentinel failover-timeout mymaster 60000
sentinel parallel-syncs mymaster 1
EOF
@ -396,7 +396,7 @@ resource "kubernetes_service" "redis_v2_headless" {
}
spec {
cluster_ip = "None"
publish_not_ready_addresses = true
publish_not_ready_addresses = false
selector = {
app = "redis-v2"
}
@ -451,9 +451,7 @@ resource "kubernetes_stateful_set_v1" "redis_v2" {
affinity {
pod_anti_affinity {
preferred_during_scheduling_ignored_during_execution {
weight = 100
pod_affinity_term {
required_during_scheduling_ignored_during_execution {
label_selector {
match_expressions {
key = "app"
@ -465,7 +463,6 @@ resource "kubernetes_stateful_set_v1" "redis_v2" {
}
}
}
}
init_container {
name = "generate-sentinel-conf"
@ -535,8 +532,8 @@ resource "kubernetes_stateful_set_v1" "redis_v2" {
}
initial_delay_seconds = 15
period_seconds = 10
timeout_seconds = 3
failure_threshold = 3
timeout_seconds = 10
failure_threshold = 5
}
readiness_probe {
exec {
@ -580,8 +577,8 @@ resource "kubernetes_stateful_set_v1" "redis_v2" {
}
initial_delay_seconds = 20
period_seconds = 10
timeout_seconds = 3
failure_threshold = 3
timeout_seconds = 10
failure_threshold = 5
}
readiness_probe {
exec {

View file

@ -434,12 +434,17 @@ resource "kubernetes_cron_job_v1" "technitium_zone_sync" {
while read -r zone; do
if grep -qx "$zone" /tmp/replica_zones.txt; then
# Zone exists just resync
# Zone exists reconcile primaryNameServerAddresses to the
# stable FQDN before resync. Without this, a zone created
# against an old pod IP (pre-service-ClusterIP era) stays
# pinned to that dead IP forever and zone transfers fail
# silently. Idempotent Technitium accepts identical values.
curl -sf "$REPLICA/api/zones/options/set?token=$R_TOKEN&zone=$zone&primaryNameServerAddresses=$PRIMARY_HOST" > /dev/null || true
curl -sf "$REPLICA/api/zones/resync?token=$R_TOKEN&zone=$zone" > /dev/null || true
else
# New zone create as Secondary and validate response
echo "NEW: Creating $zone on $REPLICA"
RESP=$(curl -sf "$REPLICA/api/zones/create?token=$R_TOKEN&zone=$zone&type=Secondary&primaryNameServerAddresses=$PRIMARY_IP" || echo '{"status":"error"}')
RESP=$(curl -sf "$REPLICA/api/zones/create?token=$R_TOKEN&zone=$zone&type=Secondary&primaryNameServerAddresses=$PRIMARY_HOST" || echo '{"status":"error"}')
if echo "$RESP" | grep -q '"status":"ok"'; then
SYNCED=$((SYNCED + 1))
else
@ -486,7 +491,14 @@ resource "kubernetes_cron_job_v1" "technitium_zone_sync" {
value = var.technitium_password
}
env {
name = "PRIMARY_IP"
# Service ClusterIP Terraform tracks it on every apply, and the
# reconcile loop below re-applies it to every existing zone on
# every run (*/30m), so any drift (e.g. service recreate new
# ClusterIP, or historical pod-IP values still pinned on replicas)
# self-heals within a sync cycle. Hostname form was tried but
# Technitium's own resolver doesn't forward svc.cluster.local,
# so `primaryNameServerAddresses` must be a literal IP.
name = "PRIMARY_HOST"
value = kubernetes_service.technitium_primary.spec[0].cluster_ip
}
}

View file

@ -200,7 +200,7 @@ resource "helm_release" "traefik" {
# Explicit entrypoint timeouts to bound tail latency from slow clients
"--entryPoints.websecure.transport.respondingTimeouts.readTimeout=60s",
"--entryPoints.websecure.transport.respondingTimeouts.writeTimeout=60s",
"--entryPoints.websecure.transport.respondingTimeouts.idleTimeout=180s",
"--entryPoints.websecure.transport.respondingTimeouts.idleTimeout=600s",
# Use forwarded headers from trusted proxies
"--entryPoints.websecure.forwardedHeaders.insecure=false",
"--entryPoints.web.forwardedHeaders.insecure=false",

View file

@ -244,8 +244,8 @@ resource "kubernetes_manifest" "middleware_immich_rate_limit" {
}
spec = {
rateLimit = {
average = 500
burst = 5000
average = 1000
burst = 20000
}
}
}

View file

@ -118,6 +118,26 @@ resource "kubernetes_deployment" "tuya-bridge" {
}
}
}
liveness_probe {
http_get {
path = "/health"
port = 8080
}
initial_delay_seconds = 60
period_seconds = 30
timeout_seconds = 5
failure_threshold = 6
}
readiness_probe {
http_get {
path = "/health"
port = 8080
}
initial_delay_seconds = 10
period_seconds = 15
timeout_seconds = 5
failure_threshold = 2
}
resources {
requests = {
cpu = "10m"

View file

@ -25,22 +25,6 @@ module "tls_secret" {
tls_secret_name = var.tls_secret_name
}
# NFS StorageClass pointing to Proxmox host (replaces nfs-truenas for vault)
resource "kubernetes_storage_class" "nfs_proxmox" {
metadata {
name = "nfs-proxmox"
}
storage_provisioner = "nfs.csi.k8s.io"
reclaim_policy = "Retain"
volume_binding_mode = "Immediate"
allow_volume_expansion = true
parameters = {
server = "192.168.1.127"
share = "/srv/nfs"
}
mount_options = ["soft", "actimeo=5", "retrans=3", "timeo=30"]
}
resource "helm_release" "vault" {
name = "vault"
namespace = kubernetes_namespace.vault.metadata[0].name
@ -72,13 +56,13 @@ resource "helm_release" "vault" {
dataStorage = {
enabled = true
size = "2Gi"
storageClass = "nfs-proxmox" # Proxmox host NFS (was nfs-truenas)
storageClass = "proxmox-lvm-encrypted" # Migrated 2026-04-25 from nfs-proxmox; raft fsync is NFS-hostile (post-mortems/2026-04-22-vault-raft-leader-deadlock.md)
}
auditStorage = {
enabled = true
size = "2Gi"
storageClass = "nfs-proxmox" # Proxmox host NFS (was nfs-truenas)
storageClass = "proxmox-lvm-encrypted" # Migrated 2026-04-25 from nfs-proxmox
}
standalone = { enabled = false }
@ -117,6 +101,24 @@ resource "helm_release" "vault" {
}
}
# fsGroupChangePolicy=OnRootMismatch skips recursive chown on restart.
# Without this, kubelet walks every file over NFS each restart; during
# 2026-04-22 outage this looped for 10m+ and blocked quorum recovery.
# The other four fields restore the chart defaults providing pod{}
# replaces them, and missing fsGroup left vault unable to write to
# the freshly-formatted ext4 PVC during the 2026-04-25 migration.
statefulSet = {
securityContext = {
pod = {
fsGroupChangePolicy = "OnRootMismatch"
fsGroup = 1000
runAsGroup = 1000
runAsUser = 100
runAsNonRoot = true
}
}
}
# Mount unseal key secret
extraVolumes = [{
type = "secret"
@ -536,7 +538,8 @@ resource "vault_database_secret_backend_connection" "postgresql" {
# "pg-trading", # Commented out 2026-04-06 - trading-bot disabled
"pg-health", "pg-linkwarden",
"pg-affine", "pg-woodpecker", "pg-claude-memory",
"pg-terraform-state", "pg-payslip-ingest", "pg-job-hunter"
"pg-terraform-state", "pg-payslip-ingest", "pg-job-hunter",
"pg-wealthfolio-sync", "pg-fire-planner"
]
postgresql {
@ -690,6 +693,22 @@ resource "vault_database_secret_backend_static_role" "pg_job_hunter" {
rotation_period = 604800
}
resource "vault_database_secret_backend_static_role" "pg_wealthfolio_sync" {
backend = vault_mount.database.path
db_name = vault_database_secret_backend_connection.postgresql.name
name = "pg-wealthfolio-sync"
username = "wealthfolio_sync"
rotation_period = 604800
}
resource "vault_database_secret_backend_static_role" "pg_fire_planner" {
backend = vault_mount.database.path
db_name = vault_database_secret_backend_connection.postgresql.name
name = "pg-fire-planner"
username = "fire_planner"
rotation_period = 604800
}
# =============================================================================
# Kubernetes Secrets Engine Dynamic K8s Credentials
# =============================================================================

View file

@ -3,6 +3,7 @@ variable "tls_secret_name" {
sensitive = true
}
variable "nfs_server" { type = string }
variable "postgresql_host" { type = string }
resource "kubernetes_namespace" "wealthfolio" {
metadata {
@ -45,6 +46,52 @@ resource "kubernetes_manifest" "external_secret" {
depends_on = [kubernetes_namespace.wealthfolio]
}
# DB credentials for the SQLitePG ETL sidecar. Vault DB engine static role
# `pg-wealthfolio-sync` rotates this every 7 days; ExternalSecret refreshes
# the K8s Secret every 15m so the sidecar always has a valid password.
resource "kubernetes_manifest" "wealthfolio_sync_db_external_secret" {
manifest = {
apiVersion = "external-secrets.io/v1beta1"
kind = "ExternalSecret"
metadata = {
name = "wealthfolio-sync-db-creds"
namespace = "wealthfolio"
}
spec = {
refreshInterval = "15m"
secretStoreRef = {
name = "vault-database"
kind = "ClusterSecretStore"
}
target = {
name = "wealthfolio-sync-db-creds"
template = {
metadata = {
annotations = {
"reloader.stakater.com/match" = "true"
}
}
data = {
PGHOST = var.postgresql_host
PGPORT = "5432"
PGDATABASE = "wealthfolio_sync"
PGUSER = "wealthfolio_sync"
PGPASSWORD = "{{ .password }}"
}
}
}
data = [{
secretKey = "password"
remoteRef = {
key = "static-creds/pg-wealthfolio-sync"
property = "password"
}
}]
}
}
depends_on = [kubernetes_namespace.wealthfolio]
}
module "tls_secret" {
source = "../../modules/kubernetes/setup_tls_secret"
namespace = kubernetes_namespace.wealthfolio.metadata[0].name
@ -214,6 +261,181 @@ resource "kubernetes_deployment" "wealthfolio" {
limits = { memory = "64Mi" }
}
}
# pg-sync sidecar mirrors a small subset of SQLite into PG every hour
# so Grafana can chart net worth / contributions / growth via the
# `wealthfolio_sync` database. Mounts /data RO; writes to a tmp dir
# for the sqlite3 .backup snapshot to avoid blocking writers. Bootstrap
# DDL runs each iteration (CREATE TABLE IF NOT EXISTS idempotent).
# Truncate-and-reload pattern: tables are small (~10k DAV rows, ~500
# activities, 6 accounts), so a full reload each hour is simpler than
# incremental upserts and gives clean cold-start behaviour.
container {
name = "pg-sync"
image = "alpine:3.20"
env {
name = "PGHOST"
value_from {
secret_key_ref {
name = "wealthfolio-sync-db-creds"
key = "PGHOST"
}
}
}
env {
name = "PGPORT"
value_from {
secret_key_ref {
name = "wealthfolio-sync-db-creds"
key = "PGPORT"
}
}
}
env {
name = "PGDATABASE"
value_from {
secret_key_ref {
name = "wealthfolio-sync-db-creds"
key = "PGDATABASE"
}
}
}
env {
name = "PGUSER"
value_from {
secret_key_ref {
name = "wealthfolio-sync-db-creds"
key = "PGUSER"
}
}
}
env {
name = "PGPASSWORD"
value_from {
secret_key_ref {
name = "wealthfolio-sync-db-creds"
key = "PGPASSWORD"
}
}
}
command = ["/bin/sh", "-c", <<-EOT
set -eu
apk add --no-cache --quiet sqlite postgresql-client busybox-suid
mkdir -p /etc/crontabs /scripts /tmp/wf-sync
cat >/etc/crontabs/root <<'CRON'
# Hourly: snapshot SQLite, reload PG mirror.
7 * * * * /scripts/sync.sh >>/proc/1/fd/1 2>&1
CRON
cat >/scripts/sync.sh <<'SCRIPT'
#!/bin/sh
set -eu
TS=$(date -u +%Y-%m-%dT%H:%M:%SZ)
echo "[$TS] wealthfolio-pg-sync: starting"
# Bootstrap schema (idempotent).
psql -v ON_ERROR_STOP=1 <<'SQL'
CREATE TABLE IF NOT EXISTS accounts (
id TEXT PRIMARY KEY,
name TEXT,
account_type TEXT,
currency TEXT,
is_active BOOLEAN
);
CREATE TABLE IF NOT EXISTS daily_account_valuation (
id TEXT PRIMARY KEY,
account_id TEXT NOT NULL,
valuation_date DATE NOT NULL,
account_currency TEXT,
base_currency TEXT,
fx_rate_to_base NUMERIC,
cash_balance NUMERIC,
investment_market_value NUMERIC,
total_value NUMERIC,
cost_basis NUMERIC,
net_contribution NUMERIC
);
CREATE INDEX IF NOT EXISTS idx_dav_acct_date ON daily_account_valuation(account_id, valuation_date);
CREATE INDEX IF NOT EXISTS idx_dav_date ON daily_account_valuation(valuation_date);
CREATE TABLE IF NOT EXISTS activities (
id TEXT PRIMARY KEY,
account_id TEXT,
asset_id TEXT,
activity_type TEXT,
activity_date TIMESTAMPTZ,
quantity NUMERIC,
unit_price NUMERIC,
amount NUMERIC,
fee NUMERIC,
currency TEXT,
fx_rate NUMERIC,
notes TEXT
);
CREATE INDEX IF NOT EXISTS idx_act_date ON activities(activity_date);
SQL
# Snapshot SQLite (online backup non-blocking).
rm -f /tmp/wf-sync/snapshot.db
sqlite3 /data/wealthfolio.db ".backup /tmp/wf-sync/snapshot.db"
# Dump source rows to TSV.
sqlite3 -separator $'\t' /tmp/wf-sync/snapshot.db \
"SELECT id, name, account_type, currency, is_active FROM accounts;" \
> /tmp/wf-sync/accounts.tsv
sqlite3 -separator $'\t' /tmp/wf-sync/snapshot.db <<'SQ' > /tmp/wf-sync/dav.tsv
SELECT id, account_id, valuation_date, account_currency, base_currency,
CAST(fx_rate_to_base AS REAL),
CAST(cash_balance AS REAL),
CAST(investment_market_value AS REAL),
CAST(total_value AS REAL),
CAST(cost_basis AS REAL),
CAST(net_contribution AS REAL)
FROM daily_account_valuation
WHERE account_id != 'TOTAL'; -- synthetic pre-aggregated row; would double-count when summed
SQ
sqlite3 -separator $'\t' /tmp/wf-sync/snapshot.db <<'SQ' > /tmp/wf-sync/activities.tsv
SELECT id, account_id, asset_id, activity_type, activity_date,
CAST(quantity AS REAL),
CAST(unit_price AS REAL),
CAST(amount AS REAL),
CAST(fee AS REAL),
currency,
CAST(fx_rate AS REAL),
notes
FROM activities WHERE status='POSTED';
SQ
# Truncate-and-reload (small tables; simpler than upserts).
psql -v ON_ERROR_STOP=1 <<SQL
BEGIN;
TRUNCATE accounts, daily_account_valuation, activities;
\copy accounts FROM '/tmp/wf-sync/accounts.tsv' WITH (FORMAT csv, DELIMITER E'\t', NULL '');
\copy daily_account_valuation FROM '/tmp/wf-sync/dav.tsv' WITH (FORMAT csv, DELIMITER E'\t', NULL '');
\copy activities FROM '/tmp/wf-sync/activities.tsv' WITH (FORMAT csv, DELIMITER E'\t', NULL '');
COMMIT;
SQL
ROWS=$(psql -tAc "SELECT COUNT(*) FROM daily_account_valuation;")
echo "[$TS] wealthfolio-pg-sync: ok (daily_account_valuation rows=$ROWS)"
rm -f /tmp/wf-sync/*.tsv /tmp/wf-sync/snapshot.db
SCRIPT
chmod +x /scripts/sync.sh
echo "wealthfolio-pg-sync sidecar ready; running initial sync, then hourly at :07"
/scripts/sync.sh || echo "initial sync failed (will retry on next cron tick)"
exec crond -f -l 8
EOT
]
volume_mount {
name = "data"
mount_path = "/data"
read_only = true
}
resources {
requests = { cpu = "10m", memory = "32Mi" }
limits = { memory = "128Mi" }
}
}
volume {
name = "data"
persistent_volume_claim {
@ -374,6 +596,53 @@ resource "kubernetes_cron_job_v1" "wealthfolio_sync" {
}
}
# Plan-time read of the ESO-created K8s Secret for Grafana datasource password.
# First apply: -target=kubernetes_manifest.wealthfolio_sync_db_external_secret first.
data "kubernetes_secret" "wealthfolio_sync_db_creds" {
metadata {
name = "wealthfolio-sync-db-creds"
namespace = kubernetes_namespace.wealthfolio.metadata[0].name
}
depends_on = [kubernetes_manifest.wealthfolio_sync_db_external_secret]
}
# Grafana datasource for wealthfolio_sync PostgreSQL DB.
# Lives in the monitoring namespace so the Grafana sidecar (grafana_datasource=1) picks it up.
resource "kubernetes_config_map" "grafana_wealth_datasource" {
metadata {
name = "grafana-wealth-datasource"
namespace = "monitoring"
labels = {
grafana_datasource = "1"
}
}
data = {
"wealth-datasource.yaml" = yamlencode({
apiVersion = 1
datasources = [{
name = "Wealth"
type = "postgres"
access = "proxy"
url = "${var.postgresql_host}:5432"
user = "wealthfolio_sync"
uid = "wealth-pg"
# Grafana 11.2+ Postgres plugin reads DB name from jsonData.database
# (top-level `database` is silently ignored).
jsonData = {
database = "wealthfolio_sync"
sslmode = "disable"
postgresVersion = 1600
timescaledb = false
}
secureJsonData = {
password = data.kubernetes_secret.wealthfolio_sync_db_creds.data["PGPASSWORD"]
}
editable = true
}]
})
}
}
############################################################################
# Backup sidecar approach
#

View file

@ -11,3 +11,8 @@ dependency "vault" {
config_path = "../vault"
skip_outputs = true
}
dependency "external-secrets" {
config_path = "../external-secrets"
skip_outputs = true
}

View file

@ -73,7 +73,7 @@ resource "kubernetes_deployment" "whisper" {
}
spec {
node_selector = {
"gpu" : "true"
"nvidia.com/gpu.present" : "true"
}
toleration {
key = "nvidia.com/gpu"
@ -195,7 +195,7 @@ resource "kubernetes_deployment" "piper" {
}
spec {
node_selector = {
"gpu" : "true"
"nvidia.com/gpu.present" : "true"
}
toleration {
key = "nvidia.com/gpu"

View file

@ -227,7 +227,7 @@ resource "kubernetes_deployment" "yt_highlights" {
}
spec {
node_selector = {
"gpu" : "true"
"nvidia.com/gpu.present" : "true"
}
toleration {
key = "nvidia.com/gpu"

File diff suppressed because one or more lines are too long