Compare commits

..

No commits in common. "2a7124d2664a63e58968498c1be50087593b29f2" and "15c88bc683b6a901a2d141c0771cee955f5ad6ae" have entirely different histories.

19 changed files with 179 additions and 1119 deletions

View file

@ -29,14 +29,14 @@ Violations cause state drift, which causes future applies to break or silently r
- **New services need CI/CD** and **monitoring** (Prometheus/Uptime Kuma)
- **New service**: Use `setup-project` skill for full workflow
- **Ingress**: `ingress_factory` module. **Auth** (`auth` string enum, default `"required"` — fail-closed). Pick by asking "what gates the app?":
- `auth = "required"` — Authentik forward-auth gates every request. Use when the backend has **no built-in user auth** and Authentik is the only thing standing between strangers and the app (prowlarr, qbittorrent, netbox, phpipam, k8s-dashboard, any admin UI shipped without its own login).
- `auth = "required"` — Authentik forward-auth gates every request. Use when the backend has **no built-in user auth** and Authentik is the only thing standing between strangers and the app (prowlarr, qbittorrent, netbox, phpipam, k8s-dashboard, foolery, any admin UI shipped without its own login).
- `auth = "app"` — the backend handles its own user authentication (NextAuth, Django, OAuth, bearer-token API, etc.); Authentik would only break it. No middleware attached; the app's own login is the gate. Examples: immich, linkwarden, tandoor, freshrss, affine, actualbudget, audiobookshelf, novelapp. **Functionally identical to `"none"`** — the distinct name exists to record intent at the call site.
- `auth = "public"` — Authentik anonymous binding via the dedicated `public` outpost (routes via `traefik-authentik-forward-auth-public``ak-outpost-public.authentik.svc:9000`). Strangers auto-bound to `guest`; logged-in users keep their identity in `X-authentik-username`. **Only works for top-level browser navigation** — CORS preflight rejects XHR/fetch and automation can't replay the cookie dance. Audit trail, not a gate.
- `auth = "none"` — no Authentik, no own-auth claim. Use for Anubis-fronted content (Anubis is the gate), native-client APIs (Git, `/v2/`, WebDAV/CalDAV, CardDAV), webhook receivers, OAuth callbacks, and Authentik outposts themselves.
- **Anti-exposure rule** (the reason `"app"` exists): only pick `"app"` or `"none"` AFTER you've verified the app has its own user auth (`"app"`) OR the endpoint is intentionally public (`"none"`). Default is `"required"` so accidental omission fails closed. **Convention**: when using `"app"` or `"none"`, add a comment line above the `auth = "..."` line stating what gates the app or why it's public. **Enforced by `scripts/tg`**: every `tg plan/apply/destroy/refresh` runs `scripts/check-ingress-auth-comments.py` against the current stack and aborts if any `auth = "app|none"` line lacks the preceding `# auth = "<tier>": ...` comment. Stack-scoped — untouched stacks aren't blocked until they're next edited.
- **Anti-AI**: on by default when `auth = "none"` or `auth = "app"` (no Authentik to discourage bots); redundant on `"required"` and `"public"`.
- **DNS**: `dns_type = "proxied"` (Cloudflare CDN) or `"non-proxied"` (direct A/AAAA). DNS records are auto-created — no need to edit `config.tfvars`. Smoke-test target: `echo.viktorbarzin.me` (auth=public, header-reflecting backend).
- **Anubis PoW challenge** (`modules/kubernetes/anubis_instance/`): per-site reverse proxy that issues a 30-day JWT cookie after a tiny PoW solve. Use for **public, content-bearing sites without app-level auth** (blog, docs, wikis, static landing pages). Pattern: declare `module "anubis" { source = "../../modules/kubernetes/anubis_instance"; name = "X"; namespace = ...; target_url = "http://<backend>.<ns>.svc.cluster.local" }`, then in `ingress_factory` set `service_name = module.anubis.service_name`, `port = module.anubis.service_port`, `anti_ai_scraping = false`. Shared ed25519 key in Vault `secret/viktor` -> `anubis_ed25519_key`; cookie scoped to `viktorbarzin.me` so one solve covers all Anubis-fronted subdomains. **DO NOT put Anubis in front of Git/API/WebDAV/CLI endpoints** — clients without JS can't solve PoW. **Replicas default to 1** because Anubis stores in-flight challenges in process memory; a challenge issued by pod A and solved against pod B errors with `store: key not found` (HTTP 500). Bumping replicas requires wiring a shared Redis store (TODO). For path-level carve-outs (e.g. wrongmove has `/` behind Anubis but `/api` direct, blog has `/net-diag.sh` direct), declare a second `ingress_factory` with `ingress_path = ["/<path>"]` pointing at the bare backend service. Active on: blog (except `/net-diag.sh`), www, kms, travel, f1, cc, json, pb (privatebin), home (homepage), wrongmove (UI only). See `.claude/reference/patterns.md` "Anti-AI Scraping" for full layering.
- **Anubis PoW challenge** (`modules/kubernetes/anubis_instance/`): per-site reverse proxy that issues a 30-day JWT cookie after a tiny PoW solve. Use for **public, content-bearing sites without app-level auth** (blog, docs, wikis, static landing pages). Pattern: declare `module "anubis" { source = "../../modules/kubernetes/anubis_instance"; name = "X"; namespace = ...; target_url = "http://<backend>.<ns>.svc.cluster.local" }`, then in `ingress_factory` set `service_name = module.anubis.service_name`, `port = module.anubis.service_port`, `anti_ai_scraping = false`. Shared ed25519 key in Vault `secret/viktor` -> `anubis_ed25519_key`; cookie scoped to `viktorbarzin.me` so one solve covers all Anubis-fronted subdomains. **DO NOT put Anubis in front of Git/API/WebDAV/CLI endpoints** — clients without JS can't solve PoW. **Replicas default to 1** because Anubis stores in-flight challenges in process memory; a challenge issued by pod A and solved against pod B errors with `store: key not found` (HTTP 500). Bumping replicas requires wiring a shared Redis store (TODO). For path-level carve-outs (e.g. wrongmove has `/` behind Anubis but `/api` direct), declare a second `ingress_factory` with `ingress_path = ["/api"]` pointing at the bare backend service. Active on: blog, www, kms, travel, f1, cc, json, pb (privatebin), home (homepage), wrongmove (UI only). See `.claude/reference/patterns.md` "Anti-AI Scraping" for full layering.
- **Docker images**: Always build for `linux/amd64`. SHA-tag rule is being phased out — see `docs/plans/2026-05-16-auto-upgrade-apps-{design,plan}.md`. New model: CI pushes `:latest` (optionally also `:<8-char-sha>` for traceability), Keel polls and triggers rollouts. Cache-staleness concern from the old rule is resolved at the nginx layer (URL-split — manifests pass through, blobs cached). Until Phase 1 of the migration completes (per the plan), follow the SHA-tag rule for new services to match existing pattern.
- **Private registry**: `forgejo.viktorbarzin.me/viktor/<name>` (Forgejo packages, OAuth-style PAT auth). Use `image: forgejo.viktorbarzin.me/viktor/<name>:<tag>` + `imagePullSecrets: [{name: registry-credentials}]`. Kyverno auto-syncs the Secret to all namespaces. Containerd `hosts.toml` on every node redirects to in-cluster Traefik LB `10.0.20.200` to avoid hairpin NAT. Push-side: viktor PAT in Vault `secret/ci/global/forgejo_push_token` (Forgejo container packages are scoped per-user; only the package owner can push, ci-pusher cannot write to viktor/*). Pull-side: cluster-puller PAT in Vault `secret/viktor/forgejo_pull_token`. Retention CronJob (`forgejo-cleanup` in `forgejo` ns, daily 04:00) keeps newest 10 versions + always `:latest`; integrity probed every 15min by `forgejo-integrity-probe` in `monitoring` ns (catalog walk + manifest HEAD on every blob). See `docs/plans/2026-05-07-forgejo-registry-consolidation-{design,plan}.md` for the migration history. Pull-through caches for upstream registries (DockerHub, GHCR, Quay, k8s.gcr, Kyverno) stay on the registry VM at `10.0.20.10` ports 5000/5010/5020/5030/5040 — the old port-5050 R/W private registry was decommissioned 2026-05-07.
- **LinuxServer.io containers**: `DOCKER_MODS` runs apt-get on every start — bake slow mods into a custom image (`RUN /docker-mods || true` then `ENV DOCKER_MODS=`). Set `NO_CHOWN=true` to skip recursive chown that hangs on NFS mounts.

View file

@ -1,225 +0,0 @@
# Wealth Net-Worth Projections — Design (2026-05-28)
## Goal
Add forward-looking net-worth projections to the existing **`wealth`**
Grafana dashboard. Answer: *"given certain growth rates, where does my
net worth go?"* — with the growth rate sourced either from **fixed
values** (editable) or from my **own historical return** (derived from
the data). Show both pure-compounding and contributing-saver
trajectories.
## Existing state (what we build on)
- **Dashboard**: `wealth.json` (UID `wealth`, 28 panels, Finance
folder), provisioned as a ConfigMap consumed by the Grafana dashboard
sidecar. Datasource: **`wealth-pg`** (Postgres, populated by
`wealthfolio-sync` ETL). Default time range `now-180d/now`. **No
template variables today.**
- **Source view `dav_corrected`** (`infra/stacks/wealthfolio/main.tf`):
wraps `daily_account_valuation`, correcting `net_contribution` by
removing synthetic Fidelity-pension and Schwab-RSU flows so returns
aren't distorted. **All return/contribution panels read this view, and
so must the projection.**
- **Net worth (today)** = `SUM(total_value)` over the *latest-per-account*
rows (`DISTINCT ON (account_id) … ORDER BY valuation_date DESC`). This
is the projection start point `NW₀`.
- **Return methodology already on the dashboard** = **Modified Dietz**:
`(nwₑ nw₀ flow) / (nw₀ + 0.5·flow)` where `flow = contribₑ
contrib₀`. Used by "12mo return" and "Yearly investment return %". The
projection's historical rate reuses this exact formula.
- **Complete-days guard**: panels only trust dates where every active
account reported (`COUNT(*) per date >= (SELECT COUNT(*) FROM
accounts)`), avoiding partial-day skew (witness: memory id=1229, the
£88k-vs-£1.03M bug). The projection reuses this guard.
## Locked decisions
| # | Decision | Choice |
|---|---|---|
| 1 | Compute engine | Pure Postgres SQL on `wealth-pg` (no new service; `fire-planner` Monte Carlo is retirement/withdrawal-oriented and a poor fit for simple growth-rate projection) |
| 2 | Display | Multiple scenario lines |
| 3 | Historical rate basis | All-time annualized Modified Dietz ("all-time CAGR") |
| 4 | Lines | Fixed low/base/high (4/7/10%, editable) **+** a line at the derived historical CAGR |
| 5 | Contributions | Support both; draw both at once at the base rate (with-contrib **and** compounding-only) |
| 6 | Horizon | 30 years (dashboard variable) |
| 7 | Placement | A **collapsed row on the existing `wealth` dashboard** (not a separate dashboard) |
## The projection panel — "Net worth — 30-year projection"
A timeseries panel. Every projected line originates from today's net
worth `NW₀`. Series:
| Series | Rate | Contributions | Line style |
|---|---|---|---|
| Net worth (actual) | — | — | solid (last 3y of real history) |
| Low | `$rate_low` (4%) | with | dashed |
| Base | `$rate_base` (7%) | with | dashed |
| Base — compounding only | `$rate_base` | none | dotted |
| High | `$rate_high` (10%) | with | dashed |
| Historical | `$hist_cagr` (derived) | with | dashed, legend `Historical (X%)` |
The visible gap between **Base** and **Base — compounding only** is the
contribution boost (how much ongoing saving adds over pure market
growth). When `$monthly_contribution = 0` the two lines coincide.
### Projection math
Per future month `n = 0 … horizon_years·12`, with monthly rate
`rm = (1+r)^(1/12) 1`:
- **Compounding only**: `V(n) = NW₀·(1+rm)ⁿ`
- **With contributions** (ordinary annuity, end-of-period):
`V(n) = NW₀·(1+rm)ⁿ + C·((1+rm)ⁿ 1)/rm`
(guard `rm = 0``V(n) = NW₀ + C·n`)
`C` = monthly contribution (see `$monthly_contribution` below). Future
timestamps come from `generate_series` against DB `now()`**not** the
Grafana time picker — so the data always exists; only the axis must be
extended to display it (see Placement).
### Derived historical rate (`$hist_cagr`)
Annualized all-time Modified Dietz, computed over the complete-day
window from `dav_corrected`:
```sql
-- d0 = earliest complete day, dn = latest complete day
R_total = (nwₙ nw₀ (cₙ c₀)) / NULLIF(nw₀ + 0.5·(cₙ c₀), 0)
hist_cagr = (power(1 + R_total, 365.25 / (dn d0)) 1) · 100 -- percent
```
This extends the dashboard's existing 12mo/yearly Modified-Dietz formula
to the full history, so the projected "Historical" line is consistent
with the returns already shown. Exposed as a **hidden query variable
`$hist_cagr`** so the projection line *and* its legend label reference
the same computed number.
> Alternative considered: geometric mean of the per-year Modified-Dietz
> returns (more robust to flow timing). Rejected for v1 — annualized
> all-time MD is the faithful reading of "all-time CAGR" and reuses the
> existing formula verbatim. Revisit if the single 0.5 flow-weight
> proves too crude over the multi-year window.
## Template variables (new — dashboard has none today)
| Variable | Type | Default | Purpose |
|---|---|---|---|
| `$rate_low` | textbox | `4` | low fixed annual % |
| `$rate_base` | textbox | `7` | base fixed annual % |
| `$rate_high` | textbox | `10` | high fixed annual % |
| `$monthly_contribution` | textbox | `auto` | `auto` → SQL substitutes the trailing-12-complete-month contribution run-rate; or type a number / `0` |
| `$horizon_years` | textbox | `30` | projection length |
| `$hist_cagr` | query (hidden) | computed | derived historical CAGR %, reused by line + label |
`auto` contribution run-rate (trailing 12 complete months):
`(contrib_now contrib_12mo_ago) / 12`, read from `dav_corrected`
latest-per-account. Note: RSU vests make raw monthly contributions
lumpy; the 12-month run-rate smooths this.
## Supporting panels (same collapsed row)
- **Stat cards**: Net worth today · Historical CAGR (`$hist_cagr`) ·
Recent monthly contribution (the `auto` value) · Projected NW at
horizon @ base · @ historical.
- **Text panel** with one-click time-range links (see Placement).
- *(Optional)* table "Projected net worth by year" — base & historical
columns per year, for exact figures.
## Placement & the Grafana future-axis constraint
Grafana's dashboard time range is **shared by all panels**; per-panel
overrides ("Relative time", "Time shift") only move a window relative to
the picker — neither can set a panel's end to `now+30y` while other
panels stay at `now-180d` (verified against Grafana v11.2 docs;
dashboard `schemaVersion` 39). So a 30-year future axis cannot coexist
on-screen with the 28 history panels without manual time changes.
Resolution (minimizes the clunk, zero edits to existing panels):
1. **Collapsed row** "📈 Projections" at the bottom of the dashboard.
Collapsed by default → the 28 existing panels are untouched and never
show future whitespace.
2. **Text panel with time-range links** inside the row:
- `Show projection range``?from=now-3y&to=now%2B30y` (reloads the
dashboard with a future-inclusive axis; projection populates).
- `Reset range``?from=now-180d&to=now`.
3. The dashboard **default time stays `now-180d/now`** — unchanged.
4. Projection SQL keys off DB `now()`, independent of the picker, so the
actual-history tail (fixed `>= now()::date interval '3 years'`)
plus the 30-year projection both render once the range is extended.
This honors "one dashboard, nothing extra to maintain" while making the
future-axis switch a single click.
## Data flow / SQL building blocks
- **Target A (projection, wide format)**: one row per future month;
columns `time, proj_low, proj_base, proj_base_nocontrib, proj_high,
proj_hist`. Grafana renders each numeric column as a series. Row `n=0`
emits `NW₀` for all columns so lines start exactly at today.
- **Target B (actual history)**: `valuation_date, "Net worth (actual)"`
over complete days, last 3 years. Grafana merges A+B on the time
field; the actual series' final point (~today) meets the projections'
`n=0` point.
- Both reuse the `latest-per-account` + `complete-days` CTEs verbatim
from existing panels, against `dav_corrected`.
- Field overrides set line styles (solid/dashed/dotted) and the dynamic
`Historical (${hist_cagr}%)` display name.
## Scope — what does NOT change
- The 28 existing panels, the `wealth-pg` datasource, the `dav_corrected`
view, `wealthfolio-sync`, and the dashboard's default time range.
- No new Kubernetes resources, no new service, no `fire-planner` changes.
- Only additions to `wealth.json`: 1 collapsed row, ~7 panels, ~6
template variables, 2 in-dashboard time-range links.
## Deployment
1. Claim presence: `scripts/presence claim stack:monitoring --purpose
"wealth dashboard projections"`.
2. Edit `infra/stacks/monitoring/modules/monitoring/dashboards/wealth.json`.
3. `scripts/tg apply` the `monitoring` stack → ConfigMap updates → the
Grafana dashboard sidecar reloads `wealth` (no Grafana restart).
4. Verify in Grafana (see below). This is Terraform-managed — no
`kubectl apply`/manual edits (infra Terraform-only rule).
## Verification plan
Dashboards aren't unit-testable, so verification is data + visual:
1. **SQL pre-validation** against live `wealth-pg` (psql): run the
`$hist_cagr` query and the projection query; sanity-check `NW₀` matches
the existing "Net worth (current)" stat, `hist_cagr` is in a plausible
band, and `proj_base` at `n=0` equals `NW₀`, growing monotonically.
2. **JSON validity**: `python -c "json.load(open('wealth.json'))"` and
unique panel `id`s / sane `gridPos`.
3. **Visual** (after apply): expand the Projections row, click `Show
projection range`, confirm 5 projected lines + actual history flow
continuously from today; toggle `$monthly_contribution` between `auto`
and `0` and confirm the Base / Base-compounding-only gap opens/closes;
confirm `Reset range` restores the normal view and the 28 panels are
unaffected.
## Risks / edge cases
- **Rate 0%**`rm = 0` divide-by-zero — guarded in the annuity term.
- **Negative historical CAGR** (portfolio down all-time) → declining
projection line; still valid.
- **Short history (<1y)** → annualization extrapolates a noisy rate; the
`Historical` line is unreliable until ~1y of data. Acceptable; note in
panel description.
- **Lumpy RSU vests** skew raw monthly contribution → trailing-12-month
run-rate smooths it; the user can override the number anytime.
- **JSON churn**: must keep `wealth.json` valid and panel ids unique;
the row is additive at the end to limit blast radius.
- **Docs**: per execution.md §7, update any affected
`infra/docs/architecture` / service-catalog references for the wealth
dashboard in the same commit (likely none beyond this plan pair).
## Open questions
None — all design decisions resolved with the user (architecture,
display, historical-rate basis, line composition, contribution
rendering, horizon, placement).

View file

@ -29,15 +29,6 @@ KUBECTL=""
JSON_RESULTS=()
TOTAL_CHECKS=44
# Parallel execution settings. Each check function is self-contained — it
# only reads cluster state and mutates the in-memory counters / JSON_RESULTS
# array. That makes it safe to run them in parallel subshells and stitch the
# results back together in original order. Auto-fix mode forces serial
# execution to keep mutations deterministic.
PARALLEL=true
PARALLEL_JOBS="${HEALTHCHECK_PARALLEL_JOBS:-12}"
TMP_DIR=""
# --- Helpers ---
info() { [[ "$JSON" == true ]] && return 0; echo -e "${BLUE}[INFO]${NC} $*"; }
pass() { PASS_COUNT=$((PASS_COUNT + 1)); [[ "$JSON" == true ]] && return 0; [[ "$QUIET" == true ]] && return 0; echo -e " ${GREEN}[PASS]${NC} $*"; }
@ -76,100 +67,6 @@ count_lines() {
fi
}
# --- Parallel runner ---
#
# Pattern: each check function is invoked in an isolated subshell, with its
# stdout redirected to a per-check temp file. The subshell maintains its
# own copy of PASS/WARN/FAIL counters and JSON_RESULTS; after the check
# returns, the subshell appends marker lines (###PASS:N, ###WARN:N,
# ###FAIL:N, ###JSON:<entry>) so the parent can re-aggregate state.
#
# Marker prefix is chosen to be unlikely to occur in real check output;
# stray matches in stdout would be invisible to the user (the parent strips
# them on replay) but would inflate counters. If a future check needs to
# print literal "###PASS:" etc., change the prefix here.
PARALLEL_MARKER='###HCK###'
run_check_in_subshell() {
# $1 = numeric index (zero-padded), $2 = check function name
local idx="$1" fn="$2"
local outfile="$TMP_DIR/${idx}_${fn}.out"
(
# Reset counters/json so we capture only this check's delta.
PASS_COUNT=0; WARN_COUNT=0; FAIL_COUNT=0
JSON_RESULTS=()
# Redirect stdout+stderr into the per-check file.
exec >"$outfile" 2>&1
# Run the check. The function inherits all globals (KUBECTL, JSON,
# QUIET, FIX, HA_CACHE_DIR, ...).
"$fn"
# Append state delta as marker lines.
echo "${PARALLEL_MARKER}PASS:${PASS_COUNT}"
echo "${PARALLEL_MARKER}WARN:${WARN_COUNT}"
echo "${PARALLEL_MARKER}FAIL:${FAIL_COUNT}"
local entry
for entry in "${JSON_RESULTS[@]}"; do
# Encode newlines so multi-line entries (rare) replay cleanly.
printf '%sJSON:%s\n' "$PARALLEL_MARKER" "$entry"
done
) &
}
run_checks_parallel() {
# $@ = check function names, in canonical display order.
local fn idx=0 active=0 padded
local -a pids=()
for fn in "$@"; do
padded=$(printf '%03d' "$idx")
run_check_in_subshell "$padded" "$fn"
pids+=($!)
idx=$((idx + 1))
active=$((active + 1))
if [[ "$active" -ge "$PARALLEL_JOBS" ]]; then
# Wait for any one job to finish; -n requires bash 4.3+.
# A non-zero exit from a check would otherwise trip `set -e`
# — checks signal their findings via PASS/WARN/FAIL counters,
# not exit codes, so we deliberately ignore the status here.
wait -n || true
active=$((active - 1))
fi
done
# Drain remaining jobs (same rationale as above for the `|| true`).
wait || true
}
replay_check_outputs() {
# Replay temp files in numeric index order so the report reads exactly
# like the serial run did. Marker lines re-populate counters; everything
# else is forwarded to stdout.
local f line stripped
while IFS= read -r f; do
while IFS= read -r line || [[ -n "$line" ]]; do
case "$line" in
"${PARALLEL_MARKER}PASS:"*)
stripped="${line#${PARALLEL_MARKER}PASS:}"
PASS_COUNT=$((PASS_COUNT + stripped))
;;
"${PARALLEL_MARKER}WARN:"*)
stripped="${line#${PARALLEL_MARKER}WARN:}"
WARN_COUNT=$((WARN_COUNT + stripped))
;;
"${PARALLEL_MARKER}FAIL:"*)
stripped="${line#${PARALLEL_MARKER}FAIL:}"
FAIL_COUNT=$((FAIL_COUNT + stripped))
;;
"${PARALLEL_MARKER}JSON:"*)
JSON_RESULTS+=("${line#${PARALLEL_MARKER}JSON:}")
;;
*)
printf '%s\n' "$line"
;;
esac
done < "$f"
done < <(find "$TMP_DIR" -maxdepth 1 -type f -name '*.out' | sort)
}
# --- Argument parsing ---
parse_args() {
while [[ $# -gt 0 ]]; do
@ -178,19 +75,15 @@ parse_args() {
--no-fix) FIX=false; shift ;;
--quiet|-q) QUIET=true; shift ;;
--json) JSON=true; shift ;;
--serial) PARALLEL=false; shift ;;
--parallel) PARALLEL=true; PARALLEL_JOBS="$2"; shift 2 ;;
--kubeconfig) KUBECONFIG_PATH="$2"; shift 2 ;;
-h|--help)
echo "Usage: $0 [--fix|--no-fix] [--quiet|-q] [--json] [--serial|--parallel N] [--kubeconfig <path>]"
echo "Usage: $0 [--fix|--no-fix] [--quiet|-q] [--json] [--kubeconfig <path>]"
echo ""
echo "Flags:"
echo " --fix Auto-remediate safe issues (delete evicted pods)"
echo " --no-fix Disable auto-remediation (default)"
echo " --quiet, -q Only show WARN and FAIL sections"
echo " --json Machine-readable JSON output"
echo " --serial Run checks sequentially (default: parallel)"
echo " --parallel N Run up to N checks concurrently (default: 12, env HEALTHCHECK_PARALLEL_JOBS)"
echo " --kubeconfig PATH Override kubeconfig (default: \$(pwd)/config)"
exit 0
;;
@ -2788,51 +2681,50 @@ main() {
fi
fi
# Canonical check order — also defines the order in the human-readable
# report and the JSON `checks` array.
local checks=(
check_nodes check_resources check_conditions check_pods check_evicted
check_daemonsets check_deployments check_pvcs check_hpa check_cronjobs
check_crowdsec check_ingresses check_alerts check_uptime_kuma
check_resourcequota check_statefulsets check_node_disk check_helm_releases
check_kyverno check_nfs check_dns check_tls_certs check_gpu
check_cloudflare_tunnel check_overcommit check_ha_entities
check_ha_integrations check_ha_automations check_ha_system
check_hardware_exporters check_cert_manager_certificates
check_cert_manager_expiry check_cert_manager_requests check_backup_per_db
check_backup_offsite_sync check_backup_lvm_snapshots
check_monitoring_prom_am check_monitoring_vault check_monitoring_css
check_external_replicas check_external_divergence check_pve_thermals
check_pve_load check_external_traefik_5xx
)
# Auto-fix mutates cluster state inside individual checks — keep that
# path serial so the mutation order matches what an operator would
# expect when reading the report top-to-bottom.
if [[ "$FIX" == true ]]; then
PARALLEL=false
fi
if [[ "$PARALLEL" == true ]]; then
TMP_DIR=$(mktemp -d -t cluster-healthcheck.XXXXXX)
# Pre-populate the HA Sofia cache once in the parent so the four HA
# checks share a single API round-trip instead of each subshell
# re-fetching states/entries/config. ha_sofia_fetch_cache installs
# its own EXIT trap for HA_CACHE_DIR cleanup; we re-install ours
# afterwards so both temp dirs get cleaned.
if ha_sofia_available; then
ha_sofia_fetch_cache || true
fi
trap 'rm -rf "$TMP_DIR" "${HA_CACHE_DIR:-}"' EXIT
run_checks_parallel "${checks[@]}"
replay_check_outputs
else
local fn
for fn in "${checks[@]}"; do
"$fn"
done
fi
check_nodes
check_resources
check_conditions
check_pods
check_evicted
check_daemonsets
check_deployments
check_pvcs
check_hpa
check_cronjobs
check_crowdsec
check_ingresses
check_alerts
check_uptime_kuma
check_resourcequota
check_statefulsets
check_node_disk
check_helm_releases
check_kyverno
check_nfs
check_dns
check_tls_certs
check_gpu
check_cloudflare_tunnel
check_overcommit
check_ha_entities
check_ha_integrations
check_ha_automations
check_ha_system
check_hardware_exporters
check_cert_manager_certificates
check_cert_manager_expiry
check_cert_manager_requests
check_backup_per_db
check_backup_offsite_sync
check_backup_lvm_snapshots
check_monitoring_prom_am
check_monitoring_vault
check_monitoring_css
check_external_replicas
check_external_divergence
check_pve_thermals
check_pve_load
check_external_traefik_5xx
print_summary
# Exit code: 2 for failures, 1 for warnings, 0 for clean

View file

@ -9,7 +9,7 @@ resource "kubernetes_namespace" "website" {
name = "website"
labels = {
"istio-injection" : "disabled"
tier = local.tiers.aux
tier = local.tiers.aux
"keel.sh/enrolled" = "true"
}
}
@ -150,24 +150,6 @@ module "ingress" {
}
}
# Carve-out for /net-diag.sh a curl|bash diagnostic script for macOS.
# Anubis can't gate this path because non-JS clients (curl) can't solve PoW.
# Points at the bare blog nginx service, bypassing the Anubis proxy.
module "ingress_net_diag" {
source = "../../modules/kubernetes/ingress_factory"
# auth = "none": public read-only static file (curl|bash diagnostic script). No login, no PoW.
auth = "none"
namespace = kubernetes_namespace.website.metadata[0].name
name = "blog-net-diag"
service_name = kubernetes_service.blog.metadata[0].name
port = "80"
ingress_path = ["/net-diag.sh"]
full_host = "viktorbarzin.me"
dns_type = "none" # DNS already owned by the main blog ingress.
tls_secret_name = var.tls_secret_name
anti_ai_scraping = false # Single static file; nothing for scrapers to mine.
}
# CI retrigger 2026-05-16T13:42:57+00:00 bulk enrollment apply (pipeline #689 killed)
# CI retrigger v2 2026-05-16T13:46:35+00:00

View file

@ -164,13 +164,6 @@ resource "kubernetes_cron_job_v1" "trading212" {
}
spec {
restart_policy = "OnFailure"
# See imap cron without fsGroup=10001 the broker user (uid=10001
# gid=999) can't write the sqlite3 journal next to /data/sync.db
# and the dedup.record() call after a successful WF import crashes
# with "attempt to write a readonly database".
security_context {
fs_group = 10001
}
container {
name = "broker-sync"
image = local.broker_sync_image
@ -247,130 +240,6 @@ resource "kubernetes_cron_job_v1" "trading212" {
}
}
# IBKR Flex Web Service daily sync. Phase 2c deliverable.
# Pulls the Activity Flex Query (Trades + Cash + OpenPositions), maps to
# broker-sync Activities, runs them through the shared pipeline, then
# reconciles broker-reported OpenPositions against WF-computed quantities.
resource "kubernetes_cron_job_v1" "ibkr" {
metadata {
name = "broker-sync-ibkr"
namespace = kubernetes_namespace.broker_sync.metadata[0].name
labels = { app = "broker-sync", component = "ibkr" }
}
spec {
schedule = "0 2 * * *" # 02:00 UK
concurrency_policy = "Forbid"
starting_deadline_seconds = 300
successful_jobs_history_limit = 3
failed_jobs_history_limit = 5
job_template {
metadata {}
spec {
backoff_limit = 2
ttl_seconds_after_finished = 86400
template {
metadata {
labels = { app = "broker-sync", component = "ibkr" }
}
spec {
restart_policy = "OnFailure"
security_context {
fs_group = 10001
}
container {
name = "broker-sync"
image = local.broker_sync_image
command = ["broker-sync", "ibkr"]
env {
name = "BROKER_SYNC_DATA_DIR"
value = "/data"
}
env {
name = "WF_SESSION_PATH"
value = "/data/wealthfolio_session.json"
}
env {
name = "WF_BASE_URL"
value_from {
secret_key_ref {
name = "broker-sync-secrets"
key = "wf_base_url"
}
}
}
env {
name = "WF_USERNAME"
value_from {
secret_key_ref {
name = "broker-sync-secrets"
key = "wf_username"
}
}
}
env {
name = "WF_PASSWORD"
value_from {
secret_key_ref {
name = "broker-sync-secrets"
key = "wf_password"
}
}
}
env {
name = "IBKR_FLEX_TOKEN"
value_from {
secret_key_ref {
name = "broker-sync-secrets"
key = "ibkr_flex_token"
}
}
}
env {
name = "IBKR_FLEX_QUERY_ID"
value_from {
secret_key_ref {
name = "broker-sync-secrets"
key = "ibkr_flex_query_id"
}
}
}
env {
name = "IBKR_ACCOUNT_ID_UPSTREAM"
value_from {
secret_key_ref {
name = "broker-sync-secrets"
key = "ibkr_account_id_upstream"
}
}
}
volume_mount {
name = "data"
mount_path = "/data"
}
resources {
requests = { cpu = "20m", memory = "128Mi" }
limits = { memory = "256Mi" }
}
}
volume {
name = "data"
persistent_volume_claim {
claim_name = kubernetes_persistent_volume_claim.data_encrypted.metadata[0].name
}
}
}
}
}
}
}
lifecycle {
# KYVERNO_LIFECYCLE_V1: Kyverno admission webhook mutates dns_config with ndots=2
ignore_changes = [spec[0].job_template[0].spec[0].template[0].spec[0].dns_config]
}
}
# IMAP ingest InvestEngine + Schwab email parsers, one combined pod.
# Phase 2 deliverable. Defined ahead of implementation so the rollout is
# one `tf apply` once the image supports the CLI subcommand.
@ -385,17 +254,11 @@ resource "kubernetes_cron_job_v1" "imap" {
concurrency_policy = "Forbid"
successful_jobs_history_limit = 3
failed_jobs_history_limit = 5
# 2026-05-27: RESUSPENDED. Despite BROKER_SYNC_IMAP_EXCLUDE_PROVIDERS=invest-engine
# being set on the cronjob (commit a4dab03), 39 IMAP-source IE BUYs were
# re-inserted into Wealthfolio at 2026-05-27T09:22:18 UTC exactly the
# rows I'd deleted yesterday during the £252k dedup. The 02:30 cron at
# 02:30 UTC today logged `ie_skipped=53` (skip is working), so the 09:22
# source is something else we haven't pinpointed yet. Suspending eliminates
# one possible vector (e.g., manual reruns / replay queues / future bugs
# where the exclude env doesn't bind correctly). Schwab vest ingestion is
# the only thing we lose; it can be unsuspended once the IE re-dup root
# cause is fixed (researcher subagent investigating; beads task pending).
# Also see code-9ko8 (pre-existing reliability issues).
# Unsuspended 2026-04-19 for RSU vest ground-truth ingestion the parser
# now detects Schwab Release Confirmations and scaffolds VestEvents; the
# postgres sink that persists them into payslip_ingest.rsu_vest_events is
# pending a real-email fixture and cross-service DB grant (see
# follow-up beads task filed under the RSU tax spike fix epic).
suspend = false
job_template {
metadata {}
@ -427,20 +290,14 @@ resource "kubernetes_cron_job_v1" "imap" {
name = "BROKER_SYNC_DATA_DIR"
value = "/data"
}
# 2026-05-27 (afternoon, post-incident): IE-via-IMAP is now
# STRUCTURALLY OPT-IN at the code level broker_sync.providers.imap
# default-excludes `invest-engine`. The earlier "standardise on IMAP
# for IE" comment was inverted after a sibling Claude session ran
# broker-sync imap-ingest at 09:22 UTC without the EXCLUDE env and
# re-imported the 39 IE BUYs/DEPOSITs the previous day's dedup had
# removed. To re-enable IE-via-IMAP, add:
# env {
# name = "BROKER_SYNC_IMAP_INCLUDE_PROVIDERS"
# value = "invest-engine"
# }
# Until that env is set, only Schwab is parsed (the canonical use
# of the IMAP path Schwab has no public API).
# See post-mortem in beads code-dc1b.
# 2026-05-26: skip InvestEngine email parsing. IE has its own
# bearer-token API path (`broker-sync invest-engine`) running
# both produces duplicate BUYs in Wealthfolio because the two
# generate different external_ids for the same fill.
env {
name = "BROKER_SYNC_IMAP_EXCLUDE_PROVIDERS"
value = "invest-engine"
}
env {
name = "WF_SESSION_PATH"
value = "/data/wealthfolio_session.json"

View file

@ -348,14 +348,9 @@ resource "kubernetes_service" "fire_planner" {
}
}
# Monthly recompute on the 2nd at 09:00 UTC.
#
# This runs `recompute-all` (the Monte Carlo Cartesian sweep), NOT
# `ingest`. The /networth path no longer depends on an ingest CronJob
# as of 2026-05-27 the account_snapshot cache is refreshed lazily on
# every /networth, /networth/history, /progress request when older than
# NETWORTH_CACHE_TTL_DAYS (default 1). See
# fire_planner/ingest/wealthfolio.py :: refresh_account_snapshots_if_stale.
# Monthly recompute on the 2nd at 09:00 UTC. Wealthfolio-sync runs on
# the 1st at 08:00, so account_snapshot is fresh by the time the
# planner picks up.
resource "kubernetes_cron_job_v1" "fire_planner_recompute" {
metadata {
name = "fire-planner-recompute"

78
stacks/foolery/main.tf Normal file
View file

@ -0,0 +1,78 @@
variable "tls_secret_name" {
type = string
sensitive = true
}
resource "kubernetes_namespace" "foolery" {
metadata {
name = "foolery"
labels = {
"istio-injection" : "disabled"
tier = local.tiers.aux
"keel.sh/enrolled" = "true"
}
}
lifecycle {
# KYVERNO_LIFECYCLE_V1: goldilocks-vpa-auto-mode ClusterPolicy stamps this label on every namespace
ignore_changes = [metadata[0].labels["goldilocks.fairwinds.com/vpa-update-mode"]]
}
}
module "tls_secret" {
source = "../../modules/kubernetes/setup_tls_secret"
namespace = kubernetes_namespace.foolery.metadata[0].name
tls_secret_name = var.tls_secret_name
}
# Service + Endpoints to reverse-proxy to Foolery at 10.0.10.10:3210
resource "kubernetes_service" "foolery" {
metadata {
name = "foolery"
namespace = kubernetes_namespace.foolery.metadata[0].name
labels = {
app = "foolery"
}
}
spec {
port {
name = "http"
port = 80
target_port = 3210
}
}
}
resource "kubernetes_endpoints" "foolery" {
metadata {
name = "foolery"
namespace = kubernetes_namespace.foolery.metadata[0].name
}
subset {
address {
ip = "10.0.10.10"
}
port {
name = "http"
port = 3210
}
}
}
module "ingress" {
source = "../../modules/kubernetes/ingress_factory"
dns_type = "proxied"
namespace = kubernetes_namespace.foolery.metadata[0].name
name = "foolery"
tls_secret_name = var.tls_secret_name
auth = "required"
extra_annotations = {
"gethomepage.dev/enabled" = "true"
"gethomepage.dev/name" = "Foolery"
"gethomepage.dev/description" = "Agent orchestration control room"
"gethomepage.dev/icon" = "mdi-robot"
"gethomepage.dev/group" = "AI"
"gethomepage.dev/pod-selector" = ""
}
}

1
stacks/foolery/secrets Symbolic link
View file

@ -0,0 +1 @@
../../secrets

View file

@ -0,0 +1,3 @@
include "root" {
path = find_in_parent_folders()
}

View file

@ -621,7 +621,7 @@
"h": 11,
"w": 24,
"x": 0,
"y": 70
"y": 40
},
"fieldConfig": {
"defaults": {
@ -686,7 +686,7 @@
"h": 10,
"w": 24,
"x": 0,
"y": 81
"y": 51
},
"fieldConfig": {
"defaults": {
@ -790,7 +790,7 @@
"h": 14,
"w": 24,
"x": 0,
"y": 134
"y": 104
},
"fieldConfig": {
"defaults": {
@ -1565,7 +1565,7 @@
"h": 11,
"w": 24,
"x": 0,
"y": 91
"y": 61
},
"fieldConfig": {
"defaults": {
@ -1659,7 +1659,7 @@
"h": 11,
"w": 24,
"x": 0,
"y": 102
"y": 72
},
"fieldConfig": {
"defaults": {
@ -1777,7 +1777,7 @@
"h": 11,
"w": 24,
"x": 0,
"y": 113
"y": 83
},
"fieldConfig": {
"defaults": {
@ -1882,7 +1882,7 @@
"h": 10,
"w": 24,
"x": 0,
"y": 124
"y": 94
},
"fieldConfig": {
"defaults": {
@ -1962,7 +1962,7 @@
"h": 8,
"w": 12,
"x": 0,
"y": 62
"y": 32
},
"fieldConfig": {
"defaults": {
@ -2156,7 +2156,7 @@
"uid": "wealth-pg"
},
"gridPos": {
"y": 62,
"y": 32,
"x": 12,
"w": 12,
"h": 8
@ -2273,7 +2273,7 @@
"uid": "wealth-pg"
},
"gridPos": {
"y": 148,
"y": 118,
"x": 0,
"w": 24,
"h": 12
@ -2485,357 +2485,6 @@
"rawSql": "WITH lots AS (SELECT id, activity_date::date AS vest_date, quantity, unit_price AS vest_price, SUM(quantity) OVER (ORDER BY activity_date, id) AS lot_end, SUM(quantity) OVER (ORDER BY activity_date, id) - quantity AS lot_start FROM activities WHERE asset_id='4f60833d-0bfb-484f-8ee6-f129af72e137' AND activity_type='BUY'), sells AS (SELECT activity_date::date AS sell_date, quantity AS sell_qty, unit_price AS sell_price, SUM(quantity) OVER (ORDER BY activity_date, id) AS sell_end, SUM(quantity) OVER (ORDER BY activity_date, id) - quantity AS sell_start FROM activities WHERE asset_id='4f60833d-0bfb-484f-8ee6-f129af72e137' AND activity_type='SELL'), matched AS (SELECT l.vest_date, l.vest_price, s.sell_date, s.sell_price, GREATEST(LEAST(l.lot_end, s.sell_end) - GREATEST(l.lot_start, s.sell_start), 0::numeric) AS qty FROM lots l CROSS JOIN sells s WHERE LEAST(l.lot_end, s.sell_end) > GREATEST(l.lot_start, s.sell_start)) SELECT vest_date, SUM(qty) AS \"shares sold\", (SUM(qty*vest_price)/NULLIF(SUM(qty),0)) AS \"vest price\", SUM(qty*vest_price) AS \"vest value\", (SUM(qty*sell_price)/NULLIF(SUM(qty),0)) AS \"avg sell price\", SUM(qty*sell_price) AS \"sell value\", SUM(qty*(sell_price-vest_price)) AS \"realized PNL\", (SUM(qty*(sell_price-vest_price))/NULLIF(SUM(qty*vest_price),0)*100) AS \"PNL %\", AVG((sell_date-vest_date)) AS \"days held (avg)\" FROM matched GROUP BY vest_date ORDER BY vest_date"
}
]
},
{
"id": 32,
"title": "Net pay earned vs market gains (cumulative)",
"description": "Active vs passive income race. Blue = total take-home pay earned from work (cumulative net_pay, payslip_ingest). Green = total market gains = portfolio value contributions (cumulative, wealthfolio_sync dav_corrected — matches the 'Growth over time' panel). Set the time range wide (e.g. 2019 → now) to see the full climb from £0.",
"type": "timeseries",
"datasource": {
"type": "datasource",
"uid": "-- Mixed --"
},
"gridPos": {
"h": 10,
"w": 24,
"x": 0,
"y": 32
},
"fieldConfig": {
"defaults": {
"color": {
"mode": "palette-classic"
},
"unit": "currencyGBP",
"decimals": 0,
"custom": {
"drawStyle": "line",
"lineWidth": 2,
"fillOpacity": 10,
"gradientMode": "opacity",
"pointSize": 5,
"showPoints": "never",
"spanNulls": true,
"axisPlacement": "auto",
"stacking": {
"group": "A",
"mode": "none"
}
}
},
"overrides": [
{
"matcher": {
"id": "byName",
"options": "net_pay_cum"
},
"properties": [
{
"id": "displayName",
"value": "Net pay earned (cumulative)"
},
{
"id": "color",
"value": {
"mode": "fixed",
"fixedColor": "blue"
}
}
]
},
{
"matcher": {
"id": "byName",
"options": "market_gain_cum"
},
"properties": [
{
"id": "displayName",
"value": "Market gains (cumulative)"
},
{
"id": "color",
"value": {
"mode": "fixed",
"fixedColor": "#56A64B"
}
}
]
}
]
},
"options": {
"legend": {
"calcs": [
"last",
"max"
],
"displayMode": "table",
"placement": "bottom"
},
"tooltip": {
"mode": "multi",
"sort": "desc"
}
},
"targets": [
{
"refId": "A",
"datasource": {
"type": "grafana-postgresql-datasource",
"uid": "payslips-pg"
},
"rawQuery": true,
"editorMode": "code",
"format": "time_series",
"rawSql": "WITH m AS (SELECT pay_date, SUM(net_pay) AS net_pay FROM payslip_ingest.payslip GROUP BY pay_date), cum AS (SELECT pay_date, SUM(net_pay) OVER (ORDER BY pay_date) AS net_pay_cum FROM m) SELECT pay_date::timestamp AS \"time\", net_pay_cum FROM cum WHERE $__timeFilter(pay_date) ORDER BY pay_date"
},
{
"refId": "B",
"datasource": {
"type": "grafana-postgresql-datasource",
"uid": "wealth-pg"
},
"rawQuery": true,
"editorMode": "code",
"format": "time_series",
"rawSql": "WITH active_count AS (SELECT COUNT(*) AS n FROM accounts), max_complete AS (SELECT MAX(valuation_date) AS d FROM (SELECT d.valuation_date, COUNT(*) AS c FROM dav_corrected d JOIN accounts a ON a.id = d.account_id GROUP BY d.valuation_date) x WHERE c >= (SELECT n FROM active_count)) SELECT valuation_date::timestamp AS \"time\", (SUM(total_value) - SUM(net_contribution)) AS market_gain_cum FROM dav_corrected WHERE $__timeFilter(valuation_date) AND valuation_date <= (SELECT d FROM max_complete) GROUP BY valuation_date ORDER BY valuation_date"
}
]
},
{
"id": 33,
"title": "Net pay vs market gain — per year",
"description": "Each calendar year: take-home pay earned (blue, SUM net_pay) vs market gain generated that year (green, change in portfolio value contributions across the year). Shows full history regardless of the time picker.",
"type": "timeseries",
"datasource": {
"type": "datasource",
"uid": "-- Mixed --"
},
"gridPos": {
"h": 10,
"w": 24,
"x": 0,
"y": 42
},
"fieldConfig": {
"defaults": {
"color": {
"mode": "palette-classic"
},
"unit": "currencyGBP",
"decimals": 0,
"custom": {
"drawStyle": "bars",
"barAlignment": 0,
"lineWidth": 1,
"fillOpacity": 70,
"gradientMode": "none",
"pointSize": 5,
"showPoints": "never",
"spanNulls": false,
"axisPlacement": "auto",
"stacking": {
"group": "A",
"mode": "none"
}
}
},
"overrides": [
{
"matcher": {
"id": "byName",
"options": "net_pay_year"
},
"properties": [
{
"id": "displayName",
"value": "Net pay (year)"
},
{
"id": "color",
"value": {
"mode": "fixed",
"fixedColor": "blue"
}
}
]
},
{
"matcher": {
"id": "byName",
"options": "market_gain_year"
},
"properties": [
{
"id": "displayName",
"value": "Market gain (year)"
},
{
"id": "color",
"value": {
"mode": "fixed",
"fixedColor": "#56A64B"
}
}
]
}
]
},
"options": {
"legend": {
"calcs": [
"sum"
],
"displayMode": "table",
"placement": "bottom"
},
"tooltip": {
"mode": "multi",
"sort": "desc"
}
},
"targets": [
{
"refId": "A",
"datasource": {
"type": "grafana-postgresql-datasource",
"uid": "payslips-pg"
},
"rawQuery": true,
"editorMode": "code",
"format": "time_series",
"rawSql": "SELECT date_trunc('year', pay_date)::timestamp AS \"time\", SUM(net_pay) AS net_pay_year FROM payslip_ingest.payslip GROUP BY 1 ORDER BY 1"
},
{
"refId": "B",
"datasource": {
"type": "grafana-postgresql-datasource",
"uid": "wealth-pg"
},
"rawQuery": true,
"editorMode": "code",
"format": "time_series",
"rawSql": "WITH active_count AS (SELECT COUNT(*) AS n FROM accounts), max_complete AS (SELECT MAX(valuation_date) AS d FROM (SELECT d.valuation_date, COUNT(*) AS c FROM dav_corrected d JOIN accounts a ON a.id = d.account_id GROUP BY d.valuation_date) x WHERE c >= (SELECT n FROM active_count)), yearly AS (SELECT date_trunc('year', valuation_date)::date AS yr, valuation_date, SUM(total_value) AS nw, SUM(net_contribution) AS contrib FROM dav_corrected WHERE valuation_date <= (SELECT d FROM max_complete) GROUP BY valuation_date), endpoints AS (SELECT yr, (array_agg(nw ORDER BY valuation_date ASC))[1] AS nw_start, (array_agg(nw ORDER BY valuation_date DESC))[1] AS nw_end, (array_agg(contrib ORDER BY valuation_date ASC))[1] AS contrib_start, (array_agg(contrib ORDER BY valuation_date DESC))[1] AS contrib_end FROM yearly GROUP BY yr) SELECT yr::timestamp AS \"time\", ROUND((nw_end - nw_start - (contrib_end - contrib_start))::numeric, 0) AS market_gain_year FROM endpoints ORDER BY yr"
}
]
},
{
"id": 34,
"title": "Net pay vs market gain — per month",
"description": "Each month: take-home pay (blue, SUM net_pay) vs market gain that month (green, change in portfolio value contributions). Monthly market swings are volatile — the yearly panel above is the smoother read. Shows full history regardless of the time picker.",
"type": "timeseries",
"datasource": {
"type": "datasource",
"uid": "-- Mixed --"
},
"gridPos": {
"h": 10,
"w": 24,
"x": 0,
"y": 52
},
"fieldConfig": {
"defaults": {
"color": {
"mode": "palette-classic"
},
"unit": "currencyGBP",
"decimals": 0,
"custom": {
"drawStyle": "bars",
"barAlignment": 0,
"lineWidth": 1,
"fillOpacity": 70,
"gradientMode": "none",
"pointSize": 5,
"showPoints": "never",
"spanNulls": false,
"axisPlacement": "auto",
"stacking": {
"group": "A",
"mode": "none"
}
}
},
"overrides": [
{
"matcher": {
"id": "byName",
"options": "net_pay_month"
},
"properties": [
{
"id": "displayName",
"value": "Net pay (month)"
},
{
"id": "color",
"value": {
"mode": "fixed",
"fixedColor": "blue"
}
}
]
},
{
"matcher": {
"id": "byName",
"options": "market_gain_month"
},
"properties": [
{
"id": "displayName",
"value": "Market gain (month)"
},
{
"id": "color",
"value": {
"mode": "fixed",
"fixedColor": "#56A64B"
}
}
]
}
]
},
"options": {
"legend": {
"calcs": [
"sum"
],
"displayMode": "table",
"placement": "bottom"
},
"tooltip": {
"mode": "multi",
"sort": "desc"
}
},
"targets": [
{
"refId": "A",
"datasource": {
"type": "grafana-postgresql-datasource",
"uid": "payslips-pg"
},
"rawQuery": true,
"editorMode": "code",
"format": "time_series",
"rawSql": "SELECT date_trunc('month', pay_date)::timestamp AS \"time\", SUM(net_pay) AS net_pay_month FROM payslip_ingest.payslip GROUP BY 1 ORDER BY 1"
},
{
"refId": "B",
"datasource": {
"type": "grafana-postgresql-datasource",
"uid": "wealth-pg"
},
"rawQuery": true,
"editorMode": "code",
"format": "time_series",
"rawSql": "WITH active_count AS (SELECT COUNT(*) AS n FROM accounts), max_complete AS (SELECT MAX(valuation_date) AS d FROM (SELECT d.valuation_date, COUNT(*) AS c FROM dav_corrected d JOIN accounts a ON a.id = d.account_id GROUP BY d.valuation_date) x WHERE c >= (SELECT n FROM active_count)), monthly AS (SELECT date_trunc('month', valuation_date)::date AS month, valuation_date, SUM(total_value) AS nw, SUM(net_contribution) AS contrib FROM dav_corrected WHERE valuation_date <= (SELECT d FROM max_complete) GROUP BY valuation_date), endpoints AS (SELECT month, (array_agg(nw ORDER BY valuation_date ASC))[1] AS nw_start, (array_agg(nw ORDER BY valuation_date DESC))[1] AS nw_end, (array_agg(contrib ORDER BY valuation_date ASC))[1] AS contrib_start, (array_agg(contrib ORDER BY valuation_date DESC))[1] AS contrib_end FROM monthly GROUP BY month) SELECT month::timestamp AS \"time\", ROUND((nw_end - nw_start - (contrib_end - contrib_start))::numeric, 0) AS market_gain_month FROM endpoints ORDER BY month"
}
]
}
],
"refresh": "5m",

View file

@ -2141,18 +2141,13 @@ serverFiles:
annotations:
summary: "5xx rate on {{ $labels.service }}: {{ $value | printf \"%.1f\" }}% (threshold: 10%)"
- alert: HighService4xxRate
# `.*catchall-error-pages.*` is excluded because that IngressRoute
# is the wildcard `HostRegexp(^(.+\.)?viktorbarzin\.me$)` handler
# — its entire purpose is to return 404 for unmatched hostnames
# (typos + scanner traffic), so its 4xx rate is permanently ~100%.
# Without this exclusion the alert is a perpetual false positive.
expr: |
(
sum(rate(traefik_service_requests_total{code=~"4..", service!~".*nextcloud.*|.*grafana.*|.*linkwarden.*|.*claude-memory.*|.*catchall-error-pages.*"}[5m])) by (service)
/ sum(rate(traefik_service_requests_total{service!~".*nextcloud.*|.*grafana.*|.*linkwarden.*|.*claude-memory.*|.*catchall-error-pages.*"}[5m])) by (service)
sum(rate(traefik_service_requests_total{code=~"4..", service!~".*nextcloud.*|.*grafana.*|.*linkwarden.*|.*claude-memory.*"}[5m])) by (service)
/ sum(rate(traefik_service_requests_total{service!~".*nextcloud.*|.*grafana.*|.*linkwarden.*|.*claude-memory.*"}[5m])) by (service)
* 100
) > 30
and sum(rate(traefik_service_requests_total{service!~".*nextcloud.*|.*grafana.*|.*linkwarden.*|.*claude-memory.*|.*catchall-error-pages.*"}[5m])) by (service) > 0.1
and sum(rate(traefik_service_requests_total{service!~".*nextcloud.*|.*grafana.*|.*linkwarden.*|.*claude-memory.*"}[5m])) by (service) > 0.1
and on() (time() - process_start_time_seconds{job="prometheus"}) > 900
for: 15m
labels:

View file

@ -100,13 +100,7 @@ resource "kubernetes_service" "proxmox-exporter" {
"app" = "proxmox-exporter"
}
annotations = {
# Use scrape_slow (5m interval, 30s timeout in prometheus values) because
# the PVE API endpoint regularly takes ~11s with ~1000 k8s-csi LVs on the
# host, blowing past the default 10s scrape_timeout and flapping the
# ProxmoxMetricsMissing + ScrapeTargetDown alerts. The slow job is gated
# by the `prometheus_io_scrape_slow=true` annotation in
# prometheus_chart_values.tpl and also excludes us from the fast job.
"prometheus.io/scrape_slow" = "true"
"prometheus.io/scrape" = "true"
"prometheus.io/port" = 9221
"prometheus.io/path" = "/pve"
"prometheus.io/param_target" = "192.168.1.127"

View file

@ -128,25 +128,3 @@ provider "registry.terraform.io/hashicorp/vault" {
"zh:ff35fb1ab6add288f0f368981e56f780b50405accd1937131cba1137999c8d83",
]
}
provider "registry.terraform.io/telmate/proxmox" {
version = "3.0.2-rc07"
constraints = "3.0.2-rc07"
hashes = [
"h1:zp5hpQJQ4t4zROSLqdltVpBO+Riy9VugtfFbpyTw1aM=",
"zh:2ee860cd0a368b3eaa53f4a9ea46f16dab8a97929e813ea6ef55183f8112c2ca",
"zh:415965fd915bae2040d7f79e45f64d6e3ae61149c10114efeac1b34687d7296c",
"zh:6584b2055df0e32062561c615e3b6b2c291ca8c959440adda09ef3ec1e1436bd",
"zh:65dcfad71928e0a8dd9befc22524ed686be5020b0024dc5cca5184c7420eeb6b",
"zh:7253dc29bd265d33f2791ac4f779c5413f16720bb717de8e6c5fcb2c858648ea",
"zh:7ec8993da10a47606670f9f67cfd10719a7580641d11c7aa761121c4a2bd66fb",
"zh:999a3f7a9dcf517967fc537e6ec930a8172203642fb01b8e1f78f908373db210",
"zh:a50e6df7280eb6584a5fd2456e3f5b6df13b2ec8a7fa4605511e438e1863be42",
"zh:b25b329a1e42681c509d027fee0365414f0cc5062b65690cfc3386aab16132ae",
"zh:c028877fdb438ece48f7bc02b65bbae9ca7b7befbd260e519ccab6c0cbb39f26",
"zh:cf0eaa3ea9fcc6d62793637947f1b8d7c885b6ad74695ab47e134e4ff132190f",
"zh:d5ade3fae031cc629b7c512a7b60e46570f4c41665e88a595d7efd943dde5ab2",
"zh:f388c15ad1ecfc09e7361e3b98bae9b627a3a85f7b908c9f40650969c949901c",
"zh:f415cc6f735a3971faae6ac24034afdb9ee83373ef8de19a9631c187d5adc7db",
]
}

View file

@ -30,14 +30,7 @@ resource "kubernetes_namespace" "nextcloud" {
tier = local.tiers.edge
"resource-governance/custom-limitrange" = "true"
"resource-governance/custom-quota" = "true"
# Keel disabled for nextcloud: the 2026-05-26 Keel-driven bump
# 32.0.3-apache 32.0.9-apache left the pod in maintenance mode
# (needsDbUpgrade=true) for ~22h because Keel doesn't run
# `occ upgrade` after rolling the image. Defense-in-depth:
# (a) namespace not enrolled here, (b) workload carries the
# `keel.sh/policy=never` label + annotation below so even if the
# ns label gets re-added, Kyverno excludes this Deployment.
# "keel.sh/enrolled" = "true"
"keel.sh/enrolled" = "true"
}
}
lifecycle {
@ -46,41 +39,6 @@ resource "kubernetes_namespace" "nextcloud" {
}
}
# Workload-level Keel opt-out (see namespace comment above).
# Keel reads the ANNOTATION `keel.sh/policy` (it's what un-tracks the
# image watcher); the LABEL exists for the Kyverno exclude rule in
# `inject-keel-annotations` (defense-in-depth in case the namespace
# label gets re-added later). Both are set via these helper resources
# because the nextcloud chart 8.8.1 doesn't expose Deployment-level
# commonLabels / commonAnnotations.
resource "kubernetes_labels" "nextcloud_keel_optout" {
api_version = "apps/v1"
kind = "Deployment"
metadata {
name = "nextcloud"
namespace = kubernetes_namespace.nextcloud.metadata[0].name
}
labels = {
"keel.sh/policy" = "never"
}
force = true
depends_on = [helm_release.nextcloud]
}
resource "kubernetes_annotations" "nextcloud_keel_optout" {
api_version = "apps/v1"
kind = "Deployment"
metadata {
name = "nextcloud"
namespace = kubernetes_namespace.nextcloud.metadata[0].name
}
annotations = {
"keel.sh/policy" = "never"
}
force = true
depends_on = [helm_release.nextcloud]
}
resource "kubernetes_manifest" "external_secret" {
manifest = {
apiVersion = "external-secrets.io/v1beta1"

View file

@ -20,10 +20,6 @@ terraform {
source = "gavinbunney/kubectl"
version = "~> 1.14"
}
proxmox = {
source = "telmate/proxmox"
version = "3.0.2-rc07"
}
}
}

View file

@ -21,5 +21,5 @@ inputs = {
# priority-pass repo HEAD auto-bumped by GHA `build-and-deploy.yml`
# on every successful build. Manual edits welcome for local trials,
# but CI will overwrite on the next push to main.
image_tag = "4ce9e8e8"
image_tag = "88f18e53"
}

View file

@ -28,7 +28,7 @@ locals {
# trips a sticky multi-hour 429 on Sonnet after 5-10 burst calls.
# Switch to "claude-sonnet-4-5" if/when the Enterprise quota allows.
TRADING_MEET_KEVIN_LLM_MODEL = "claude-haiku-4-5-20251001"
TRADING_MEET_KEVIN_PROMPT_VERSION = "v2"
TRADING_MEET_KEVIN_PROMPT_VERSION = "v1"
}
}
@ -76,8 +76,6 @@ resource "kubernetes_manifest" "external_secret" {
DBAAS_ROOT_PASSWORD = "{{ .dbaas_root_password }}"
TRADING_ANTHROPIC_OAUTH_TOKEN = "{{ .anthropic_oauth_token }}"
TRADING_MEET_KEVIN_CHANNEL_ID = "{{ .meet_kevin_channel_id }}"
TRADING_SLACK_WEBHOOK_URL = "{{ .slack_webhook_url }}"
TRADING_SLACK_BOT_TOKEN = "{{ .slack_bot_token }}"
}
}
}
@ -92,9 +90,6 @@ resource "kubernetes_manifest" "external_secret" {
{ secretKey = "dbaas_root_password", remoteRef = { key = "trading-bot", property = "dbaas_root_password" } },
{ secretKey = "anthropic_oauth_token", remoteRef = { key = "trading-bot", property = "anthropic_oauth_token" } },
{ secretKey = "meet_kevin_channel_id", remoteRef = { key = "trading-bot", property = "meet_kevin_channel_id" } },
{ secretKey = "slack_webhook_url", remoteRef = { key = "trading-bot", property = "slack_webhook_url" } },
# slack_bot_token is sourced from secret/viktor (shared bot identity), NOT secret/trading-bot.
{ secretKey = "slack_bot_token", remoteRef = { key = "viktor", property = "slack_bot_token" } },
]
}
}
@ -593,12 +588,6 @@ resource "kubernetes_deployment" "trading-bot-workers" {
name = "TRADING_KEVIN_DAILY_LOSS_CIRCUIT_PCT"
value = "0.05"
}
# Slack channel for trade alerts. User must create #trading-bot
# in Slack UI; bot uses chat:write.public so no invite needed.
env {
name = "TRADING_SLACK_CHANNEL"
value = "trading-bot"
}
env_from {
secret_ref {
name = "trading-bot-secrets"

View file

@ -29,21 +29,6 @@ provider "registry.terraform.io/gavinbunney/kubectl" {
constraints = "~> 1.14"
hashes = [
"h1:9QkxPjp0x5FZFfJbE+B7hBOoads9gmdfj9aYu5N4Sfc=",
"zh:1dec8766336ac5b00b3d8f62e3fff6390f5f60699c9299920fc9861a76f00c71",
"zh:43f101b56b58d7fead6a511728b4e09f7c41dc2e3963f59cf1c146c4767c6cb7",
"zh:4c4fbaa44f60e722f25cc05ee11dfaec282893c5c0ffa27bc88c382dbfbaa35c",
"zh:51dd23238b7b677b8a1abbfcc7deec53ffa5ec79e58e3b54d6be334d3d01bc0e",
"zh:5afc2ebc75b9d708730dbabdc8f94dd559d7f2fc5a31c5101358bd8d016916ba",
"zh:6be6e72d4663776390a82a37e34f7359f726d0120df622f4a2b46619338a168e",
"zh:72642d5fcf1e3febb6e5d4ae7b592bb9ff3cb220af041dbda893588e4bf30c0c",
"zh:9b12af85486a96aedd8d7984b0ff811a4b42e3d88dad1a3fb4c0b580d04fa425",
"zh:a1da03e3239867b35812ee031a1060fed6e8d8e458e2eaca48b5dd51b35f56f7",
"zh:b98b6a6728fe277fcd133bdfa7237bd733eae233f09653523f14460f608f8ba2",
"zh:bb8b071d0437f4767695c6158a3cb70df9f52e377c67019971d888b99147511f",
"zh:dc89ce4b63bfef708ec29c17e85ad0232a1794336dc54dd88c3ba0b77e764f71",
"zh:dd7dd18f1f8218c6cd19592288fde32dccc743cde05b9feeb2883f37c2ff4b4e",
"zh:ec4bd5ab3872dedb39fe528319b4bba609306e12ee90971495f109e142d66310",
"zh:f610ead42f724c82f5463e0e71fa735a11ffb6101880665d93f48b4a67b9ad82",
]
}
@ -52,20 +37,6 @@ provider "registry.terraform.io/goauthentik/authentik" {
constraints = "~> 2024.10"
hashes = [
"h1:roBMd+gi+TGgikH/bMzEI8JfvJiMAQWt+8FmokCrQIs=",
"zh:090260dc7889ea822ec1d899344e1ee23eba5290461989c0796149c9511f2316",
"zh:13c2655ff824b0dc4b9bb832b5ca6d41dba97cb280330258c5fef4115e236209",
"zh:166a73c3a810c9c895d68a8ff968158f339f8a2c1c03e20ec9fc5ed99cc64e20",
"zh:203777eae1cdc711233315499643180604cff2324411b186b7cf07fdbe16f655",
"zh:3b2f18c9a8d28dac74dc6bbf168c946855ab9c68f053578d4630c50d5eaf30a0",
"zh:4822275985f6b74b6196c47112316a4252db22cf4ceaef7c9ab4c66d488abf2f",
"zh:53ea97562666c8a5a2f6d63d418a302a7f8ee4b7bb7da35dedaa89aa5708b7f0",
"zh:56b8a230901e3550c92a1d3f58ee9dafe9853f30fe4315af3ab28ae63262e15d",
"zh:6293ab7b1fd8206a0c853591f50186aca4a1eff117b2a773e10760a23a2c83e9",
"zh:9433970f79fb92d8aae3ee436db5630ab312c78b6dc9df9c1db3273a18f8aaa1",
"zh:95df406214f79b3b98222d7c7fe8fc319a3d90b7a9d53e1d5abbda5dfb8b9436",
"zh:a85880da0552a42c8f449390fbd7d8b03541d1a13e04bba9f1404fa658754260",
"zh:a95f6e9bd62c67e70eba1b1a14728856b9a6a28cd1e5e3be54a7718882c87e7f",
"zh:dd599b51c5beb34a4c6feece244fde07d2558d69929449ab1fd39a5ebe738781",
]
}
@ -93,18 +64,6 @@ provider "registry.terraform.io/hashicorp/kubernetes" {
version = "3.1.0"
hashes = [
"h1:oodIAuFMikXNmEtil5MQgP4dfSctUBYQiGJfjbsF3NY=",
"zh:0215c5c60be62028c09a2f22458e89cda3ef5830a632299f1d401eb3538874b0",
"zh:09ebb9f442431e278a310a9423f32caf467cb4b3cad3fe59573ca71fa7b14e20",
"zh:0c4e5912f83bb35846ae0a9ae54fc320706ee61894cd21cc6b4181b1c5a2fa5c",
"zh:1678c982853ad461e65ccb5e79d585e13ed109dd47dab2a66d3a7a304faeef65",
"zh:1c050a5c15e330457a9c18caacf61a923c59d663e13f2962e4b32f04fef523a0",
"zh:2c55bcec83be58ec132c7cb0a1ac644758b800d794fdc636d53a0eada0358a3a",
"zh:a062bb0aa316c08d8460c66a5d68da71da40de5d3bc3b31abcf3a1a9a19650f1",
"zh:a26fdea0afaa9b247c73c0b42843ca51ba7db0ac2571f9d3d50dcabd20ca1b98",
"zh:c872c9385a78d502bf5823d61cd3bb0f9a0585030e025eb12585c83451beeaa1",
"zh:f180879af931182beee4c8c0d9dab62b81d86f17ddcbe3786ef4c7cec9163a4e",
"zh:f569b65999264a9416862bca5cd2a6177d94ccb0424f3a4ef424428912b9cb3c",
"zh:f70f5789264069e0eef06f9b5d5fde955ef7206f7d446d1ce51a4c37a3f3e02f",
]
}
@ -154,19 +113,5 @@ provider "registry.terraform.io/telmate/proxmox" {
constraints = "3.0.2-rc07"
hashes = [
"h1:zp5hpQJQ4t4zROSLqdltVpBO+Riy9VugtfFbpyTw1aM=",
"zh:2ee860cd0a368b3eaa53f4a9ea46f16dab8a97929e813ea6ef55183f8112c2ca",
"zh:415965fd915bae2040d7f79e45f64d6e3ae61149c10114efeac1b34687d7296c",
"zh:6584b2055df0e32062561c615e3b6b2c291ca8c959440adda09ef3ec1e1436bd",
"zh:65dcfad71928e0a8dd9befc22524ed686be5020b0024dc5cca5184c7420eeb6b",
"zh:7253dc29bd265d33f2791ac4f779c5413f16720bb717de8e6c5fcb2c858648ea",
"zh:7ec8993da10a47606670f9f67cfd10719a7580641d11c7aa761121c4a2bd66fb",
"zh:999a3f7a9dcf517967fc537e6ec930a8172203642fb01b8e1f78f908373db210",
"zh:a50e6df7280eb6584a5fd2456e3f5b6df13b2ec8a7fa4605511e438e1863be42",
"zh:b25b329a1e42681c509d027fee0365414f0cc5062b65690cfc3386aab16132ae",
"zh:c028877fdb438ece48f7bc02b65bbae9ca7b7befbd260e519ccab6c0cbb39f26",
"zh:cf0eaa3ea9fcc6d62793637947f1b8d7c885b6ad74695ab47e134e4ff132190f",
"zh:d5ade3fae031cc629b7c512a7b60e46570f4c41665e88a595d7efd943dde5ab2",
"zh:f388c15ad1ecfc09e7361e3b98bae9b627a3a85f7b908c9f40650969c949901c",
"zh:f415cc6f735a3971faae6ac24034afdb9ee83373ef8de19a9631c187d5adc7db",
]
}

View file

@ -381,59 +381,32 @@ resource "kubernetes_deployment" "wealthfolio" {
total_cost_basis NUMERIC NOT NULL,
currency TEXT
);
-- Drop-in replacement for daily_account_valuation. Net contribution
-- is corrected for two classes of "synthetic" flows that broker-sync
-- emits to make Wealthfolio's bookkeeping balance, but which do NOT
-- represent real user contributions/withdrawals:
--
-- 1. Fidelity pension `unrealised-gains-offset` DEPOSITs emitted
-- to reconcile WF totals with PlanViewer. Otherwise WF treats
-- the gain as a contribution, growth shows £0.
--
-- 2. Schwab RSU `cash-flow-match` DEPOSITs and WITHDRAWALs
-- emitted to pair each vest BUY with a cash DEPOSIT and each
-- sell-to-cover SELL with a cash WITHDRAWAL. The user never
-- transfers cash to Schwab (RSUs are compensation) and the
-- sell proceeds leave the account to bank (counted elsewhere
-- when redeposited to IE/T212). Without correction, Schwab
-- shows huge negative net_contribution because sell proceeds
-- exceed vest cost basis cumulatively.
--
-- Scope: the cash-flow-match filter targets ONLY the Schwab account
-- (account_id below). For InvestEngine / Trading212 the same note
-- pattern marks REAL user deposits, so they must be preserved.
-- Drop-in replacement for daily_account_valuation that subtracts
-- the cumulative pension gains-offset (DEPOSITs emitted by
-- broker-sync Fidelity provider to reconcile WF totals with the
-- PlanViewer reported pot). Wealthfolio's data model treats the
-- offset as a cash contribution, so without this correction
-- net_contribution is inflated by the gain and growth shows £0
-- for the entire pension. The view re-exports the corrected
-- value AS net_contribution so panels can use it as a drop-in
-- replacement for the base table.
CREATE OR REPLACE VIEW dav_corrected AS
WITH synthetic_flows AS (
-- Fidelity pension unrealised-gains-offsets (always DEPOSIT).
SELECT account_id,
activity_date::date AS effective_date,
COALESCE(amount, 0) AS synthetic_net
WITH all_offsets AS (
SELECT account_id, activity_date::date AS effective_date, amount
FROM activities
WHERE notes LIKE 'fidelity-planviewer:unrealised-gains-offset%'
UNION ALL
-- Schwab RSU cash-flow-match (DEPOSIT positive, WITHDRAWAL negative).
SELECT account_id,
activity_date::date AS effective_date,
CASE
WHEN activity_type='DEPOSIT' THEN COALESCE(amount, 0)
WHEN activity_type='WITHDRAWAL' THEN -COALESCE(amount, 0)
ELSE 0
END AS synthetic_net
FROM activities
WHERE notes LIKE 'cash-flow-match:%'
AND account_id = '72d34e09-c1a6-41aa-99ea-abe3305ecc4a' -- Schwab
)
SELECT
d.id, d.account_id, d.valuation_date, d.account_currency,
d.base_currency, d.fx_rate_to_base, d.cash_balance,
d.investment_market_value, d.total_value, d.cost_basis,
d.net_contribution AS net_contribution_raw,
(d.net_contribution - COALESCE(SUM(s.synthetic_net), 0)) AS net_contribution,
COALESCE(SUM(s.synthetic_net), 0) AS synthetic_adjustment
(d.net_contribution - COALESCE(SUM(o.amount), 0)) AS net_contribution,
COALESCE(SUM(o.amount), 0) AS pension_gains_offset
FROM daily_account_valuation d
LEFT JOIN synthetic_flows s
ON s.account_id = d.account_id
AND s.effective_date <= d.valuation_date
LEFT JOIN all_offsets o
ON o.account_id = d.account_id
AND o.effective_date <= d.valuation_date
GROUP BY d.id, d.account_id, d.valuation_date, d.account_currency,
d.base_currency, d.fx_rate_to_base, d.cash_balance,
d.investment_market_value, d.total_value, d.cost_basis,