Commit graph

15 commits

Author SHA1 Message Date
98c4729622 fidelity: replace snapshot-push with delta gains-offset DEPOSITs
Some checks failed
ci/woodpecker/push/build Pipeline failed
CI / test (push) Has been cancelled
CI / build (push) Has been cancelled
CI / deploy (push) Has been cancelled
Per-fund snapshot import landed quantities but dropped cost basis +
needed a separate quote-push path we never identified. Snapshotting
also collided with WF's own TOTAL aggregation and ZEROED the Fidelity
cash balance.

Simpler plan: each monthly scrape emits a single DEPOSIT (or
WITHDRAWAL on a market drop) sized to the delta between the live
PlanViewer pot value and Wealthfolio's running total. dav_corrected
PG view continues to subtract these offsets from net_contribution so
the dashboard Growth/ROI math stays right.

- New gains_offset_delta_activity() — current_gain - prior_offset.
- New WealthfolioSink.cumulative_amount_with_notes_prefix() — sums
  the existing fidelity-planviewer:unrealised-gains-offset DEPOSITs
  in WF so we know what's already been emitted.
- CLI runs sync_provider_to_wealthfolio first (cash flows), then
  computes + emits the delta via import_activities.
- 4 new provider tests for the delta logic; full suite (144 + 1
  skipped) green; mypy + ruff clean.

The old fidelity_holdings_to_snapshot helper + push_manual_snapshots
sink method stay for future use but are no longer called.
2026-05-17 00:35:17 +00:00
cb159e17d9 fidelity: push per-fund manual snapshot instead of gains-offset DEPOSIT
PlanViewer's DisplayValuation.action JSON already gives us current
fund units + unit price; we were parsing it and throwing it away,
emitting only a single 'unrealised-gains-offset' DEPOSIT to make
Wealthfolio's totals match the dashboard. That hack double-counted
the gain as a cash contribution, hiding £35k of pension growth from
every contribution/growth/ROI panel.

New flow:
- FidelityPlanViewerProvider exposes last_holdings + last_total_contribution
  after fetch() drains.
- fidelity-ingest CLI converts to a ManualSnapshotPayload (cost basis
  allocated proportionally by current fund value share) and posts to
  WF /api/v1/snapshots/import. WF auto-creates unknown fund symbols
  with kind=INVESTMENT, quoteMode=MANUAL, quoteCcy=GBP.
- The gains-offset emission is removed entirely. Historical offset
  rows already in WF are corrected at the dashboard layer by the
  dav_corrected view shipped in infra@2841347e.

WealthfolioSink gains push_manual_snapshots() + ManualSnapshotPayload /
SnapshotPosition wire types. 11 sink tests (3 new) + 9 fidelity
provider tests (2 changed, 1 new) all green; mypy + ruff clean.
2026-05-16 13:56:25 +00:00
Viktor Barzin
c830856ba1 imap: route IE BUYs to ISA first-£20k / GIA overflow per UK tax year
## Context

Viktor's InvestEngine account has both an ISA and a GIA wrapper. Trade
confirmation emails (info@investengine.com) are identical between them —
subject "Here's how your portfolio looks now", body shows "Client name:
Viktor Barzin" with no portfolio/account type. That left the IMAP parser
hardcoded to route every IE BUY to the ISA (invest-engine-primary),
which produced a 2339-share over-count when 2023-24 GIA buys landed in
the ISA during the 2026-04-18 reconciliation.

Viktor's rule: from 6 April each tax year, BUYs fill ISA up to the
£20,000 cap, then overflow to GIA. This commit codifies that rule in a
standalone batch splitter and applies it at the ImapProvider boundary.

Also picks up a silent-drop bug surfaced during the same reconciliation:
WF's /import (unlike /import/check) rejects naive datetimes with
"Invalid date". The sink now coerces tzinfo=UTC defensively so every
provider gets the same guarantee.

## This change

- `_split_ie_by_isa_cap(activities)` — sorts all IE-ISA BUYs by date and
  walks them once per UK tax year (6 April boundary). A BUY whose running
  tax-year total BEFORE it is strictly below £20k stays on the ISA;
  otherwise it flips to a new `invest-engine-gia` account_id. No
  fractional splits — boundary activities go whole to whichever bucket
  their pre-running-total dictates. Non-IE and non-BUY activities pass
  through unchanged.
- `ImapProvider.accounts()` gains an `invest-engine-gia` Account so the
  pipeline's `_ensure_accounts` can resolve both.
- `ImapProvider.fetch()` calls the splitter on the full batch before
  applying the `since`/`before` date filter — batch-level sort
  guarantees consistent routing regardless of the order IMAP returns
  messages.
- `WealthfolioSink._activity_to_import_row` coerces naive datetimes to
  UTC so the row passes WF /import validation.

## What is NOT in this change

- No retroactive re-routing of data already in WF. Historical
  finance-mysql rows (all lumped to `invest-engine-primary` or
  `invest-engine-gia` by the existing heuristic) keep their current
  account assignment. If a past tax-year was routed "wrong" under the
  new rule, that's corrected manually via the WF API, not here.
- No change to the Schwab or trading212 paths.

## Verification

### Automated

\`\`\`
$ poetry run pytest tests/providers/test_imap.py -v
tests/providers/test_imap.py::test_uk_tax_year_start_before_april_6_rolls_back PASSED
tests/providers/test_imap.py::test_single_tax_year_under_cap_stays_isa PASSED
tests/providers/test_imap.py::test_overflow_past_cap_flips_to_gia PASSED
tests/providers/test_imap.py::test_tax_year_boundary_resets_cap PASSED
tests/providers/test_imap.py::test_out_of_order_activities_sorted_before_cap_applied PASSED
tests/providers/test_imap.py::test_non_ie_activities_passed_through_unchanged PASSED
6 passed in 0.36s

$ poetry run pytest -q --ignore=tests/test_cli.py
116 passed, 1 skipped in 2.76s

$ poetry run ruff check broker_sync/providers/imap.py broker_sync/sinks/wealthfolio.py
All checks passed!

$ poetry run mypy broker_sync/providers/imap.py broker_sync/sinks/wealthfolio.py
Success: no issues found in 2 source files
\`\`\`

### Manual verification

The tzinfo fix was validated against the live WF instance during the
2026-04-18 reconciliation — before the fix, /import returned
\`"errors": {"symbol": ["Invalid date '2022-05-24T00:00:00'."]}\` for
every IMAP activity; after, the same payload imported cleanly.

The splitter was not exercised against live IMAP data because Viktor's
mailbox only has Apr 2022 → Feb 2024 emails, all inside finance.position's
existing coverage. Running IMAP ingest with \`since=2024-04-06\` yields
fetched=0. The unit tests cover the boundary arithmetic; a live run
will happen when newer emails are parsed (or when finance coverage is
re-scoped).

## Reproduce locally

1. \`poetry install\`
2. \`poetry run pytest tests/providers/test_imap.py\`
3. Expected: 6 passed, 0 failed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 12:02:49 +00:00
Viktor Barzin
a190875f63 Add finance_mysql provider + CLI for historical backfill
finance.position (171 rows, 2020-06-07 to 2025-12-19) is the only source
of InvestEngine + Schwab trade history pre-dating the broker-sync project.
This provider reads it once and pushes every row into the correct WF
account (.L tickers → IE ISA, others → Schwab).

Dedup: external_id = 'finance-mysql:position:<PK>' — idempotent on re-run.
Auth: aiomysql as MySQL root (user-authorized) against the standalone
mysql:8.4 in-cluster service.

New CLI: broker-sync finance-mysql-import
New tests: 5 unit tests covering route, symbol normalise, BUY/SELL
detection.

poetry run pytest -q   →  114 passed, 1 skipped
poetry run mypy        →  clean (aiomysql shielded with type: ignore)
poetry run ruff check  →  clean
2026-04-17 22:38:21 +00:00
Viktor Barzin
74b2179c83 sinks: read summary.imported as truth for partial-persist detection
The /import response returns activities=[input echo with errors annotated]
— its length equals input size regardless of actual persistence. The
summary{total,imported,skipped,duplicates} block is the authoritative
signal. When imported<total, raise with errorMessage + skipped + duplicates
counts.
2026-04-17 22:30:24 +00:00
Viktor Barzin
4e2da87637 sinks: detect silent Wealthfolio /import drops
After the check step returns isValid=true + no errors, a row can still
be silently dropped by /import (response returns activities=[] on
200 OK). Root-cause is usually a field that check hydrates but
/import re-normalises differently (date string form, asset_id resolution).

When we send N valid rows and get back 0, raise ImportValidationError
with a snippet of the check output + first warning — gives the
operator a concrete hint to fix the producer instead of silently
growing dedup against activities that never landed.

poetry run pytest -q   →  109 passed, 1 skipped
poetry run mypy        →  clean
poetry run ruff check  →  clean
2026-04-17 22:24:36 +00:00
Viktor Barzin
6efd03570a Add imap-ingest CLI + ImapProvider: route emails to IE/Schwab parsers
Wires the IE + Schwab email parsers into an actual runnable sync. Walks
the IMAP mailbox, routes each message by sender domain:
  - *@investengine.com → invest_engine.parse_invest_engine_email
  - *@schwab.com       → schwab.parse_schwab_email
then pushes the resulting Activities through the shared pipeline.

broker-sync imap-ingest — new CLI command taking IMAP_HOST/USER/PASSWORD/
DIRECTORY (mirrors the old wealthfolio-sync image's env shape so the
Terraform CronJob's existing env wiring works unchanged).

Verified: poetry run pytest -q → 109 passed + 1 skipped; mypy strict
clean (37 files); ruff + yapf clean.
2026-04-17 22:12:05 +00:00
Viktor Barzin
b363032e42 sinks: feed /import/check enrichment into /import body
/import/check hydrates each ActivityImport with resolved assetId,
exchangeMic, quoteCcy, instrumentType, quoteMode. The /import endpoint
on Wealthfolio 3.2 does NOT re-resolve — passing an un-enriched row
returns 200 OK but silently drops the activity (activities=[] in the
response).

The first live run returned `imported=63 failed=0` but nothing reached
the database. Fixed by posting the hydrated rows from the check response
to /import instead of the original.

Requires the test to also return list-shaped check responses (matches
the upstream Json<Vec<ActivityImport>> signature on the Rust side).

poetry run pytest -q     70 passed
poetry run mypy          clean
poetry run ruff check    clean
2026-04-17 20:54:17 +00:00
Viktor Barzin
80ca009373 Match Wealthfolio accounts by providerAccountId, remap accountId on import
Context: Wealthfolio 3.2 generates its own UUIDs on POST /accounts, ignoring any
`id` we supply. Our logical Account.id lives on as `providerAccountId`, which
WF preserves verbatim.

Live run created six duplicate accounts because ensure_account looked up by
our `id`, never found it, and POSTed a new account on every attempt. Deleted
the duplicates manually via DELETE /accounts/{id}.

This change:
- ensure_account now returns Wealthfolio's UUID; matches existing via
  (provider, providerAccountId)
- pipeline remaps activity.account_id to the WF UUID at submission time
  but keeps dedup keyed on our stable id (WF resets must not blow away
  the whole dedup history)
- test updates to the new account-shape + dedup key expectations

poetry run pytest -q    70 passed
poetry run mypy         clean
poetry run ruff check   clean
2026-04-17 20:44:32 +00:00
Viktor Barzin
ba672a1633 sinks: add required isDraft/isValid fields on ActivityImport 2026-04-17 20:37:38 +00:00
Viktor Barzin
1d23bf6ed7 sinks: switch Wealthfolio import to JSON body (not multipart CSV)
/api/v1/activities/import expects Content-Type: application/json with body
{"activities": [ActivityImport, ...]} where ActivityImport is camelCase:
{date, symbol, activityType, quantity?, unitPrice?, currency, fee?, amount?,
comment?, accountId?, ...}. Source: crates/core/src/activities/activities_model.rs

Live run failure was HTTP 415 Unsupported Media Type because we were uploading
a CSV in multipart/form-data; that endpoint is JSON-only.

Also handle two response shapes on import — older builds return a list, current
build returns {activities: [...]}.

Test plan:
  poetry run pytest -q   →  70 passed
  poetry run mypy        →  clean
  poetry run ruff check  →  clean
2026-04-17 20:34:12 +00:00
Viktor Barzin
ea881e272b sinks: match Wealthfolio NewAccount camelCase schema + required booleans
Wealthfolio 3.2's POST /api/v1/accounts was 422ing on live traffic — its
NewAccount struct uses camelCase field names and requires isDefault +
isActive as booleans. Reference:
https://github.com/afadil/wealthfolio/blob/main/apps/server/src/models.rs#L~145

Sends trackingMode=TRANSACTIONS so Wealthfolio computes holdings from
our imported activities (vs HOLDINGS mode which requires periodic
holdings snapshots). Populates providerAccountId so the broker account
is traceable back to our sync's id scheme.

Test plan:
  poetry run pytest -q   →  70 passed
  poetry run mypy        →  clean
  poetry run ruff check  →  clean

Live re-run of the backfill Job follows this commit's image rebuild.
2026-04-17 20:29:43 +00:00
Viktor Barzin
66cf0e0399 Fix live Wealthfolio login + Dockerfile poetry path
Context
-------
Two live-integration bugs surfaced during the Phase 0.5 auth-spike
run against the restored production Wealthfolio.

1. Wealthfolio 3.2's LoginRequest schema is `{ password: String }` —
   it rejects any request with an unknown `username` field as HTTP
   400 (empty body, hard to debug). Upstream source:
   https://github.com/afadil/wealthfolio/blob/main/apps/server/src/auth.rs#L86-L88

2. Dockerfile referenced `/opt/poetry/bin/poetry` but pip install
   puts poetry on the normal PATH; POETRY_HOME only affects the
   self-installer, not `pip install`. Exit 127 in GHA build.

This change
-----------
- WealthfolioSink.login() sends `{password}` only; kept `username`
  constructor arg as a stub for the day Wealthfolio adds multi-user.
- Dockerfile drops POETRY_HOME and uses `poetry` on PATH.
- Test: `_login_ok` now asserts body == {"password": "hunter2"}
  ("hunter2" is the XKCD placeholder — not a real credential).

Test plan
---------
## Automated
- poetry run pytest -q  →  70 passed
- poetry run mypy broker_sync tests  →  Success: no issues found in 29 source files
- poetry run ruff check .  →  All checks passed!

## Manual Verification (executed live)
```
kubectl -n wealthfolio port-forward svc/wealthfolio 18080:80 &
WF_BASE_URL=http://localhost:18080 WF_USERNAME=admin \
WF_PASSWORD=<from-vault> \
poetry run broker-sync auth-spike
→ "Logged in. 1 account(s) visible."
```
2026-04-17 20:17:24 +00:00
Viktor Barzin
e7da408a85 Add WealthfolioSink with CSV import + cookie reuse
Context
-------
This is the Phase 0.5 deliverable — the hardest-to-validate unknown
in the plan. Wealthfolio auth is JWT HttpOnly cookie with a 5-req/min
login rate limit. CronJob pods are ephemeral, so we persist cookies
to disk between runs (shared PVC in production).

Plan stress-test also flagged: use the CSV import path, not per-row
JSON POST. Wealthfolio's UI uses /activities/import and its dedup
logic is battle-tested; CSVs double as audit artefacts we can replay.

This change
-----------
- WealthfolioSink (httpx async): login with username/password, persists
  cookie dict to session_path on disk, attaches it as a Cookie header
  on subsequent calls.
- 401 on a non-login endpoint triggers a single re-login + retry.
- ensure_account() is idempotent — GETs the account list first, only
  POSTs /accounts if id is missing.
- import_activities() always runs /activities/import/check first; any
  non-2xx there raises ImportValidationError and we never touch the
  real import endpoint. Protects against half-written state when the
  broker emits a symbol Wealthfolio doesn't know.
- httpx.MockTransport-based tests cover: login persistence, 401 on
  login raises UnauthorizedError, session reuse from disk, 401 retry
  path, ensure_account idempotency + creation, import dry-run-then-real
  sequencing, halt on check failure.

Not yet covered (deferred):
- Multi-process file lock on session_path (single-process enough for
  now; Phase 1 adds it when multiple CronJobs run concurrently).
- 429 jittered backoff (TBD when Wealthfolio actually rate-limits us).

Test plan
---------
## Automated
- poetry run pytest -q  →  31 passed
- poetry run mypy broker_sync tests  →  Success: no issues found in 17 source files
- poetry run ruff check .  →  All checks passed!

## Manual Verification
Live auth spike against https://wealthfolio.viktorbarzin.me deferred
until the password is seeded into Vault at secret/broker-sync/wealthfolio
in a follow-up commit (needs Viktor's Vault session).
2026-04-17 19:22:34 +00:00
Viktor Barzin
f306dc9605 Add Provider protocol and normaliser
Context
-------
Every broker connector needs a uniform shape so the orchestrator can
fan out without knowing provider-specific details. Normalisation (GBP
conversion) lives outside providers on purpose — keeping providers
native-currency-emitters means we can re-normalise historical activity
when HMRC rates land without re-fetching from the broker.

This change
-----------
- providers/base.py: Provider Protocol with `accounts()` and async
  `fetch(since, before)` iterator. No abstract base class — duck-typed
  Protocol so each concrete provider stays independent.
- normaliser.py: takes a native Activity + FxCache, returns a copy
  with amount_gbp/fx_rate_gbp/fx_rate_source filled in. Two modes:
  qty*price for BUY/SELL, amount for DIVIDEND/DEPOSIT/etc.
- Namespace packages for providers/, providers/parsers/, sinks/ so
  future modules slot in cleanly.

Test plan
---------
## Automated
- poetry run pytest -q  →  23 passed
- poetry run mypy broker_sync tests  →  Success: no issues found in 14 source files
- poetry run ruff check .  →  All checks passed!

## Manual Verification
Not applicable at this layer.
2026-04-17 19:20:12 +00:00