broker-sync

Author	SHA1	Message	Date
Viktor Barzin	4e2da87637	sinks: detect silent Wealthfolio /import drops After the check step returns isValid=true + no errors, a row can still be silently dropped by /import (response returns activities=[] on 200 OK). Root-cause is usually a field that check hydrates but /import re-normalises differently (date string form, asset_id resolution). When we send N valid rows and get back 0, raise ImportValidationError with a snippet of the check output + first warning — gives the operator a concrete hint to fix the producer instead of silently growing dedup against activities that never landed. poetry run pytest -q → 109 passed, 1 skipped poetry run mypy → clean poetry run ruff check → clean	2026-04-17 22:24:36 +00:00
Viktor Barzin	6efd03570a	Add imap-ingest CLI + ImapProvider: route emails to IE/Schwab parsers Wires the IE + Schwab email parsers into an actual runnable sync. Walks the IMAP mailbox, routes each message by sender domain: - @investengine.com → invest_engine.parse_invest_engine_email - @schwab.com → schwab.parse_schwab_email then pushes the resulting Activities through the shared pipeline. broker-sync imap-ingest — new CLI command taking IMAP_HOST/USER/PASSWORD/ DIRECTORY (mirrors the old wealthfolio-sync image's env shape so the Terraform CronJob's existing env wiring works unchanged). Verified: poetry run pytest -q → 109 passed + 1 skipped; mypy strict clean (37 files); ruff + yapf clean.	2026-04-17 22:12:05 +00:00
Viktor Barzin	b363032e42	sinks: feed /import/check enrichment into /import body /import/check hydrates each ActivityImport with resolved assetId, exchangeMic, quoteCcy, instrumentType, quoteMode. The /import endpoint on Wealthfolio 3.2 does NOT re-resolve — passing an un-enriched row returns 200 OK but silently drops the activity (activities=[] in the response). The first live run returned `imported=63 failed=0` but nothing reached the database. Fixed by posting the hydrated rows from the check response to /import instead of the original. Requires the test to also return list-shaped check responses (matches the upstream Json<Vec<ActivityImport>> signature on the Rust side). poetry run pytest -q 70 passed poetry run mypy clean poetry run ruff check clean	2026-04-17 20:54:17 +00:00
Viktor Barzin	80ca009373	Match Wealthfolio accounts by providerAccountId, remap accountId on import Context: Wealthfolio 3.2 generates its own UUIDs on POST /accounts, ignoring any `id` we supply. Our logical Account.id lives on as `providerAccountId`, which WF preserves verbatim. Live run created six duplicate accounts because ensure_account looked up by our `id`, never found it, and POSTed a new account on every attempt. Deleted the duplicates manually via DELETE /accounts/{id}. This change: - ensure_account now returns Wealthfolio's UUID; matches existing via (provider, providerAccountId) - pipeline remaps activity.account_id to the WF UUID at submission time but keeps dedup keyed on our stable id (WF resets must not blow away the whole dedup history) - test updates to the new account-shape + dedup key expectations poetry run pytest -q 70 passed poetry run mypy clean poetry run ruff check clean	2026-04-17 20:44:32 +00:00
Viktor Barzin	ba672a1633	sinks: add required isDraft/isValid fields on ActivityImport	2026-04-17 20:37:38 +00:00
Viktor Barzin	1d23bf6ed7	sinks: switch Wealthfolio import to JSON body (not multipart CSV) /api/v1/activities/import expects Content-Type: application/json with body {"activities": [ActivityImport, ...]} where ActivityImport is camelCase: {date, symbol, activityType, quantity?, unitPrice?, currency, fee?, amount?, comment?, accountId?, ...}. Source: crates/core/src/activities/activities_model.rs Live run failure was HTTP 415 Unsupported Media Type because we were uploading a CSV in multipart/form-data; that endpoint is JSON-only. Also handle two response shapes on import — older builds return a list, current build returns {activities: [...]}. Test plan: poetry run pytest -q → 70 passed poetry run mypy → clean poetry run ruff check → clean	2026-04-17 20:34:12 +00:00
Viktor Barzin	ea881e272b	sinks: match Wealthfolio NewAccount camelCase schema + required booleans Wealthfolio 3.2's POST /api/v1/accounts was 422ing on live traffic — its NewAccount struct uses camelCase field names and requires isDefault + isActive as booleans. Reference: https://github.com/afadil/wealthfolio/blob/main/apps/server/src/models.rs#L~145 Sends trackingMode=TRANSACTIONS so Wealthfolio computes holdings from our imported activities (vs HOLDINGS mode which requires periodic holdings snapshots). Populates providerAccountId so the broker account is traceable back to our sync's id scheme. Test plan: poetry run pytest -q → 70 passed poetry run mypy → clean poetry run ruff check → clean Live re-run of the backfill Job follows this commit's image rebuild.	2026-04-17 20:29:43 +00:00
Viktor Barzin	66cf0e0399	Fix live Wealthfolio login + Dockerfile poetry path Context ------- Two live-integration bugs surfaced during the Phase 0.5 auth-spike run against the restored production Wealthfolio. 1. Wealthfolio 3.2's LoginRequest schema is `{ password: String }` — it rejects any request with an unknown `username` field as HTTP 400 (empty body, hard to debug). Upstream source: https://github.com/afadil/wealthfolio/blob/main/apps/server/src/auth.rs#L86-L88 2. Dockerfile referenced `/opt/poetry/bin/poetry` but pip install puts poetry on the normal PATH; POETRY_HOME only affects the self-installer, not `pip install`. Exit 127 in GHA build. This change ----------- - WealthfolioSink.login() sends `{password}` only; kept `username` constructor arg as a stub for the day Wealthfolio adds multi-user. - Dockerfile drops POETRY_HOME and uses `poetry` on PATH. - Test: `_login_ok` now asserts body == {"password": "hunter2"} ("hunter2" is the XKCD placeholder — not a real credential). Test plan --------- ## Automated - poetry run pytest -q → 70 passed - poetry run mypy broker_sync tests → Success: no issues found in 29 source files - poetry run ruff check . → All checks passed! ## Manual Verification (executed live) ``` kubectl -n wealthfolio port-forward svc/wealthfolio 18080:80 & WF_BASE_URL=http://localhost:18080 WF_USERNAME=admin \ WF_PASSWORD=<from-vault> \ poetry run broker-sync auth-spike → "Logged in. 1 account(s) visible." ```	2026-04-17 20:17:24 +00:00
Viktor Barzin	e7da408a85	Add WealthfolioSink with CSV import + cookie reuse Context ------- This is the Phase 0.5 deliverable — the hardest-to-validate unknown in the plan. Wealthfolio auth is JWT HttpOnly cookie with a 5-req/min login rate limit. CronJob pods are ephemeral, so we persist cookies to disk between runs (shared PVC in production). Plan stress-test also flagged: use the CSV import path, not per-row JSON POST. Wealthfolio's UI uses /activities/import and its dedup logic is battle-tested; CSVs double as audit artefacts we can replay. This change ----------- - WealthfolioSink (httpx async): login with username/password, persists cookie dict to session_path on disk, attaches it as a Cookie header on subsequent calls. - 401 on a non-login endpoint triggers a single re-login + retry. - ensure_account() is idempotent — GETs the account list first, only POSTs /accounts if id is missing. - import_activities() always runs /activities/import/check first; any non-2xx there raises ImportValidationError and we never touch the real import endpoint. Protects against half-written state when the broker emits a symbol Wealthfolio doesn't know. - httpx.MockTransport-based tests cover: login persistence, 401 on login raises UnauthorizedError, session reuse from disk, 401 retry path, ensure_account idempotency + creation, import dry-run-then-real sequencing, halt on check failure. Not yet covered (deferred): - Multi-process file lock on session_path (single-process enough for now; Phase 1 adds it when multiple CronJobs run concurrently). - 429 jittered backoff (TBD when Wealthfolio actually rate-limits us). Test plan --------- ## Automated - poetry run pytest -q → 31 passed - poetry run mypy broker_sync tests → Success: no issues found in 17 source files - poetry run ruff check . → All checks passed! ## Manual Verification Live auth spike against https://wealthfolio.viktorbarzin.me deferred until the password is seeded into Vault at secret/broker-sync/wealthfolio in a follow-up commit (needs Viktor's Vault session).	2026-04-17 19:22:34 +00:00
Viktor Barzin	f306dc9605	Add Provider protocol and normaliser Context ------- Every broker connector needs a uniform shape so the orchestrator can fan out without knowing provider-specific details. Normalisation (GBP conversion) lives outside providers on purpose — keeping providers native-currency-emitters means we can re-normalise historical activity when HMRC rates land without re-fetching from the broker. This change ----------- - providers/base.py: Provider Protocol with `accounts()` and async `fetch(since, before)` iterator. No abstract base class — duck-typed Protocol so each concrete provider stays independent. - normaliser.py: takes a native Activity + FxCache, returns a copy with amount_gbp/fx_rate_gbp/fx_rate_source filled in. Two modes: qty*price for BUY/SELL, amount for DIVIDEND/DEPOSIT/etc. - Namespace packages for providers/, providers/parsers/, sinks/ so future modules slot in cleanly. Test plan --------- ## Automated - poetry run pytest -q → 23 passed - poetry run mypy broker_sync tests → Success: no issues found in 14 source files - poetry run ruff check . → All checks passed! ## Manual Verification Not applicable at this layer.	2026-04-17 19:20:12 +00:00

10 commits