After the check step returns isValid=true + no errors, a row can still
be silently dropped by /import (response returns activities=[] on
200 OK). Root-cause is usually a field that check hydrates but
/import re-normalises differently (date string form, asset_id resolution).
When we send N valid rows and get back 0, raise ImportValidationError
with a snippet of the check output + first warning — gives the
operator a concrete hint to fix the producer instead of silently
growing dedup against activities that never landed.
poetry run pytest -q → 109 passed, 1 skipped
poetry run mypy → clean
poetry run ruff check → clean
Wires the IE + Schwab email parsers into an actual runnable sync. Walks
the IMAP mailbox, routes each message by sender domain:
- *@investengine.com → invest_engine.parse_invest_engine_email
- *@schwab.com → schwab.parse_schwab_email
then pushes the resulting Activities through the shared pipeline.
broker-sync imap-ingest — new CLI command taking IMAP_HOST/USER/PASSWORD/
DIRECTORY (mirrors the old wealthfolio-sync image's env shape so the
Terraform CronJob's existing env wiring works unchanged).
Verified: poetry run pytest -q → 109 passed + 1 skipped; mypy strict
clean (37 files); ruff + yapf clean.
/import/check hydrates each ActivityImport with resolved assetId,
exchangeMic, quoteCcy, instrumentType, quoteMode. The /import endpoint
on Wealthfolio 3.2 does NOT re-resolve — passing an un-enriched row
returns 200 OK but silently drops the activity (activities=[] in the
response).
The first live run returned `imported=63 failed=0` but nothing reached
the database. Fixed by posting the hydrated rows from the check response
to /import instead of the original.
Requires the test to also return list-shaped check responses (matches
the upstream Json<Vec<ActivityImport>> signature on the Rust side).
poetry run pytest -q 70 passed
poetry run mypy clean
poetry run ruff check clean
Context: Wealthfolio 3.2 generates its own UUIDs on POST /accounts, ignoring any
`id` we supply. Our logical Account.id lives on as `providerAccountId`, which
WF preserves verbatim.
Live run created six duplicate accounts because ensure_account looked up by
our `id`, never found it, and POSTed a new account on every attempt. Deleted
the duplicates manually via DELETE /accounts/{id}.
This change:
- ensure_account now returns Wealthfolio's UUID; matches existing via
(provider, providerAccountId)
- pipeline remaps activity.account_id to the WF UUID at submission time
but keeps dedup keyed on our stable id (WF resets must not blow away
the whole dedup history)
- test updates to the new account-shape + dedup key expectations
poetry run pytest -q 70 passed
poetry run mypy clean
poetry run ruff check clean
/api/v1/activities/import expects Content-Type: application/json with body
{"activities": [ActivityImport, ...]} where ActivityImport is camelCase:
{date, symbol, activityType, quantity?, unitPrice?, currency, fee?, amount?,
comment?, accountId?, ...}. Source: crates/core/src/activities/activities_model.rs
Live run failure was HTTP 415 Unsupported Media Type because we were uploading
a CSV in multipart/form-data; that endpoint is JSON-only.
Also handle two response shapes on import — older builds return a list, current
build returns {activities: [...]}.
Test plan:
poetry run pytest -q → 70 passed
poetry run mypy → clean
poetry run ruff check → clean
Wealthfolio 3.2's POST /api/v1/accounts was 422ing on live traffic — its
NewAccount struct uses camelCase field names and requires isDefault +
isActive as booleans. Reference:
https://github.com/afadil/wealthfolio/blob/main/apps/server/src/models.rs#L~145
Sends trackingMode=TRANSACTIONS so Wealthfolio computes holdings from
our imported activities (vs HOLDINGS mode which requires periodic
holdings snapshots). Populates providerAccountId so the broker account
is traceable back to our sync's id scheme.
Test plan:
poetry run pytest -q → 70 passed
poetry run mypy → clean
poetry run ruff check → clean
Live re-run of the backfill Job follows this commit's image rebuild.
Context
-------
Two live-integration bugs surfaced during the Phase 0.5 auth-spike
run against the restored production Wealthfolio.
1. Wealthfolio 3.2's LoginRequest schema is `{ password: String }` —
it rejects any request with an unknown `username` field as HTTP
400 (empty body, hard to debug). Upstream source:
https://github.com/afadil/wealthfolio/blob/main/apps/server/src/auth.rs#L86-L88
2. Dockerfile referenced `/opt/poetry/bin/poetry` but pip install
puts poetry on the normal PATH; POETRY_HOME only affects the
self-installer, not `pip install`. Exit 127 in GHA build.
This change
-----------
- WealthfolioSink.login() sends `{password}` only; kept `username`
constructor arg as a stub for the day Wealthfolio adds multi-user.
- Dockerfile drops POETRY_HOME and uses `poetry` on PATH.
- Test: `_login_ok` now asserts body == {"password": "hunter2"}
("hunter2" is the XKCD placeholder — not a real credential).
Test plan
---------
## Automated
- poetry run pytest -q → 70 passed
- poetry run mypy broker_sync tests → Success: no issues found in 29 source files
- poetry run ruff check . → All checks passed!
## Manual Verification (executed live)
```
kubectl -n wealthfolio port-forward svc/wealthfolio 18080:80 &
WF_BASE_URL=http://localhost:18080 WF_USERNAME=admin \
WF_PASSWORD=<from-vault> \
poetry run broker-sync auth-spike
→ "Logged in. 1 account(s) visible."
```
Context
-------
This is the Phase 0.5 deliverable — the hardest-to-validate unknown
in the plan. Wealthfolio auth is JWT HttpOnly cookie with a 5-req/min
login rate limit. CronJob pods are ephemeral, so we persist cookies
to disk between runs (shared PVC in production).
Plan stress-test also flagged: use the CSV import path, not per-row
JSON POST. Wealthfolio's UI uses /activities/import and its dedup
logic is battle-tested; CSVs double as audit artefacts we can replay.
This change
-----------
- WealthfolioSink (httpx async): login with username/password, persists
cookie dict to session_path on disk, attaches it as a Cookie header
on subsequent calls.
- 401 on a non-login endpoint triggers a single re-login + retry.
- ensure_account() is idempotent — GETs the account list first, only
POSTs /accounts if id is missing.
- import_activities() always runs /activities/import/check first; any
non-2xx there raises ImportValidationError and we never touch the
real import endpoint. Protects against half-written state when the
broker emits a symbol Wealthfolio doesn't know.
- httpx.MockTransport-based tests cover: login persistence, 401 on
login raises UnauthorizedError, session reuse from disk, 401 retry
path, ensure_account idempotency + creation, import dry-run-then-real
sequencing, halt on check failure.
Not yet covered (deferred):
- Multi-process file lock on session_path (single-process enough for
now; Phase 1 adds it when multiple CronJobs run concurrently).
- 429 jittered backoff (TBD when Wealthfolio actually rate-limits us).
Test plan
---------
## Automated
- poetry run pytest -q → 31 passed
- poetry run mypy broker_sync tests → Success: no issues found in 17 source files
- poetry run ruff check . → All checks passed!
## Manual Verification
Live auth spike against https://wealthfolio.viktorbarzin.me deferred
until the password is seeded into Vault at secret/broker-sync/wealthfolio
in a follow-up commit (needs Viktor's Vault session).
Context
-------
Every broker connector needs a uniform shape so the orchestrator can
fan out without knowing provider-specific details. Normalisation (GBP
conversion) lives outside providers on purpose — keeping providers
native-currency-emitters means we can re-normalise historical activity
when HMRC rates land without re-fetching from the broker.
This change
-----------
- providers/base.py: Provider Protocol with `accounts()` and async
`fetch(since, before)` iterator. No abstract base class — duck-typed
Protocol so each concrete provider stays independent.
- normaliser.py: takes a native Activity + FxCache, returns a copy
with amount_gbp/fx_rate_gbp/fx_rate_source filled in. Two modes:
qty*price for BUY/SELL, amount for DIVIDEND/DEPOSIT/etc.
- Namespace packages for providers/, providers/parsers/, sinks/ so
future modules slot in cleanly.
Test plan
---------
## Automated
- poetry run pytest -q → 23 passed
- poetry run mypy broker_sync tests → Success: no issues found in 14 source files
- poetry run ruff check . → All checks passed!
## Manual Verification
Not applicable at this layer.