Compare commits

...

31 commits

Author SHA1 Message Date
Viktor Barzin
dfee29fda7 [ci] Add Woodpecker build pushing to forgejo.viktorbarzin.me/viktor/wealthfolio-sync
Some checks failed
ci/woodpecker/manual/deploy Pipeline failed
CI / test (push) Has been cancelled
CI / build (push) Has been cancelled
CI / deploy (push) Has been cancelled
Companion to the existing GHA pipeline that pushes broker-sync to
DockerHub. The Woodpecker build pushes to Forgejo as wealthfolio-sync
(image name kept to match the existing infra/stacks/wealthfolio/main.tf
CronJob reference, which has been broken since registry-private lost
the image).
2026-05-07 22:33:29 +00:00
Viktor Barzin
1d1e20b72b schwab: detect vest-confirmation emails + emit VestEvent
Some checks are pending
CI / test (push) Waiting to run
CI / build (push) Blocked by required conditions
CI / deploy (push) Blocked by required conditions
Extends parse_schwab_email to handle Schwab's RSU Release Confirmation
emails alongside the existing trade confirmations. Adds:

- `VestEvent` dataclass in models.py — carries vest_date, ticker,
  shares_vested, shares_sold_to_cover, fmv_at_vest_usd, tax_withheld_usd.
  Written to payslip_ingest.rsu_vest_events by a postgres sink (pending
  a real email fixture + cross-service DB grant).
- `parse_schwab_email_full()` — new entry point returning both
  `list[Activity]` and `VestEvent | None`. The legacy
  `parse_schwab_email()` shape is preserved for existing callers.
- Vest-release dispatch heuristic: HTML body mentions "Release
  Confirmation" / "Award Vesting" / "RSU Release". On match, extract
  vest fields via label regexes; the full vest becomes a BUY Activity
  and the sell-to-cover slice becomes a SELL Activity at the same FMV
  (net zero cash on the day). Gross vest + sell-to-cover returned so
  Wealthfolio gets the full portfolio picture.
- Tests: 3 new (vest roundtrip, unparseable-vest safety, legacy shape
  preserved); existing 6 unchanged.

The regex heuristics will need tightening once a real email sample
exists — the HTML structure observed in public Schwab emails may
differ in material ways. For now, unmatched vest bodies return
empty-result (no Activity, no VestEvent) rather than crashing the
IMAP batch.

Part of: code-860
2026-04-19 18:27:58 +00:00
Viktor Barzin
6f3bcea23e ci: fix ruff E501 + mypy None-comparison warning
test_imap.py:49 — one-line comment ran past the 100-char line limit
introduced in commit c830856. Split the "£20,000 cap" note onto its
own line above the call.

test_fidelity_planviewer.py:108 — mypy flagged `offset.amount > 0`
where amount is typed Decimal | None. Added an explicit `is not None`
guard; runtime behaviour unchanged (we already check offset is not
None two lines earlier).

$ poetry run ruff check . → All checks passed!
$ poetry run mypy broker_sync tests → Success: no issues found in 43 source files
$ poetry run pytest -q → 133 passed, 1 skipped

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 22:52:38 +00:00
Viktor Barzin
6450201af0 pipeline: emit matching DEPOSIT/WITHDRAWAL for every BUY/SELL
## Context

The 2026-04-18 reconciliation ended with Wealthfolio's historical Net
Worth chart showing cliff-jumps on 5 dates — the single-day lump cash
offsets we'd posted to "zero out" phantom cash. An operational fix
replaced those 6 lumps with 231 per-BUY/SELL matched DEPOSIT/WITHDRAWAL
rows (see code-r9n note). That made the chart smooth — but only for
today's data. Any future broker-sync run would re-introduce phantom
cash because providers emit BUY/SELL only; nothing on the cash side.

This commit bakes the match into the pipeline so **future syncs
self-balance cash at import time** and the chart stays smooth.

## This change

- broker_sync/pipeline.py
  - New _matched_cash_flow(a): returns a DEPOSIT for a BUY (amount =
    qty * unit_price + fee) or a WITHDRAWAL for a SELL (amount =
    qty * unit_price - fee). Returns None for every other activity
    type — DEPOSIT/WITHDRAWAL/DIVIDEND/etc. already touch cash
    directly. The synthetic activity carries a deterministic
    external_id `cash-flow-match:<buy|sell>:<original external_id>`
    so SyncRecordStore dedup handles idempotency across runs.
  - New _with_cash_flow_match(a): expand helper — returns [a] or
    [a, match]. Pure, testable.
  - sync_provider_to_wealthfolio loops over the expansion, so each
    activity may now contribute up to two rows to the batch. `fetched`
    still counts provider-side activities only; `new_after_dedup` +
    `imported` + `failed` count expanded rows.
- tests/test_pipeline.py
  - Updated two existing pipeline integration tests to reflect the
    now-larger batch shape (3 BUYs become 6 rows after expansion).
  - 5 new unit tests for the helpers: BUY → DEPOSIT with fee,
    SELL → WITHDRAWAL net of fee, DEPOSIT/WITHDRAWAL/DIVIDEND pass
    through, zero-amount trades skipped, _with_cash_flow_match
    returns the right cardinality.

## What is NOT in this change

- Provider-level opt-out (e.g., Provider.emits_matching_cash_flow =
  True). No current provider emits real cash flows alongside trades
  (Trading212 only calls /orders, not /transactions), so the default
  "always match" is safe. If we ever wire a provider that pulls real
  bank-transfer dates, add the opt-out then.
- Retroactive cleanup of already-imported WF accounts — already done
  operationally today.

## Verification

### Automated

$ poetry run pytest tests/test_pipeline.py -v
7 passed in 0.40s

$ poetry run pytest -q
133 passed, 1 skipped in 8.58s

$ poetry run mypy broker_sync/pipeline.py tests/test_pipeline.py
Success: no issues found in 2 source files

$ poetry run ruff check broker_sync/pipeline.py tests/test_pipeline.py
All checks passed!

### Manual — next sync

Once this image ships and broker-sync-trading212 / broker-sync-imap /
broker-sync-fidelity run, confirm:
1. kubectl -n broker-sync logs job/<next-run> → fetched=N new=2N
   imported=2N failed=0 (doubled due to matches).
2. WF /api/v1/holdings?accountId=<uuid> → cash ≈ £0 for every currency
   after import.
3. Net Worth chart has no new cliff-jumps.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 19:12:49 +00:00
Viktor Barzin
7c9be544dc fidelity-planviewer: bake Chromium into the image for headless Playwright
## Context

The Fidelity provider (commit 804e6a8) drives headless Chromium via
Playwright to refresh the PlanViewer session cookie jar and scrape the
Struts2 transaction history page. The image needs both the Chromium
runtime and the Debian system libs Chromium dynamic-links against.

## This change

- Adds Playwright's documented Debian 12 dependency set
  (fonts-liberation, libnss3, libxkbcommon0, xvfb, etc.).
- Creates /app/.playwright-browsers owned by the broker user so the
  non-root process can write the Chromium install, and runs `playwright
  install chromium` as that user so the browser lands in the right
  cache path (PLAYWRIGHT_BROWSERS_PATH=/app/.playwright-browsers).
- Image size will grow by ~300MB (Chromium headless shell is ~110MB
  compressed, plus libs). Acceptable — broker-sync runs once a day so
  pull cost is a one-shot.

## What is NOT in this change

- Terraform CronJob / monitoring — separate commit in the infra repo.

## Verification

$ docker build -t broker-sync:test . → (will run in CI)
$ docker run --rm broker-sync:test fidelity-seed --help → shows the
  CLI help (can't actually run fidelity-seed headlessly).
$ poetry run pytest -q (local) → 128 passed, 1 skipped.

Reproduce locally:
1. docker build -t broker-sync:fidelity-test .
2. docker run --rm -v $PWD/tests/fixtures/fidelity:/data broker-sync:fidelity-test \
     python -c "from playwright.sync_api import sync_playwright; \
                with sync_playwright() as p: b = p.chromium.launch(); b.close(); print('ok')"
3. Expected: "ok" — Chromium launches successfully.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 18:50:54 +00:00
Viktor Barzin
804e6a89de fidelity-planviewer: wire provider to real PlanViewer session + JSON API
## Context

Prior commit 832732a scaffolded the provider with a stub fetch() that
raised FidelityProviderConfigError. This commit replaces the stub with
the end-to-end ingest flow, validated against the real PlanViewer site
during a live login session on 2026-04-18.

Fidelity UK PlanViewer mixes a legacy Struts2 HTML app
(www.planviewer.fidelity.co.uk) with a React SPA at
pv.planviewer.fidelity.co.uk. Authentication is PingFederate OAuth2 at
id.fidelity.co.uk — password + memorable word + SMS OTP, with a
remember-device cookie that keeps the session alive for weeks. The
transaction history is server-rendered HTML at DisplayMyPlanMemberTransHist.action;
current fund holdings come from the DisplayValuation.action JSON XHR.

Both live behind the same cookie jar, so one Playwright session (seeded
interactively once, kept alive via storage_state) can scrape both.

## This change

- broker_sync/providers/parsers/fidelity.py (NEW)
  - parse_transactions_html: extracts cash-impacting rows from the
    #myplan_member_transhist_support table, skips Bulk Switches (no cash
    movement), emits FidelityCashTx with deterministic external_id for
    dedup.
  - parse_valuation_json: lifts fund code + name + units + price +
    contribution-type breakdown from the JSON payload.
- broker_sync/providers/fidelity_planviewer.py (REWRITTEN)
  - FidelityPlanViewerProvider.fetch() now loads storage_state, boots
    headless Chromium, navigates landing → main page (to hydrate the
    SPA session + capture DisplayValuation XHR) → transactions page
    with a wide 01 Jan 1990 → today window. Raises FidelitySessionError
    if PlanViewer shows the 15-min idle page or redirects back to
    id.fidelity.co.uk.
  - _gains_offset_activity emits a synthetic DEPOSIT/WITHDRAWAL with a
    date-keyed external_id so WF Net Worth reconciles to the
    Fidelity-reported pot value without stacking duplicates across
    monthly runs.
  - Rolls storage_state back to disk after each run, extending session
    TTL.
- tests/providers/test_fidelity_planviewer.py (EXTENDED)
  - 8 tests against a real captured fixture: account shape, guard on
    missing storage_state, full-fixture round-trip (51 txs summing to
    £102,004.15), Bulk Switch filtered, deterministic external_id,
    valuation parse with fund-code resolution, gains-offset direction
    + skip-when-empty.
- tests/fixtures/fidelity/transactions-full.html + valuation.json (NEW)
  - Sanitised captures from the 2026-04-18 live session.

## What is NOT in this change

- CronJob + Vault secret wiring + Prometheus alert in
  infra/stacks/broker-sync/main.tf — next commit.
- Dockerfile Chromium install — next commit.
- The scrape-and-import was already done manually (51 activities +
  1 gains offset imported into WF account a7d6208d); this commit
  productionises the code path so the monthly cron can do the same.

## Verification

### Automated

$ poetry run pytest tests/providers/test_fidelity_planviewer.py -v
8 passed in 0.88s

$ poetry run pytest -q
128 passed, 1 skipped in 1.41s

$ poetry run mypy broker_sync/providers/fidelity_planviewer.py broker_sync/providers/parsers/fidelity.py
Success: no issues found in 2 source files

$ poetry run ruff check broker_sync/providers/fidelity_planviewer.py broker_sync/providers/parsers/fidelity.py
All checks passed!

### Manual verification (2026-04-18 live run)

1. poetry run broker-sync fidelity-seed (headed browser + SMS OTP) —
   captured storage_state, staged to Vault.
2. Inline import script hit the same code paths the provider now runs;
   52 activities imported into a new WF WORKPLACE_PENSION account, WF
   Net Worth jumped from £865,358 → £1,003,083.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 18:47:38 +00:00
Viktor Barzin
832732a419 fidelity-planviewer: scaffold provider + CLI (seed + stub ingest)
## Context

UK workplace pension at planviewer.fidelity.co.uk has no public API; the SPA
calls a private JSON backend at prd.wiciam.fidelity.co.uk/cvmfe/api/*. Viktor
confirmed in DevTools that an OPTIONS preflight lists auth headers
(ch, fid, rid, sid, tbid, theosreferer, ua). Full reverse-engineering of the
endpoint paths is pending Viktor's POST cURL paste for transactions +
holdings views.

Until those endpoints are captured, ship the scaffold: provider module, CLI
commands, tests, docs. This unblocks installing Playwright in the image and
lets Viktor run the one-off seed command on his laptop ahead of the data
integration.

## This change

- broker_sync/providers/fidelity_planviewer.py
  - FidelityCreds namedtuple (storage_state_path, plan_id).
  - FidelitySessionError (401 → re-seed), FidelityProviderConfigError.
  - FidelityPlanViewerProvider: .accounts() returns a single
    WORKPLACE_PENSION account, .fetch() raises until endpoints are wired.
- broker_sync/cli.py
  - fidelity-seed: launches headed Chromium so Viktor can log in and tick
    "Remember device", then dumps storage_state.json.
  - fidelity-ingest: stub matching the invest-engine / trading212 CLI
    shape; reads storage_state + plan_id, pipes through the shared pipeline.
- tests/providers/test_fidelity_planviewer.py
  - Asserts the single-account shape + the loud-failure guard.
- docs/providers/fidelity-planviewer.md
  - Architecture diagram, one-time seed procedure, backfill + monthly
    commands, alert runbook.
- pyproject.toml
  - playwright ^1.47 as a first-class dep (used only by fidelity-seed and
    later by the session-refresh step in fidelity-ingest).

## What is NOT in this change

- Endpoint wiring in provider.fetch() — blocked on DevTools POST cURL.
- Infra CronJob + Vault secret + Prometheus alert — lands once the first
  manual backfill succeeds and we know the Chromium image size is fine.
- Dockerfile Chromium install — same trigger.

## Verification

### Automated

$ poetry run pytest tests/providers/test_fidelity_planviewer.py -v
2 passed in 0.08s

$ poetry run pytest -q
122 passed, 1 skipped in 1.07s

$ poetry run mypy broker_sync/providers/fidelity_planviewer.py broker_sync/cli.py
Success: no issues found in 2 source files

$ poetry run ruff check broker_sync/providers/fidelity_planviewer.py broker_sync/cli.py tests/providers/test_fidelity_planviewer.py
All checks passed!

### Manual (Viktor, later)

1. poetry install && poetry run playwright install chromium
2. poetry run broker-sync fidelity-seed --out /tmp/state.json
3. Chromium opens → log in → tick "Remember device" → press Enter
4. vault kv patch secret/broker-sync fidelity_storage_state=@/tmp/state.json

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 14:09:04 +00:00
Viktor Barzin
c830856ba1 imap: route IE BUYs to ISA first-£20k / GIA overflow per UK tax year
## Context

Viktor's InvestEngine account has both an ISA and a GIA wrapper. Trade
confirmation emails (info@investengine.com) are identical between them —
subject "Here's how your portfolio looks now", body shows "Client name:
Viktor Barzin" with no portfolio/account type. That left the IMAP parser
hardcoded to route every IE BUY to the ISA (invest-engine-primary),
which produced a 2339-share over-count when 2023-24 GIA buys landed in
the ISA during the 2026-04-18 reconciliation.

Viktor's rule: from 6 April each tax year, BUYs fill ISA up to the
£20,000 cap, then overflow to GIA. This commit codifies that rule in a
standalone batch splitter and applies it at the ImapProvider boundary.

Also picks up a silent-drop bug surfaced during the same reconciliation:
WF's /import (unlike /import/check) rejects naive datetimes with
"Invalid date". The sink now coerces tzinfo=UTC defensively so every
provider gets the same guarantee.

## This change

- `_split_ie_by_isa_cap(activities)` — sorts all IE-ISA BUYs by date and
  walks them once per UK tax year (6 April boundary). A BUY whose running
  tax-year total BEFORE it is strictly below £20k stays on the ISA;
  otherwise it flips to a new `invest-engine-gia` account_id. No
  fractional splits — boundary activities go whole to whichever bucket
  their pre-running-total dictates. Non-IE and non-BUY activities pass
  through unchanged.
- `ImapProvider.accounts()` gains an `invest-engine-gia` Account so the
  pipeline's `_ensure_accounts` can resolve both.
- `ImapProvider.fetch()` calls the splitter on the full batch before
  applying the `since`/`before` date filter — batch-level sort
  guarantees consistent routing regardless of the order IMAP returns
  messages.
- `WealthfolioSink._activity_to_import_row` coerces naive datetimes to
  UTC so the row passes WF /import validation.

## What is NOT in this change

- No retroactive re-routing of data already in WF. Historical
  finance-mysql rows (all lumped to `invest-engine-primary` or
  `invest-engine-gia` by the existing heuristic) keep their current
  account assignment. If a past tax-year was routed "wrong" under the
  new rule, that's corrected manually via the WF API, not here.
- No change to the Schwab or trading212 paths.

## Verification

### Automated

\`\`\`
$ poetry run pytest tests/providers/test_imap.py -v
tests/providers/test_imap.py::test_uk_tax_year_start_before_april_6_rolls_back PASSED
tests/providers/test_imap.py::test_single_tax_year_under_cap_stays_isa PASSED
tests/providers/test_imap.py::test_overflow_past_cap_flips_to_gia PASSED
tests/providers/test_imap.py::test_tax_year_boundary_resets_cap PASSED
tests/providers/test_imap.py::test_out_of_order_activities_sorted_before_cap_applied PASSED
tests/providers/test_imap.py::test_non_ie_activities_passed_through_unchanged PASSED
6 passed in 0.36s

$ poetry run pytest -q --ignore=tests/test_cli.py
116 passed, 1 skipped in 2.76s

$ poetry run ruff check broker_sync/providers/imap.py broker_sync/sinks/wealthfolio.py
All checks passed!

$ poetry run mypy broker_sync/providers/imap.py broker_sync/sinks/wealthfolio.py
Success: no issues found in 2 source files
\`\`\`

### Manual verification

The tzinfo fix was validated against the live WF instance during the
2026-04-18 reconciliation — before the fix, /import returned
\`"errors": {"symbol": ["Invalid date '2022-05-24T00:00:00'."]}\` for
every IMAP activity; after, the same payload imported cleanly.

The splitter was not exercised against live IMAP data because Viktor's
mailbox only has Apr 2022 → Feb 2024 emails, all inside finance.position's
existing coverage. Running IMAP ingest with \`since=2024-04-06\` yields
fetched=0. The unit tests cover the boundary arithmetic; a live run
will happen when newer emails are parsed (or when finance coverage is
re-scoped).

## Reproduce locally

1. \`poetry install\`
2. \`poetry run pytest tests/providers/test_imap.py\`
3. Expected: 6 passed, 0 failed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 12:02:49 +00:00
Viktor Barzin
a190875f63 Add finance_mysql provider + CLI for historical backfill
finance.position (171 rows, 2020-06-07 to 2025-12-19) is the only source
of InvestEngine + Schwab trade history pre-dating the broker-sync project.
This provider reads it once and pushes every row into the correct WF
account (.L tickers → IE ISA, others → Schwab).

Dedup: external_id = 'finance-mysql:position:<PK>' — idempotent on re-run.
Auth: aiomysql as MySQL root (user-authorized) against the standalone
mysql:8.4 in-cluster service.

New CLI: broker-sync finance-mysql-import
New tests: 5 unit tests covering route, symbol normalise, BUY/SELL
detection.

poetry run pytest -q   →  114 passed, 1 skipped
poetry run mypy        →  clean (aiomysql shielded with type: ignore)
poetry run ruff check  →  clean
2026-04-17 22:38:21 +00:00
Viktor Barzin
74b2179c83 sinks: read summary.imported as truth for partial-persist detection
The /import response returns activities=[input echo with errors annotated]
— its length equals input size regardless of actual persistence. The
summary{total,imported,skipped,duplicates} block is the authoritative
signal. When imported<total, raise with errorMessage + skipped + duplicates
counts.
2026-04-17 22:30:24 +00:00
Viktor Barzin
4e2da87637 sinks: detect silent Wealthfolio /import drops
After the check step returns isValid=true + no errors, a row can still
be silently dropped by /import (response returns activities=[] on
200 OK). Root-cause is usually a field that check hydrates but
/import re-normalises differently (date string form, asset_id resolution).

When we send N valid rows and get back 0, raise ImportValidationError
with a snippet of the check output + first warning — gives the
operator a concrete hint to fix the producer instead of silently
growing dedup against activities that never landed.

poetry run pytest -q   →  109 passed, 1 skipped
poetry run mypy        →  clean
poetry run ruff check  →  clean
2026-04-17 22:24:36 +00:00
Viktor Barzin
6efd03570a Add imap-ingest CLI + ImapProvider: route emails to IE/Schwab parsers
Wires the IE + Schwab email parsers into an actual runnable sync. Walks
the IMAP mailbox, routes each message by sender domain:
  - *@investengine.com → invest_engine.parse_invest_engine_email
  - *@schwab.com       → schwab.parse_schwab_email
then pushes the resulting Activities through the shared pipeline.

broker-sync imap-ingest — new CLI command taking IMAP_HOST/USER/PASSWORD/
DIRECTORY (mirrors the old wealthfolio-sync image's env shape so the
Terraform CronJob's existing env wiring works unchanged).

Verified: poetry run pytest -q → 109 passed + 1 skipped; mypy strict
clean (37 files); ruff + yapf clean.
2026-04-17 22:12:05 +00:00
Viktor Barzin
f089b8b93a Add Schwab email parser (port from finance/)
Schwab's workplace-RSU confirmation emails have 5 data td elements
with class='dark-background-body' align='right': date, direction, qty,
ticker, price-with-currency-sign. One email → one Activity.

- parse_schwab_email(raw_html) -> list[Activity] (1-item or empty)
- Empty on any parse failure (IMAP batch shouldn't crash on one bad mail)
- Deterministic external_id ('schwab📅ticker:type:qty') — stable
  across re-pulls so dedup works
- Hardcoded to account 'schwab-workplace' / AccountType.GIA / USD
- 6 unit tests: SELL + BUY happy path, malformed, missing cells,
  external-id stability, commas in price

Dropped from the original finance port:
- msg_timestamp-based external id (non-deterministic — would re-import
  on every IMAP walk). Replaced with a hash-stable key.
- Currency.from_sign() currency hack. Schwab US is USD-only; we'll add
  FX when that changes.

poetry run pytest -q   →  109 passed, 1 skipped
poetry run mypy        →  clean (added types-python-dateutil)
poetry run ruff check  →  clean
2026-04-17 22:08:40 +00:00
Viktor Barzin
1aa60ce348 Merge ie-email-parser: HTML + CSV fallbacks + failure-mode tests
# Conflicts:
#	broker_sync/providers/parsers/invest_engine.py
#	tests/providers/parsers/test_invest_engine.py
2026-04-17 22:06:29 +00:00
Viktor Barzin
89e9710d24 Merge ie-bearer-client: IE Bearer-token HTTP client + CLI subcommand 2026-04-17 22:05:11 +00:00
Viktor Barzin
87526898e6 Pin InvestEngine parser failure modes — empty-on-junk + partial-match
Context: The port's graceful-failure contract was implicit in the way
each strategy returns None/[] on malformed input, but without tests it
was an accidental property that could regress silently. Codify it.

Two invariants, each backed by a fixture:

1. Junk email → empty list, never raise.
   `unparseable.eml` is a pure-marketing IE newsletter with no order
   data. All three strategies try and fail; parse_invest_engine_email
   returns []. No exception leaks.

2. Partial HTML email → intact orders only.
   `html_partial_match.eml` has two nested summary tables: one with a
   valid VUAG order, one that is missing both the ticker and "Bought N
   @ £P" rows (simulates IE dropping content mid-render). The parser
   returns just the VUAG order.

No implementation change needed — the behaviour existed as a side
effect of _try_html_summary_table returning None on missing fields.
These tests lock it down so future refactors can't quietly break it.

Test plan:
  poetry run pytest tests/providers/parsers/ -q   →  8 passed in 0.19s
  poetry run mypy broker_sync/providers/parsers/invest_engine.py tests/providers/parsers/test_invest_engine.py   →  clean
  poetry run ruff check broker_sync/providers/parsers/invest_engine.py tests/providers/parsers/test_invest_engine.py   →  All checks passed!
  poetry run yapf --diff   →  clean (no diff)

Manual verification:
- Load unparseable.eml → parse returns [].
- Load html_partial_match.eml → parse returns exactly 1 activity (VUAG).
2026-04-17 22:02:48 +00:00
Viktor Barzin
020ba16723 Add CSV attachment fallback for InvestEngine email parser
Context: IE has not (yet) sent CSV-attached statements in production,
but the upstream parser had _extract_positions_csv as a third fallback
for exactly this case. Keeping the fallback preserves behaviour-parity
with the legacy parser and makes future statement support one fixture
away — the shape is documented by column set, not scraped live.

Unlike the upstream which split the body on whitespace and broke on any
embedded commas in names, this port walks real MIME attachments using
Python's csv.DictReader. A part qualifies as CSV if:
- its Content-Type is text/csv / application/csv / application/vnd.ms-excel, OR
- its filename ends in .csv (defence against IE mis-labelling the part)

Rows missing required columns or containing unparseable numbers/dates
are skipped silently — consistent with the "partial match" contract:
a half-corrupt CSV yields whatever rows were intact. Required columns:
ticker, unit_price, quantity, date (YYYY-MM-DD), currency. Non-GBP
rows are filtered because the IE ISA is strictly sterling — flagging
this assumption in the review notes.

This change:
- Adds `_parse_csv_attachment(raw_email)` as the third strategy after
  text/plain and text/html; it re-parses the raw email bytes so we can
  inspect Content-Type/filename on each part.
- Flags symbols/currencies, filters non-GBP, and runs each row through
  the shared `_build_activity` so external_id formation matches every
  other strategy (dedup stays consistent across strategies).
- Fixture `csv_attachment.eml` has three rows (VUAG, SWDA, VUSA) in a
  `text/csv` part with a `.csv` filename — covers both detection paths.

Test plan:
  poetry run pytest tests/providers/parsers/ -q   →  6 passed in 0.15s
  poetry run mypy broker_sync/providers/parsers/invest_engine.py tests/providers/parsers/test_invest_engine.py  →  clean
  poetry run ruff check broker_sync/providers/parsers/invest_engine.py tests/providers/parsers/test_invest_engine.py  →  All checks passed!
  poetry run yapf --diff  →  clean (no diff)

Manual verification: load csv_attachment.eml, call parse_invest_engine_email,
assert 3 activities each with symbol in {VUAG,SWDA,VUSA}, currency=GBP,
notes containing "csv".
2026-04-17 22:01:46 +00:00
Viktor Barzin
f49918c74d Add broker-sync invest-engine CLI subcommand
Context: Phase 2b wiring — hand the bearer-token InvestEngineProvider
into the existing sync pipeline (sync_provider_to_wealthfolio), mirroring
the trading212 subcommand.

Environment contract:

  WF_BASE_URL, WF_USERNAME, WF_PASSWORD, WF_SESSION_PATH  (shared with trading212)
  IE_BEARER_TOKEN                                         (devtools-pasted)
  IE_TOKEN_EXPIRES_AT                                     (ISO-8601; Viktor sets on paste)
  BROKER_SYNC_DATA_DIR                                    (sync.db + checkpoint state)

Exit codes:

  0 = clean run
  1 = some rows failed to import (mirrors trading212 behaviour)
  2 = token already expired per IE_TOKEN_EXPIRES_AT, or malformed ISO
      timestamp, or live 401 response from IE (InvestEngineTokenExpiredError),
      or unknown --mode flag

The pre-request expiry check is deliberate: a CronJob that runs during
the refresh window would otherwise waste a request on a dead token and
get the same 401 that we already know about from the clock. Exit 2
from the clock-only path also separates "token is old" from "wealthfolio
rejected a batch" in the CronJob alert pipeline.

Mode defaults:

  --mode steady    → since = now - 30d  (bigger window than T212's 7d
                    because the IE sync only runs once a month in steady
                    state; 30d guarantees no gap even after a missed run)
  --mode backfill  → since = None       (full history)

This change:
 - `invest-engine` subcommand added to broker_sync/cli.py
 - Token-expiry pre-check (clock), IE_TOKEN_EXPIRES_AT ISO parsing with
   a UTC default for naive timestamps, and graceful handling of
   InvestEngineTokenExpiredError surfaced during pipeline run
 - 3 new tests in tests/test_cli.py covering the 3 exit-2 paths

## Automated

poetry run pytest tests/test_cli.py -v
======================== 4 passed in 0.28s =========================

poetry run pytest -q
98 passed, 1 skipped in 0.85s

poetry run mypy --strict .
Success: no issues found in 34 source files

poetry run ruff check .
All checks passed!

## Manual Verification

  1. Populate Vault keys per the docstring in
     broker_sync/providers/invest_engine.py (Viktor pastes token + sets
     expires_at to the Monday morning of next month).
  2. Set env:
       export WF_BASE_URL=https://wealthfolio.viktorbarzin.me
       export WF_USERNAME=viktor
       export WF_PASSWORD=<from Vault>
       export IE_BEARER_TOKEN=<from Vault>
       export IE_TOKEN_EXPIRES_AT=<from Vault>
       export BROKER_SYNC_DATA_DIR=/tmp/ie-smoke
  3. poetry run broker-sync invest-engine --mode backfill
     Expected: single line "invest-engine: fetched=N new=M imported=M failed=0"
     on success; exit 2 with "InvestEngine token expired..." if the clock
     or server disagrees; exit 2 with "IE_TOKEN_EXPIRES_AT not a valid
     ISO-8601 timestamp..." if the env var is malformed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-17 21:59:31 +00:00
Viktor Barzin
72d348e294 Add HTML table fallback for InvestEngine email parser
Context: Plain-text IE emails vanished around 2024-Q2 when IE switched to
an HTML-only template with per-order nested summary tables. The RFC 2822
line parser returns [] on those modern emails, so we need a fallback
that walks the HTML table structure.

Upstream _extract_from_html parsed a fixed DOM path (table[1].tr[10].
table) and only handled ONE order per email. The real IE HTML template
nests one summary <table> per ticker inside the second top-level table —
multiple orders in a single batched confirmation are common — so this
port walks every leaf table (no child <table>) and interprets each one
as an independent trade summary. Structural (non-leaf) tables are
skipped to avoid double-counting via get_text().

This change:
- `_parse_html_tables(body)` extracts the date once from the full text
  then walks leaf tables looking for "Bought N @ £P" rows.
- `_try_html_summary_table` parses one leaf; returns None on structural
  tables or missing ticker/qty/price — so a partial email yields only
  its intact orders (the "2 orders, 1 parseable → 1 returned" invariant
  works by construction without raising).
- `parse_invest_engine_email` now falls through text/plain → text/html
  in the multipart message, picking the first strategy that returns
  activities. Order matters: text/plain wins when both succeed because
  the RFC 2822 strategy is the more constrained grammar.
- Regexes are module-level constants so they compile once per process.

Fixture `html_two_orders.eml` is a minimal-but-realistic multipart email
with two nested summary tables (VUAG + SWDA), no personal data beyond
tickers/qty/price.

Test plan:
  poetry run pytest tests/providers/parsers/ -q
  → 5 passed in 0.16s
  poetry run mypy broker_sync/providers/parsers/invest_engine.py tests/providers/parsers/test_invest_engine.py
  → Success: no issues found in 2 source files
  poetry run ruff check broker_sync/providers/parsers/invest_engine.py tests/providers/parsers/test_invest_engine.py
  → All checks passed!
  poetry run yapf --diff broker_sync/providers/parsers/invest_engine.py tests/providers/parsers/test_invest_engine.py
  → clean (no diff)

Manual verification: load html_two_orders.eml, call parse_invest_engine_email,
assert len == 2 with both expected tickers (VUAG, SWDA) and numbers,
dates set to 2026-04-01.
2026-04-17 21:58:15 +00:00
Viktor Barzin
9ec8ece2d9 Add InvestEngine email parser — RFC 2822 v1/v2 line format
Context: The old finance/ app had a 324-line IE message parser with four
line-based variants (v1/v2/v3/v4) plus an HTML strategy and a CSV
fallback. Port into broker-sync so we can consume IE trade confirmation
emails as a backup to the live HTTP client (Phase 2b) while IE's public
API remains Bearer-only.

The upstream parser emits storage.model.Position; we emit canonical
Activity with the broker-sync invariants: account_id="invest-engine-primary"
(sink remaps to Wealthfolio UUID), account_type=ISA, currency=GBP, and
external_id="invest-engine:<fingerprint>" where the fingerprint is a
SHA-256 of (date|symbol|quantity|unit_price) — deterministic so repeat
imports of the same email dedup at the sync-record layer.

This change:
- Top-level `parse_invest_engine_email(raw_email: bytes) -> list[Activity]`
  extracts the text/plain body from an RFC 2822 message and dispatches to
  the line-based parser.
- `_parse_rfc2822_lines(body)` tries the v2 layout first (newer IE format
  where `Date: DD Month` is on line 2 and the year on line 3), then the
  v1 layout (where the day alone is on line 2 and `Month YYYY` on line 3).
  v3 and v4 variants are re-added in a follow-up if we find fixtures
  where they matter — initial fixture coverage hits v2.
- Drops the upstream `_ticker_post_processing` VUAG→VUAG.L hack.
  Wealthfolio's /import/check endpoint resolves exchange suffixes; the
  Trading212 provider also emits suffix-free tickers (e.g. `VUAG`), so
  staying consistent avoids double-mapping.
- Notes field records the parse-strategy tag ("rfc2822-v2") plus the
  matched line for debugging.

Test plan:
  poetry run pytest tests/providers/parsers/ -q
  → 3 passed in 0.03s
  poetry run mypy broker_sync/providers/parsers/invest_engine.py tests/providers/parsers/test_invest_engine.py
  → Success: no issues found in 2 source files
  poetry run ruff check broker_sync/providers/parsers/invest_engine.py tests/providers/parsers/test_invest_engine.py
  → All checks passed!
  poetry run yapf --diff broker_sync/providers/parsers/invest_engine.py tests/providers/parsers/test_invest_engine.py
  → clean (no diff)

Manual verification: load the fixture email, call the parser, inspect the
returned Activity has symbol=VUAG, quantity=59.539562, unit_price=60.46,
date=2023-01-17, external_id starts with invest-engine:.
2026-04-17 21:55:01 +00:00
Viktor Barzin
dc4d3f889d Add InvestEngineProvider — Bearer-token HTTP client
Context: InvestEngine has no public API. The web app uses an undocumented
Django REST backend at /api/v0.3X/*, which requires a Bearer token and
rolls its minor every 4-6 weeks. MFA (push-approval) is mandatory on
every login, so we do NOT automate login — Viktor logs in manually in
a browser, copies the Bearer out of devtools, and pastes it into Vault.
This provider consumes that token.

The response shape is UNVERIFIED (MFA blocks an unauthed probe, so the
research leading into Phase 2b could only confirm endpoint existence via
401 responses on v0.31 and v0.32). `_transaction_to_activity` is written
defensively:

 - accepts both `results`/`data` list wrappers and `next`/`meta.next_page`
   cursor fields for pagination;
 - accepts `symbol`/`ticker`, `price`/`unit_price`, `amount`/`value`,
   `date`/`created_at`/`timestamp` field-name variants;
 - maps exact type strings (BUY, SELL, DIVIDEND, INTEREST, DEPOSIT,
   WITHDRAWAL, FEE, TAX) and substring-matches DEPOSIT/WITHDRAWAL for
   variants like "CASH_DEPOSIT"; refuses to guess on anything else —
   unknown types log WARNING and return None (silent misclassification
   would corrupt tax reporting).

Version probe:

  _START_VERSION_MINOR=32 (research: v0.31/v0.32 live, v0.30 Gone)
  GET /api/v0.{n}/ → 410 ? advance : done
  cap at v0.60 so a misconfigured backend doesn't infinite-loop.

A 410 response on a data endpoint triggers exactly one re-probe + retry
against the newer version; the new version is cached on the instance for
the rest of the process.

Token expiry is tracked at the Python layer:

 - constructor takes token_expires_at (set by Viktor when he pastes);
 - fetch() fails fast with InvestEngineTokenExpiredError if the clock
   says the token is already dead — cheaper than burning a request for
   a known 401;
 - a real 401 response also raises InvestEngineTokenExpiredError so the
   CLI/pipeline can alert Viktor to paste a new token.

Vault schema expected (consumed by the CLI in the follow-up commit):

  secret/broker-sync
    investengine_bearer_token       <devtools-captured Bearer>
    investengine_token_expires_at   <ISO-8601 set at paste time>
    investengine_refresh_token      <optional, not used yet>

This module does NOT read Vault — the caller hands values in, keeping
the provider testable.

This change:
 - New `broker_sync/providers/invest_engine.py`:
   * InvestEngineProvider with .accounts(), .fetch(), .close()
   * _probe_version / _active_version with 410-retry + cache
   * _transaction_to_activity with defensive type + field-name mapping
   * InvestEngineError / InvestEngineTokenExpiredError / InvestEngineVersionError
 - New `tests/providers/test_invest_engine.py`: 22 tests covering version
   probe, expiry fail-fast, 401→TokenExpired, 410→reprobe, header
   shape, pagination variants, and the full txn→activity mapping. One
   @pytest.mark.skip integration stub for when Viktor has a live token.

Assumptions flagged for verification with a live token:
 - IE id field is castable to str (int or string)
 - Type strings match or fuzz-contain: BUY, SELL, DIVIDEND, INTEREST,
   DEPOSIT, WITHDRAWAL, FEE, TAX
 - Transactions carry numeric quantity/price/amount (Decimal-convertible)
 - Date field is one of: date / created_at / timestamp
 - Pagination shape is {results, next} OR {data, meta.next_page}
 - /transactions/ accepts ?portfolio=<id>&start=YYYY-MM-DD&end=YYYY-MM-DD

## Automated

poetry run pytest tests/providers/test_invest_engine.py -v
======================== 22 passed, 1 skipped in 0.26s =========================

poetry run pytest -q
95 passed, 1 skipped in 0.84s

poetry run mypy --strict .
Success: no issues found in 34 source files

poetry run ruff check .
All checks passed!

poetry run yapf --diff broker_sync/providers/invest_engine.py tests/providers/test_invest_engine.py
(clean)

## Manual Verification

Once Viktor pastes a live token:

  1. Export:
     export IE_BEARER_TOKEN='<paste>'
     export IE_TOKEN_EXPIRES_AT='2026-05-17T00:00:00+00:00'
  2. Unmark the @pytest.mark.skip on test_live_integration_smoke
  3. poetry run pytest tests/providers/test_invest_engine.py::test_live_integration_smoke -v
     Expected: a successful round-trip that returns an empty-or-populated
     list of Activity objects — prove the version probe + auth header +
     portfolio enumeration actually work against the real IE backend.
  4. Validate the Assumptions list above against the real transaction JSON.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-17 21:52:26 +00:00
Viktor Barzin
ea15b80111 Add InvestEngine email parser — RFC 2822 v1/v2 line format
Context: The old finance/ app had a 324-line IE message parser with four
line-based variants (v1/v2/v3/v4) plus an HTML strategy and a CSV
fallback. Port into broker-sync so we can consume IE trade confirmation
emails as a backup to the live HTTP client (Phase 2b) while IE's public
API remains Bearer-only.

The upstream parser emits storage.model.Position; we emit canonical
Activity with the broker-sync invariants: account_id="invest-engine-primary"
(sink remaps to Wealthfolio UUID), account_type=ISA, currency=GBP, and
external_id="invest-engine:<fingerprint>" where the fingerprint is a
SHA-256 of (date|symbol|quantity|unit_price) — deterministic so repeat
imports of the same email dedup at the sync-record layer.

This change:
- Top-level `parse_invest_engine_email(raw_email: bytes) -> list[Activity]`
  extracts the text/plain body from an RFC 2822 message and dispatches to
  the line-based parser.
- `_parse_rfc2822_lines(body)` tries the v2 layout first (newer IE format
  where `Date: DD Month` is on line 2 and the year on line 3), then the
  v1 layout (where the day alone is on line 2 and `Month YYYY` on line 3).
  v3 and v4 variants are re-added in a follow-up if we find fixtures
  where they matter — initial fixture coverage hits v2.
- Drops the upstream `_ticker_post_processing` VUAG→VUAG.L hack.
  Wealthfolio's /import/check endpoint resolves exchange suffixes; the
  Trading212 provider also emits suffix-free tickers (e.g. `VUAG`), so
  staying consistent avoids double-mapping.
- Notes field records the parse-strategy tag ("rfc2822-v2") plus the
  matched line for debugging.

Test plan:
  poetry run pytest tests/providers/parsers/ -q
  → 3 passed in 0.03s
  poetry run mypy broker_sync/providers/parsers/invest_engine.py tests/providers/parsers/test_invest_engine.py
  → Success: no issues found in 2 source files
  poetry run ruff check broker_sync/providers/parsers/invest_engine.py tests/providers/parsers/test_invest_engine.py
  → All checks passed!
  poetry run yapf --diff broker_sync/providers/parsers/invest_engine.py tests/providers/parsers/test_invest_engine.py
  → clean (no diff)

Manual verification: load the fixture email, call the parser, inspect the
returned Activity has symbol=VUAG, quantity=59.539562, unit_price=60.46,
date=2023-01-17, external_id starts with invest-engine:.
2026-04-17 21:49:52 +00:00
Viktor Barzin
b363032e42 sinks: feed /import/check enrichment into /import body
/import/check hydrates each ActivityImport with resolved assetId,
exchangeMic, quoteCcy, instrumentType, quoteMode. The /import endpoint
on Wealthfolio 3.2 does NOT re-resolve — passing an un-enriched row
returns 200 OK but silently drops the activity (activities=[] in the
response).

The first live run returned `imported=63 failed=0` but nothing reached
the database. Fixed by posting the hydrated rows from the check response
to /import instead of the original.

Requires the test to also return list-shaped check responses (matches
the upstream Json<Vec<ActivityImport>> signature on the Rust side).

poetry run pytest -q     70 passed
poetry run mypy          clean
poetry run ruff check    clean
2026-04-17 20:54:17 +00:00
Viktor Barzin
80ca009373 Match Wealthfolio accounts by providerAccountId, remap accountId on import
Context: Wealthfolio 3.2 generates its own UUIDs on POST /accounts, ignoring any
`id` we supply. Our logical Account.id lives on as `providerAccountId`, which
WF preserves verbatim.

Live run created six duplicate accounts because ensure_account looked up by
our `id`, never found it, and POSTed a new account on every attempt. Deleted
the duplicates manually via DELETE /accounts/{id}.

This change:
- ensure_account now returns Wealthfolio's UUID; matches existing via
  (provider, providerAccountId)
- pipeline remaps activity.account_id to the WF UUID at submission time
  but keeps dedup keyed on our stable id (WF resets must not blow away
  the whole dedup history)
- test updates to the new account-shape + dedup key expectations

poetry run pytest -q    70 passed
poetry run mypy         clean
poetry run ruff check   clean
2026-04-17 20:44:32 +00:00
Viktor Barzin
ba672a1633 sinks: add required isDraft/isValid fields on ActivityImport 2026-04-17 20:37:38 +00:00
Viktor Barzin
1d23bf6ed7 sinks: switch Wealthfolio import to JSON body (not multipart CSV)
/api/v1/activities/import expects Content-Type: application/json with body
{"activities": [ActivityImport, ...]} where ActivityImport is camelCase:
{date, symbol, activityType, quantity?, unitPrice?, currency, fee?, amount?,
comment?, accountId?, ...}. Source: crates/core/src/activities/activities_model.rs

Live run failure was HTTP 415 Unsupported Media Type because we were uploading
a CSV in multipart/form-data; that endpoint is JSON-only.

Also handle two response shapes on import — older builds return a list, current
build returns {activities: [...]}.

Test plan:
  poetry run pytest -q   →  70 passed
  poetry run mypy        →  clean
  poetry run ruff check  →  clean
2026-04-17 20:34:12 +00:00
Viktor Barzin
ea881e272b sinks: match Wealthfolio NewAccount camelCase schema + required booleans
Wealthfolio 3.2's POST /api/v1/accounts was 422ing on live traffic — its
NewAccount struct uses camelCase field names and requires isDefault +
isActive as booleans. Reference:
https://github.com/afadil/wealthfolio/blob/main/apps/server/src/models.rs#L~145

Sends trackingMode=TRANSACTIONS so Wealthfolio computes holdings from
our imported activities (vs HOLDINGS mode which requires periodic
holdings snapshots). Populates providerAccountId so the broker account
is traceable back to our sync's id scheme.

Test plan:
  poetry run pytest -q   →  70 passed
  poetry run mypy        →  clean
  poetry run ruff check  →  clean

Live re-run of the backfill Job follows this commit's image rebuild.
2026-04-17 20:29:43 +00:00
Viktor Barzin
1d0769c9e6 Disable typer rich tracebacks to avoid secret leak in logs
Context
-------
Live run of `broker-sync trading212` hit a PermissionError and typer's
rich traceback printed every local variable, including the cleartext
WF_PASSWORD and the T212 api_key strings, into pod logs. Kubernetes
pod logs are world-readable cluster-wide — that's a security incident.

This change
-----------
- Pass `pretty_exceptions_enable=False` to the typer.Typer constructor.
  Plain stdlib tracebacks don't dump frame locals.
- Rich is still available for help text; only crash formatting changes.

Follow-up in infra/stacks/broker-sync: add `security_context.fs_group = 10001`
to every pod spec so the PVC is owned by the broker user (the original
PermissionError that triggered the traceback was the broker user being
unable to write /data/watermarks).

Test plan
---------
## Automated
- poetry run pytest -q  →  70 passed
- poetry run mypy broker_sync tests  →  clean
- poetry run ruff check .  →  clean

## Manual Verification
Re-run the backfill Job after the image is rebuilt + the infra
fsGroup change is applied.
2026-04-17 20:22:30 +00:00
Viktor Barzin
66cf0e0399 Fix live Wealthfolio login + Dockerfile poetry path
Context
-------
Two live-integration bugs surfaced during the Phase 0.5 auth-spike
run against the restored production Wealthfolio.

1. Wealthfolio 3.2's LoginRequest schema is `{ password: String }` —
   it rejects any request with an unknown `username` field as HTTP
   400 (empty body, hard to debug). Upstream source:
   https://github.com/afadil/wealthfolio/blob/main/apps/server/src/auth.rs#L86-L88

2. Dockerfile referenced `/opt/poetry/bin/poetry` but pip install
   puts poetry on the normal PATH; POETRY_HOME only affects the
   self-installer, not `pip install`. Exit 127 in GHA build.

This change
-----------
- WealthfolioSink.login() sends `{password}` only; kept `username`
  constructor arg as a stub for the day Wealthfolio adds multi-user.
- Dockerfile drops POETRY_HOME and uses `poetry` on PATH.
- Test: `_login_ok` now asserts body == {"password": "hunter2"}
  ("hunter2" is the XKCD placeholder — not a real credential).

Test plan
---------
## Automated
- poetry run pytest -q  →  70 passed
- poetry run mypy broker_sync tests  →  Success: no issues found in 29 source files
- poetry run ruff check .  →  All checks passed!

## Manual Verification (executed live)
```
kubectl -n wealthfolio port-forward svc/wealthfolio 18080:80 &
WF_BASE_URL=http://localhost:18080 WF_USERNAME=admin \
WF_PASSWORD=<from-vault> \
poetry run broker-sync auth-spike
→ "Logged in. 1 account(s) visible."
```
2026-04-17 20:17:24 +00:00
Viktor Barzin
645c765287 CI: build image from phase-0-scaffold branch too (bootstrap) 2026-04-17 19:51:09 +00:00
Viktor Barzin
70275afc06 CI: trigger on phase-0-scaffold + any PR for initial rollout 2026-04-17 19:48:50 +00:00
34 changed files with 5798 additions and 96 deletions

View file

@ -2,9 +2,8 @@ name: CI
on:
push:
branches: [main]
branches: [main, phase-0-scaffold]
pull_request:
branches: [main]
env:
IMAGE_NAME: broker-sync
@ -28,7 +27,7 @@ jobs:
build:
runs-on: ubuntu-latest
needs: test
if: github.event_name == 'push' && github.ref == 'refs/heads/main'
if: github.event_name == 'push' && (github.ref == 'refs/heads/main' || github.ref == 'refs/heads/phase-0-scaffold')
outputs:
image_tag: ${{ steps.meta.outputs.sha }}
steps:
@ -54,7 +53,7 @@ jobs:
deploy:
needs: build
if: github.event_name == 'push' && github.ref == 'refs/heads/main'
if: github.event_name == 'push' && (github.ref == 'refs/heads/main' || github.ref == 'refs/heads/phase-0-scaffold')
runs-on: ubuntu-latest
steps:
- name: Trigger Woodpecker deploy

45
.woodpecker/build.yml Normal file
View file

@ -0,0 +1,45 @@
when:
event: push
branch: [main, master]
clone:
git:
image: woodpeckerci/plugin-git
settings:
attempts: 5
backoff: 10s
steps:
- name: lint-and-test
image: python:3.12-slim
commands:
- pip install --no-cache-dir "poetry==1.8.4"
- poetry install --no-interaction --no-root
- poetry run ruff check .
- poetry run mypy broker_sync tests
- poetry run pytest -q
- name: build-and-push
image: woodpeckerci/plugin-docker-buildx
depends_on:
- lint-and-test
settings:
# Image name is `wealthfolio-sync` to match the deployment in
# infra/stacks/wealthfolio/main.tf (CronJob `wealthfolio-sync`).
# The repo is called `broker-sync` because the source covers
# multiple brokers (Trading 212, Schwab, Fidelity, IMAP-CSV) —
# we just happen to publish it under the wealthfolio name since
# that's the consumer stack.
repo:
- forgejo.viktorbarzin.me/viktor/wealthfolio-sync
logins:
- registry: forgejo.viktorbarzin.me
username:
from_secret: forgejo_user
password:
from_secret: forgejo_push_token
dockerfile: Dockerfile
context: .
auto_tag: true
platforms:
- linux/amd64

View file

@ -1,32 +1,75 @@
FROM python:3.12-slim AS builder
ENV POETRY_VERSION=1.8.4 \
POETRY_HOME=/opt/poetry \
POETRY_VIRTUALENVS_IN_PROJECT=true \
PIP_NO_CACHE_DIR=1
# `pip install` puts poetry on PATH (/usr/local/bin/poetry) — don't bother
# with POETRY_HOME indirection.
RUN pip install --no-cache-dir "poetry==${POETRY_VERSION}"
WORKDIR /app
COPY pyproject.toml poetry.lock ./
RUN /opt/poetry/bin/poetry install --only main --no-root
RUN poetry install --only main --no-root
COPY broker_sync ./broker_sync
RUN /opt/poetry/bin/poetry install --only main
RUN poetry install --only main
FROM python:3.12-slim
WORKDIR /app
# Playwright needs a big list of system libs for Chromium (fonts, NSS, libs
# for rendering, audio stubs, etc.). Mirror the list Playwright publishes at
# https://playwright.dev/docs/browsers#system-requirements for Debian 12.
# Fidelity PlanViewer is the only consumer today; gated to the fidelity-*
# CronJobs via the provider's explicit Playwright import.
RUN apt-get update && apt-get install --no-install-recommends -y \
ca-certificates \
fonts-liberation \
fonts-noto-color-emoji \
libasound2 \
libatk-bridge2.0-0 \
libatk1.0-0 \
libatspi2.0-0 \
libcairo2 \
libcups2 \
libdbus-1-3 \
libdrm2 \
libexpat1 \
libgbm1 \
libglib2.0-0 \
libnspr4 \
libnss3 \
libpango-1.0-0 \
libx11-6 \
libxcb1 \
libxcomposite1 \
libxdamage1 \
libxext6 \
libxfixes3 \
libxkbcommon0 \
libxrandr2 \
xvfb \
&& rm -rf /var/lib/apt/lists/*
RUN useradd --system --uid 10001 --home /app --shell /usr/sbin/nologin broker && \
mkdir -p /data && chown -R broker:broker /data
COPY --from=builder --chown=broker:broker /app /app
# Install Chromium into broker's cache so Playwright (running as broker)
# can pick it up. `PLAYWRIGHT_BROWSERS_PATH=0` forces a co-located install
# next to the python package — the simpler path on slim images.
ENV PATH="/app/.venv/bin:${PATH}" \
PYTHONUNBUFFERED=1
PYTHONUNBUFFERED=1 \
PLAYWRIGHT_BROWSERS_PATH=/app/.playwright-browsers
RUN mkdir -p "${PLAYWRIGHT_BROWSERS_PATH}" && \
chown -R broker:broker "${PLAYWRIGHT_BROWSERS_PATH}"
USER broker
RUN playwright install chromium
ENTRYPOINT ["broker-sync"]
CMD ["version"]

View file

@ -14,7 +14,14 @@ import typer
if TYPE_CHECKING:
from broker_sync.models import Account
app = typer.Typer(help="broker-sync: pull brokerage activity into Wealthfolio")
app = typer.Typer(
help="broker-sync: pull brokerage activity into Wealthfolio",
# CRITICAL: rich tracebacks print all local variables on crash, which
# includes env-sourced credentials (WF_PASSWORD, T212_API_KEYS_JSON).
# Kubernetes pod logs are world-readable — leaking creds there is a
# security incident. Plain tracebacks only.
pretty_exceptions_enable=False,
)
@app.command("version")
@ -130,6 +137,336 @@ def trading212(
asyncio.run(_run())
@app.command("invest-engine")
def invest_engine(
wf_base_url: str = typer.Option(..., envvar="WF_BASE_URL"),
wf_username: str = typer.Option(..., envvar="WF_USERNAME"),
wf_password: str = typer.Option(..., envvar="WF_PASSWORD"),
wf_session_path: str = typer.Option("/data/wealthfolio_session.json", envvar="WF_SESSION_PATH"),
ie_bearer_token: str = typer.Option(..., envvar="IE_BEARER_TOKEN"),
ie_token_expires_at: str = typer.Option(..., envvar="IE_TOKEN_EXPIRES_AT"),
data_dir: str = typer.Option("/data", envvar="BROKER_SYNC_DATA_DIR"),
mode: str = typer.Option("steady", help="steady = last-30-days; backfill = full history"),
) -> None:
"""Phase 2b — sync InvestEngine activity into Wealthfolio via Bearer token.
The Bearer token is pasted from browser devtools by Viktor (MFA blocks
scripted login). IE_TOKEN_EXPIRES_AT is the ISO-8601 timestamp he sets
when he pastes it; we fail fast with exit=2 if that moment has passed
so a CronJob that runs past the refresh window doesn't burn a request
on a known-dead token.
"""
from broker_sync.dedup import SyncRecordStore
from broker_sync.pipeline import sync_provider_to_wealthfolio
from broker_sync.providers.invest_engine import (
InvestEngineProvider,
InvestEngineTokenExpiredError,
)
from broker_sync.sinks.wealthfolio import WealthfolioSink
_setup_logging()
try:
expires_at = datetime.fromisoformat(ie_token_expires_at)
except ValueError as e:
typer.echo(f"IE_TOKEN_EXPIRES_AT not a valid ISO-8601 timestamp: {e}", err=True)
sys.exit(2)
if expires_at.tzinfo is None:
expires_at = expires_at.replace(tzinfo=UTC)
if expires_at <= datetime.now(UTC):
typer.echo(
f"InvestEngine token expired at {expires_at.isoformat()}"
f"Viktor must paste a fresh Bearer into Vault.",
err=True,
)
sys.exit(2)
data = Path(data_dir)
data.mkdir(parents=True, exist_ok=True)
if mode == "steady":
since: datetime | None = datetime.now(UTC) - timedelta(days=30)
elif mode == "backfill":
since = None
else:
typer.echo(f"Unknown mode: {mode!r}. Use 'steady' or 'backfill'.", err=True)
sys.exit(2)
async def _run() -> None:
sink = WealthfolioSink(
base_url=wf_base_url,
username=wf_username,
password=wf_password,
session_path=wf_session_path,
)
provider = InvestEngineProvider(
bearer_token=ie_bearer_token,
token_expires_at=expires_at,
)
dedup = SyncRecordStore(data / "sync.db")
try:
if not Path(wf_session_path).exists():
await sink.login()
result = await sync_provider_to_wealthfolio(
provider=provider,
sink=sink,
dedup=dedup,
since=since,
)
except InvestEngineTokenExpiredError as e:
typer.echo(f"InvestEngine auth failed: {e}", err=True)
sys.exit(2)
finally:
await provider.close()
await sink.close()
typer.echo(f"invest-engine: fetched={result.fetched} "
f"new={result.new_after_dedup} "
f"imported={result.imported} "
f"failed={result.failed}")
if result.failed > 0:
sys.exit(1)
asyncio.run(_run())
@app.command("finance-mysql-import")
def finance_mysql_import(
wf_base_url: str = typer.Option(..., envvar="WF_BASE_URL"),
wf_username: str = typer.Option(..., envvar="WF_USERNAME"),
wf_password: str = typer.Option(..., envvar="WF_PASSWORD"),
wf_session_path: str = typer.Option("/data/wealthfolio_session.json",
envvar="WF_SESSION_PATH"),
db_host: str = typer.Option(..., envvar="FINANCE_DB_HOST"),
db_port: int = typer.Option(3306, envvar="FINANCE_DB_PORT"),
db_user: str = typer.Option(..., envvar="FINANCE_DB_USER"),
db_password: str = typer.Option(..., envvar="FINANCE_DB_PASSWORD"),
db_name: str = typer.Option("finance", envvar="FINANCE_DB_NAME"),
data_dir: str = typer.Option("/data", envvar="BROKER_SYNC_DATA_DIR"),
) -> None:
"""One-shot backfill: read the retired finance app's MySQL position table
and push every row into the correct Wealthfolio account (IE for .L
tickers, Schwab for US tickers). Idempotent via dedup."""
from broker_sync.dedup import SyncRecordStore
from broker_sync.pipeline import sync_provider_to_wealthfolio
from broker_sync.providers.finance_mysql import (
FinanceMySQLCreds,
FinanceMySQLProvider,
)
from broker_sync.sinks.wealthfolio import WealthfolioSink
_setup_logging()
data = Path(data_dir)
data.mkdir(parents=True, exist_ok=True)
async def _run() -> None:
sink = WealthfolioSink(
base_url=wf_base_url,
username=wf_username,
password=wf_password,
session_path=wf_session_path,
)
provider = FinanceMySQLProvider(
FinanceMySQLCreds(
host=db_host,
port=db_port,
user=db_user,
password=db_password,
database=db_name,
))
dedup = SyncRecordStore(data / "sync.db")
try:
if not Path(wf_session_path).exists():
await sink.login()
result = await sync_provider_to_wealthfolio(
provider=provider,
sink=sink,
dedup=dedup,
)
finally:
await sink.close()
typer.echo(f"finance-mysql: fetched={result.fetched} "
f"new={result.new_after_dedup} "
f"imported={result.imported} "
f"failed={result.failed}")
if result.failed > 0:
sys.exit(1)
asyncio.run(_run())
@app.command("imap-ingest")
def imap_ingest(
wf_base_url: str = typer.Option(..., envvar="WF_BASE_URL"),
wf_username: str = typer.Option(..., envvar="WF_USERNAME"),
wf_password: str = typer.Option(..., envvar="WF_PASSWORD"),
wf_session_path: str = typer.Option("/data/wealthfolio_session.json",
envvar="WF_SESSION_PATH"),
imap_host: str = typer.Option(..., envvar="IMAP_HOST"),
imap_user: str = typer.Option(..., envvar="IMAP_USER"),
imap_password: str = typer.Option(..., envvar="IMAP_PASSWORD"),
imap_directory: str = typer.Option("INBOX", envvar="IMAP_DIRECTORY"),
data_dir: str = typer.Option("/data", envvar="BROKER_SYNC_DATA_DIR"),
) -> None:
"""Phase 2/3 — ingest InvestEngine + Schwab confirmation emails via IMAP.
Walks the mailbox, routes each message by `From:` sender domain to the
matching parser, pushes any resulting activities through the shared
pipeline (dedup Wealthfolio CSV-free JSON import).
"""
from broker_sync.dedup import SyncRecordStore
from broker_sync.pipeline import sync_provider_to_wealthfolio
from broker_sync.providers.imap import ImapCreds, ImapProvider
from broker_sync.sinks.wealthfolio import WealthfolioSink
_setup_logging()
data = Path(data_dir)
data.mkdir(parents=True, exist_ok=True)
async def _run() -> None:
sink = WealthfolioSink(
base_url=wf_base_url,
username=wf_username,
password=wf_password,
session_path=wf_session_path,
)
provider = ImapProvider(
ImapCreds(
host=imap_host,
user=imap_user,
password=imap_password,
directory=imap_directory,
))
dedup = SyncRecordStore(data / "sync.db")
try:
if not Path(wf_session_path).exists():
await sink.login()
result = await sync_provider_to_wealthfolio(
provider=provider,
sink=sink,
dedup=dedup,
)
finally:
await sink.close()
typer.echo(f"imap-ingest: fetched={result.fetched} "
f"new={result.new_after_dedup} "
f"imported={result.imported} "
f"failed={result.failed}")
if result.failed > 0:
sys.exit(1)
asyncio.run(_run())
@app.command("fidelity-seed")
def fidelity_seed(
out: str = typer.Option(
"fidelity_storage_state.json",
help="Where to write the storage_state JSON (stage it to Vault afterwards)",
),
url: str = typer.Option(
"https://pv.planviewer.fidelity.co.uk/",
help="PlanViewer SPA URL — defaults to the production UK landing",
),
) -> None:
"""One-off: launch a headed Chromium so Viktor can log into PlanViewer and
capture a long-lived storage_state (cookies + localStorage) for the monthly
cron.
Expected flow:
1. Chromium opens on the PlanViewer login page.
2. Viktor enters username, password, memorable word, MFA code.
3. Viktor ticks "Remember device" / "Trust this browser" if offered.
4. Viktor waits until the dashboard loads, then presses Enter in the terminal.
5. Script dumps storage_state.json and exits.
6. Viktor runs ``vault kv patch secret/broker-sync fidelity_storage_state=@...``.
"""
_setup_logging()
try:
from playwright.sync_api import sync_playwright
except ImportError as e:
typer.echo(
"Playwright is not installed — run `poetry install` first.", err=True)
raise typer.Exit(code=2) from e
typer.echo(f"Opening {url} in a headed browser — log in, tick "
"'Remember device' if offered, then press Enter here.")
with sync_playwright() as pw:
browser = pw.chromium.launch(headless=False)
context = browser.new_context()
page = context.new_page()
page.goto(url)
input("Press Enter once you're fully logged in and the dashboard is visible… ")
context.storage_state(path=out)
browser.close()
typer.echo(f"Wrote {out} — stage it to Vault:")
typer.echo(f" vault kv patch secret/broker-sync fidelity_storage_state=@{out}")
@app.command("fidelity-ingest")
def fidelity_ingest(
wf_base_url: str = typer.Option(..., envvar="WF_BASE_URL"),
wf_username: str = typer.Option(..., envvar="WF_USERNAME"),
wf_password: str = typer.Option(..., envvar="WF_PASSWORD"),
wf_session_path: str = typer.Option("/data/wealthfolio_session.json", envvar="WF_SESSION_PATH"),
storage_state_path: str = typer.Option(
...,
envvar="FIDELITY_STORAGE_STATE_PATH",
help="Path on disk to storage_state.json (materialised from Vault by the init container)",
),
plan_id: str = typer.Option(..., envvar="FIDELITY_PLAN_ID"),
data_dir: str = typer.Option("/data", envvar="BROKER_SYNC_DATA_DIR"),
mode: str = typer.Option("steady", help="steady = last-60-days; backfill = full history"),
) -> None:
"""Sync Fidelity UK PlanViewer contributions + fund purchases into Wealthfolio."""
from broker_sync.dedup import SyncRecordStore
from broker_sync.pipeline import sync_provider_to_wealthfolio
from broker_sync.providers.fidelity_planviewer import (
FidelityCreds,
FidelityPlanViewerProvider,
)
from broker_sync.sinks.wealthfolio import WealthfolioSink
_setup_logging()
if mode == "steady":
since: datetime | None = datetime.now(UTC) - timedelta(days=60)
elif mode == "backfill":
since = None
else:
typer.echo(f"Unknown mode: {mode!r}. Use 'steady' or 'backfill'.", err=True)
sys.exit(2)
async def _run() -> None:
sink = WealthfolioSink(
base_url=wf_base_url,
username=wf_username,
password=wf_password,
session_path=wf_session_path,
)
provider = FidelityPlanViewerProvider(FidelityCreds(
storage_state_path=storage_state_path,
plan_id=plan_id,
))
dedup = SyncRecordStore(Path(data_dir) / "sync.db")
try:
if not Path(wf_session_path).exists():
await sink.login()
result = await sync_provider_to_wealthfolio(
provider=provider, sink=sink, dedup=dedup, since=since,
)
finally:
await sink.close()
typer.echo(f"fidelity-ingest: fetched={result.fetched} "
f"new={result.new_after_dedup} "
f"imported={result.imported} "
f"failed={result.failed}")
if result.failed > 0:
sys.exit(1)
asyncio.run(_run())
def _setup_logging() -> None:
logging.basicConfig(
level=logging.INFO,

View file

@ -102,3 +102,27 @@ def _fmt(v: Decimal | None) -> str:
if v is None:
return ""
return format(v, "f")
@dataclass
class VestEvent:
"""Schwab RSU vest event — written to payslip_ingest.rsu_vest_events.
Carries both the gross vest (shares x FMV) and the sell-to-cover portion
(shares withheld for tax x FMV). Sibling Activity records (one BUY for
the full vest, one SELL for the sold-to-cover slice) are produced
separately for Wealthfolio.
USD-only at parse time; FX conversion happens at the postgres sink via
the ECB daily rate so the DB row carries both the raw USD figures and
the GBP-translated values for dashboard joins.
"""
external_id: str # schwab:{date}:{ticker}:VEST:{shares_vested}
vest_date: datetime
ticker: str
shares_vested: Decimal
shares_sold_to_cover: Decimal | None
fmv_at_vest_usd: Decimal
tax_withheld_usd: Decimal | None
source: str = "schwab_email"
raw: dict[str, str] = field(default_factory=dict)

View file

@ -5,9 +5,10 @@ import logging
from collections.abc import AsyncIterator
from dataclasses import dataclass
from datetime import datetime
from decimal import Decimal
from broker_sync.dedup import SyncRecordStore
from broker_sync.models import Account, Activity
from broker_sync.models import Account, Activity, ActivityType
from broker_sync.providers.base import Provider
from broker_sync.sinks.wealthfolio import WealthfolioSink
@ -39,26 +40,38 @@ async def sync_provider_to_wealthfolio(
Caller owns sink lifecycle (including `login()` and `close()`).
"""
await _ensure_accounts(sink, provider.accounts())
wf_account_ids = await _ensure_accounts(sink, provider.accounts())
fetched = 0
new_after_dedup = 0
imported = 0
failed = 0
batch: list[Activity] = []
# Batches are (original_account_id, remapped_for_import Activity) pairs.
# Dedup keys on our stable account_id; the import row uses Wealthfolio's UUID.
batch: list[tuple[str, Activity]] = []
async for activity in provider.fetch(since=since, before=before):
fetched += 1
if dedup.has_seen(provider.name, activity.account_id, activity.external_id):
continue
new_after_dedup += 1
_tag_notes(activity, provider.name)
batch.append(activity)
if len(batch) >= _BATCH_SIZE:
ok, bad = await _flush_batch(sink, dedup, provider.name, batch)
imported += ok
failed += bad
batch = []
# Expand each BUY/SELL into (original, matching DEPOSIT/WITHDRAWAL).
# See `_matched_cash_flow` — without the match, WF's historical Net
# Worth chart shows phantom spikes because BUYs consume cash that
# was never "deposited" according to the activity log.
for act in _with_cash_flow_match(activity):
if dedup.has_seen(provider.name, act.account_id, act.external_id):
continue
new_after_dedup += 1
_tag_notes(act, provider.name)
original_account_id = act.account_id
# Submit under Wealthfolio's UUID; keep dedup keyed on our id.
wf_id = wf_account_ids.get(original_account_id)
if wf_id:
act.account_id = wf_id
batch.append((original_account_id, act))
if len(batch) >= _BATCH_SIZE:
ok, bad = await _flush_batch(sink, dedup, provider.name, batch)
imported += ok
failed += bad
batch = []
if batch:
ok, bad = await _flush_batch(sink, dedup, provider.name, batch)
@ -82,9 +95,12 @@ async def sync_provider_to_wealthfolio(
)
async def _ensure_accounts(sink: WealthfolioSink, accounts: list[Account]) -> None:
async def _ensure_accounts(sink: WealthfolioSink, accounts: list[Account]) -> dict[str, str]:
"""Return {our_account_id: wealthfolio_uuid}."""
out: dict[str, str] = {}
for account in accounts:
await sink.ensure_account(account)
out[account.id] = await sink.ensure_account(account)
return out
def _tag_notes(activity: Activity, provider_name: str) -> None:
@ -101,18 +117,16 @@ async def _flush_batch(
sink: WealthfolioSink,
dedup: SyncRecordStore,
provider_name: str,
batch: list[Activity],
batch: list[tuple[str, Activity]],
) -> tuple[int, int]:
activities_only = [a for _, a in batch]
try:
created = await sink.import_activities(batch)
created = await sink.import_activities(activities_only)
except Exception:
log.exception("Wealthfolio import failed for batch of %d", len(batch))
return 0, len(batch)
# Map returned Wealthfolio activity ids back to our external_ids.
# Wealthfolio's response shape is under review — we defensively look
# for `external_id` on each returned row but fall back to positional
# matching if the server doesn't echo it.
by_external: dict[str, str | None] = {}
for row in created:
ext = row.get("external_id") if isinstance(row, dict) else None
@ -121,9 +135,14 @@ async def _flush_batch(
by_external[str(ext)] = str(wf_id) if wf_id is not None else None
ok = 0
for a in batch:
for original_account_id, a in batch:
wf_id = by_external.get(a.external_id)
dedup.record(provider_name, a.account_id, a.external_id, wealthfolio_activity_id=wf_id)
dedup.record(
provider_name,
original_account_id,
a.external_id,
wealthfolio_activity_id=wf_id,
)
ok += 1
return ok, 0
@ -131,3 +150,56 @@ async def _flush_batch(
async def collect(iterator: AsyncIterator[Activity]) -> list[Activity]:
"""Tiny helper — drain an async iterator to a list. Mainly for tests."""
return [a async for a in iterator]
# -- Cash-flow matching --------------------------------------------------
# BUY and SELL activities touch shares, not cash. Without an explicit
# DEPOSIT/WITHDRAWAL on the same day, WF models the account as having
# "phantom" cash debt — and its Net Worth chart shows cliff-jumps
# whenever a lump offset is applied after the fact.
#
# The pipeline emits a matching DEPOSIT (for BUY) or WITHDRAWAL (for SELL)
# right alongside each trade so the account's cash balance reconciles to
# ~0 at every point in time. Providers that already emit real cash flows
# (e.g. a Trading212 "deposit" endpoint, if we ever wire it) should set
# `Provider.emits_matching_cash_flow = True` to opt out — no provider
# does today (Trading212 only exposes BUY/SELL via the /orders endpoint).
def _matched_cash_flow(a: Activity) -> Activity | None:
"""Return the DEPOSIT/WITHDRAWAL that funds/receives the BUY/SELL `a`.
Returns None for every other activity type those already touch cash
directly (DEPOSIT, WITHDRAWAL, DIVIDEND, FEE, TAX, TRANSFER_*,
CONVERSION_*).
"""
if a.activity_type is ActivityType.BUY:
if a.quantity is None or a.unit_price is None:
return None
amount = a.quantity * a.unit_price + (a.fee or Decimal(0))
kind, tag = ActivityType.DEPOSIT, "buy"
elif a.activity_type is ActivityType.SELL:
if a.quantity is None or a.unit_price is None:
return None
amount = a.quantity * a.unit_price - (a.fee or Decimal(0))
kind, tag = ActivityType.WITHDRAWAL, "sell"
else:
return None
if amount <= 0:
return None
return Activity(
external_id=f"cash-flow-match:{tag}:{a.external_id}",
account_id=a.account_id,
account_type=a.account_type,
date=a.date,
activity_type=kind,
currency=a.currency,
amount=amount,
notes=f"cash-flow-match:{tag}:{a.external_id}",
)
def _with_cash_flow_match(a: Activity) -> list[Activity]:
"""Expand one activity into [original] or [original, matching cash flow]."""
match = _matched_cash_flow(a)
return [a] if match is None else [a, match]

View file

@ -0,0 +1,253 @@
"""Fidelity UK PlanViewer provider — workplace pension backfill + monthly sync.
PlanViewer has no public individual-member API. The SPA (at
``pv.planviewer.fidelity.co.uk``) and the legacy HTML app (at
``www.planviewer.fidelity.co.uk``) share session cookies via PingFederate
OAuth at ``id.fidelity.co.uk``.
We keep a Playwright-maintained session via ``storage_state.json``:
1. **One-off seed** (``broker-sync fidelity-seed``): Viktor runs a headed
Chromium, logs in (password + memorable word + SMS MFA), clicks
"Remember device". The storage_state is persisted to Vault.
2. **Monthly cron**: loads storage_state, boots headless Chromium, navigates
to the transaction-history page with a wide date range, parses the HTML
table, and intercepts the ``DisplayValuation`` XHR for the current
fund holdings. On 401/idle-timeout we raise
:class:`FidelitySessionError` so Prometheus alerts Viktor to re-seed.
## Emitted Activity shape
- One ``DEPOSIT`` per cash-impacting transaction (Regular Premium, Single
Premium, rebate, etc.). ``external_id = fidelity:tx:<sha256[:16]>``.
- One synthetic ``DEPOSIT`` for unrealised gains so WF's Net Worth matches
the Fidelity dashboard. ``external_id =
fidelity:gains:<YYYY-MM-DD>``.
- Bulk Switches / Fund Switches are skipped (no cash movement).
"""
from __future__ import annotations
import contextlib
import logging
from collections.abc import AsyncIterator
from datetime import UTC, datetime
from decimal import Decimal
from pathlib import Path
from typing import Any, NamedTuple
from broker_sync.models import Account, AccountType, Activity, ActivityType
from broker_sync.providers.parsers.fidelity import (
FidelityCashTx,
FidelityHolding,
parse_transactions_html,
parse_valuation_json,
)
log = logging.getLogger(__name__)
ACCOUNT_ID = "fidelity-workplace-pension"
_CCY = "GBP"
_PV_BASE = "https://www.planviewer.fidelity.co.uk"
_PV_TX_PATH = "/planviewer/DisplayMyPlanMemberTransHist.action"
_PV_VALUATION_PATH = "/planviewer/DisplayValuation.action"
_PV_LANDING = "https://www.planviewer.fidelity.co.uk/"
# A wide backfill cap; scheme can't predate 1990.
_BACKFILL_START = "01 Jan 1990"
class FidelityCreds(NamedTuple):
"""Paths needed to run the provider."""
storage_state_path: str
plan_id: str
headless: bool = True
class FidelitySessionError(Exception):
"""Raised when PlanViewer rejects the saved session — re-seed required."""
class FidelityProviderConfigError(Exception):
"""Raised when provider config is missing or obviously wrong."""
def _tx_to_activity(tx: FidelityCashTx) -> Activity:
"""Map a Fidelity cash transaction to a canonical DEPOSIT."""
return Activity(
external_id=tx.external_id,
account_id=ACCOUNT_ID,
account_type=AccountType.WORKPLACE_PENSION,
date=tx.date,
activity_type=ActivityType.DEPOSIT,
currency=_CCY,
amount=tx.amount,
notes=f"fidelity-planviewer:{tx.tx_type}",
)
def _gains_offset_activity(
holdings: list[FidelityHolding],
transactions: list[FidelityCashTx],
as_of: datetime,
) -> Activity | None:
"""Create a synthetic DEPOSIT/WITHDRAWAL so WF Net Worth matches the
Fidelity dashboard's reported pot value.
The offset carries a date-derived external_id so monthly runs refresh
the same synthetic entry rather than stacking duplicates.
"""
if not holdings:
return None
total_value = sum((h.total_value for h in holdings), Decimal(0))
total_contrib = sum((t.amount for t in transactions), Decimal(0))
gains = total_value - total_contrib
if gains == 0:
return None
return Activity(
external_id=f"fidelity:gains:{as_of.date().isoformat()}",
account_id=ACCOUNT_ID,
account_type=AccountType.WORKPLACE_PENSION,
date=as_of,
activity_type=ActivityType.DEPOSIT if gains > 0 else ActivityType.WITHDRAWAL,
currency=_CCY,
amount=abs(gains),
notes=(f"fidelity-planviewer:unrealised-gains-offset "
f"(pot=£{total_value}, contrib=£{total_contrib})"),
)
class FidelityPlanViewerProvider:
"""Read-only provider against Fidelity UK PlanViewer.
Lifecycle:
- ``accounts()`` advertises the single WF workplace-pension account.
- ``fetch(since, before)`` opens a Playwright session with the saved
storage_state, navigates to the transaction-history page with a wide
date range, scrapes the table, and intercepts the valuation XHR.
"""
name = "fidelity-planviewer"
def __init__(self, creds: FidelityCreds) -> None:
self._creds = creds
def accounts(self) -> list[Account]:
return [
Account(
id=ACCOUNT_ID,
name="Fidelity UK Pension",
account_type=AccountType.WORKPLACE_PENSION,
currency=_CCY,
provider=self.name,
),
]
async def fetch(
self,
*,
since: datetime | None = None,
before: datetime | None = None,
) -> AsyncIterator[Activity]:
state_path = self._creds.storage_state_path
if not Path(state_path).exists():
raise FidelityProviderConfigError(
f"storage_state not found at {state_path}"
"run `broker-sync fidelity-seed` first")
tx_html, valuation_json = await _scrape_live_session(
state_path=state_path, headless=self._creds.headless,
)
transactions = parse_transactions_html(tx_html)
holdings = parse_valuation_json(valuation_json)
log.info("fidelity: parsed %d transactions, %d holdings",
len(transactions), len(holdings))
for tx in transactions:
if since is not None and tx.date < since:
continue
if before is not None and tx.date >= before:
continue
yield _tx_to_activity(tx)
# The gains offset is always "as of now" so it reflects today's pot.
# Only emit when the caller isn't windowing (full state).
if since is None and before is None:
offset = _gains_offset_activity(holdings, transactions, datetime.now(UTC))
if offset is not None:
yield offset
async def _scrape_live_session(
*,
state_path: str,
headless: bool,
) -> tuple[str, dict[str, Any]]:
"""Load storage_state, navigate the transaction + valuation pages,
return (transactions HTML, valuation JSON payload).
Raises :class:`FidelitySessionError` if the session is dead (15-min idle,
cookie expiry, etc.) Viktor must re-seed.
"""
from playwright.async_api import async_playwright
captured_valuation: dict[str, dict[str, Any]] = {}
async with async_playwright() as pw:
browser = await pw.chromium.launch(headless=headless)
try:
ctx = await browser.new_context(
storage_state=state_path,
user_agent=("Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) "
"AppleWebKit/537.36 (KHTML, like Gecko) "
"Chrome/147.0.0.0 Safari/537.36"),
viewport={"width": 1280, "height": 900},
)
page = await ctx.new_page()
async def on_response(resp: Any) -> None:
if _PV_VALUATION_PATH in resp.url and resp.status < 400:
with contextlib.suppress(Exception):
captured_valuation["payload"] = await resp.json()
page.on("response", on_response)
# Trigger session + capture valuation by navigating through landing
# → main page. The SPA fires DisplayValuation on the main page.
await page.goto(_PV_LANDING, wait_until="networkidle", timeout=30000)
await page.wait_for_timeout(2000)
main_url = f"{_PV_BASE}/planviewer/DisplayMainPage.action"
await page.goto(main_url, wait_until="networkidle", timeout=30000)
await page.wait_for_timeout(3000)
if "idle for more than 15 minutes" in (await page.content()) \
or "id.fidelity.co.uk" in page.url:
raise FidelitySessionError(
"PlanViewer session stale — run `broker-sync fidelity-seed`")
# Now pull the transactions page with a wide date range.
await page.goto(f"{_PV_BASE}{_PV_TX_PATH}",
wait_until="networkidle", timeout=30000)
await page.wait_for_timeout(1500)
await page.fill('input[name="startDate"]', _BACKFILL_START)
today = await page.evaluate(
"new Date().toLocaleDateString('en-GB',"
"{day:'2-digit',month:'short',year:'numeric'}).replace(/,/g,'')")
await page.fill('input[name="endDate"]', today)
await page.focus('input[name="endDate"]')
await page.keyboard.press("Enter")
with contextlib.suppress(Exception):
await page.wait_for_load_state("networkidle", timeout=15000)
await page.wait_for_timeout(2000)
tx_html = await page.content()
# If valuation wasn't picked up on the main page, request directly.
if "payload" not in captured_valuation:
r = await page.request.get(f"{_PV_BASE}{_PV_VALUATION_PATH}")
if r.ok:
with contextlib.suppress(Exception):
captured_valuation["payload"] = await r.json()
# Roll the storage_state so the next run benefits from any refresh.
await ctx.storage_state(path=state_path)
finally:
await browser.close()
valuation: dict[str, Any] = captured_valuation.get("payload") or {}
return tx_html, valuation

View file

@ -0,0 +1,144 @@
"""Backfill-from-finance provider.
The retired `finance` app's MySQL has a `position` table with 5+ years of
InvestEngine + Schwab trade history (2020 onwards) that the broker-sync
pipeline otherwise can't reconstruct (IE's emails only go back to when
Viktor started receiving them; Schwab emails are sparse). This provider
reads that table once and emits canonical Activities so a full-history
backfill into Wealthfolio is possible.
Ticker routing to Wealthfolio accounts:
*.L (VUAG.L, VUSA.L, etc.) -> InvestEngine ISA (GBP)
everything else (META, *_US_EQ) -> Schwab (US workplace, USD)
Deduplication: the finance.position PK (a giant numeric string) goes into
external_id verbatim, so re-runs are idempotent against the sync_record
store.
"""
from __future__ import annotations
import logging
from collections.abc import AsyncIterator
from datetime import UTC, datetime
from decimal import Decimal
from typing import NamedTuple
import aiomysql # type: ignore[import-untyped]
from broker_sync.models import Account, AccountType, Activity, ActivityType
log = logging.getLogger(__name__)
IE_ACCOUNT_ID = "invest-engine-primary"
SCHWAB_ACCOUNT_ID = "schwab-workplace"
class FinanceMySQLCreds(NamedTuple):
host: str
port: int
user: str
password: str
database: str
def _route(ticker: str) -> tuple[str, AccountType, str]:
"""Return (account_id, account_type, currency) for a raw ticker."""
if ticker.endswith(".L"):
return IE_ACCOUNT_ID, AccountType.ISA, "GBP"
return SCHWAB_ACCOUNT_ID, AccountType.GIA, "USD"
def _normalise_symbol(ticker: str) -> str:
"""Strip finance-app quirks so the output symbol matches T212/Wealthfolio."""
# VUAG.L -> VUAG (LSE handled by Wealthfolio's exchange_mic resolution)
if ticker.endswith(".L"):
return ticker[:-2]
# FLME_US_EQ -> FLME (Trading212-style suffix leaked into the old finance DB)
if ticker.endswith("_US_EQ"):
return ticker[:-6]
if ticker.endswith("_EQ"):
return ticker[:-3]
return ticker
def _row_to_activity(row: dict[str, object]) -> Activity:
ticker = str(row["ticker"])
account_id, account_type, default_ccy = _route(ticker)
raw_qty = Decimal(str(row["num_shares"]))
activity_type = ActivityType.BUY if raw_qty > 0 else ActivityType.SELL
# buy_date from MySQL comes back as datetime (aiomysql converts)
dt = row["buy_date"]
if isinstance(dt, datetime):
date = dt if dt.tzinfo else dt.replace(tzinfo=UTC)
else:
date = datetime.fromisoformat(str(dt)).replace(tzinfo=UTC)
currency_raw = row.get("currency")
currency = str(currency_raw) if currency_raw else default_ccy
return Activity(
external_id=f"finance-mysql:position:{row['id']}",
account_id=account_id,
account_type=account_type,
date=date,
activity_type=activity_type,
symbol=_normalise_symbol(ticker),
quantity=abs(raw_qty),
unit_price=Decimal(str(row["buy_price"])),
currency=currency,
notes=f"finance-mysql:{ticker}",
)
class FinanceMySQLProvider:
"""Read-only backfill from the retired finance MySQL `position` table."""
name = "finance-mysql"
def __init__(self, creds: FinanceMySQLCreds) -> None:
self._creds = creds
def accounts(self) -> list[Account]:
return [
Account(
id=IE_ACCOUNT_ID,
name="InvestEngine ISA",
account_type=AccountType.ISA,
currency="GBP",
provider="invest-engine",
),
Account(
id=SCHWAB_ACCOUNT_ID,
name="Schwab (US workplace)",
account_type=AccountType.GIA,
currency="USD",
provider="schwab",
),
]
async def fetch(
self,
*,
since: datetime | None = None,
before: datetime | None = None,
) -> AsyncIterator[Activity]:
conn = await aiomysql.connect(
host=self._creds.host,
port=self._creds.port,
user=self._creds.user,
password=self._creds.password,
db=self._creds.database,
autocommit=True,
)
try:
async with conn.cursor(aiomysql.DictCursor) as cur:
await cur.execute("SELECT id, ticker, buy_price, num_shares, currency, buy_date, "
"account_id FROM position ORDER BY buy_date ASC")
rows = await cur.fetchall()
log.info("finance-mysql: %d position rows", len(rows))
for row in rows:
activity = _row_to_activity(row)
if since is not None and activity.date < since:
continue
if before is not None and activity.date >= before:
continue
yield activity
finally:
conn.close()

View file

@ -0,0 +1,253 @@
"""IMAP email ingestor: dispatches messages to the matching parser by sender.
Used by the `imap-ingest` CLI command for InvestEngine + Schwab confirmation
emails. Each message passes through:
1. Pull ALL messages from the configured mailbox directory.
2. Route each by `From:` to a parser:
- noreply@investengine.com (+ equivalents) invest_engine parser
- Schwab confirmations (equityawards@schwab.com, etc.) schwab parser
3. Merge parser output into one list[Activity] with source attribution.
Not imap-idle; runs once per invocation. Designed for a daily CronJob.
"""
from __future__ import annotations
import email
import imaplib
import logging
import re
import ssl
from collections.abc import AsyncIterator, Iterator
from datetime import date, datetime
from decimal import Decimal
from email.message import Message
from typing import NamedTuple
from broker_sync.models import Account, AccountType, Activity, ActivityType
from broker_sync.providers.parsers import invest_engine as ie_parser
from broker_sync.providers.parsers.schwab import parse_schwab_email
_IE_ISA_ACCOUNT_ID = "invest-engine-primary"
_IE_GIA_ACCOUNT_ID = "invest-engine-gia"
_ISA_ANNUAL_CAP = Decimal("20000")
_UK_TAX_YEAR_START = (4, 6) # (month, day) — UK tax year starts 6 April
def _uk_tax_year_start(d: datetime) -> date:
"""Return the start date (6 April of year N) of the UK tax year containing `d`."""
month, day = _UK_TAX_YEAR_START
cutoff = date(d.year, month, day)
return cutoff if d.date() >= cutoff else date(d.year - 1, month, day)
def _split_ie_by_isa_cap(
activities: list[Activity],
*,
isa_cap: Decimal = _ISA_ANNUAL_CAP,
) -> list[Activity]:
"""Re-route IE BUYs: first `isa_cap` GBP of each UK tax year → ISA, rest → GIA.
Viktor's IE account has both an ISA and a GIA wrapper, and his trade
confirmation emails don't indicate which one a given buy hit. Empirically,
he fills the ISA allowance first each tax year (6 April) and any excess
lands in GIA. This function partitions an already-parsed batch of Activity
objects by that rule.
Rule for boundary buys: a BUY is assigned to ISA iff the running tax-year
total BEFORE it is still strictly below the cap; otherwise GIA. Whole-
activity assignment no fractional splits.
Non-IE activities and non-BUYs are passed through unchanged.
"""
ie_buys = [
a for a in activities
if a.account_id == _IE_ISA_ACCOUNT_ID and a.activity_type is ActivityType.BUY
]
ie_buys.sort(key=lambda a: a.date)
cumulative: dict[date, Decimal] = {}
for a in ie_buys:
ty = _uk_tax_year_start(a.date)
running = cumulative.get(ty, Decimal(0))
trade_value = (a.quantity or Decimal(0)) * (a.unit_price or Decimal(0))
if running < isa_cap:
a.account_id = _IE_ISA_ACCOUNT_ID
a.account_type = AccountType.ISA
else:
a.account_id = _IE_GIA_ACCOUNT_ID
a.account_type = AccountType.GIA
cumulative[ty] = running + trade_value
return activities
log = logging.getLogger(__name__)
_IE_SENDERS = {"noreply@investengine.com", "hello@investengine.com"}
_SCHWAB_SENDERS = {
"equityawards@schwab.com",
"donotreply@schwab.com",
"wealthnotify@schwab.com",
}
_ADDR_RE = re.compile(r"[\w.+-]+@[\w-]+(?:\.[\w-]+)+")
class ImapCreds(NamedTuple):
host: str
user: str
password: str
directory: str
def _extract_sender(msg: Message) -> str:
raw = msg.get("From", "")
m = _ADDR_RE.search(raw)
return (m.group(0) if m else "").lower()
def _html_or_text(msg: Message) -> str:
"""Return the richest body available (prefer HTML)."""
if msg.is_multipart():
html = None
plain = None
for part in msg.walk():
ct = part.get_content_type()
if ct == "text/html" and html is None:
html = part.get_payload(decode=True)
elif ct == "text/plain" and plain is None:
plain = part.get_payload(decode=True)
body = html or plain
else:
body = msg.get_payload(decode=True)
if body is None:
return ""
if isinstance(body, bytes):
charset = msg.get_content_charset() or "utf-8"
try:
return body.decode(charset, errors="replace")
except LookupError:
return body.decode("utf-8", errors="replace")
return str(body)
def _fetch_all(creds: ImapCreds) -> Iterator[bytes]:
ctx = ssl.create_default_context()
with imaplib.IMAP4_SSL(creds.host, ssl_context=ctx) as m:
m.login(creds.user, creds.password)
typ, _ = m.select(creds.directory, readonly=True)
if typ != "OK":
raise RuntimeError(f"IMAP select {creds.directory} failed: {typ}")
typ, data = m.search(None, "ALL")
if typ != "OK":
raise RuntimeError(f"IMAP search failed: {typ}")
ids = data[0].split()
log.info("imap: fetching %d messages from %s", len(ids), creds.directory)
for uid in ids:
typ, rsp = m.fetch(uid, "(RFC822)")
if typ != "OK" or not rsp or not rsp[0]:
continue
raw = rsp[0][1]
if isinstance(raw, bytes):
yield raw
def fetch_activities(creds: ImapCreds) -> list[Activity]:
out: list[Activity] = []
ie_parsed = schwab_parsed = skipped = 0
for raw in _fetch_all(creds):
try:
msg = email.message_from_bytes(raw)
except Exception:
skipped += 1
continue
sender = _extract_sender(msg)
if sender in _IE_SENDERS or sender.endswith("@investengine.com"):
out.extend(ie_parser.parse_invest_engine_email(raw))
ie_parsed += 1
elif sender in _SCHWAB_SENDERS or sender.endswith("@schwab.com"):
html = _html_or_text(msg)
out.extend(parse_schwab_email(html))
schwab_parsed += 1
else:
skipped += 1
log.info(
"imap: ie_parsed=%d schwab_parsed=%d skipped=%d%d activities",
ie_parsed,
schwab_parsed,
skipped,
len(out),
)
return out
class ImapProvider:
"""Wraps the IMAP fetch + per-sender parse into the Provider protocol.
Yields both InvestEngine AND Schwab activities downstream the
pipeline's dedup keyed on (provider, account, external_id) already
isolates them by account_id.
"""
name = "imap"
def __init__(self, creds: ImapCreds) -> None:
self._creds = creds
def accounts(self) -> list[Account]:
return [
Account(
id=_IE_ISA_ACCOUNT_ID,
name="InvestEngine ISA",
account_type=AccountType.ISA,
currency="GBP",
provider="invest-engine",
),
Account(
id=_IE_GIA_ACCOUNT_ID,
name="InvestEngine GIA",
account_type=AccountType.GIA,
currency="GBP",
provider="invest-engine",
),
Account(
id="schwab-workplace",
name="Schwab (US workplace)",
account_type=AccountType.GIA,
currency="USD",
provider="schwab",
),
]
async def fetch(
self,
*,
since: datetime | None = None,
before: datetime | None = None,
) -> AsyncIterator[Activity]:
# IMAP doesn't give us a server-side date range directly without
# constructing IMAP SEARCH criteria; filter client-side.
all_activities = fetch_activities(self._creds)
# Apply ISA/GIA £20k-cap routing in one batch-level pass so each UK tax
# year's cumulative total is computed consistently regardless of email
# order on the server.
routed = _split_ie_by_isa_cap(all_activities)
for a in routed:
if since is not None and a.date < since:
continue
if before is not None and a.date >= before:
continue
yield a
if __name__ == "__main__":
# Local smoke — invoked manually for debug, never from the CronJob.
import os
logging.basicConfig(level=logging.INFO)
c = ImapCreds(
host=os.environ["IMAP_HOST"],
user=os.environ["IMAP_USER"],
password=os.environ["IMAP_PASSWORD"],
directory=os.environ.get("IMAP_DIRECTORY", "INBOX"),
)
acts = fetch_activities(c)
print(f"total={len(acts)}")
for a in acts[:5]:
print(f" {a.activity_type} {a.symbol} {a.date.isoformat()}")

View file

@ -0,0 +1,380 @@
"""InvestEngine Bearer-token provider.
InvestEngine (https://investengine.com) has no public API and requires MFA
(push-approval via the IE mobile app) on every login. We work around that
by having Viktor log in manually in a browser, copy the Bearer token out
of devtools, and paste it into Vault. This module consumes that token.
## Vault schema — `secret/broker-sync`
The following keys are expected in Vault. The caller (CLI/pipeline) reads
them and hands the values to the constructor. This module does NOT read
Vault directly it stays testable.
- ``investengine_bearer_token`` the Bearer string Viktor pastes from
devtools. Expires on IE's refresh schedule (~monthly based on observed
behaviour).
- ``investengine_token_expires_at`` ISO-8601 timestamp Viktor sets WHEN
HE PASTES the token. Used to alert 3 days before expiry and to fail
fast before making a request with a known-dead token.
- ``investengine_refresh_token`` *(optional)* if Viktor's devtools
capture included a refresh token, we may attempt auto-refresh in a
future iteration. Not used yet.
## Version probing
IE rolls the ``/api/v0.3X/`` version every 4-6 weeks. ``v0.29`` and
``v0.30`` were ``410 Gone`` at research time; ``v0.31`` and ``v0.32``
were live (401 without auth the correct "auth required" signal);
``v0.33`` and later returned 404 (not yet created). The probe starts
at ``v0.32`` and walks forward on 410, stopping at the first version
that returns anything other than 410 (401/404/200/etc).
"""
from __future__ import annotations
import logging
from collections.abc import AsyncIterator
from datetime import UTC, datetime
from decimal import Decimal
from typing import Any
import httpx
from broker_sync.models import Account, AccountType, Activity, ActivityType
log = logging.getLogger(__name__)
_DEFAULT_BASE_URL = "https://investengine.com"
_USER_AGENT = "broker-sync/0.1 (+https://github.com/ViktorBarzin/broker-sync)"
# Version probe starts here. If this minor bumps past a live version,
# update the constant rather than relying on re-probe every process.
_START_VERSION_MINOR = 32
# Hard cap on how far forward we probe before giving up. IE has never
# skipped versions; this is defence against a runaway loop.
_MAX_VERSION_MINOR = 60
# One logical account — IE has only ever had a single ISA per user.
_ACCOUNT_ID = "invest-engine-primary"
_ACCOUNT = Account(
id=_ACCOUNT_ID,
name="InvestEngine ISA",
account_type=AccountType.ISA,
currency="GBP",
provider="invest-engine",
)
# Type-string → ActivityType. Exact IE strings are UNVERIFIED — we match
# case-insensitively and fall back to substring checks for the common
# variants ("DEPOSIT", "WITHDRAWAL").
_EXACT_TYPE_MAP: dict[str, ActivityType] = {
"BUY": ActivityType.BUY,
"SELL": ActivityType.SELL,
"DIVIDEND": ActivityType.DIVIDEND,
"INTEREST": ActivityType.INTEREST,
"DEPOSIT": ActivityType.DEPOSIT,
"WITHDRAWAL": ActivityType.WITHDRAWAL,
"FEE": ActivityType.FEE,
"TAX": ActivityType.TAX,
}
class InvestEngineError(Exception):
"""Any non-retryable InvestEngine API failure."""
class InvestEngineTokenExpiredError(InvestEngineError):
"""Bearer token rejected by IE (401). Viktor must paste a new token."""
class InvestEngineVersionError(InvestEngineError):
"""Could not find a live /api/v0.3X/ version on the IE backend."""
def _version_path(minor: int) -> str:
return f"/api/v0.{minor}/"
async def _probe_version(
client: httpx.AsyncClient,
*,
start_minor: int = _START_VERSION_MINOR,
max_minor: int = _MAX_VERSION_MINOR,
) -> int:
"""Walk forward from ``start_minor`` looking for a live API version.
Returns the minor number of the first version that is NOT 410 Gone.
A live version is one IE currently accepts; a 401 response is the
expected "auth required" signal and confirms the version is serving
traffic. Raises :class:`InvestEngineVersionError` if no live version
is found before ``max_minor``.
"""
for minor in range(start_minor, max_minor + 1):
resp = await client.get(_version_path(minor))
if resp.status_code != 410:
return minor
raise InvestEngineVersionError(
f"No live /api/v0.3X/ between v0.{start_minor} and v0.{max_minor}")
def _parse_iso(ts: str) -> datetime:
"""Accept both ``...Z`` and ``...+HH:MM`` suffixes."""
return datetime.fromisoformat(ts.replace("Z", "+00:00"))
def _opt_decimal(raw: Any) -> Decimal | None:
if raw is None:
return None
return Decimal(str(raw))
def _classify_type(raw_type: str) -> ActivityType | None:
"""Map a raw IE type string to ActivityType, or None if unknown."""
upper = raw_type.upper()
exact = _EXACT_TYPE_MAP.get(upper)
if exact is not None:
return exact
if "DEPOSIT" in upper:
return ActivityType.DEPOSIT
if "WITHDRAWAL" in upper or "WITHDRAW" in upper:
return ActivityType.WITHDRAWAL
return None
def _transaction_to_activity(raw: dict[str, Any]) -> Activity | None:
"""Turn one IE transaction dict into a canonical Activity.
The IE response shape is UNVERIFIED these assumptions WILL need
review once Viktor pastes a live token:
- ``id``: string or int (cast to str for ``external_id``)
- ``type``: upper-case string ``BUY``/``SELL``/``DIVIDEND``/etc.
- ``symbol``: ticker string (may be missing on cash events)
- ``quantity`` / ``price`` / ``amount``: numeric-in-a-string or float
- ``currency``: ISO code; default ``GBP`` for an ISA
- ``date``: ISO-8601 with ``Z`` or offset suffix
Unknown type strings are logged at WARNING and skipped silent
misclassification would corrupt tax reporting, so we refuse to guess.
"""
raw_type = str(raw.get("type", ""))
activity_type = _classify_type(raw_type)
if activity_type is None:
log.warning(
"invest-engine: skipping transaction id=%s with unknown type=%r",
raw.get("id"),
raw_type,
)
return None
txn_id = raw.get("id")
if txn_id is None:
log.warning("invest-engine: skipping transaction with missing id: %r", raw)
return None
currency = str(raw.get("currency") or "GBP")
date_str = raw.get("date") or raw.get("created_at") or raw.get("timestamp")
if not isinstance(date_str, str):
log.warning("invest-engine: skipping txn id=%s — no parseable date", txn_id)
return None
quantity = _opt_decimal(raw.get("quantity"))
unit_price = _opt_decimal(raw.get("price") or raw.get("unit_price"))
amount = _opt_decimal(raw.get("amount") or raw.get("value"))
fee = _opt_decimal(raw.get("fee")) or Decimal("0")
symbol_raw = raw.get("symbol") or raw.get("ticker")
symbol = str(symbol_raw) if symbol_raw else None
return Activity(
external_id=f"invest-engine:{txn_id}",
account_id=_ACCOUNT_ID,
account_type=AccountType.ISA,
date=_parse_iso(date_str),
activity_type=activity_type,
currency=currency,
symbol=symbol,
quantity=quantity,
unit_price=unit_price,
amount=amount,
fee=fee,
)
def _extract_list(page: dict[str, Any]) -> list[dict[str, Any]]:
"""Handle both ``{results: [...]}`` and ``{data: [...]}`` shapes."""
for key in ("results", "data"):
items = page.get(key)
if isinstance(items, list):
return [i for i in items if isinstance(i, dict)]
return []
def _extract_next(page: dict[str, Any]) -> str | None:
"""Pull the next-page URL from either DRF ``next`` or JSON:API ``meta.next_page``."""
nxt = page.get("next")
if isinstance(nxt, str) and nxt:
return nxt
meta = page.get("meta")
if isinstance(meta, dict):
meta_nxt = meta.get("next_page") or meta.get("next")
if isinstance(meta_nxt, str) and meta_nxt:
return meta_nxt
return None
class InvestEngineProvider:
"""Concrete Provider for InvestEngine.
Only one logical account per user (one ISA, GBP). The token expiry is
tracked at the Python layer: the CLI alerts 3 days before expiry, and
this class fails fast if the clock says the token is already dead
(cheaper than burning a request for the same 401).
"""
name = "invest-engine"
def __init__(
self,
*,
bearer_token: str,
token_expires_at: datetime,
base_url: str = _DEFAULT_BASE_URL,
transport: httpx.AsyncBaseTransport | None = None,
) -> None:
self._token = bearer_token
self._token_expires_at = token_expires_at
self._client = httpx.AsyncClient(
base_url=base_url,
timeout=30.0,
transport=transport,
headers={
"Authorization": f"Bearer {bearer_token}",
"User-Agent": _USER_AGENT,
},
)
self._version_minor: int | None = None
def accounts(self) -> list[Account]:
return [_ACCOUNT]
async def close(self) -> None:
await self._client.aclose()
async def _active_version(self, *, force: bool = False) -> int:
"""Cache the live API minor. Set ``force=True`` after a 410 to re-probe."""
if self._version_minor is not None and not force:
return self._version_minor
start = (self._version_minor + 1) if force and self._version_minor else _START_VERSION_MINOR
minor = await _probe_version(self._client, start_minor=start)
self._version_minor = minor
return minor
async def _request_json(self, path: str, params: dict[str, str] | None = None) -> Any:
"""GET ``path`` (relative), return JSON. Handles one 410 re-probe retry."""
resp = await self._client.get(path, params=params)
if resp.status_code == 401:
raise InvestEngineTokenExpiredError(
f"InvestEngine rejected Bearer token (HTTP 401 on {path}); "
f"token_expires_at={self._token_expires_at.isoformat()}. "
f"Viktor must paste a new token into Vault.")
if resp.status_code == 410:
# Version rolled mid-session. Re-probe, retarget path, retry once.
old_minor = self._version_minor
new_minor = await self._active_version(force=True)
if old_minor is not None and new_minor != old_minor:
new_path = path.replace(f"/v0.{old_minor}/", f"/v0.{new_minor}/", 1)
retry = await self._client.get(new_path, params=params)
if retry.status_code == 401:
raise InvestEngineTokenExpiredError(f"InvestEngine 401 on retry {new_path}")
if retry.status_code != 200:
raise InvestEngineError(
f"InvestEngine {new_path} HTTP {retry.status_code} after re-probe")
return retry.json()
raise InvestEngineError(f"InvestEngine 410 on {path} and no newer version")
if resp.status_code != 200:
raise InvestEngineError(
f"InvestEngine {path} HTTP {resp.status_code}: {resp.text[:200]}")
return resp.json()
async def fetch(
self,
*,
since: datetime | None = None,
before: datetime | None = None,
) -> AsyncIterator[Activity]:
# Fail fast if the token is already known-dead.
if self._token_expires_at <= datetime.now(UTC):
raise InvestEngineTokenExpiredError(
f"InvestEngine token expired at {self._token_expires_at.isoformat()}"
f"Viktor must paste a new token.")
version = await self._active_version()
portfolio_ids = await self._list_portfolio_ids(version)
for pid in portfolio_ids:
async for activity in self._fetch_portfolio(version, pid, since, before):
yield activity
async def _list_portfolio_ids(self, version: int) -> list[str]:
"""Walk `/portfolios/` pagination and return the list of ids."""
ids: list[str] = []
path: str | None = f"/api/v0.{version}/portfolios/"
while path is not None:
page = await self._request_json(path)
if not isinstance(page, dict):
log.warning("invest-engine: /portfolios/ returned non-dict %r", type(page))
break
for item in _extract_list(page):
pid = item.get("id")
if pid is None:
continue
ids.append(str(pid))
path = _next_page_path(_extract_next(page), current=path)
return ids
async def _fetch_portfolio(
self,
version: int,
portfolio_id: str,
since: datetime | None,
before: datetime | None,
) -> AsyncIterator[Activity]:
params: dict[str, str] = {"portfolio": portfolio_id}
if since is not None:
params["start"] = since.date().isoformat()
if before is not None:
params["end"] = before.date().isoformat()
path: str | None = f"/api/v0.{version}/transactions/"
while path is not None:
page = await self._request_json(
path, params=params if path.endswith("/transactions/") else None)
if not isinstance(page, dict):
break
for raw in _extract_list(page):
activity = _transaction_to_activity(raw)
if activity is None:
continue
if since is not None and activity.date < since:
continue
if before is not None and activity.date >= before:
continue
yield activity
path = _next_page_path(_extract_next(page), current=path)
def _next_page_path(raw_next: str | None, *, current: str) -> str | None:
"""Normalise a ``next`` URL to a request path.
IE might emit full URLs (``https://investengine.com/api/v0.32/...``)
or relative paths. We return the path component only the httpx
client holds the base URL.
"""
if raw_next is None:
return None
if raw_next.startswith("http://") or raw_next.startswith("https://"):
from urllib.parse import urlparse
parsed = urlparse(raw_next)
path = parsed.path
if parsed.query:
path = f"{path}?{parsed.query}"
return path
return raw_next

View file

@ -0,0 +1,129 @@
"""Parsers for Fidelity UK PlanViewer scraped data.
Two inputs:
- **Transactions HTML** from ``/planviewer/DisplayMyPlanMemberTransHist.action``
rendered with a wide date range. The relevant <table> has
``id="myplan_member_transhist_support"``.
- **Valuation JSON** from the XHR ``/planviewer/DisplayValuation.action``
the SPA calls this to render the my-investments dashboard. Contains
current unit holdings + price + breakdown by contribution type.
"""
from __future__ import annotations
import hashlib
import re
from dataclasses import dataclass
from datetime import UTC, datetime
from decimal import Decimal
from typing import Any
from bs4 import BeautifulSoup
_AMOUNT_RE = re.compile(r"\u00a3([\d,]+(?:\.\d+)?)")
# Fidelity transaction type strings we care about
_TX_DEPOSIT_TYPES = {
"regular premium",
"single premium",
"investment management rebate",
}
_TX_IGNORE_TYPES = {
"bulk switch", # pure reallocation, no cash impact
"fund switch",
}
@dataclass(frozen=True)
class FidelityCashTx:
"""A single cash-impacting transaction from the transaction history page."""
date: datetime
tx_type: str # raw Fidelity label ("Regular Premium", "Single Premium", …)
amount: Decimal
external_id: str
@dataclass(frozen=True)
class FidelityHolding:
"""A current fund-unit holding from DisplayValuation.action."""
fund_code: str
fund_name: str
units: Decimal
unit_price: Decimal
currency: str
total_value: Decimal
# Contribution-type breakdown ({"SASC": Decimal(...), "ERXS": Decimal(...)})
units_by_source: dict[str, Decimal]
def parse_transactions_html(html: str) -> list[FidelityCashTx]:
"""Extract cash-impacting transactions from the transaction history page.
Skips bulk switches (no cash movement) and header/total rows. Deterministic
external_id so re-runs dedup against the same rows.
"""
soup = BeautifulSoup(html, "html.parser")
out: list[FidelityCashTx] = []
for tr in soup.select("table#myplan_member_transhist_support tr"):
cells = [td.get_text(" ", strip=True) for td in tr.find_all("td")]
if len(cells) != 7:
continue
date_str, tx_type, _f, _c, _u, _p, amount_str = cells
m_date = re.match(r"(\d{2})/(\d{2})/(\d{4})", date_str)
if not m_date:
continue
tx_lower = tx_type.lower()
if tx_lower in _TX_IGNORE_TYPES or tx_type in ("-",):
continue
m_amt = _AMOUNT_RE.search(amount_str)
if not m_amt:
continue
amount = Decimal(m_amt.group(1).replace(",", ""))
if amount == 0:
continue
dd, mm, yyyy = m_date.groups()
dt = datetime(int(yyyy), int(mm), int(dd), tzinfo=UTC)
fp = hashlib.sha256(
f"{dt.isoformat()}|{tx_type}|{amount}".encode()
).hexdigest()[:16]
out.append(FidelityCashTx(
date=dt,
tx_type=tx_type,
amount=amount,
external_id=f"fidelity:tx:{fp}",
))
return out
def parse_valuation_json(payload: Any) -> list[FidelityHolding]:
"""Extract current fund holdings from DisplayValuation.action JSON."""
out: list[FidelityHolding] = []
for v in payload.get("valuations", []):
asset = v.get("asset") or {}
fund_code = next(
(a.get("value") for a in asset.get("assetId", []) if a.get("type") == "FUND_CODE"),
None,
)
if not fund_code:
continue
fund_name = asset.get("name") or fund_code
units = Decimal(str((v.get("units") or {}).get("total") or 0))
price = (v.get("price") or {})
unit_price = Decimal(str(price.get("value") or 0))
currency = price.get("currency") or "GBP"
total = Decimal(str((v.get("valuation") or {}).get("total") or 0))
groups = (v.get("units") or {}).get("group", []) or []
by_src = {}
for g in groups:
if g.get("type") == "CONTRIBUTION_TYPE" and g.get("groupId"):
by_src[g["groupId"]] = Decimal(str(g.get("unit", {}).get("total") or 0))
out.append(FidelityHolding(
fund_code=fund_code,
fund_name=fund_name,
units=units,
unit_price=unit_price,
currency=currency,
total_value=total,
units_by_source=by_src,
))
return out

View file

@ -0,0 +1,329 @@
"""InvestEngine email parser.
IE mails the user after each trade batch. The body shape varies over
the years IE has sent trade confirmations as plain-text RFC 2822
messages, multipart HTML emails with a summary table, and (for older
statements) CSV attachments. This module tries the three strategies in
order and returns the first that yields at least one Activity.
Every parse strategy produces canonical `Activity` objects with:
- `account_id = "invest-engine-primary"` (sink remaps to Wealthfolio UUID)
- `account_type = AccountType.ISA` (Viktor's IE account is an ISA)
- `currency = "GBP"`
- `external_id = f"invest-engine:{fingerprint}"` where fingerprint hashes
(date, symbol, quantity, unit_price) for deterministic dedup.
"""
from __future__ import annotations
import csv
import email
import hashlib
import io
import re
from datetime import datetime
from decimal import Decimal, InvalidOperation
from email.message import Message
from bs4 import BeautifulSoup
from broker_sync.models import AccountType, Activity, ActivityType
_ACCOUNT_ID = "invest-engine-primary"
_CURRENCY_SIGN = "£"
# HTML trade summary rows have the shape "Bought <qty> @ £<price> per share".
_BOUGHT_RE = re.compile(
r"Bought\s+([0-9]+(?:\.[0-9]+)?)\s*@\s*" + re.escape(_CURRENCY_SIGN) + r"([0-9]+(?:\.[0-9]+)?)",
re.IGNORECASE,
)
# Ticker lines look like "Vanguard S&P 500: VUAG" — we want the last
# all-caps token after the colon.
_TICKER_RE = re.compile(r":\s*([A-Z][A-Z0-9]{1,9})\s*$")
# Date rows contain "Date: DD Month YYYY".
_DATE_RE = re.compile(
r"Date:\s*([0-9]{1,2})\s+([A-Za-z]+)\s+([0-9]{4})",
re.IGNORECASE,
)
def parse_invest_engine_email(raw_email: bytes) -> list[Activity]:
"""Parse an IE trade confirmation email into Activity records.
Tries RFC 2822 body lines first, then HTML tables, then a CSV
attachment. Returns an empty list when nothing matches never
raises on malformed input.
"""
msg = email.message_from_bytes(raw_email)
text_body = _extract_part_body(msg, "text/plain")
if text_body is not None:
activities = _parse_rfc2822_lines(text_body)
if activities:
return activities
html_body = _extract_part_body(msg, "text/html")
if html_body is not None:
activities = _parse_html_tables(html_body)
if activities:
return activities
csv_activities = _parse_csv_attachment(raw_email)
if csv_activities:
return csv_activities
return []
def _extract_part_body(msg: Message, content_type: str) -> str | None:
"""Return the first sub-part of the given content type, or None."""
if msg.is_multipart():
for part in msg.walk():
if part.get_content_type() == content_type:
return _decode_payload(part)
return None
if msg.get_content_type() == content_type:
return _decode_payload(msg)
return None
def _decode_payload(part: Message) -> str | None:
payload = part.get_payload(decode=True)
if isinstance(payload, bytes):
return payload.decode(part.get_content_charset() or "utf-8", errors="replace")
if isinstance(payload, str):
return payload
return None
def _parse_rfc2822_lines(body: str) -> list[Activity]:
"""Try each line-based body format (v1/v2) and return matches.
Corresponds to `_extract_position_v1` and `_extract_position_v2` in
the upstream parser. Returns a one-element list on success, `[]`
otherwise. v3/v4 are not ported no surviving fixtures exist and
the HTML fallback covers newer formats.
"""
for parser in (_try_v2, _try_v1):
result = parser(body)
if result is not None:
return [result]
return []
def _try_v2(body: str) -> Activity | None:
"""Parse body with v2 layout: `Date: DD Month` on line 2, year on line 3."""
lines = body.splitlines()
if len(lines) < 6:
return None
try:
day_str, month = lines[2].split()[-2:]
year = lines[3].split()[0]
on_date = datetime.strptime(f"{day_str}-{month}-{year}", "%d-%B-%Y")
symbol = lines[4].split(":")[1].split()[0].strip()
unit_price = Decimal(lines[4].split(_CURRENCY_SIGN)[1].split()[0])
quantity = Decimal(lines[4].split("Bought")[1].split()[0])
except (ValueError, IndexError):
return None
return _build_activity(
on_date=on_date,
symbol=symbol,
quantity=quantity,
unit_price=unit_price,
strategy="rfc2822-v2",
matched=lines[4],
)
def _try_v1(body: str) -> Activity | None:
"""Parse body with v1 layout: `Date: DD` on line 2, `Month YYYY` on line 3."""
lines = body.splitlines()
if len(lines) < 6:
return None
try:
day = int(lines[2].split("Date: ")[1])
month, year = (lines[3].split(" ")[0]).split()
on_date = datetime.strptime(f"{day}-{month}-{year}", "%d-%B-%Y")
symbol = lines[4].split(":")[1].split()[0].strip()
quantity = Decimal(lines[4].split("Bought")[1].split()[0])
price_str = lines[4].split("Bought")[1].split("@")[1].split()[0].split(_CURRENCY_SIGN)[1]
unit_price = Decimal(price_str)
except (ValueError, IndexError):
return None
return _build_activity(
on_date=on_date,
symbol=symbol,
quantity=quantity,
unit_price=unit_price,
strategy="rfc2822-v1",
matched=lines[4],
)
def _parse_html_tables(body: str) -> list[Activity]:
"""Parse an HTML body with per-order nested summary tables.
Walks every leaf <table> (a table with no child tables); each leaf
carries one trade summary (ticker, bought line, total, ISIN + order
id). Tables that don't contain the expected shape are skipped, so a
partially corrupted email yields only its intact orders.
"""
soup = BeautifulSoup(body, "html.parser")
on_date = _extract_html_date(soup)
if on_date is None:
return []
activities: list[Activity] = []
for table in soup.find_all("table"):
if table.find("table") is not None:
continue
activity = _try_html_summary_table(table, on_date)
if activity is not None:
activities.append(activity)
return activities
def _extract_html_date(soup: BeautifulSoup) -> datetime | None:
match = _DATE_RE.search(soup.get_text(" ", strip=True))
if match is None:
return None
day, month, year = match.groups()
try:
return datetime.strptime(f"{day}-{month}-{year}", "%d-%B-%Y")
except ValueError:
return None
def _try_html_summary_table(nested: object, on_date: datetime) -> Activity | None:
"""Interpret a leaf <table> as a single trade summary.
Returns None if the table is structural (no "Bought N @ £P" row) or
any required field is missing.
"""
get_text = getattr(nested, "get_text", None)
if get_text is None:
return None
text = get_text(" ", strip=True)
bought = _BOUGHT_RE.search(text)
if bought is None:
return None
symbol = _extract_html_symbol(nested)
if symbol is None:
return None
quantity = Decimal(bought.group(1))
unit_price = Decimal(bought.group(2))
return _build_activity(
on_date=on_date,
symbol=symbol,
quantity=quantity,
unit_price=unit_price,
strategy="html",
matched=text[:200],
)
def _extract_html_symbol(nested: object) -> str | None:
find_all = getattr(nested, "find_all", None)
if find_all is None:
return None
for cell in find_all("td"):
cell_text = cell.get_text(" ", strip=True)
m = _TICKER_RE.search(cell_text)
if m is not None:
return m.group(1)
return None
_CSV_CONTENT_TYPES = {"text/csv", "application/csv", "application/vnd.ms-excel"}
# Required columns for the CSV attachment strategy. IE has not (yet) sent
# CSV-attached statements in production — the column set here mirrors the
# upstream _extract_positions_csv contract (ticker, buy_price, num_shares,
# buy_date, currency) with modern names.
_CSV_COLUMNS = {"ticker", "unit_price", "quantity", "date", "currency"}
def _parse_csv_attachment(raw_email: bytes) -> list[Activity]:
"""Parse a CSV attachment from the email into Activity records.
Walks every MIME part, picks the first one with a CSV-ish content
type OR a `.csv` filename, and iterates its rows. Rows missing a
required column or with an unparseable number/date are skipped.
"""
msg = email.message_from_bytes(raw_email)
csv_text = _extract_csv_attachment_text(msg)
if csv_text is None:
return []
reader = csv.DictReader(io.StringIO(csv_text))
fieldnames = set(reader.fieldnames or [])
if not _CSV_COLUMNS.issubset(fieldnames):
return []
activities: list[Activity] = []
for row in reader:
activity = _csv_row_to_activity(row)
if activity is not None:
activities.append(activity)
return activities
def _extract_csv_attachment_text(msg: Message) -> str | None:
for part in msg.walk():
if not _looks_like_csv_part(part):
continue
payload = part.get_payload(decode=True)
if isinstance(payload, bytes):
return payload.decode(part.get_content_charset() or "utf-8", errors="replace")
if isinstance(payload, str):
return payload
return None
def _looks_like_csv_part(part: Message) -> bool:
if part.get_content_type() in _CSV_CONTENT_TYPES:
return True
filename = part.get_filename()
return isinstance(filename, str) and filename.lower().endswith(".csv")
def _csv_row_to_activity(row: dict[str, str]) -> Activity | None:
try:
on_date = datetime.strptime(row["date"], "%Y-%m-%d")
symbol = row["ticker"].strip()
quantity = Decimal(row["quantity"])
unit_price = Decimal(row["unit_price"])
currency = row["currency"].strip() or "GBP"
except (KeyError, ValueError, InvalidOperation):
return None
if not symbol or currency != "GBP":
return None
return _build_activity(
on_date=on_date,
symbol=symbol,
quantity=quantity,
unit_price=unit_price,
strategy="csv",
matched=f"{symbol},{unit_price},{quantity},{row['date']}",
)
def _build_activity(
*,
on_date: datetime,
symbol: str,
quantity: Decimal,
unit_price: Decimal,
strategy: str,
matched: str,
) -> Activity:
fingerprint = _fingerprint(on_date, symbol, quantity, unit_price)
return Activity(
external_id=f"invest-engine:{fingerprint}",
account_id=_ACCOUNT_ID,
account_type=AccountType.ISA,
date=on_date,
activity_type=ActivityType.BUY,
currency="GBP",
symbol=symbol,
quantity=quantity,
unit_price=unit_price,
notes=f"[{strategy}] {matched.strip()}",
)
def _fingerprint(date: datetime, symbol: str, quantity: Decimal, unit_price: Decimal) -> str:
key = f"{date.isoformat()}|{symbol}|{quantity}|{unit_price}"
return hashlib.sha256(key.encode("utf-8")).hexdigest()[:16]

View file

@ -0,0 +1,240 @@
"""Schwab workplace-RSU email parser.
Two email shapes are handled:
1. Trade confirmations (sell-to-cover or user-initiated trades): HTML
with five `<td class="dark-background-body" align="right">` cells
holding date / direction / quantity / ticker / price. one Activity.
2. Release Confirmations (RSU vest events): subject/body mentions
"Release Confirmation" or "Award Vesting"; body lists vest date,
shares released, FMV, shares sold to cover, and USD tax withheld.
(Activity, Activity, VestEvent) tuple: the gross vest (BUY at FMV),
the sell-to-cover (SELL at FMV), and a standalone VestEvent for the
payslip-ingest reconciliation pipeline.
On any parse failure we return the neutral empty result (no Activities,
no VestEvent) an unparseable email shouldn't crash the IMAP batch.
"""
from __future__ import annotations
import logging
import re
from dataclasses import dataclass
from decimal import Decimal, InvalidOperation
from bs4 import BeautifulSoup
from dateutil import parser as dateparser
from broker_sync.models import AccountType, Activity, ActivityType, VestEvent
log = logging.getLogger(__name__)
_ACCOUNT_ID = "schwab-workplace"
_DEFAULT_CURRENCY = "USD"
# Vest-confirmation emails reliably include one of these phrases. Matching
# is case-insensitive and on the raw HTML (cheap — no DOM parse needed).
_VEST_SUBJECT_RE = re.compile(r"Release Confirmation|Award Vesting|RSU Release",
re.IGNORECASE)
@dataclass
class VestParseResult:
activities: list[Activity]
vest_event: VestEvent | None
def parse_schwab_email(raw_html: str) -> list[Activity]:
"""Return a single-item list of Activity on success, empty on failure.
For vest-confirmation emails, returns the two Activity rows (gross
vest + sell-to-cover). Use `parse_schwab_email_full` when the caller
also needs the VestEvent.
"""
return parse_schwab_email_full(raw_html).activities
def parse_schwab_email_full(raw_html: str) -> VestParseResult:
"""Full parse — returns activities + optional VestEvent.
Dispatches: vest-confirmation emails `_parse_vest_release`;
everything else the legacy single-row confirmation parser.
"""
if _VEST_SUBJECT_RE.search(raw_html):
result = _parse_vest_release(raw_html)
if result is not None:
return result
log.warning("schwab: detected vest email but could not extract fields; "
"add a real fixture to broker-sync/tests/fixtures/")
return VestParseResult(activities=[], vest_event=None)
return VestParseResult(activities=_parse_trade_confirmation(raw_html), vest_event=None)
def _parse_trade_confirmation(raw_html: str) -> list[Activity]:
"""Legacy 5-cell trade confirmation parser."""
try:
soup = BeautifulSoup(raw_html, "html.parser")
cells = [
td.get_text(strip=True) for td in soup.find_all("td", {
"class": "dark-background-body",
"align": "right"
})
]
if len(cells) < 5:
return []
date_txt, direction_txt, qty_txt, ticker, price_txt = cells[:5]
trade_date = dateparser.parse(date_txt)
direction = (ActivityType.SELL
if direction_txt.strip().lower() == "sold" else ActivityType.BUY)
quantity = Decimal(qty_txt.replace(",", "").strip())
# Price like "$123.45" — strip the currency sign and parse the numeric tail.
# Handle "£", "€", "USD", etc. by taking the last numeric span.
price_clean = price_txt
for sign in ("$", "£", "", "USD", "GBP", "EUR"):
price_clean = price_clean.replace(sign, "")
unit_price = Decimal(price_clean.replace(",", "").strip())
external_id = (f"schwab:{trade_date.date().isoformat()}:{ticker}:"
f"{direction.value}:{quantity}")
return [
Activity(
external_id=external_id,
account_id=_ACCOUNT_ID,
account_type=AccountType.GIA,
date=trade_date,
activity_type=direction,
symbol=ticker.strip(),
quantity=quantity,
unit_price=unit_price,
currency=_DEFAULT_CURRENCY,
notes=f"schwab-email:{direction_txt}",
)
]
except (ValueError, InvalidOperation, IndexError, AttributeError):
return []
# Heuristic extractors for vest-release emails. Labels observed in public
# Schwab RSU release samples; real fixture needed to tighten these.
_VEST_DATE_RE = re.compile(
r"(?:Release Date|Vest Date|Vesting Date)\s*[:<][^0-9]*"
r"(\d{1,2}[\s/\-][A-Za-z]{3}[\s/\-]\d{2,4}|\d{2}/\d{2}/\d{4}|\d{4}-\d{2}-\d{2})",
re.IGNORECASE)
_VEST_TICKER_RE = re.compile(r"(?:Ticker|Symbol)\s*[:<]\s*([A-Z]{2,5})",
re.IGNORECASE)
_VEST_SHARES_RELEASED_RE = re.compile(
r"(?:Shares Released|Total Shares (?:Released|Vested))\s*[:<]\s*"
r"([\d,]+(?:\.\d+)?)",
re.IGNORECASE)
_VEST_SHARES_WITHHELD_RE = re.compile(
r"(?:Shares (?:Withheld|Sold)(?: for Taxes)?)\s*[:<]\s*"
r"([\d,]+(?:\.\d+)?)",
re.IGNORECASE)
_VEST_FMV_RE = re.compile(
r"(?:Market Price|FMV|Fair Market Value)\s*[:<]\s*"
r"\$?\s*([\d,]+(?:\.\d+)?)",
re.IGNORECASE)
_VEST_TAX_USD_RE = re.compile(
r"(?:Tax Withholding Amount|Total Tax Withholding|Tax Withheld)\s*[:<]\s*"
r"\$?\s*([\d,]+(?:\.\d+)?)",
re.IGNORECASE)
def _parse_vest_release(raw_html: str) -> VestParseResult | None:
"""Best-effort extraction from a Schwab Release Confirmation email.
Runs label regexes on the plain-text view of the HTML. Returns None
(signalling fall-through) if the core four fields (date, ticker,
shares released, FMV) don't all resolve — that's a strong signal the
heuristics need a real fixture before they can be trusted on a live
email.
"""
try:
soup = BeautifulSoup(raw_html, "html.parser")
text = soup.get_text(" ", strip=True)
except Exception:
return None
date_str = _search_group(_VEST_DATE_RE, text)
ticker = _search_group(_VEST_TICKER_RE, text)
shares_released_str = _search_group(_VEST_SHARES_RELEASED_RE, text)
fmv_str = _search_group(_VEST_FMV_RE, text)
if not (date_str and ticker and shares_released_str and fmv_str):
return None
try:
vest_date = dateparser.parse(date_str)
shares_vested = Decimal(shares_released_str.replace(",", ""))
fmv = Decimal(fmv_str.replace(",", ""))
except (ValueError, InvalidOperation):
return None
shares_sold_str = _search_group(_VEST_SHARES_WITHHELD_RE, text)
shares_sold_to_cover = (Decimal(shares_sold_str.replace(",", ""))
if shares_sold_str else None)
tax_usd_str = _search_group(_VEST_TAX_USD_RE, text)
tax_withheld_usd = (Decimal(tax_usd_str.replace(",", ""))
if tax_usd_str else None)
external_id = (f"schwab:{vest_date.date().isoformat()}:{ticker}:VEST:"
f"{shares_vested}")
vest_event = VestEvent(
external_id=external_id,
vest_date=vest_date,
ticker=ticker,
shares_vested=shares_vested,
shares_sold_to_cover=shares_sold_to_cover,
fmv_at_vest_usd=fmv,
tax_withheld_usd=tax_withheld_usd,
source="schwab_email",
raw={
"date": date_str,
"ticker": ticker,
"shares_released": shares_released_str,
"fmv": fmv_str,
"shares_withheld": shares_sold_str or "",
"tax_withheld": tax_usd_str or "",
},
)
# Sibling Activities for Wealthfolio: full vest as BUY, sell-to-cover
# slice as SELL, both at the same FMV so net cash = 0 on that day.
activities: list[Activity] = [
Activity(
external_id=f"{external_id}:BUY",
account_id=_ACCOUNT_ID,
account_type=AccountType.GIA,
date=vest_date,
activity_type=ActivityType.BUY,
symbol=ticker,
quantity=shares_vested,
unit_price=fmv,
currency=_DEFAULT_CURRENCY,
notes="schwab-vest-release",
)
]
if shares_sold_to_cover is not None and shares_sold_to_cover > 0:
activities.append(
Activity(
external_id=f"{external_id}:SELL_TO_COVER",
account_id=_ACCOUNT_ID,
account_type=AccountType.GIA,
date=vest_date,
activity_type=ActivityType.SELL,
symbol=ticker,
quantity=shares_sold_to_cover,
unit_price=fmv,
currency=_DEFAULT_CURRENCY,
notes="schwab-sell-to-cover",
))
return VestParseResult(activities=activities, vest_event=vest_event)
def _search_group(pattern: re.Pattern[str], text: str) -> str | None:
m = pattern.search(text)
return m.group(1).strip() if m else None

View file

@ -1,9 +1,8 @@
from __future__ import annotations
import csv
import io
import json
from collections.abc import Iterable
from datetime import UTC
from pathlib import Path
from typing import Any
@ -80,17 +79,18 @@ class WealthfolioSink:
self._session_path.write_text(json.dumps({"cookies": cookies}))
async def login(self) -> None:
# Wealthfolio 3.2's LoginRequest is `{ password: String }` only — a
# username key is rejected as an unknown field (HTTP 400). The
# `username` constructor arg is kept for a future Wealthfolio
# release that may add multi-user support.
resp = await self._client.post(
_LOGIN_PATH,
json={
"username": self._username,
"password": self._password
},
json={"password": self._password},
)
if resp.status_code == 401:
raise WealthfolioUnauthorizedError("Wealthfolio /auth/login returned 401")
resp.raise_for_status()
cookies = {k: v for k, v in resp.cookies.items()}
cookies = dict(resp.cookies.items())
if not cookies:
raise WealthfolioError("/auth/login returned 2xx but no Set-Cookie")
self._save_cookies(cookies)
@ -121,42 +121,87 @@ class WealthfolioSink:
assert isinstance(raw, list)
return raw
async def ensure_account(self, account: Account) -> None:
async def ensure_account(self, account: Account) -> str:
"""Idempotently create the account and return Wealthfolio's UUID for it.
Wealthfolio generates its own UUIDs on POST /accounts, ignoring any
`id` we supply. We identify accounts by (provider, providerAccountId)
which Wealthfolio DOES preserve verbatim. Our own Account.id is
used as the providerAccountId.
"""
existing = await self.list_accounts()
if any(a.get("id") == account.id for a in existing):
return
for a in existing:
if (a.get("provider") == account.provider and a.get("providerAccountId") == account.id):
wf_id = a.get("id")
assert isinstance(wf_id, str)
return wf_id
# NewAccount is camelCase with required booleans.
# See apps/server/src/models.rs#NewAccount.
resp = await self._request(
"POST",
_ACCOUNTS_PATH,
json={
"id": account.id,
"name": account.name,
"account_type": str(account.account_type),
"accountType": str(account.account_type),
"currency": account.currency,
"isDefault": False,
"isActive": True,
"isArchived": False,
"trackingMode": "TRANSACTIONS",
"provider": account.provider,
"providerAccountId": account.id,
},
)
resp.raise_for_status()
created = resp.json()
wf_id = created.get("id")
if not isinstance(wf_id, str):
raise WealthfolioError(f"POST /accounts returned no id: {created}")
return wf_id
# -- activity import --
@staticmethod
def _activities_csv(activities: Iterable[Activity]) -> str:
activities = list(activities)
if not activities:
return ""
rows = [a.to_wealthfolio_csv_row() for a in activities]
buf = io.StringIO()
w = csv.DictWriter(buf, fieldnames=list(rows[0].keys()))
w.writeheader()
w.writerows(rows)
return buf.getvalue()
def _activity_to_import_row(a: Activity) -> dict[str, Any]:
"""Match Wealthfolio's ActivityImport struct (camelCase JSON)."""
# WF /import rejects naive datetimes with "Invalid date" (even though
# /import/check accepts them) — coerce to UTC if tzinfo is missing.
date = a.date if a.date.tzinfo is not None else a.date.replace(tzinfo=UTC)
row: dict[str, Any] = {
"date": date.isoformat(),
"symbol": a.symbol or "$CASH",
"activityType": str(a.activity_type),
"currency": a.currency,
"accountId": a.account_id,
# Required booleans on ActivityImport (no defaults in Rust struct).
"isDraft": False,
"isValid": True,
}
if a.quantity is not None:
row["quantity"] = format(a.quantity, "f")
if a.unit_price is not None:
row["unitPrice"] = format(a.unit_price, "f")
if a.amount is not None:
row["amount"] = format(a.amount, "f")
if a.fee:
row["fee"] = format(a.fee, "f")
if a.notes:
row["comment"] = a.notes
return row
async def import_activities(self, activities: Iterable[Activity]) -> list[dict[str, Any]]:
csv_text = self._activities_csv(activities)
files = {"file": ("activities.csv", csv_text, "text/csv")}
rows = [self._activity_to_import_row(a) for a in activities]
if not rows:
return []
check = await self._request("POST", _IMPORT_CHECK, files=files)
# Step 1 — /import/check hydrates each row with resolved asset_id,
# exchange_mic, quote_ccy, instrument_type, quote_mode (and flags
# errors). The /import endpoint on Wealthfolio 3.2+ DOES NOT
# re-resolve — if we send the un-enriched row the activity is
# silently dropped (import returns 200 OK with activities=[] in
# the payload). We must feed check's output into import.
check = await self._request("POST", _IMPORT_CHECK, json={"activities": rows})
if check.status_code >= 400:
try:
payload = check.json()
@ -164,9 +209,56 @@ class WealthfolioSink:
payload = {"raw": check.text}
raise ImportValidationError(f"Wealthfolio /import/check rejected: {payload}")
# Re-send the same CSV to the real endpoint.
real = await self._request("POST", _IMPORT_REAL, files=files)
checked = check.json()
if not isinstance(checked, list):
raise ImportValidationError(
f"Wealthfolio /import/check returned non-list: {type(checked).__name__}")
invalid = [r for r in checked if isinstance(r, dict) and r.get("errors")]
if invalid:
raise ImportValidationError(f"Wealthfolio /import/check flagged {len(invalid)} row(s); "
f"first: {invalid[0]}")
# Drop any row the server marked is_valid=false (shouldn't happen
# without errors, but defensive).
valid_rows = [r for r in checked if isinstance(r, dict) and r.get("isValid")]
real = await self._request("POST", _IMPORT_REAL, json={"activities": valid_rows})
real.raise_for_status()
raw = real.json()
assert isinstance(raw, list)
return raw
# Two observed response shapes:
# - {activities:[...], importRunId:"...", summary:{total,imported,skipped,...}}
# - bare list (older builds)
if isinstance(raw, dict) and "activities" in raw:
got = raw["activities"]
summary = raw.get("summary") if isinstance(raw.get("summary"), dict) else None
elif isinstance(raw, list):
got = raw
summary = None
else:
got = []
summary = None
# Summary.imported is THE truth. The `activities` field echoes input
# with errors annotated — its length equals input even when zero
# actually persisted.
if summary is not None:
imported_n = int(summary.get("imported", 0))
total_n = int(summary.get("total", len(valid_rows)))
if imported_n < total_n:
err_msg = summary.get("errorMessage") or "no errorMessage"
skipped = int(summary.get("skipped", 0))
dupes = int(summary.get("duplicates", 0))
raise ImportValidationError(f"Wealthfolio /import persisted {imported_n}/{total_n} "
f"(skipped={skipped} duplicates={dupes}). "
f"errorMessage: {err_msg}")
# Legacy silent-drop guard for no-summary responses.
elif valid_rows and not got:
first_warn = next(
(r.get("warnings") for r in checked if isinstance(r, dict) and r.get("warnings")),
None,
)
raise ImportValidationError(
f"Wealthfolio /import silently dropped all {len(valid_rows)} rows. "
f"First checked row: {checked[0] if checked else 'none'}. "
f"First warning: {first_warn}")
assert isinstance(got, list)
return [r for r in got if isinstance(r, dict)]

View file

@ -0,0 +1,111 @@
# Fidelity UK PlanViewer provider
Viktor's UK workplace pension is hosted at `pv.planviewer.fidelity.co.uk`. There
is no public API for individual members — the provider reverse-engineers the
private JSON backend at `prd.wiciam.fidelity.co.uk/cvmfe/api/*` that the SPA
itself calls, and uses Playwright only to keep a long-lived login session
alive.
## Architecture
```
┌─────────────┐ storage_state.json ┌──────────────────┐
│ Vault KV │◀─── (quarterly reseed) ───│ fidelity-seed │
│ broker-sync │ │ (headed browser) │
└──────┬──────┘ └──────────────────┘
│ ▲
│ loads on start │ Viktor runs once
▼ when session expires
┌────────────────────┐
│ Monthly CronJob │
│ broker-sync-fidelity│
└────────────┬────────┘
│ headless Chromium
┌─────────────────────────────────┐ ┌────────────────────────────────┐
│ pv.planviewer.fidelity.co.uk │◀─────│ navigate dashboard → capture │
│ (SPA) │ │ fresh sid/fid/tbid/rid headers │
└─────────────────────────────────┘ └──────────────┬─────────────────┘
┌───────────▼─────────────┐
│ httpx JSON calls │
│ prd.wiciam.../cvmfe/api│
└───────────┬─────────────┘
┌────────────────────▼────────────────────┐
│ DEPOSIT × N (employee + employer) │
│ BUY × N (fund unit purchases, per date) │
└────────────────────┬────────────────────┘
┌────────────────▼────────────────┐
│ Wealthfolio account │
│ type = WORKPLACE_PENSION │
│ currency = GBP │
└──────────────────────────────────┘
```
## One-time seed (Viktor)
```bash
# on your laptop (macOS / Linux with a desktop):
cd broker-sync
poetry install
poetry run playwright install chromium
poetry run broker-sync fidelity-seed --out /tmp/fidelity_storage_state.json
# chromium opens — log in to PlanViewer, tick "Remember device", press Enter
# stage to Vault
vault kv patch secret/broker-sync \
fidelity_storage_state=@/tmp/fidelity_storage_state.json \
fidelity_plan_id=<your-plan-id>
rm /tmp/fidelity_storage_state.json # don't leave credentials lying around
```
Re-seed when the monthly CronJob fails with `FidelitySessionError` (expect
every 30-90 days, depending on how long Fidelity honours the remember-device
cookie).
## One-time backfill
```bash
kubectl -n broker-sync create job fidelity-backfill \
--from=cronjob/broker-sync-fidelity
kubectl -n broker-sync logs -f job/fidelity-backfill
# expect: fidelity-ingest: fetched=N new=N imported=N failed=0
```
## Monthly cron
- Schedule: `0 3 5 * *` (3am UTC on the 5th of each month — after mid-month payroll settles in Viktor's scheme)
- CronJob: `broker-sync-fidelity` in namespace `broker-sync`
- Resource: small, ≤512 MiB memory (Chromium for ~2 min, then idle)
- Alert: `BrokerSyncFidelityFailed` fires on 2 consecutive failures
## Runbook — `BrokerSyncFidelityFailed`
1. Check pod logs: `kubectl -n broker-sync logs job/broker-sync-fidelity-<timestamp>`.
2. If the error is `FidelitySessionError`: session expired, re-run the seed on
Viktor's laptop (see above).
3. If the error is a 404 / 5xx from `prd.wiciam.fidelity.co.uk`: likely an API
path change. Check DevTools for the new endpoint, update the provider, ship
a new image.
4. If Playwright can't launch Chromium: check that the image still has Chromium
installed (`playwright install chromium` at build time).
## Data model notes
- **Salary sacrifice scheme**: all employee + employer contributions are
pre-tax from gross salary. No HMRC basic-rate relief line.
- Emits two `DEPOSIT` per month (employee, employer) with `comment` carrying
the source tag `fidelity:<doc-id>:<source>` for audit.
- Emits one `BUY` per fund unit purchase, `symbol` = Fidelity fund code / ISIN.
Units × unit price should reconcile to the cash deposited ±pennies.
## Not yet implemented
- Endpoint paths: waiting on Viktor's DevTools POST cURL for transactions +
holdings views. Until pasted, `fidelity-ingest` raises
`FidelityProviderConfigError` to fail loudly.
- Infra: CronJob + Vault secret wiring + Prometheus alert in
`infra/stacks/broker-sync/main.tf` — pending first successful manual run.

162
poetry.lock generated
View file

@ -1,5 +1,24 @@
# This file is automatically @generated by Poetry 2.1.3 and should not be changed by hand.
[[package]]
name = "aiomysql"
version = "0.3.2"
description = "MySQL driver for asyncio."
optional = false
python-versions = ">=3.9"
groups = ["main"]
files = [
{file = "aiomysql-0.3.2-py3-none-any.whl", hash = "sha256:c82c5ba04137d7afd5c693a258bea8ead2aad77101668044143a991e04632eb2"},
{file = "aiomysql-0.3.2.tar.gz", hash = "sha256:72d15ef5cfc34c03468eb41e1b90adb9fd9347b0b589114bd23ead569a02ac1a"},
]
[package.dependencies]
PyMySQL = ">=1.0"
[package.extras]
rsa = ["PyMySQL[rsa] (>=1.0)"]
sa = ["sqlalchemy (>=1.3,<1.4)"]
[[package]]
name = "anyio"
version = "4.13.0"
@ -82,6 +101,79 @@ files = [
]
markers = {main = "platform_system == \"Windows\"", dev = "sys_platform == \"win32\""}
[[package]]
name = "greenlet"
version = "3.4.0"
description = "Lightweight in-process concurrent programming"
optional = false
python-versions = ">=3.10"
groups = ["main"]
files = [
{file = "greenlet-3.4.0-cp310-cp310-macosx_11_0_universal2.whl", hash = "sha256:d18eae9a7fb0f499efcd146b8c9750a2e1f6e0e93b5a382b3481875354a430e6"},
{file = "greenlet-3.4.0-cp310-cp310-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:636d2f95c309e35f650e421c23297d5011716be15d966e6328b367c9fc513a82"},
{file = "greenlet-3.4.0-cp310-cp310-manylinux_2_24_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:234582c20af9742583c3b2ddfbdbb58a756cfff803763ffaae1ac7990a9fac31"},
{file = "greenlet-3.4.0-cp310-cp310-manylinux_2_24_s390x.manylinux_2_28_s390x.whl", hash = "sha256:ac6a5f618be581e1e0713aecec8e54093c235e5fa17d6d8eb7ffc487e2300508"},
{file = "greenlet-3.4.0-cp310-cp310-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:523677e69cd4711b5a014e37bc1fb3a29947c3e3a5bb6a527e1cc50312e5a398"},
{file = "greenlet-3.4.0-cp310-cp310-manylinux_2_39_riscv64.whl", hash = "sha256:d336d46878e486de7d9458653c722875547ac8d36a1cff9ffaf4a74a3c1f62eb"},
{file = "greenlet-3.4.0-cp310-cp310-musllinux_1_2_aarch64.whl", hash = "sha256:b45e45fe47a19051a396abb22e19e7836a59ee6c5a90f3be427343c37908d65b"},
{file = "greenlet-3.4.0-cp310-cp310-musllinux_1_2_x86_64.whl", hash = "sha256:5434271357be07f3ad0936c312645853b7e689e679e29310e2de09a9ea6c3adf"},
{file = "greenlet-3.4.0-cp310-cp310-win_amd64.whl", hash = "sha256:a19093fbad824ed7c0f355b5ff4214bffda5f1a7f35f29b31fcaa240cc0135ab"},
{file = "greenlet-3.4.0-cp311-cp311-macosx_11_0_universal2.whl", hash = "sha256:805bebb4945094acbab757d34d6e1098be6de8966009ab9ca54f06ff492def58"},
{file = "greenlet-3.4.0-cp311-cp311-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:439fc2f12b9b512d9dfa681c5afe5f6b3232c708d13e6f02c845e0d9f4c2d8c6"},
{file = "greenlet-3.4.0-cp311-cp311-manylinux_2_24_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:a70ed1cb0295bee1df57b63bf7f46b4e56a5c93709eea769c1fec1bb23a95875"},
{file = "greenlet-3.4.0-cp311-cp311-manylinux_2_24_s390x.manylinux_2_28_s390x.whl", hash = "sha256:8c5696c42e6bb5cfb7c6ff4453789081c66b9b91f061e5e9367fa15792644e76"},
{file = "greenlet-3.4.0-cp311-cp311-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:c660bce1940a1acae5f51f0a064f1bc785d07ea16efcb4bc708090afc4d69e83"},
{file = "greenlet-3.4.0-cp311-cp311-manylinux_2_39_riscv64.whl", hash = "sha256:89995ce5ddcd2896d89615116dd39b9703bfa0c07b583b85b89bf1b5d6eddf81"},
{file = "greenlet-3.4.0-cp311-cp311-musllinux_1_2_aarch64.whl", hash = "sha256:ee407d4d1ca9dc632265aee1c8732c4a2d60adff848057cdebfe5fe94eb2c8a2"},
{file = "greenlet-3.4.0-cp311-cp311-musllinux_1_2_x86_64.whl", hash = "sha256:956215d5e355fffa7c021d168728321fd4d31fd730ac609b1653b450f6a4bc71"},
{file = "greenlet-3.4.0-cp311-cp311-win_amd64.whl", hash = "sha256:5cb614ace7c27571270354e9c9f696554d073f8aa9319079dcba466bbdead711"},
{file = "greenlet-3.4.0-cp311-cp311-win_arm64.whl", hash = "sha256:04403ac74fe295a361f650818de93be11b5038a78f49ccfb64d3b1be8fbf1267"},
{file = "greenlet-3.4.0-cp312-cp312-macosx_11_0_universal2.whl", hash = "sha256:1a54a921561dd9518d31d2d3db4d7f80e589083063ab4d3e2e950756ef809e1a"},
{file = "greenlet-3.4.0-cp312-cp312-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:16dec271460a9a2b154e3b1c2fa1050ce6280878430320e85e08c166772e3f97"},
{file = "greenlet-3.4.0-cp312-cp312-manylinux_2_24_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:90036ce224ed6fe75508c1907a77e4540176dcf0744473627785dd519c6f9996"},
{file = "greenlet-3.4.0-cp312-cp312-manylinux_2_24_s390x.manylinux_2_28_s390x.whl", hash = "sha256:6f0def07ec9a71d72315cf26c061aceee53b306c36ed38c35caba952ea1b319d"},
{file = "greenlet-3.4.0-cp312-cp312-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:a1c4f6b453006efb8310affb2d132832e9bbb4fc01ce6df6b70d810d38f1f6dc"},
{file = "greenlet-3.4.0-cp312-cp312-manylinux_2_39_riscv64.whl", hash = "sha256:0e1254cf0cbaa17b04320c3a78575f29f3c161ef38f59c977108f19ffddaf077"},
{file = "greenlet-3.4.0-cp312-cp312-musllinux_1_2_aarch64.whl", hash = "sha256:9b2d9a138ffa0e306d0e2b72976d2fb10b97e690d40ab36a472acaab0838e2de"},
{file = "greenlet-3.4.0-cp312-cp312-musllinux_1_2_x86_64.whl", hash = "sha256:8424683caf46eb0eb6f626cb95e008e8cc30d0cb675bdfa48200925c79b38a08"},
{file = "greenlet-3.4.0-cp312-cp312-win_amd64.whl", hash = "sha256:a0a53fb071531d003b075c444014ff8f8b1a9898d36bb88abd9ac7b3524648a2"},
{file = "greenlet-3.4.0-cp312-cp312-win_arm64.whl", hash = "sha256:f38b81880ba28f232f1f675893a39cf7b6db25b31cc0a09bb50787ecf957e85e"},
{file = "greenlet-3.4.0-cp313-cp313-macosx_11_0_universal2.whl", hash = "sha256:43748988b097f9c6f09364f260741aa73c80747f63389824435c7a50bfdfd5c1"},
{file = "greenlet-3.4.0-cp313-cp313-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:5566e4e2cd7a880e8c27618e3eab20f3494452d12fd5129edef7b2f7aa9a36d1"},
{file = "greenlet-3.4.0-cp313-cp313-manylinux_2_24_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:1054c5a3c78e2ab599d452f23f7adafef55062a783a8e241d24f3b633ba6ff82"},
{file = "greenlet-3.4.0-cp313-cp313-manylinux_2_24_s390x.manylinux_2_28_s390x.whl", hash = "sha256:98eedd1803353daf1cd9ef23eef23eda5a4d22f99b1f998d273a8b78b70dd47f"},
{file = "greenlet-3.4.0-cp313-cp313-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:f82cb6cddc27dd81c96b1506f4aa7def15070c3b2a67d4e46fd19016aacce6cf"},
{file = "greenlet-3.4.0-cp313-cp313-manylinux_2_39_riscv64.whl", hash = "sha256:b7857e2202aae67bc5725e0c1f6403c20a8ff46094ece015e7d474f5f7020b55"},
{file = "greenlet-3.4.0-cp313-cp313-musllinux_1_2_aarch64.whl", hash = "sha256:227a46251ecba4ff46ae742bc5ce95c91d5aceb4b02f885487aff269c127a729"},
{file = "greenlet-3.4.0-cp313-cp313-musllinux_1_2_x86_64.whl", hash = "sha256:5b99e87be7eba788dd5b75ba1cde5639edffdec5f91fe0d734a249535ec3408c"},
{file = "greenlet-3.4.0-cp313-cp313-win_amd64.whl", hash = "sha256:849f8bc17acd6295fcb5de8e46d55cc0e52381c56eaf50a2afd258e97bc65940"},
{file = "greenlet-3.4.0-cp313-cp313-win_arm64.whl", hash = "sha256:9390ad88b652b1903814eaabd629ca184db15e0eeb6fe8a390bbf8b9106ae15a"},
{file = "greenlet-3.4.0-cp314-cp314-macosx_11_0_universal2.whl", hash = "sha256:10a07aca6babdd18c16a3f4f8880acfffc2b88dfe431ad6aa5f5740759d7d75e"},
{file = "greenlet-3.4.0-cp314-cp314-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:076e21040b3a917d3ce4ad68fb5c3c6b32f1405616c4a57aa83120979649bd3d"},
{file = "greenlet-3.4.0-cp314-cp314-manylinux_2_24_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:e82689eea4a237e530bb5cb41b180ef81fa2160e1f89422a67be7d90da67f615"},
{file = "greenlet-3.4.0-cp314-cp314-manylinux_2_24_s390x.manylinux_2_28_s390x.whl", hash = "sha256:06c2d3b89e0c62ba50bd7adf491b14f39da9e7e701647cb7b9ff4c99bee04b19"},
{file = "greenlet-3.4.0-cp314-cp314-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:4df3b0b2289ec686d3c821a5fee44259c05cfe824dd5e6e12c8e5f5df23085cf"},
{file = "greenlet-3.4.0-cp314-cp314-manylinux_2_39_riscv64.whl", hash = "sha256:070b8bac2ff3b4d9e0ff36a0d19e42103331d9737e8504747cd1e659f76297bd"},
{file = "greenlet-3.4.0-cp314-cp314-musllinux_1_2_aarch64.whl", hash = "sha256:8bff29d586ea415688f4cec96a591fcc3bf762d046a796cdadc1fdb6e7f2d5bf"},
{file = "greenlet-3.4.0-cp314-cp314-musllinux_1_2_x86_64.whl", hash = "sha256:8a569c2fb840c53c13a2b8967c63621fafbd1a0e015b9c82f408c33d626a2fda"},
{file = "greenlet-3.4.0-cp314-cp314-win_amd64.whl", hash = "sha256:207ba5b97ea8b0b60eb43ffcacf26969dd83726095161d676aac03ff913ee50d"},
{file = "greenlet-3.4.0-cp314-cp314-win_arm64.whl", hash = "sha256:f8296d4e2b92af34ebde81085a01690f26a51eb9ac09a0fcadb331eb36dbc802"},
{file = "greenlet-3.4.0-cp314-cp314t-macosx_11_0_universal2.whl", hash = "sha256:d70012e51df2dbbccfaf63a40aaf9b40c8bed37c3e3a38751c926301ce538ece"},
{file = "greenlet-3.4.0-cp314-cp314t-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:a58bec0751f43068cd40cff31bb3ca02ad6000b3a51ca81367af4eb5abc480c8"},
{file = "greenlet-3.4.0-cp314-cp314t-manylinux_2_24_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:05fa0803561028f4b2e3b490ee41216a842eaee11aed004cc343a996d9523aa2"},
{file = "greenlet-3.4.0-cp314-cp314t-manylinux_2_24_s390x.manylinux_2_28_s390x.whl", hash = "sha256:c4cd56a9eb7a6444edbc19062f7b6fbc8f287c663b946e3171d899693b1c19fa"},
{file = "greenlet-3.4.0-cp314-cp314t-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:e60d38719cb80b3ab5e85f9f1aed4960acfde09868af6762ccb27b260d68f4ed"},
{file = "greenlet-3.4.0-cp314-cp314t-manylinux_2_39_riscv64.whl", hash = "sha256:1f85f204c4d54134ae850d401fa435c89cd667d5ce9dc567571776b45941af72"},
{file = "greenlet-3.4.0-cp314-cp314t-musllinux_1_2_aarch64.whl", hash = "sha256:7f50c804733b43eded05ae694691c9aa68bca7d0a867d67d4a3f514742a2d53f"},
{file = "greenlet-3.4.0-cp314-cp314t-musllinux_1_2_x86_64.whl", hash = "sha256:2d4f0635dc4aa638cda4b2f5a07ae9a2cff9280327b581a3fcb6f317b4fbc38a"},
{file = "greenlet-3.4.0-cp314-cp314t-win_amd64.whl", hash = "sha256:1a4a48f24681300c640f143ba7c404270e1ebbbcf34331d7104a4ff40f8ea705"},
{file = "greenlet-3.4.0.tar.gz", hash = "sha256:f50a96b64dafd6169e595a5c56c9146ef80333e67d4476a65a9c55f400fc22ff"},
]
[package.extras]
docs = ["Sphinx", "furo"]
test = ["objgraph", "psutil", "setuptools"]
[[package]]
name = "h11"
version = "0.16.0"
@ -428,6 +520,28 @@ files = [
{file = "platformdirs-4.9.6.tar.gz", hash = "sha256:3bfa75b0ad0db84096ae777218481852c0ebc6c727b3168c1b9e0118e458cf0a"},
]
[[package]]
name = "playwright"
version = "1.58.0"
description = "A high-level API to automate web browsers"
optional = false
python-versions = ">=3.9"
groups = ["main"]
files = [
{file = "playwright-1.58.0-py3-none-macosx_10_13_x86_64.whl", hash = "sha256:96e3204aac292ee639edbfdef6298b4be2ea0a55a16b7068df91adac077cc606"},
{file = "playwright-1.58.0-py3-none-macosx_11_0_arm64.whl", hash = "sha256:70c763694739d28df71ed578b9c8202bb83e8fe8fb9268c04dd13afe36301f71"},
{file = "playwright-1.58.0-py3-none-macosx_11_0_universal2.whl", hash = "sha256:185e0132578733d02802dfddfbbc35f42be23a45ff49ccae5081f25952238117"},
{file = "playwright-1.58.0-py3-none-manylinux1_x86_64.whl", hash = "sha256:c95568ba1eda83812598c1dc9be60b4406dffd60b149bc1536180ad108723d6b"},
{file = "playwright-1.58.0-py3-none-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:8f9999948f1ab541d98812de25e3a8c410776aa516d948807140aff797b4bffa"},
{file = "playwright-1.58.0-py3-none-win32.whl", hash = "sha256:1e03be090e75a0fabbdaeab65ce17c308c425d879fa48bb1d7986f96bfad0b99"},
{file = "playwright-1.58.0-py3-none-win_amd64.whl", hash = "sha256:a2bf639d0ce33b3ba38de777e08697b0d8f3dc07ab6802e4ac53fb65e3907af8"},
{file = "playwright-1.58.0-py3-none-win_arm64.whl", hash = "sha256:32ffe5c303901a13a0ecab91d1c3f74baf73b84f4bedbb6b935f5bc11cc98e1b"},
]
[package.dependencies]
greenlet = ">=3.1.1,<4.0.0"
pyee = ">=13,<14"
[[package]]
name = "pluggy"
version = "1.6.0"
@ -444,6 +558,24 @@ files = [
dev = ["pre-commit", "tox"]
testing = ["coverage", "pytest", "pytest-benchmark"]
[[package]]
name = "pyee"
version = "13.0.1"
description = "A rough port of Node.js's EventEmitter to Python with a few tricks of its own"
optional = false
python-versions = ">=3.8"
groups = ["main"]
files = [
{file = "pyee-13.0.1-py3-none-any.whl", hash = "sha256:af2f8fede4171ef667dfded53f96e2ed0d6e6bd7ee3bb46437f77e3b57689228"},
{file = "pyee-13.0.1.tar.gz", hash = "sha256:0b931f7c14535667ed4c7e0d531716368715e860b988770fc7eb8578d1f67fc8"},
]
[package.dependencies]
typing-extensions = "*"
[package.extras]
dev = ["black", "build", "flake8", "flake8-black", "isort", "jupyter-console", "mkdocs", "mkdocs-include-markdown-plugin", "mkdocstrings[python]", "mypy", "pytest", "pytest-asyncio ; python_version >= \"3.4\"", "pytest-trio ; python_version >= \"3.7\"", "sphinx", "toml", "tox", "trio", "trio ; python_version > \"3.6\"", "trio-typing ; python_version > \"3.6\"", "twine", "twisted", "validate-pyproject[all]"]
[[package]]
name = "pygments"
version = "2.20.0"
@ -459,6 +591,22 @@ files = [
[package.extras]
windows-terminal = ["colorama (>=0.4.6)"]
[[package]]
name = "pymysql"
version = "1.1.2"
description = "Pure Python MySQL Driver"
optional = false
python-versions = ">=3.8"
groups = ["main"]
files = [
{file = "pymysql-1.1.2-py3-none-any.whl", hash = "sha256:e6b1d89711dd51f8f74b1631fe08f039e7d76cf67a42a323d3178f0f25762ed9"},
{file = "pymysql-1.1.2.tar.gz", hash = "sha256:4961d3e165614ae65014e361811a724e2044ad3ea3739de9903ae7c21f539f03"},
]
[package.extras]
ed25519 = ["PyNaCl (>=1.4.0)"]
rsa = ["cryptography"]
[[package]]
name = "pytest"
version = "8.4.2"
@ -628,6 +776,18 @@ rich = ">=10.11.0"
shellingham = ">=1.3.0"
typing-extensions = ">=3.7.4.3"
[[package]]
name = "types-python-dateutil"
version = "2.9.0.20260408"
description = "Typing stubs for python-dateutil"
optional = false
python-versions = ">=3.10"
groups = ["dev"]
files = [
{file = "types_python_dateutil-2.9.0.20260408-py3-none-any.whl", hash = "sha256:473139d514a71c9d1fbd8bb328974bedcb1cc3dba57aad04ffa4157f483c216f"},
{file = "types_python_dateutil-2.9.0.20260408.tar.gz", hash = "sha256:8b056ec01568674235f64ecbcef928972a5fac412f5aab09c516dfa2acfbb582"},
]
[[package]]
name = "typing-extensions"
version = "4.15.0"
@ -658,4 +818,4 @@ platformdirs = ">=3.5.1"
[metadata]
lock-version = "2.1"
python-versions = ">=3.11,<3.13"
content-hash = "b9c19ac1963682740a98cd539d3790ff180c2e8195d5cfcc9572da855db3fa7d"
content-hash = "b3896b2258a425cce9498be9ada5bd48a06d5f2bd7c53ead044ad27c53086bd7"

View file

@ -13,6 +13,11 @@ beautifulsoup4 = "^4.12"
python-dateutil = "^2.9"
typer = "^0.12"
click = "<8.2" # typer 0.12 uses make_metavar() without ctx; click 8.2 made ctx required
aiomysql = "^0.3.2"
# Fidelity UK PlanViewer has no public API — we use Playwright only to keep a
# long-lived session alive (storage_state + device-trust cookie); actual data
# is fetched via httpx against the SPA's private JSON backend.
playwright = "^1.47"
[tool.poetry.group.dev.dependencies]
pytest = "^8.3"
@ -20,6 +25,7 @@ pytest-asyncio = "^0.23"
mypy = "^1.11"
ruff = "^0.6"
yapf = "^0.43"
types-python-dateutil = "^2.9.0.20260408"
[tool.poetry.scripts]
broker-sync = "broker_sync.cli:app"

File diff suppressed because one or more lines are too long

View file

@ -0,0 +1,2 @@
{"valuations":[{"asset":{"assetId":[{"type":"FUND_CODE","value":"KDOA"}],"name":"Passive Global Equity Fund - Class 9"},"units":{"total":44920.21,"available":null,"crystallised":null,"uncrystallised":null,"group":[{"groupId":"BONW","type":"CONTRIBUTION_TYPE","name":"Bonus Waiver","unit":{"total":11490.84,"available":null,"crystallised":null,"uncrystallised":null}},{"groupId":"ERXS","type":"CONTRIBUTION_TYPE","name":"Company","unit":{"total":17148.27,"available":null,"crystallised":null,"uncrystallised":null}},{"groupId":"SASC","type":"CONTRIBUTION_TYPE","name":"Salary Sacrifice","unit":{"total":11432.20,"available":null,"crystallised":null,"uncrystallised":null}},{"groupId":"TREX","type":"CONTRIBUTION_TYPE","name":"Transfer In","unit":{"total":4848.90,"available":null,"crystallised":null,"uncrystallised":null}}]},"price":{"value":3.066,"datetime":"2026-04-17","currency":"GBP"},"valuation":{"total":137725.35,"available":null,"crystallised":null,"uncrystallised":null,"group":[{"groupId":"BONW","type":"CONTRIBUTION_TYPE","name":"Bonus Waiver","valuation":{"total":35230.91,"available":null,"crystallised":null,"uncrystallised":null}},{"groupId":"ERXS","type":"CONTRIBUTION_TYPE","name":"Company","valuation":{"total":52576.60,"available":null,"crystallised":null,"uncrystallised":null}},{"groupId":"SASC","type":"CONTRIBUTION_TYPE","name":"Salary Sacrifice","valuation":{"total":35051.12,"available":null,"crystallised":null,"uncrystallised":null}},{"groupId":"TREX","type":"CONTRIBUTION_TYPE","name":"Transfer In","valuation":{"total":14866.72,"available":null,"crystallised":null,"uncrystallised":null}}],"valuationType":"Value"},"currency":"GBP"},{"asset":{"assetId":[{"type":"FUND_CODE","value":"KCVT"}],"name":"FutureWise Target 2065 - Class 10"},"units":{"total":230.02,"available":null,"crystallised":null,"uncrystallised":null,"group":[{"groupId":"ERXS","type":"CONTRIBUTION_TYPE","name":"Company","unit":{"total":153.35,"available":null,"crystallised":null,"uncrystallised":null}},{"groupId":"SASC","type":"CONTRIBUTION_TYPE","name":"Salary Sacrifice","unit":{"total":76.67,"available":null,"crystallised":null,"uncrystallised":null}}]},"price":{"value":3.254,"datetime":"2026-04-17","currency":"GBP"},"valuation":{"total":748.48,"available":null,"crystallised":null,"uncrystallised":null,"group":[{"groupId":"ERXS","type":"CONTRIBUTION_TYPE","name":"Company","valuation":{"total":498.99,"available":null,"crystallised":null,"uncrystallised":null}},{"groupId":"SASC","type":"CONTRIBUTION_TYPE","name":"Salary Sacrifice","valuation":{"total":249.49,"available":null,"crystallised":null,"uncrystallised":null}}],"valuationType":"Value"},"currency":"GBP"},{"asset":{"assetId":[{"type":"FUND_CODE","value":"LAFC"}],"name":"Volatility Managed Multi Asset Fund"},"units":{"total":106.64,"available":null,"crystallised":null,"uncrystallised":null,"group":[{"groupId":"ERXS","type":"CONTRIBUTION_TYPE","name":"Company","unit":{"total":71.09,"available":null,"crystallised":null,"uncrystallised":null}},{"groupId":"SASC","type":"CONTRIBUTION_TYPE","name":"Salary Sacrifice","unit":{"total":35.55,"available":null,"crystallised":null,"uncrystallised":null}}]},"price":{"value":252.9000,"datetime":"2026-04-17","currency":"GBP"},"valuation":{"total":269.70,"available":null,"crystallised":null,"uncrystallised":null,"group":[{"groupId":"ERXS","type":"CONTRIBUTION_TYPE","name":"Company","valuation":{"total":179.80,"available":null,"crystallised":null,"uncrystallised":null}},{"groupId":"SASC","type":"CONTRIBUTION_TYPE","name":"Salary Sacrifice","valuation":{"total":89.90,"available":null,"crystallised":null,"uncrystallised":null}}],"valuationType":"Value"},"currency":"GBP"}],"valuationSum":{"total":138743.53,"available":0.0,"crystallised":null,"uncrystallised":null,"currency":"GBP"},"asOfDateTime":"2026-04-17T12:00:00+01:00"}

View file

@ -0,0 +1,22 @@
From: InvestEngine <no-reply@investengine.com>
To: viktorbarzin@example.com
Subject: Your InvestEngine statement
Date: Mon, 07 Apr 2025 09:00:00 +0000
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="----=_MIXED_1"
------=_MIXED_1
Content-Type: text/plain; charset=UTF-8
Your monthly statement is attached as a CSV.
------=_MIXED_1
Content-Type: text/csv; charset=UTF-8; name="statement.csv"
Content-Disposition: attachment; filename="statement.csv"
ticker,unit_price,quantity,date,currency
VUAG,63.21,12.5,2025-04-02,GBP
SWDA,86.40,4.75,2025-04-03,GBP
VUSA,90.10,1.0,2025-04-04,GBP
------=_MIXED_1--

View file

@ -0,0 +1,40 @@
From: InvestEngine <no-reply@investengine.com>
To: viktorbarzin@example.com
Subject: Your portfolio has been updated
Date: Wed, 15 Apr 2026 11:00:00 +0000
MIME-Version: 1.0
Content-Type: multipart/alternative; boundary="----=_Part_PM"
------=_Part_PM
Content-Type: text/plain; charset=UTF-8
(HTML-only view — your client does not render HTML emails.)
------=_Part_PM
Content-Type: text/html; charset=UTF-8
<html><body>
<table><tr><td>Logo</td></tr></table>
<table>
<tr><td> Date: 15 April 2026 </td></tr>
<tr>
<td>
<table>
<tr><td>Vanguard S&amp;P 500: VUAG</td></tr>
<tr><td>Bought 3.0 @ &pound;61.25 per share</td></tr>
<tr><td>Total: &pound;183.75</td></tr>
</table>
</td>
</tr>
<tr>
<td>
<table>
<tr><td>Some broken order with no ticker and no bought line</td></tr>
<tr><td>(Malformed — IE dropped a row mid-render)</td></tr>
</table>
</td>
</tr>
</table>
</body></html>
------=_Part_PM--

View file

@ -0,0 +1,55 @@
From: InvestEngine <no-reply@investengine.com>
To: viktorbarzin@example.com
Subject: Your portfolio has been updated
Date: Wed, 01 Apr 2026 09:15:00 +0000
MIME-Version: 1.0
Content-Type: multipart/alternative; boundary="----=_Part_1"
------=_Part_1
Content-Type: text/plain; charset=UTF-8
(HTML-only view — your client does not render HTML emails.)
------=_Part_1
Content-Type: text/html; charset=UTF-8
<html><head><title>InvestEngine</title></head><body>
<table><tr><td>Header logo</td></tr></table>
<table>
<tr><td>Client name: Redacted</td></tr>
<tr><td>Trading venue: London Stock Exchange</td></tr>
<tr><td>Type: Market Order(s)</td></tr>
<tr><td>Here's a summary of the trades we've made for you</td></tr>
<tr>
<td>a</td><td>b</td><td>c</td><td>d</td>
<td> Date: 01 April 2026 </td>
</tr>
<tr><td>filler</td></tr>
<tr><td>filler</td></tr>
<tr><td>filler</td></tr>
<tr><td>filler</td></tr>
<tr><td>filler</td></tr>
<tr>
<td>
<table>
<tr><td>Vanguard S&amp;P 500: VUAG</td></tr>
<tr><td>Bought 10.5 @ &pound;62.10 per share</td></tr>
<tr><td>Total: &pound;652.05</td></tr>
<tr><td>ISIN: IE00BFMXXD54, Order ID: 300000/4000001, Traded at 9:05am GMT</td></tr>
</table>
</td>
</tr>
<tr>
<td>
<table>
<tr><td>iShares Core MSCI World: SWDA</td></tr>
<tr><td>Bought 2.25 @ &pound;85.40 per share</td></tr>
<tr><td>Total: &pound;192.15</td></tr>
<tr><td>ISIN: IE00B4L5Y983, Order ID: 300000/4000002, Traded at 9:06am GMT</td></tr>
</table>
</td>
</tr>
</table>
</body></html>
------=_Part_1--

View file

@ -0,0 +1,15 @@
From: InvestEngine <no-reply@investengine.com>
To: viktorbarzin@example.com
Subject: Your portfolio has been updated
Date: Tue, 17 Jan 2023 14:48:00 +0000
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
<https://investengine.com/> We've executed your orders and your
portfolio has been updated Client name: Redacted Trading
venue: London Stock Exchange Type: Market Order(s) Date: 17 January
2023 Here's a summary of the trades we've made for you
Vanguard S&P 500: VUAG Bought 59.539562 @ £60.46 per share Total:
£3600.00 ISIN: IE00BFMXXD54, Order ID: 199510/2163746, Traded at
2:48pm GMT/UTC Take me to my updated portfolio

View file

@ -0,0 +1,15 @@
From: InvestEngine <no-reply@investengine.com>
To: viktorbarzin@example.com
Subject: InvestEngine newsletter
Date: Thu, 10 Apr 2025 12:00:00 +0000
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Hi Viktor,
This is a newsletter, not a trade confirmation. There is no structured
order data here — just marketing copy and a promo for a new feature we
are rolling out. Thanks for being a customer.
Cheers,
The InvestEngine team

View file

View file

@ -0,0 +1,108 @@
from __future__ import annotations
from datetime import datetime
from decimal import Decimal
from pathlib import Path
from broker_sync.models import AccountType, ActivityType
from broker_sync.providers.parsers.invest_engine import parse_invest_engine_email
_FIXTURES = Path(__file__).parent.parent.parent / "fixtures" / "invest_engine"
def _load(name: str) -> bytes:
return (_FIXTURES / name).read_bytes()
# -- RFC 2822 body (v2-style, single BUY) --
def test_rfc2822_single_buy_parses_to_one_activity() -> None:
activities = parse_invest_engine_email(_load("rfc2822_v2_single_buy.eml"))
assert len(activities) == 1
a = activities[0]
assert a.activity_type is ActivityType.BUY
assert a.symbol == "VUAG"
assert a.quantity == Decimal("59.539562")
assert a.unit_price == Decimal("60.46")
assert a.currency == "GBP"
assert a.date == datetime(2023, 1, 17)
assert a.account_id == "invest-engine-primary"
assert a.account_type is AccountType.ISA
def test_rfc2822_external_id_is_deterministic() -> None:
a1 = parse_invest_engine_email(_load("rfc2822_v2_single_buy.eml"))[0]
a2 = parse_invest_engine_email(_load("rfc2822_v2_single_buy.eml"))[0]
assert a1.external_id == a2.external_id
assert a1.external_id.startswith("invest-engine:")
def test_rfc2822_notes_record_parse_strategy() -> None:
a = parse_invest_engine_email(_load("rfc2822_v2_single_buy.eml"))[0]
assert a.notes is not None
assert "rfc2822" in a.notes
# -- HTML table body (multipart/alternative, two orders) --
def test_html_body_parses_both_orders() -> None:
activities = parse_invest_engine_email(_load("html_two_orders.eml"))
assert len(activities) == 2
a, b = activities
assert a.symbol == "VUAG"
assert a.quantity == Decimal("10.5")
assert a.unit_price == Decimal("62.10")
assert a.date == datetime(2026, 4, 1)
assert a.account_id == "invest-engine-primary"
assert a.account_type is AccountType.ISA
assert a.activity_type is ActivityType.BUY
assert b.symbol == "SWDA"
assert b.quantity == Decimal("2.25")
assert b.unit_price == Decimal("85.40")
assert b.date == datetime(2026, 4, 1)
def test_html_notes_record_html_strategy() -> None:
a = parse_invest_engine_email(_load("html_two_orders.eml"))[0]
assert a.notes is not None
assert "html" in a.notes
# -- CSV attachment body --
def test_csv_attachment_parses_all_rows() -> None:
activities = parse_invest_engine_email(_load("csv_attachment.eml"))
assert len(activities) == 3
by_symbol = {a.symbol: a for a in activities}
assert by_symbol["VUAG"].quantity == Decimal("12.5")
assert by_symbol["VUAG"].unit_price == Decimal("63.21")
assert by_symbol["VUAG"].date == datetime(2025, 4, 2)
assert by_symbol["SWDA"].quantity == Decimal("4.75")
assert by_symbol["VUSA"].date == datetime(2025, 4, 4)
for a in activities:
assert a.activity_type is ActivityType.BUY
assert a.currency == "GBP"
assert a.account_id == "invest-engine-primary"
assert a.account_type is AccountType.ISA
assert a.notes is not None
assert "csv" in a.notes
# -- graceful failure modes --
def test_unparseable_email_returns_empty_list() -> None:
assert parse_invest_engine_email(_load("unparseable.eml")) == []
def test_html_partial_match_returns_only_parseable_orders() -> None:
activities = parse_invest_engine_email(_load("html_partial_match.eml"))
assert len(activities) == 1
a = activities[0]
assert a.symbol == "VUAG"
assert a.quantity == Decimal("3.0")
assert a.unit_price == Decimal("61.25")
assert a.date == datetime(2026, 4, 15)

View file

@ -0,0 +1,140 @@
from __future__ import annotations
from decimal import Decimal
from broker_sync.models import AccountType, ActivityType
from broker_sync.providers.parsers.schwab import parse_schwab_email
_SELL = """
<html><body>
<table>
<tr><td>Date</td><td class="dark-background-body" align="right">Jan 23, 2025</td></tr>
<tr><td>Action</td><td class="dark-background-body" align="right">Sold</td></tr>
<tr><td>Quantity</td><td class="dark-background-body" align="right">100.0</td></tr>
<tr><td>Ticker</td><td class="dark-background-body" align="right">META</td></tr>
<tr><td>Price</td><td class="dark-background-body" align="right">$612.34</td></tr>
</table>
</body></html>
"""
_BUY = """
<html><body><table>
<tr><td class="dark-background-body" align="right">2024-11-15</td></tr>
<tr><td class="dark-background-body" align="right">Bought</td></tr>
<tr><td class="dark-background-body" align="right">5.5</td></tr>
<tr><td class="dark-background-body" align="right">AAPL</td></tr>
<tr><td class="dark-background-body" align="right">$225.00</td></tr>
</table></body></html>
"""
_MALFORMED = "<html><body>no transaction here</body></html>"
_MISSING_CELLS = """
<html><body><table>
<tr><td class="dark-background-body" align="right">Jan 23, 2025</td></tr>
<tr><td class="dark-background-body" align="right">Sold</td></tr>
</table></body></html>
"""
def test_sell_email_parses_to_one_sell_activity() -> None:
acts = parse_schwab_email(_SELL)
assert len(acts) == 1
a = acts[0]
assert a.activity_type is ActivityType.SELL
assert a.symbol == "META"
assert a.quantity == Decimal("100.0")
assert a.unit_price == Decimal("612.34")
assert a.currency == "USD"
assert a.account_id == "schwab-workplace"
assert a.account_type is AccountType.GIA
assert a.date.date().isoformat() == "2025-01-23"
def test_buy_email_becomes_buy_activity() -> None:
acts = parse_schwab_email(_BUY)
assert len(acts) == 1
a = acts[0]
assert a.activity_type is ActivityType.BUY
assert a.symbol == "AAPL"
assert a.quantity == Decimal("5.5")
assert a.unit_price == Decimal("225.00")
def test_malformed_email_returns_empty_list() -> None:
# No matching td cells at all.
assert parse_schwab_email(_MALFORMED) == []
def test_missing_cells_returns_empty_list() -> None:
# Only 2 of the 5 required cells — parser must bail cleanly.
assert parse_schwab_email(_MISSING_CELLS) == []
def test_external_id_is_stable_across_reruns() -> None:
# Same email → same external_id (deterministic, not timestamp-based).
a1 = parse_schwab_email(_SELL)[0]
a2 = parse_schwab_email(_SELL)[0]
assert a1.external_id == a2.external_id
def test_price_with_commas_parses() -> None:
html = _SELL.replace("$612.34", "$1,612.34")
a = parse_schwab_email(html)[0]
assert a.unit_price == Decimal("1612.34")
# --- Vest-release parsing -------------------------------------------------
_VEST_RELEASE = """<html><body>
<h2>Release Confirmation</h2>
<p>
Release Date: 15 Mar 2026
Ticker: META
Total Shares Released: 100.0
Market Price: $612.34
Shares Withheld for Taxes: 45
Tax Withholding Amount: $27,555.30
</p>
</body></html>"""
def test_vest_release_returns_two_activities_and_vest_event() -> None:
"""Release Confirmation yields a BUY (full vest) + SELL (sell-to-cover) + VestEvent."""
from broker_sync.providers.parsers.schwab import parse_schwab_email_full
result = parse_schwab_email_full(_VEST_RELEASE)
assert result.vest_event is not None
assert result.vest_event.ticker == "META"
assert result.vest_event.shares_vested == Decimal("100.0")
assert result.vest_event.shares_sold_to_cover == Decimal("45")
assert result.vest_event.fmv_at_vest_usd == Decimal("612.34")
assert result.vest_event.tax_withheld_usd == Decimal("27555.30")
assert result.vest_event.vest_date.date().isoformat() == "2026-03-15"
assert result.vest_event.external_id.startswith("schwab:2026-03-15:META:VEST:")
assert len(result.activities) == 2
buy = result.activities[0]
assert buy.activity_type is ActivityType.BUY
assert buy.quantity == Decimal("100.0")
sell = result.activities[1]
assert sell.activity_type is ActivityType.SELL
assert sell.quantity == Decimal("45")
assert sell.unit_price == Decimal("612.34")
def test_vest_email_with_unparseable_body_returns_empty() -> None:
"""Subject says Release Confirmation but fields missing → empty result, no crash."""
from broker_sync.providers.parsers.schwab import parse_schwab_email_full
html = "<html><body>Release Confirmation — please contact support</body></html>"
result = parse_schwab_email_full(html)
assert result.vest_event is None
assert result.activities == []
def test_back_compat_parse_schwab_email_drops_vest_event() -> None:
"""The legacy list[Activity] shape remains stable for existing callers."""
acts = parse_schwab_email(_VEST_RELEASE)
assert len(acts) == 2
assert all(isinstance(a.activity_type, ActivityType) for a in acts)

View file

@ -0,0 +1,116 @@
from __future__ import annotations
import json
from datetime import UTC, datetime
from decimal import Decimal
from pathlib import Path
import pytest
from broker_sync.models import Account, AccountType, ActivityType
from broker_sync.providers.fidelity_planviewer import (
ACCOUNT_ID,
FidelityCreds,
FidelityPlanViewerProvider,
FidelityProviderConfigError,
_gains_offset_activity,
)
from broker_sync.providers.parsers.fidelity import (
parse_transactions_html,
parse_valuation_json,
)
_FIXTURES = Path(__file__).parent.parent / "fixtures" / "fidelity"
def test_accounts_exposes_single_workplace_pension_account() -> None:
prov = FidelityPlanViewerProvider(FidelityCreds(
storage_state_path="/tmp/x", plan_id="META",
))
assert prov.accounts() == [
Account(
id=ACCOUNT_ID,
name="Fidelity UK Pension",
account_type=AccountType.WORKPLACE_PENSION,
currency="GBP",
provider="fidelity-planviewer",
),
]
async def test_fetch_raises_without_storage_state() -> None:
prov = FidelityPlanViewerProvider(FidelityCreds(
storage_state_path="/tmp/does-not-exist-xyzzy.json", plan_id="META",
))
with pytest.raises(FidelityProviderConfigError, match="storage_state"):
async for _ in prov.fetch():
pytest.fail("should have raised before yielding")
# -- parser tests against real (captured) fixture --
def test_parse_transactions_real_fixture() -> None:
html = (_FIXTURES / "transactions-full.html").read_text()
txs = parse_transactions_html(html)
# Scheme has ~48 months + a couple of single premiums + 1 rebate;
# Bulk Switches must be filtered out (zero-amount rows).
assert 40 <= len(txs) <= 100
# All dates are within the scheme's lifetime (2022-03 to today-ish).
assert all(tx.date >= datetime(2022, 1, 1, tzinfo=UTC) for tx in txs)
# Sum should match the header total on the page (£102,004.15 at
# fixture time). Allow a £5 tolerance in case the page summary row
# changes in future captures — the unit test primarily guards parsing
# correctness, not drift in the fixture.
total = sum((tx.amount for tx in txs), Decimal(0))
assert abs(total - Decimal("102004.15")) < Decimal("5")
def test_parse_transactions_skips_bulk_switch() -> None:
html = (_FIXTURES / "transactions-full.html").read_text()
txs = parse_transactions_html(html)
assert not any("bulk switch" in tx.tx_type.lower() for tx in txs)
def test_parse_transactions_external_id_deterministic() -> None:
html = (_FIXTURES / "transactions-full.html").read_text()
a = parse_transactions_html(html)
b = parse_transactions_html(html)
assert [tx.external_id for tx in a] == [tx.external_id for tx in b]
assert all(tx.external_id.startswith("fidelity:tx:") for tx in a)
def test_parse_valuation_fixture() -> None:
payload = json.loads((_FIXTURES / "valuation.json").read_text())
holdings = parse_valuation_json(payload)
assert len(holdings) >= 1
h = holdings[0]
assert h.fund_code == "KDOA"
assert "Passive Global Equity" in h.fund_name
assert h.currency == "GBP"
assert h.units > 0
assert h.unit_price > 0
# Value ≈ units * price
assert abs(h.total_value - h.units * h.unit_price) < Decimal("1")
# Contribution-type breakdown must parse
assert set(h.units_by_source.keys()) >= {"SASC", "ERXS"}
def test_gains_offset_emits_deposit_when_pot_exceeds_contributions() -> None:
html = (_FIXTURES / "transactions-full.html").read_text()
valuation = json.loads((_FIXTURES / "valuation.json").read_text())
txs = parse_transactions_html(html)
holdings = parse_valuation_json(valuation)
as_of = datetime(2026, 4, 18, tzinfo=UTC)
offset = _gains_offset_activity(holdings, txs, as_of)
assert offset is not None
assert offset.activity_type in (ActivityType.DEPOSIT, ActivityType.WITHDRAWAL)
assert offset.amount is not None and offset.amount > 0
assert offset.external_id == "fidelity:gains:2026-04-18"
def test_gains_offset_none_when_no_holdings() -> None:
assert _gains_offset_activity(
holdings=[], transactions=[],
as_of=datetime(2026, 4, 18, tzinfo=UTC),
) is None

View file

@ -0,0 +1,66 @@
from __future__ import annotations
from datetime import UTC, datetime
from decimal import Decimal
from broker_sync.models import AccountType, ActivityType
from broker_sync.providers.finance_mysql import _normalise_symbol, _route, _row_to_activity
def test_lse_ticker_routes_to_investengine() -> None:
acct, t, ccy = _route("VUAG.L")
assert acct == "invest-engine-primary"
assert t is AccountType.ISA
assert ccy == "GBP"
def test_us_ticker_routes_to_schwab() -> None:
assert _route("META") == ("schwab-workplace", AccountType.GIA, "USD")
assert _route("FLME_US_EQ") == ("schwab-workplace", AccountType.GIA, "USD")
def test_normalise_symbol() -> None:
assert _normalise_symbol("VUAG.L") == "VUAG"
assert _normalise_symbol("VUSA.L") == "VUSA"
assert _normalise_symbol("META") == "META"
assert _normalise_symbol("FLME_US_EQ") == "FLME"
assert _normalise_symbol("FOO_EQ") == "FOO"
def test_row_to_buy_activity() -> None:
row = {
"id": "123456",
"ticker": "VUAG.L",
"buy_price": 85.5,
"num_shares": 10.0,
"currency": "GBP",
"buy_date": datetime(2022, 3, 15, 10, 30),
"account_id": 1,
}
a = _row_to_activity(row)
assert a.external_id == "finance-mysql:position:123456"
assert a.account_id == "invest-engine-primary"
assert a.account_type is AccountType.ISA
assert a.activity_type is ActivityType.BUY
assert a.symbol == "VUAG" # .L stripped
assert a.quantity == Decimal("10.0")
assert a.unit_price == Decimal("85.5")
assert a.currency == "GBP"
assert a.date == datetime(2022, 3, 15, 10, 30, tzinfo=UTC)
def test_row_to_sell_when_qty_negative() -> None:
row = {
"id": "x",
"ticker": "META",
"buy_price": 450.0,
"num_shares": -2.5, # sell
"currency": "USD",
"buy_date": datetime(2024, 8, 5),
"account_id": 1,
}
a = _row_to_activity(row)
assert a.activity_type is ActivityType.SELL
assert a.quantity == Decimal("2.5") # absolute
assert a.account_id == "schwab-workplace"
assert a.symbol == "META"

View file

@ -0,0 +1,101 @@
from __future__ import annotations
from datetime import UTC, date, datetime
from decimal import Decimal
from broker_sync.models import AccountType, Activity, ActivityType
from broker_sync.providers.imap import (
_IE_GIA_ACCOUNT_ID,
_IE_ISA_ACCOUNT_ID,
_split_ie_by_isa_cap,
_uk_tax_year_start,
)
def _buy(on: datetime, qty: str, price: str) -> Activity:
return Activity(
external_id=f"invest-engine:{on.isoformat()}|{qty}|{price}",
account_id=_IE_ISA_ACCOUNT_ID,
account_type=AccountType.ISA,
date=on,
activity_type=ActivityType.BUY,
currency="GBP",
symbol="VUAG",
quantity=Decimal(qty),
unit_price=Decimal(price),
)
def test_uk_tax_year_start_before_april_6_rolls_back() -> None:
assert _uk_tax_year_start(datetime(2025, 4, 5, tzinfo=UTC)) == date(2024, 4, 6)
assert _uk_tax_year_start(datetime(2025, 4, 6, tzinfo=UTC)) == date(2025, 4, 6)
assert _uk_tax_year_start(datetime(2025, 1, 15, tzinfo=UTC)) == date(2024, 4, 6)
assert _uk_tax_year_start(datetime(2024, 4, 7, tzinfo=UTC)) == date(2024, 4, 6)
def test_single_tax_year_under_cap_stays_isa() -> None:
acts = [
_buy(datetime(2024, 5, 1, tzinfo=UTC), "100", "50"), # £5000
_buy(datetime(2024, 8, 1, tzinfo=UTC), "100", "80"), # £8000
]
routed = _split_ie_by_isa_cap(acts)
assert all(a.account_id == _IE_ISA_ACCOUNT_ID for a in routed)
assert all(a.account_type is AccountType.ISA for a in routed)
def test_overflow_past_cap_flips_to_gia() -> None:
acts = [
_buy(datetime(2024, 5, 1, tzinfo=UTC), "100", "80"), # £8,000
# +£12,000 → £20,000 total; prev £8k < cap → ISA
_buy(datetime(2024, 6, 1, tzinfo=UTC), "150", "80"),
_buy(datetime(2024, 7, 1, tzinfo=UTC), "10", "80"), # prev £20,000 ≥ cap → GIA
_buy(datetime(2024, 8, 1, tzinfo=UTC), "10", "80"), # GIA
]
routed = _split_ie_by_isa_cap(acts)
assert routed[0].account_id == _IE_ISA_ACCOUNT_ID
assert routed[1].account_id == _IE_ISA_ACCOUNT_ID
assert routed[2].account_id == _IE_GIA_ACCOUNT_ID
assert routed[2].account_type is AccountType.GIA
assert routed[3].account_id == _IE_GIA_ACCOUNT_ID
def test_tax_year_boundary_resets_cap() -> None:
acts = [
# 2023-24 tax year: £20k in ISA, plus one in GIA
_buy(datetime(2023, 5, 1, tzinfo=UTC), "400", "50"), # £20,000 → ISA (prev 0 < cap)
_buy(datetime(2024, 1, 1, tzinfo=UTC), "100", "50"), # GIA (prev 20k)
# 2024-25 tax year starts 2024-04-06 — cap resets
_buy(datetime(2024, 5, 1, tzinfo=UTC), "100", "50"), # ISA (prev 0 for new year)
]
routed = _split_ie_by_isa_cap(acts)
assert routed[0].account_id == _IE_ISA_ACCOUNT_ID
assert routed[1].account_id == _IE_GIA_ACCOUNT_ID
assert routed[2].account_id == _IE_ISA_ACCOUNT_ID
def test_out_of_order_activities_sorted_before_cap_applied() -> None:
acts = [
_buy(datetime(2024, 8, 1, tzinfo=UTC), "10", "80"), # later date but given first
_buy(datetime(2024, 5, 1, tzinfo=UTC), "250", "80"), # earlier, £20,000 → ISA
]
routed = _split_ie_by_isa_cap(acts)
by_date = {a.date: a for a in routed}
assert by_date[datetime(2024, 5, 1, tzinfo=UTC)].account_id == _IE_ISA_ACCOUNT_ID
assert by_date[datetime(2024, 8, 1, tzinfo=UTC)].account_id == _IE_GIA_ACCOUNT_ID
def test_non_ie_activities_passed_through_unchanged() -> None:
schwab_act = Activity(
external_id="schwab:abc",
account_id="schwab-workplace",
account_type=AccountType.GIA,
date=datetime(2024, 5, 1, tzinfo=UTC),
activity_type=ActivityType.SELL,
currency="USD",
symbol="META",
quantity=Decimal("10"),
unit_price=Decimal("500"),
)
routed = _split_ie_by_isa_cap([schwab_act])
assert routed[0].account_id == "schwab-workplace"
assert routed[0].account_type is AccountType.GIA

View file

@ -0,0 +1,488 @@
from __future__ import annotations
from collections.abc import Callable
from datetime import UTC, datetime, timedelta
from decimal import Decimal
from typing import Any
import httpx
import pytest
from broker_sync.models import AccountType, ActivityType
from broker_sync.providers.invest_engine import (
InvestEngineError,
InvestEngineProvider,
InvestEngineTokenExpiredError,
InvestEngineVersionError,
_probe_version,
_transaction_to_activity,
)
# -- helpers --
def _future() -> datetime:
return datetime.now(UTC) + timedelta(days=30)
def _past() -> datetime:
return datetime.now(UTC) - timedelta(days=1)
def _client(handler: Callable[[httpx.Request], httpx.Response]) -> httpx.AsyncClient:
return httpx.AsyncClient(
base_url="https://investengine.com",
transport=httpx.MockTransport(handler),
)
# -- version probe --
async def test_probe_stops_at_first_live_version() -> None:
"""v0.32 is live (401). Probe should return 32 without touching v0.33."""
visited: list[str] = []
def handler(req: httpx.Request) -> httpx.Response:
visited.append(req.url.path)
return httpx.Response(401)
async with _client(handler) as c:
minor = await _probe_version(c, start_minor=32)
assert minor == 32
assert visited == ["/api/v0.32/"]
async def test_probe_skips_410_and_advances() -> None:
"""v0.32 is Gone, v0.33 is live (401). Probe lands on 33."""
visited: list[str] = []
def handler(req: httpx.Request) -> httpx.Response:
visited.append(req.url.path)
if "v0.32" in req.url.path:
return httpx.Response(410)
return httpx.Response(401)
async with _client(handler) as c:
minor = await _probe_version(c, start_minor=32)
assert minor == 33
assert visited == ["/api/v0.32/", "/api/v0.33/"]
async def test_probe_gives_up_after_max_minor() -> None:
"""Every version 410s → explicit error rather than infinite loop."""
def handler(req: httpx.Request) -> httpx.Response:
return httpx.Response(410)
async with _client(handler) as c:
with pytest.raises(InvestEngineVersionError):
await _probe_version(c, start_minor=32, max_minor=34)
# -- token expiry fail-fast --
async def test_expired_token_raises_on_fetch() -> None:
"""If expires_at is in the past, we fail before making any request."""
def handler(req: httpx.Request) -> httpx.Response:
raise AssertionError("should not have called the API")
p = InvestEngineProvider(
bearer_token="x",
token_expires_at=_past(),
transport=httpx.MockTransport(handler),
)
try:
with pytest.raises(InvestEngineTokenExpiredError):
async for _ in p.fetch():
pass
finally:
await p.close()
# -- 401 during fetch --
async def test_401_during_probe_is_live_version() -> None:
"""401 on version-probe GET means version is live — we then request
the portfolios endpoint which, with a bad token, also 401s, and that
second 401 is what should surface as TokenExpired."""
def handler(req: httpx.Request) -> httpx.Response:
return httpx.Response(401)
p = InvestEngineProvider(
bearer_token="dead-token",
token_expires_at=_future(),
transport=httpx.MockTransport(handler),
)
try:
with pytest.raises(InvestEngineTokenExpiredError):
async for _ in p.fetch():
pass
finally:
await p.close()
# -- headers --
async def test_bearer_and_user_agent_headers_attached() -> None:
seen: list[tuple[str | None, str | None]] = []
def handler(req: httpx.Request) -> httpx.Response:
seen.append((req.headers.get("Authorization"), req.headers.get("User-Agent")))
# Probe returns live; portfolios returns empty list shape.
if req.url.path.endswith("/portfolios/"):
return httpx.Response(200, json={"results": []})
return httpx.Response(401)
p = InvestEngineProvider(
bearer_token="abc123",
token_expires_at=_future(),
transport=httpx.MockTransport(handler),
)
try:
async for _ in p.fetch():
pass
finally:
await p.close()
# Probe + portfolios — both should carry the Bearer + UA.
assert len(seen) == 2
for auth, ua in seen:
assert auth == "Bearer abc123"
assert ua is not None and "broker-sync" in ua
# -- accounts contract --
def test_accounts_returns_single_isa() -> None:
p = InvestEngineProvider(
bearer_token="x",
token_expires_at=_future(),
)
accs = p.accounts()
assert [a.id for a in accs] == ["invest-engine-primary"]
assert accs[0].account_type is AccountType.ISA
assert accs[0].currency == "GBP"
assert accs[0].provider == "invest-engine"
def test_provider_name() -> None:
assert InvestEngineProvider.name == "invest-engine"
# -- transaction → activity mapping --
#
# The real IE response shape is UNVERIFIED (MFA blocks authed probes).
# These tests use best-guess shapes based on Django REST conventions.
# `_transaction_to_activity` is written defensively so alternative casings
# and common field names round-trip correctly.
def _mock_txn(
*,
txn_id: str = "txn-1",
txn_type: str = "BUY",
symbol: str = "VUAG",
quantity: str = "10",
price: str = "90.5",
amount: str = "905.00",
currency: str = "GBP",
date: str = "2026-04-01T10:00:00Z",
) -> dict[str, Any]:
return {
"id": txn_id,
"type": txn_type,
"symbol": symbol,
"quantity": quantity,
"price": price,
"amount": amount,
"currency": currency,
"date": date,
}
def test_buy_txn_becomes_buy_activity() -> None:
a = _transaction_to_activity(_mock_txn(txn_type="BUY"))
assert a is not None
assert a.activity_type is ActivityType.BUY
assert a.external_id == "invest-engine:txn-1"
assert a.account_id == "invest-engine-primary"
assert a.account_type is AccountType.ISA
assert a.symbol == "VUAG"
assert a.quantity == Decimal("10")
assert a.unit_price == Decimal("90.5")
assert a.currency == "GBP"
def test_sell_txn_becomes_sell_activity() -> None:
a = _transaction_to_activity(_mock_txn(txn_type="SELL", quantity="5"))
assert a is not None
assert a.activity_type is ActivityType.SELL
assert a.quantity == Decimal("5")
def test_dividend_txn_becomes_dividend_with_amount() -> None:
raw = _mock_txn(txn_type="DIVIDEND", amount="12.34")
raw.pop("quantity")
raw.pop("price")
a = _transaction_to_activity(raw)
assert a is not None
assert a.activity_type is ActivityType.DIVIDEND
assert a.amount == Decimal("12.34")
def test_deposit_txn_mapped() -> None:
raw = _mock_txn(txn_type="DEPOSIT", amount="500.00")
raw.pop("quantity")
raw.pop("price")
a = _transaction_to_activity(raw)
assert a is not None
assert a.activity_type is ActivityType.DEPOSIT
assert a.amount == Decimal("500.00")
def test_withdrawal_txn_mapped() -> None:
raw = _mock_txn(txn_type="WITHDRAWAL", amount="100.00")
raw.pop("quantity")
raw.pop("price")
a = _transaction_to_activity(raw)
assert a is not None
assert a.activity_type is ActivityType.WITHDRAWAL
def test_unknown_txn_type_is_skipped_with_warning(caplog: pytest.LogCaptureFixture, ) -> None:
raw = _mock_txn(txn_type="MYSTERY_EVENT")
a = _transaction_to_activity(raw)
assert a is None
assert any("MYSTERY_EVENT" in r.message for r in caplog.records)
def test_date_parsing_handles_z_suffix() -> None:
a = _transaction_to_activity(_mock_txn(date="2026-04-01T10:00:00Z"))
assert a is not None
assert a.date == datetime(2026, 4, 1, 10, 0, tzinfo=UTC)
def test_date_parsing_handles_offset_suffix() -> None:
a = _transaction_to_activity(_mock_txn(date="2026-04-01T10:00:00+00:00"))
assert a is not None
assert a.date == datetime(2026, 4, 1, 10, 0, tzinfo=UTC)
# -- end-to-end fetch (portfolios + transactions happy path) --
async def test_fetch_enumerates_portfolios_and_transactions() -> None:
# Mock Django-REST-style paginated response with results + next.
portfolios = {"results": [{"id": 7, "name": "Viktor's ISA"}], "next": None}
dividend = _mock_txn(txn_id="t2", txn_type="DIVIDEND", amount="5.00")
dividend.pop("quantity")
dividend.pop("price")
transactions: dict[str, Any] = {
"results": [
_mock_txn(txn_id="t1", txn_type="BUY"),
dividend,
],
"next": None,
}
visited: list[str] = []
def handler(req: httpx.Request) -> httpx.Response:
visited.append(req.url.path)
if req.url.path == "/api/v0.32/":
return httpx.Response(401)
if req.url.path == "/api/v0.32/portfolios/":
return httpx.Response(200, json=portfolios)
if req.url.path == "/api/v0.32/transactions/":
assert req.url.params.get("portfolio") == "7"
return httpx.Response(200, json=transactions)
raise AssertionError(f"unexpected path: {req.url.path}")
p = InvestEngineProvider(
bearer_token="good-token",
token_expires_at=_future(),
transport=httpx.MockTransport(handler),
)
try:
out = [a async for a in p.fetch()]
finally:
await p.close()
assert [a.external_id for a in out] == [
"invest-engine:t1",
"invest-engine:t2",
]
async def test_fetch_supports_data_meta_pagination_shape() -> None:
"""Defensive: handle the alternative {data, meta.next_page} shape too."""
portfolios = {"data": [{"id": 9, "name": "ISA"}], "meta": {"next_page": None}}
transactions = {
"data": [_mock_txn(txn_id="dm1")],
"meta": {
"next_page": None
},
}
def handler(req: httpx.Request) -> httpx.Response:
if req.url.path == "/api/v0.32/":
return httpx.Response(401)
if req.url.path == "/api/v0.32/portfolios/":
return httpx.Response(200, json=portfolios)
if req.url.path == "/api/v0.32/transactions/":
return httpx.Response(200, json=transactions)
raise AssertionError(f"unexpected: {req.url.path}")
p = InvestEngineProvider(
bearer_token="t",
token_expires_at=_future(),
transport=httpx.MockTransport(handler),
)
try:
out = [a async for a in p.fetch()]
finally:
await p.close()
assert [a.external_id for a in out] == ["invest-engine:dm1"]
# -- since filter --
async def test_since_drops_older_transactions() -> None:
txns = {
"results": [
_mock_txn(txn_id="old", date="2020-01-01T00:00:00Z"),
_mock_txn(txn_id="new", date="2026-04-01T10:00:00Z"),
],
"next":
None,
}
def handler(req: httpx.Request) -> httpx.Response:
if req.url.path == "/api/v0.32/":
return httpx.Response(401)
if req.url.path == "/api/v0.32/portfolios/":
return httpx.Response(200, json={"results": [{"id": 1}]})
return httpx.Response(200, json=txns)
p = InvestEngineProvider(
bearer_token="t",
token_expires_at=_future(),
transport=httpx.MockTransport(handler),
)
try:
since = datetime(2026, 1, 1, tzinfo=UTC)
out = [a async for a in p.fetch(since=since)]
finally:
await p.close()
assert [a.external_id for a in out] == ["invest-engine:new"]
# -- 401 on a data endpoint → TokenExpired --
async def test_401_on_portfolios_triggers_token_expired() -> None:
def handler(req: httpx.Request) -> httpx.Response:
if req.url.path == "/api/v0.32/":
return httpx.Response(401)
if req.url.path == "/api/v0.32/portfolios/":
return httpx.Response(401, json={"detail": "Invalid token"})
raise AssertionError(f"unexpected: {req.url.path}")
p = InvestEngineProvider(
bearer_token="stale",
token_expires_at=_future(), # clock says alive, server says dead
transport=httpx.MockTransport(handler),
)
try:
with pytest.raises(InvestEngineTokenExpiredError):
async for _ in p.fetch():
pass
finally:
await p.close()
# -- 410 on a data endpoint → one re-probe + retry --
async def test_410_on_data_triggers_reprobe_and_retry() -> None:
# Scenario: probe lands on v0.32, then the portfolios call 410s because
# IE just rolled the version mid-session. We re-probe, find v0.33, and
# retry the call. We verify both versions are hit and we don't loop.
visited: list[str] = []
portfolios_call_count = 0
def handler(req: httpx.Request) -> httpx.Response:
visited.append(req.url.path)
# Version probe endpoints: 32 was live when process started; new probe
# now shows 32 Gone and 33 live.
if req.url.path == "/api/v0.32/":
# First probe (process start) → live (401).
# Second probe (after 410) → Gone.
if "/api/v0.32/portfolios/" in [v for v in visited]:
return httpx.Response(410)
return httpx.Response(401)
if req.url.path == "/api/v0.33/":
return httpx.Response(401)
if req.url.path == "/api/v0.32/portfolios/":
nonlocal portfolios_call_count
portfolios_call_count += 1
return httpx.Response(410, json={"detail": "Version Gone"})
if req.url.path == "/api/v0.33/portfolios/":
return httpx.Response(200, json={"results": []})
raise AssertionError(f"unexpected: {req.url.path}")
p = InvestEngineProvider(
bearer_token="t",
token_expires_at=_future(),
transport=httpx.MockTransport(handler),
)
try:
out = [a async for a in p.fetch()]
finally:
await p.close()
assert out == []
assert "/api/v0.32/portfolios/" in visited
assert "/api/v0.33/" in visited
assert "/api/v0.33/portfolios/" in visited
# Exactly one 410 on v0.32/portfolios/; no repeat loop.
assert portfolios_call_count == 1
# -- integration stub --
@pytest.mark.skip(reason="needs live token — flip on manually")
async def test_live_integration_smoke() -> None: # pragma: no cover
"""Real API smoke test. Enable manually after Viktor pastes a token."""
import os
token = os.environ.get("IE_BEARER_TOKEN")
if not token:
pytest.skip("IE_BEARER_TOKEN not set")
p = InvestEngineProvider(
bearer_token=token,
token_expires_at=_future(),
)
try:
out = [a async for a in p.fetch(since=_past())]
finally:
await p.close()
# No assertions on content yet — just proves a live round-trip works.
assert isinstance(out, list)
# -- smoke check InvestEngineError is public --
def test_error_types_public() -> None:
assert issubclass(InvestEngineTokenExpiredError, InvestEngineError)
assert issubclass(InvestEngineVersionError, InvestEngineError)

View file

@ -44,10 +44,14 @@ def _client(handler: httpx.MockTransport, session_path: Path) -> WealthfolioSink
def _login_ok(req: httpx.Request) -> httpx.Response:
assert req.url.path == "/api/v1/auth/login"
body = json.loads(req.content)
assert body == {"username": "viktor", "password": "hunter2"}
# Wealthfolio 3.2 LoginRequest is password-only.
assert body == {"password": "hunter2"}
return httpx.Response(
200,
json={"ok": True},
json={
"authenticated": True,
"expiresIn": 604800
},
headers={"set-cookie": "wf_token=abc123; Path=/api; HttpOnly"},
)
@ -126,7 +130,7 @@ async def test_401_triggers_single_reauth_and_retry(tmp_path: Path) -> None:
# -- Account ensure --
async def test_ensure_account_no_op_if_exists(tmp_path: Path) -> None:
async def test_ensure_account_returns_existing_wf_id(tmp_path: Path) -> None:
sp = tmp_path / "s.json"
sp.write_text(json.dumps({"cookies": {"wf_token": "fresh"}}))
@ -134,11 +138,15 @@ async def test_ensure_account_no_op_if_exists(tmp_path: Path) -> None:
async def handler(req: httpx.Request) -> httpx.Response:
if req.method == "GET" and req.url.path == "/api/v1/accounts":
# Wealthfolio stores its own UUID for id; providerAccountId is
# what we gave it.
return httpx.Response(
200,
json=[{
"id": "t212-isa",
"name": "Trading212 ISA"
"id": "uuid-wf-123",
"name": "Trading212 ISA",
"provider": "trading212",
"providerAccountId": "t212-isa",
}],
)
if req.method == "POST":
@ -153,7 +161,8 @@ async def test_ensure_account_no_op_if_exists(tmp_path: Path) -> None:
currency="GBP",
provider="trading212",
)
await sink.ensure_account(acc)
wf_id = await sink.ensure_account(acc)
assert wf_id == "uuid-wf-123"
assert posts == [] # no create
@ -168,7 +177,14 @@ async def test_ensure_account_creates_if_missing(tmp_path: Path) -> None:
return httpx.Response(200, json=[])
if req.method == "POST" and req.url.path == "/api/v1/accounts":
posted.append(json.loads(req.content))
return httpx.Response(200, json={"id": "t212-isa"})
return httpx.Response(
200,
json={
"id": "uuid-new-456",
"provider": "trading212",
"providerAccountId": "t212-isa",
},
)
return httpx.Response(500)
sink = _client(httpx.MockTransport(handler), sp)
@ -179,11 +195,18 @@ async def test_ensure_account_creates_if_missing(tmp_path: Path) -> None:
currency="GBP",
provider="trading212",
)
await sink.ensure_account(acc)
wf_id = await sink.ensure_account(acc)
assert wf_id == "uuid-new-456"
assert len(posted) == 1
assert posted[0]["id"] == "t212-isa"
assert posted[0]["account_type"] == "ISA"
# We no longer send our logical id — Wealthfolio would ignore it. We
# DO send our id as providerAccountId so Wealthfolio preserves it for
# future matching.
assert "id" not in posted[0]
assert posted[0]["providerAccountId"] == "t212-isa"
assert posted[0]["accountType"] == "ISA"
assert posted[0]["currency"] == "GBP"
assert posted[0]["isActive"] is True
assert posted[0]["isDefault"] is False
# -- Activity import --
@ -198,14 +221,28 @@ async def test_import_dry_run_then_real(tmp_path: Path) -> None:
async def handler(req: httpx.Request) -> httpx.Response:
calls.append(req.url.path)
if req.url.path == "/api/v1/activities/import/check":
return httpx.Response(200, json={"ok": True, "rows": 1})
# /import/check hydrates and returns a list of ActivityImport.
return httpx.Response(200,
json=[
{
"symbol": "VUAG",
"isValid": True,
"errors": None,
"assetId": "enriched-asset-uuid",
"exchangeMic": "XLON",
},
])
if req.url.path == "/api/v1/activities/import":
return httpx.Response(
200,
json=[{
"id": "wf-1",
"external_id": "t212:1"
}],
json={
"activities": [
{
"id": "wf-1",
"external_id": "t212:1"
},
],
},
)
return httpx.Response(500)

View file

@ -1,3 +1,7 @@
from __future__ import annotations
from datetime import UTC, datetime, timedelta
from typer.testing import CliRunner
from broker_sync import __version__
@ -10,3 +14,64 @@ def test_version_prints_package_version() -> None:
result = runner.invoke(app, ["version"])
assert result.exit_code == 0
assert __version__ in result.stdout
# -- invest-engine CLI --
def _future_iso() -> str:
return (datetime.now(UTC) + timedelta(days=30)).isoformat()
def _past_iso() -> str:
return (datetime.now(UTC) - timedelta(days=1)).isoformat()
def test_invest_engine_expired_token_exits_2() -> None:
"""Guard against burning a request on a token the user already knows is dead."""
result = runner.invoke(
app,
["invest-engine"],
env={
"WF_BASE_URL": "https://wf.example.com",
"WF_USERNAME": "u",
"WF_PASSWORD": "p",
"IE_BEARER_TOKEN": "anything",
"IE_TOKEN_EXPIRES_AT": _past_iso(),
"BROKER_SYNC_DATA_DIR": "/tmp",
},
)
assert result.exit_code == 2, result.output
assert "expired" in result.output.lower() or "token" in result.output.lower()
def test_invest_engine_unknown_mode_exits_2() -> None:
result = runner.invoke(
app,
["invest-engine", "--mode", "nonsense"],
env={
"WF_BASE_URL": "https://wf.example.com",
"WF_USERNAME": "u",
"WF_PASSWORD": "p",
"IE_BEARER_TOKEN": "t",
"IE_TOKEN_EXPIRES_AT": _future_iso(),
"BROKER_SYNC_DATA_DIR": "/tmp",
},
)
assert result.exit_code == 2
def test_invest_engine_malformed_expires_exits_2() -> None:
result = runner.invoke(
app,
["invest-engine"],
env={
"WF_BASE_URL": "https://wf.example.com",
"WF_USERNAME": "u",
"WF_PASSWORD": "p",
"IE_BEARER_TOKEN": "t",
"IE_TOKEN_EXPIRES_AT": "not-an-iso-date",
"BROKER_SYNC_DATA_DIR": "/tmp",
},
)
assert result.exit_code == 2

View file

@ -73,22 +73,35 @@ async def test_pipeline_skips_dedup_then_imports_new(tmp_path: Path) -> None:
async def handler(req: httpx.Request) -> httpx.Response:
if req.method == "GET" and req.url.path == "/api/v1/accounts":
return httpx.Response(200, json=[{"id": "fake-isa"}])
if req.url.path == "/api/v1/activities/import/check":
return httpx.Response(200, json={"ok": True})
if req.url.path == "/api/v1/activities/import":
# The httpx request body is multipart. We don't parse the multipart
# properly — we just scan for our dedup tags to confirm the
# pipeline pushed the rows it should have.
body = req.content.decode()
posted_batches.append(body)
# Echo back external_ids so dedup.record gets the WF activity id.
# Return account with Wealthfolio-assigned UUID + our providerAccountId.
return httpx.Response(
200,
json=[{
"id": f"wf-{i}",
"external_id": ext
} for i, ext in enumerate(["a", "b", "c"])],
"id": "wf-uuid-fake-isa",
"provider": "fake",
"providerAccountId": "fake-isa",
}],
)
if req.url.path == "/api/v1/activities/import/check":
body = json.loads(req.content)
# Echo each activity back marked valid (mimic Wealthfolio's
# hydrate step).
return httpx.Response(200,
json=[{
**a, "isValid": True,
"errors": None
} for a in body["activities"]])
if req.url.path == "/api/v1/activities/import":
body = req.content.decode()
posted_batches.append(body)
return httpx.Response(
200,
json={
"activities": [{
"id": f"wf-{i}",
"external_id": ext
} for i, ext in enumerate(["a", "b", "c"])]
},
)
return httpx.Response(500)
@ -106,21 +119,31 @@ async def test_pipeline_skips_dedup_then_imports_new(tmp_path: Path) -> None:
finally:
await sink.close()
# 3 provider activities fetched, but pipeline expands each BUY into
# (BUY, matching DEPOSIT). "a" is already-seen → skipped; its match
# "cash-flow-match:buy:a" is NEW since it wasn't seeded.
assert result.fetched == 3
assert result.new_after_dedup == 2
assert result.imported == 2
assert result.new_after_dedup == 5
assert result.imported == 5
assert result.failed == 0
assert len(posted_batches) == 1
body = posted_batches[0]
# Only the new rows (b, c) — NOT the already-seen "a".
# Only the new rows (b, c + the 3 matches) — NOT the already-seen "a".
assert "sync:fake:a" not in body
assert "sync:fake:b" in body
assert "sync:fake:c" in body
# Matching DEPOSITs rode along with their trade.
assert "cash-flow-match:buy:a" in body
assert "cash-flow-match:buy:b" in body
assert "cash-flow-match:buy:c" in body
# All three external_ids are now in dedup after the run.
# All six external_ids are now in dedup after the run.
assert dedup.has_seen("fake", "fake-isa", "a")
assert dedup.has_seen("fake", "fake-isa", "b")
assert dedup.has_seen("fake", "fake-isa", "c")
assert dedup.has_seen("fake", "fake-isa", "cash-flow-match:buy:a")
assert dedup.has_seen("fake", "fake-isa", "cash-flow-match:buy:b")
assert dedup.has_seen("fake", "fake-isa", "cash-flow-match:buy:c")
async def test_pipeline_records_failure_when_import_rejects(tmp_path: Path) -> None:
@ -135,7 +158,14 @@ async def test_pipeline_records_failure_when_import_rejects(tmp_path: Path) -> N
async def handler(req: httpx.Request) -> httpx.Response:
if req.method == "GET" and req.url.path == "/api/v1/accounts":
return httpx.Response(200, json=[{"id": "fake-isa"}])
return httpx.Response(
200,
json=[{
"id": "wf-uuid-fake-isa",
"provider": "fake",
"providerAccountId": "fake-isa",
}],
)
if req.url.path == "/api/v1/activities/import/check":
return httpx.Response(400, json={"errors": ["bad row"]})
return httpx.Response(500)
@ -152,8 +182,86 @@ async def test_pipeline_records_failure_when_import_rejects(tmp_path: Path) -> N
finally:
await sink.close()
# Pipeline expands 1 BUY into (BUY, matching DEPOSIT). Both are in the
# batch that /import/check rejects, so both are counted as failed.
assert result.fetched == 1
assert result.imported == 0
assert result.failed == 1
# NOT recorded in dedup so the next run retries.
assert result.failed == 2
# NOT recorded in dedup so the next run retries both.
assert not dedup.has_seen("fake", "fake-isa", "a")
assert not dedup.has_seen("fake", "fake-isa", "cash-flow-match:buy:a")
# -- Cash-flow match helpers ---------------------------------------------
from broker_sync.pipeline import _matched_cash_flow, _with_cash_flow_match # noqa: E402
def _make_activity(
activity_type: ActivityType,
*,
quantity: str | None = "1",
unit_price: str | None = "100",
fee: str = "0",
amount: str | None = None,
external_id: str = "x",
) -> Activity:
return Activity(
external_id=external_id,
account_id="acct",
account_type=AccountType.ISA,
date=datetime(2026, 4, 1, tzinfo=UTC),
activity_type=activity_type,
currency="GBP",
quantity=Decimal(quantity) if quantity is not None else None,
unit_price=Decimal(unit_price) if unit_price is not None else None,
fee=Decimal(fee),
amount=Decimal(amount) if amount is not None else None,
)
def test_matched_cash_flow_for_buy_is_deposit_with_total_cost() -> None:
buy = _make_activity(
ActivityType.BUY, quantity="10", unit_price="200.50", fee="1.25",
external_id="buy-1",
)
match = _matched_cash_flow(buy)
assert match is not None
assert match.activity_type is ActivityType.DEPOSIT
assert match.amount == Decimal("2006.25") # 10*200.50 + 1.25
assert match.currency == "GBP"
assert match.account_id == buy.account_id
assert match.date == buy.date
assert match.external_id == "cash-flow-match:buy:buy-1"
def test_matched_cash_flow_for_sell_is_withdrawal_net_of_fee() -> None:
sell = _make_activity(
ActivityType.SELL, quantity="5", unit_price="300", fee="2.50",
external_id="sell-7",
)
match = _matched_cash_flow(sell)
assert match is not None
assert match.activity_type is ActivityType.WITHDRAWAL
assert match.amount == Decimal("1497.50") # 5*300 - 2.50
assert match.external_id == "cash-flow-match:sell:sell-7"
def test_matched_cash_flow_none_for_deposit_withdrawal_dividend() -> None:
dep = _make_activity(ActivityType.DEPOSIT, quantity=None, unit_price=None, amount="100")
wit = _make_activity(ActivityType.WITHDRAWAL, quantity=None, unit_price=None, amount="50")
div = _make_activity(ActivityType.DIVIDEND, quantity=None, unit_price=None, amount="5")
assert _matched_cash_flow(dep) is None
assert _matched_cash_flow(wit) is None
assert _matched_cash_flow(div) is None
def test_matched_cash_flow_skips_zero_amount_trades() -> None:
zero_buy = _make_activity(ActivityType.BUY, quantity="0", unit_price="100")
assert _matched_cash_flow(zero_buy) is None
def test_with_cash_flow_match_returns_pair_for_buy_single_for_deposit() -> None:
buy = _make_activity(ActivityType.BUY, external_id="buy-2")
dep = _make_activity(ActivityType.DEPOSIT, quantity=None, unit_price=None, amount="500")
assert len(_with_cash_flow_match(buy)) == 2
assert len(_with_cash_flow_match(dep)) == 1