## Context
Prior commit 832732a scaffolded the provider with a stub fetch() that
raised FidelityProviderConfigError. This commit replaces the stub with
the end-to-end ingest flow, validated against the real PlanViewer site
during a live login session on 2026-04-18.
Fidelity UK PlanViewer mixes a legacy Struts2 HTML app
(www.planviewer.fidelity.co.uk) with a React SPA at
pv.planviewer.fidelity.co.uk. Authentication is PingFederate OAuth2 at
id.fidelity.co.uk — password + memorable word + SMS OTP, with a
remember-device cookie that keeps the session alive for weeks. The
transaction history is server-rendered HTML at DisplayMyPlanMemberTransHist.action;
current fund holdings come from the DisplayValuation.action JSON XHR.
Both live behind the same cookie jar, so one Playwright session (seeded
interactively once, kept alive via storage_state) can scrape both.
## This change
- broker_sync/providers/parsers/fidelity.py (NEW)
- parse_transactions_html: extracts cash-impacting rows from the
#myplan_member_transhist_support table, skips Bulk Switches (no cash
movement), emits FidelityCashTx with deterministic external_id for
dedup.
- parse_valuation_json: lifts fund code + name + units + price +
contribution-type breakdown from the JSON payload.
- broker_sync/providers/fidelity_planviewer.py (REWRITTEN)
- FidelityPlanViewerProvider.fetch() now loads storage_state, boots
headless Chromium, navigates landing → main page (to hydrate the
SPA session + capture DisplayValuation XHR) → transactions page
with a wide 01 Jan 1990 → today window. Raises FidelitySessionError
if PlanViewer shows the 15-min idle page or redirects back to
id.fidelity.co.uk.
- _gains_offset_activity emits a synthetic DEPOSIT/WITHDRAWAL with a
date-keyed external_id so WF Net Worth reconciles to the
Fidelity-reported pot value without stacking duplicates across
monthly runs.
- Rolls storage_state back to disk after each run, extending session
TTL.
- tests/providers/test_fidelity_planviewer.py (EXTENDED)
- 8 tests against a real captured fixture: account shape, guard on
missing storage_state, full-fixture round-trip (51 txs summing to
£102,004.15), Bulk Switch filtered, deterministic external_id,
valuation parse with fund-code resolution, gains-offset direction
+ skip-when-empty.
- tests/fixtures/fidelity/transactions-full.html + valuation.json (NEW)
- Sanitised captures from the 2026-04-18 live session.
## What is NOT in this change
- CronJob + Vault secret wiring + Prometheus alert in
infra/stacks/broker-sync/main.tf — next commit.
- Dockerfile Chromium install — next commit.
- The scrape-and-import was already done manually (51 activities +
1 gains offset imported into WF account a7d6208d); this commit
productionises the code path so the monthly cron can do the same.
## Verification
### Automated
$ poetry run pytest tests/providers/test_fidelity_planviewer.py -v
8 passed in 0.88s
$ poetry run pytest -q
128 passed, 1 skipped in 1.41s
$ poetry run mypy broker_sync/providers/fidelity_planviewer.py broker_sync/providers/parsers/fidelity.py
Success: no issues found in 2 source files
$ poetry run ruff check broker_sync/providers/fidelity_planviewer.py broker_sync/providers/parsers/fidelity.py
All checks passed!
### Manual verification (2026-04-18 live run)
1. poetry run broker-sync fidelity-seed (headed browser + SMS OTP) —
captured storage_state, staged to Vault.
2. Inline import script hit the same code paths the provider now runs;
52 activities imported into a new WF WORKPLACE_PENSION account, WF
Net Worth jumped from £865,358 → £1,003,083.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
253 lines
9.7 KiB
Python
253 lines
9.7 KiB
Python
"""Fidelity UK PlanViewer provider — workplace pension backfill + monthly sync.
|
|
|
|
PlanViewer has no public individual-member API. The SPA (at
|
|
``pv.planviewer.fidelity.co.uk``) and the legacy HTML app (at
|
|
``www.planviewer.fidelity.co.uk``) share session cookies via PingFederate
|
|
OAuth at ``id.fidelity.co.uk``.
|
|
|
|
We keep a Playwright-maintained session via ``storage_state.json``:
|
|
|
|
1. **One-off seed** (``broker-sync fidelity-seed``): Viktor runs a headed
|
|
Chromium, logs in (password + memorable word + SMS MFA), clicks
|
|
"Remember device". The storage_state is persisted to Vault.
|
|
2. **Monthly cron**: loads storage_state, boots headless Chromium, navigates
|
|
to the transaction-history page with a wide date range, parses the HTML
|
|
table, and intercepts the ``DisplayValuation`` XHR for the current
|
|
fund holdings. On 401/idle-timeout we raise
|
|
:class:`FidelitySessionError` so Prometheus alerts Viktor to re-seed.
|
|
|
|
## Emitted Activity shape
|
|
|
|
- One ``DEPOSIT`` per cash-impacting transaction (Regular Premium, Single
|
|
Premium, rebate, etc.). ``external_id = fidelity:tx:<sha256[:16]>``.
|
|
- One synthetic ``DEPOSIT`` for unrealised gains so WF's Net Worth matches
|
|
the Fidelity dashboard. ``external_id =
|
|
fidelity:gains:<YYYY-MM-DD>``.
|
|
- Bulk Switches / Fund Switches are skipped (no cash movement).
|
|
"""
|
|
from __future__ import annotations
|
|
|
|
import contextlib
|
|
import logging
|
|
from collections.abc import AsyncIterator
|
|
from datetime import UTC, datetime
|
|
from decimal import Decimal
|
|
from pathlib import Path
|
|
from typing import Any, NamedTuple
|
|
|
|
from broker_sync.models import Account, AccountType, Activity, ActivityType
|
|
from broker_sync.providers.parsers.fidelity import (
|
|
FidelityCashTx,
|
|
FidelityHolding,
|
|
parse_transactions_html,
|
|
parse_valuation_json,
|
|
)
|
|
|
|
log = logging.getLogger(__name__)
|
|
|
|
ACCOUNT_ID = "fidelity-workplace-pension"
|
|
_CCY = "GBP"
|
|
|
|
_PV_BASE = "https://www.planviewer.fidelity.co.uk"
|
|
_PV_TX_PATH = "/planviewer/DisplayMyPlanMemberTransHist.action"
|
|
_PV_VALUATION_PATH = "/planviewer/DisplayValuation.action"
|
|
_PV_LANDING = "https://www.planviewer.fidelity.co.uk/"
|
|
|
|
# A wide backfill cap; scheme can't predate 1990.
|
|
_BACKFILL_START = "01 Jan 1990"
|
|
|
|
|
|
class FidelityCreds(NamedTuple):
|
|
"""Paths needed to run the provider."""
|
|
storage_state_path: str
|
|
plan_id: str
|
|
headless: bool = True
|
|
|
|
|
|
class FidelitySessionError(Exception):
|
|
"""Raised when PlanViewer rejects the saved session — re-seed required."""
|
|
|
|
|
|
class FidelityProviderConfigError(Exception):
|
|
"""Raised when provider config is missing or obviously wrong."""
|
|
|
|
|
|
def _tx_to_activity(tx: FidelityCashTx) -> Activity:
|
|
"""Map a Fidelity cash transaction to a canonical DEPOSIT."""
|
|
return Activity(
|
|
external_id=tx.external_id,
|
|
account_id=ACCOUNT_ID,
|
|
account_type=AccountType.WORKPLACE_PENSION,
|
|
date=tx.date,
|
|
activity_type=ActivityType.DEPOSIT,
|
|
currency=_CCY,
|
|
amount=tx.amount,
|
|
notes=f"fidelity-planviewer:{tx.tx_type}",
|
|
)
|
|
|
|
|
|
def _gains_offset_activity(
|
|
holdings: list[FidelityHolding],
|
|
transactions: list[FidelityCashTx],
|
|
as_of: datetime,
|
|
) -> Activity | None:
|
|
"""Create a synthetic DEPOSIT/WITHDRAWAL so WF Net Worth matches the
|
|
Fidelity dashboard's reported pot value.
|
|
|
|
The offset carries a date-derived external_id so monthly runs refresh
|
|
the same synthetic entry rather than stacking duplicates.
|
|
"""
|
|
if not holdings:
|
|
return None
|
|
total_value = sum((h.total_value for h in holdings), Decimal(0))
|
|
total_contrib = sum((t.amount for t in transactions), Decimal(0))
|
|
gains = total_value - total_contrib
|
|
if gains == 0:
|
|
return None
|
|
return Activity(
|
|
external_id=f"fidelity:gains:{as_of.date().isoformat()}",
|
|
account_id=ACCOUNT_ID,
|
|
account_type=AccountType.WORKPLACE_PENSION,
|
|
date=as_of,
|
|
activity_type=ActivityType.DEPOSIT if gains > 0 else ActivityType.WITHDRAWAL,
|
|
currency=_CCY,
|
|
amount=abs(gains),
|
|
notes=(f"fidelity-planviewer:unrealised-gains-offset "
|
|
f"(pot=£{total_value}, contrib=£{total_contrib})"),
|
|
)
|
|
|
|
|
|
class FidelityPlanViewerProvider:
|
|
"""Read-only provider against Fidelity UK PlanViewer.
|
|
|
|
Lifecycle:
|
|
- ``accounts()`` advertises the single WF workplace-pension account.
|
|
- ``fetch(since, before)`` opens a Playwright session with the saved
|
|
storage_state, navigates to the transaction-history page with a wide
|
|
date range, scrapes the table, and intercepts the valuation XHR.
|
|
"""
|
|
name = "fidelity-planviewer"
|
|
|
|
def __init__(self, creds: FidelityCreds) -> None:
|
|
self._creds = creds
|
|
|
|
def accounts(self) -> list[Account]:
|
|
return [
|
|
Account(
|
|
id=ACCOUNT_ID,
|
|
name="Fidelity UK Pension",
|
|
account_type=AccountType.WORKPLACE_PENSION,
|
|
currency=_CCY,
|
|
provider=self.name,
|
|
),
|
|
]
|
|
|
|
async def fetch(
|
|
self,
|
|
*,
|
|
since: datetime | None = None,
|
|
before: datetime | None = None,
|
|
) -> AsyncIterator[Activity]:
|
|
state_path = self._creds.storage_state_path
|
|
if not Path(state_path).exists():
|
|
raise FidelityProviderConfigError(
|
|
f"storage_state not found at {state_path} — "
|
|
"run `broker-sync fidelity-seed` first")
|
|
|
|
tx_html, valuation_json = await _scrape_live_session(
|
|
state_path=state_path, headless=self._creds.headless,
|
|
)
|
|
transactions = parse_transactions_html(tx_html)
|
|
holdings = parse_valuation_json(valuation_json)
|
|
log.info("fidelity: parsed %d transactions, %d holdings",
|
|
len(transactions), len(holdings))
|
|
|
|
for tx in transactions:
|
|
if since is not None and tx.date < since:
|
|
continue
|
|
if before is not None and tx.date >= before:
|
|
continue
|
|
yield _tx_to_activity(tx)
|
|
|
|
# The gains offset is always "as of now" so it reflects today's pot.
|
|
# Only emit when the caller isn't windowing (full state).
|
|
if since is None and before is None:
|
|
offset = _gains_offset_activity(holdings, transactions, datetime.now(UTC))
|
|
if offset is not None:
|
|
yield offset
|
|
|
|
|
|
async def _scrape_live_session(
|
|
*,
|
|
state_path: str,
|
|
headless: bool,
|
|
) -> tuple[str, dict[str, Any]]:
|
|
"""Load storage_state, navigate the transaction + valuation pages,
|
|
return (transactions HTML, valuation JSON payload).
|
|
|
|
Raises :class:`FidelitySessionError` if the session is dead (15-min idle,
|
|
cookie expiry, etc.) — Viktor must re-seed.
|
|
"""
|
|
from playwright.async_api import async_playwright
|
|
|
|
captured_valuation: dict[str, dict[str, Any]] = {}
|
|
async with async_playwright() as pw:
|
|
browser = await pw.chromium.launch(headless=headless)
|
|
try:
|
|
ctx = await browser.new_context(
|
|
storage_state=state_path,
|
|
user_agent=("Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) "
|
|
"AppleWebKit/537.36 (KHTML, like Gecko) "
|
|
"Chrome/147.0.0.0 Safari/537.36"),
|
|
viewport={"width": 1280, "height": 900},
|
|
)
|
|
page = await ctx.new_page()
|
|
|
|
async def on_response(resp: Any) -> None:
|
|
if _PV_VALUATION_PATH in resp.url and resp.status < 400:
|
|
with contextlib.suppress(Exception):
|
|
captured_valuation["payload"] = await resp.json()
|
|
page.on("response", on_response)
|
|
|
|
# Trigger session + capture valuation by navigating through landing
|
|
# → main page. The SPA fires DisplayValuation on the main page.
|
|
await page.goto(_PV_LANDING, wait_until="networkidle", timeout=30000)
|
|
await page.wait_for_timeout(2000)
|
|
main_url = f"{_PV_BASE}/planviewer/DisplayMainPage.action"
|
|
await page.goto(main_url, wait_until="networkidle", timeout=30000)
|
|
await page.wait_for_timeout(3000)
|
|
if "idle for more than 15 minutes" in (await page.content()) \
|
|
or "id.fidelity.co.uk" in page.url:
|
|
raise FidelitySessionError(
|
|
"PlanViewer session stale — run `broker-sync fidelity-seed`")
|
|
|
|
# Now pull the transactions page with a wide date range.
|
|
await page.goto(f"{_PV_BASE}{_PV_TX_PATH}",
|
|
wait_until="networkidle", timeout=30000)
|
|
await page.wait_for_timeout(1500)
|
|
await page.fill('input[name="startDate"]', _BACKFILL_START)
|
|
today = await page.evaluate(
|
|
"new Date().toLocaleDateString('en-GB',"
|
|
"{day:'2-digit',month:'short',year:'numeric'}).replace(/,/g,'')")
|
|
await page.fill('input[name="endDate"]', today)
|
|
await page.focus('input[name="endDate"]')
|
|
await page.keyboard.press("Enter")
|
|
with contextlib.suppress(Exception):
|
|
await page.wait_for_load_state("networkidle", timeout=15000)
|
|
await page.wait_for_timeout(2000)
|
|
tx_html = await page.content()
|
|
|
|
# If valuation wasn't picked up on the main page, request directly.
|
|
if "payload" not in captured_valuation:
|
|
r = await page.request.get(f"{_PV_BASE}{_PV_VALUATION_PATH}")
|
|
if r.ok:
|
|
with contextlib.suppress(Exception):
|
|
captured_valuation["payload"] = await r.json()
|
|
|
|
# Roll the storage_state so the next run benefits from any refresh.
|
|
await ctx.storage_state(path=state_path)
|
|
finally:
|
|
await browser.close()
|
|
|
|
valuation: dict[str, Any] = captured_valuation.get("payload") or {}
|
|
return tx_html, valuation
|