Context: InvestEngine has no public API. The web app uses an undocumented
Django REST backend at /api/v0.3X/*, which requires a Bearer token and
rolls its minor every 4-6 weeks. MFA (push-approval) is mandatory on
every login, so we do NOT automate login — Viktor logs in manually in
a browser, copies the Bearer out of devtools, and pastes it into Vault.
This provider consumes that token.
The response shape is UNVERIFIED (MFA blocks an unauthed probe, so the
research leading into Phase 2b could only confirm endpoint existence via
401 responses on v0.31 and v0.32). `_transaction_to_activity` is written
defensively:
- accepts both `results`/`data` list wrappers and `next`/`meta.next_page`
cursor fields for pagination;
- accepts `symbol`/`ticker`, `price`/`unit_price`, `amount`/`value`,
`date`/`created_at`/`timestamp` field-name variants;
- maps exact type strings (BUY, SELL, DIVIDEND, INTEREST, DEPOSIT,
WITHDRAWAL, FEE, TAX) and substring-matches DEPOSIT/WITHDRAWAL for
variants like "CASH_DEPOSIT"; refuses to guess on anything else —
unknown types log WARNING and return None (silent misclassification
would corrupt tax reporting).
Version probe:
_START_VERSION_MINOR=32 (research: v0.31/v0.32 live, v0.30 Gone)
GET /api/v0.{n}/ → 410 ? advance : done
cap at v0.60 so a misconfigured backend doesn't infinite-loop.
A 410 response on a data endpoint triggers exactly one re-probe + retry
against the newer version; the new version is cached on the instance for
the rest of the process.
Token expiry is tracked at the Python layer:
- constructor takes token_expires_at (set by Viktor when he pastes);
- fetch() fails fast with InvestEngineTokenExpiredError if the clock
says the token is already dead — cheaper than burning a request for
a known 401;
- a real 401 response also raises InvestEngineTokenExpiredError so the
CLI/pipeline can alert Viktor to paste a new token.
Vault schema expected (consumed by the CLI in the follow-up commit):
secret/broker-sync
investengine_bearer_token <devtools-captured Bearer>
investengine_token_expires_at <ISO-8601 set at paste time>
investengine_refresh_token <optional, not used yet>
This module does NOT read Vault — the caller hands values in, keeping
the provider testable.
This change:
- New `broker_sync/providers/invest_engine.py`:
* InvestEngineProvider with .accounts(), .fetch(), .close()
* _probe_version / _active_version with 410-retry + cache
* _transaction_to_activity with defensive type + field-name mapping
* InvestEngineError / InvestEngineTokenExpiredError / InvestEngineVersionError
- New `tests/providers/test_invest_engine.py`: 22 tests covering version
probe, expiry fail-fast, 401→TokenExpired, 410→reprobe, header
shape, pagination variants, and the full txn→activity mapping. One
@pytest.mark.skip integration stub for when Viktor has a live token.
Assumptions flagged for verification with a live token:
- IE id field is castable to str (int or string)
- Type strings match or fuzz-contain: BUY, SELL, DIVIDEND, INTEREST,
DEPOSIT, WITHDRAWAL, FEE, TAX
- Transactions carry numeric quantity/price/amount (Decimal-convertible)
- Date field is one of: date / created_at / timestamp
- Pagination shape is {results, next} OR {data, meta.next_page}
- /transactions/ accepts ?portfolio=<id>&start=YYYY-MM-DD&end=YYYY-MM-DD
## Automated
poetry run pytest tests/providers/test_invest_engine.py -v
======================== 22 passed, 1 skipped in 0.26s =========================
poetry run pytest -q
95 passed, 1 skipped in 0.84s
poetry run mypy --strict .
Success: no issues found in 34 source files
poetry run ruff check .
All checks passed!
poetry run yapf --diff broker_sync/providers/invest_engine.py tests/providers/test_invest_engine.py
(clean)
## Manual Verification
Once Viktor pastes a live token:
1. Export:
export IE_BEARER_TOKEN='<paste>'
export IE_TOKEN_EXPIRES_AT='2026-05-17T00:00:00+00:00'
2. Unmark the @pytest.mark.skip on test_live_integration_smoke
3. poetry run pytest tests/providers/test_invest_engine.py::test_live_integration_smoke -v
Expected: a successful round-trip that returns an empty-or-populated
list of Activity objects — prove the version probe + auth header +
portfolio enumeration actually work against the real IE backend.
4. Validate the Assumptions list above against the real transaction JSON.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
380 lines
14 KiB
Python
380 lines
14 KiB
Python
"""InvestEngine Bearer-token provider.
|
|
|
|
InvestEngine (https://investengine.com) has no public API and requires MFA
|
|
(push-approval via the IE mobile app) on every login. We work around that
|
|
by having Viktor log in manually in a browser, copy the Bearer token out
|
|
of devtools, and paste it into Vault. This module consumes that token.
|
|
|
|
## Vault schema — `secret/broker-sync`
|
|
|
|
The following keys are expected in Vault. The caller (CLI/pipeline) reads
|
|
them and hands the values to the constructor. This module does NOT read
|
|
Vault directly — it stays testable.
|
|
|
|
- ``investengine_bearer_token`` — the Bearer string Viktor pastes from
|
|
devtools. Expires on IE's refresh schedule (~monthly based on observed
|
|
behaviour).
|
|
- ``investengine_token_expires_at`` — ISO-8601 timestamp Viktor sets WHEN
|
|
HE PASTES the token. Used to alert 3 days before expiry and to fail
|
|
fast before making a request with a known-dead token.
|
|
- ``investengine_refresh_token`` *(optional)* — if Viktor's devtools
|
|
capture included a refresh token, we may attempt auto-refresh in a
|
|
future iteration. Not used yet.
|
|
|
|
## Version probing
|
|
|
|
IE rolls the ``/api/v0.3X/`` version every 4-6 weeks. ``v0.29`` and
|
|
``v0.30`` were ``410 Gone`` at research time; ``v0.31`` and ``v0.32``
|
|
were live (401 without auth — the correct "auth required" signal);
|
|
``v0.33`` and later returned 404 (not yet created). The probe starts
|
|
at ``v0.32`` and walks forward on 410, stopping at the first version
|
|
that returns anything other than 410 (401/404/200/etc).
|
|
"""
|
|
from __future__ import annotations
|
|
|
|
import logging
|
|
from collections.abc import AsyncIterator
|
|
from datetime import UTC, datetime
|
|
from decimal import Decimal
|
|
from typing import Any
|
|
|
|
import httpx
|
|
|
|
from broker_sync.models import Account, AccountType, Activity, ActivityType
|
|
|
|
log = logging.getLogger(__name__)
|
|
|
|
_DEFAULT_BASE_URL = "https://investengine.com"
|
|
_USER_AGENT = "broker-sync/0.1 (+https://github.com/ViktorBarzin/broker-sync)"
|
|
|
|
# Version probe starts here. If this minor bumps past a live version,
|
|
# update the constant rather than relying on re-probe every process.
|
|
_START_VERSION_MINOR = 32
|
|
# Hard cap on how far forward we probe before giving up. IE has never
|
|
# skipped versions; this is defence against a runaway loop.
|
|
_MAX_VERSION_MINOR = 60
|
|
|
|
# One logical account — IE has only ever had a single ISA per user.
|
|
_ACCOUNT_ID = "invest-engine-primary"
|
|
_ACCOUNT = Account(
|
|
id=_ACCOUNT_ID,
|
|
name="InvestEngine ISA",
|
|
account_type=AccountType.ISA,
|
|
currency="GBP",
|
|
provider="invest-engine",
|
|
)
|
|
|
|
# Type-string → ActivityType. Exact IE strings are UNVERIFIED — we match
|
|
# case-insensitively and fall back to substring checks for the common
|
|
# variants ("DEPOSIT", "WITHDRAWAL").
|
|
_EXACT_TYPE_MAP: dict[str, ActivityType] = {
|
|
"BUY": ActivityType.BUY,
|
|
"SELL": ActivityType.SELL,
|
|
"DIVIDEND": ActivityType.DIVIDEND,
|
|
"INTEREST": ActivityType.INTEREST,
|
|
"DEPOSIT": ActivityType.DEPOSIT,
|
|
"WITHDRAWAL": ActivityType.WITHDRAWAL,
|
|
"FEE": ActivityType.FEE,
|
|
"TAX": ActivityType.TAX,
|
|
}
|
|
|
|
|
|
class InvestEngineError(Exception):
|
|
"""Any non-retryable InvestEngine API failure."""
|
|
|
|
|
|
class InvestEngineTokenExpiredError(InvestEngineError):
|
|
"""Bearer token rejected by IE (401). Viktor must paste a new token."""
|
|
|
|
|
|
class InvestEngineVersionError(InvestEngineError):
|
|
"""Could not find a live /api/v0.3X/ version on the IE backend."""
|
|
|
|
|
|
def _version_path(minor: int) -> str:
|
|
return f"/api/v0.{minor}/"
|
|
|
|
|
|
async def _probe_version(
|
|
client: httpx.AsyncClient,
|
|
*,
|
|
start_minor: int = _START_VERSION_MINOR,
|
|
max_minor: int = _MAX_VERSION_MINOR,
|
|
) -> int:
|
|
"""Walk forward from ``start_minor`` looking for a live API version.
|
|
|
|
Returns the minor number of the first version that is NOT 410 Gone.
|
|
A live version is one IE currently accepts; a 401 response is the
|
|
expected "auth required" signal and confirms the version is serving
|
|
traffic. Raises :class:`InvestEngineVersionError` if no live version
|
|
is found before ``max_minor``.
|
|
"""
|
|
for minor in range(start_minor, max_minor + 1):
|
|
resp = await client.get(_version_path(minor))
|
|
if resp.status_code != 410:
|
|
return minor
|
|
raise InvestEngineVersionError(
|
|
f"No live /api/v0.3X/ between v0.{start_minor} and v0.{max_minor}")
|
|
|
|
|
|
def _parse_iso(ts: str) -> datetime:
|
|
"""Accept both ``...Z`` and ``...+HH:MM`` suffixes."""
|
|
return datetime.fromisoformat(ts.replace("Z", "+00:00"))
|
|
|
|
|
|
def _opt_decimal(raw: Any) -> Decimal | None:
|
|
if raw is None:
|
|
return None
|
|
return Decimal(str(raw))
|
|
|
|
|
|
def _classify_type(raw_type: str) -> ActivityType | None:
|
|
"""Map a raw IE type string to ActivityType, or None if unknown."""
|
|
upper = raw_type.upper()
|
|
exact = _EXACT_TYPE_MAP.get(upper)
|
|
if exact is not None:
|
|
return exact
|
|
if "DEPOSIT" in upper:
|
|
return ActivityType.DEPOSIT
|
|
if "WITHDRAWAL" in upper or "WITHDRAW" in upper:
|
|
return ActivityType.WITHDRAWAL
|
|
return None
|
|
|
|
|
|
def _transaction_to_activity(raw: dict[str, Any]) -> Activity | None:
|
|
"""Turn one IE transaction dict into a canonical Activity.
|
|
|
|
The IE response shape is UNVERIFIED — these assumptions WILL need
|
|
review once Viktor pastes a live token:
|
|
|
|
- ``id``: string or int (cast to str for ``external_id``)
|
|
- ``type``: upper-case string — ``BUY``/``SELL``/``DIVIDEND``/etc.
|
|
- ``symbol``: ticker string (may be missing on cash events)
|
|
- ``quantity`` / ``price`` / ``amount``: numeric-in-a-string or float
|
|
- ``currency``: ISO code; default ``GBP`` for an ISA
|
|
- ``date``: ISO-8601 with ``Z`` or offset suffix
|
|
|
|
Unknown type strings are logged at WARNING and skipped — silent
|
|
misclassification would corrupt tax reporting, so we refuse to guess.
|
|
"""
|
|
raw_type = str(raw.get("type", ""))
|
|
activity_type = _classify_type(raw_type)
|
|
if activity_type is None:
|
|
log.warning(
|
|
"invest-engine: skipping transaction id=%s with unknown type=%r",
|
|
raw.get("id"),
|
|
raw_type,
|
|
)
|
|
return None
|
|
|
|
txn_id = raw.get("id")
|
|
if txn_id is None:
|
|
log.warning("invest-engine: skipping transaction with missing id: %r", raw)
|
|
return None
|
|
|
|
currency = str(raw.get("currency") or "GBP")
|
|
date_str = raw.get("date") or raw.get("created_at") or raw.get("timestamp")
|
|
if not isinstance(date_str, str):
|
|
log.warning("invest-engine: skipping txn id=%s — no parseable date", txn_id)
|
|
return None
|
|
|
|
quantity = _opt_decimal(raw.get("quantity"))
|
|
unit_price = _opt_decimal(raw.get("price") or raw.get("unit_price"))
|
|
amount = _opt_decimal(raw.get("amount") or raw.get("value"))
|
|
fee = _opt_decimal(raw.get("fee")) or Decimal("0")
|
|
symbol_raw = raw.get("symbol") or raw.get("ticker")
|
|
symbol = str(symbol_raw) if symbol_raw else None
|
|
|
|
return Activity(
|
|
external_id=f"invest-engine:{txn_id}",
|
|
account_id=_ACCOUNT_ID,
|
|
account_type=AccountType.ISA,
|
|
date=_parse_iso(date_str),
|
|
activity_type=activity_type,
|
|
currency=currency,
|
|
symbol=symbol,
|
|
quantity=quantity,
|
|
unit_price=unit_price,
|
|
amount=amount,
|
|
fee=fee,
|
|
)
|
|
|
|
|
|
def _extract_list(page: dict[str, Any]) -> list[dict[str, Any]]:
|
|
"""Handle both ``{results: [...]}`` and ``{data: [...]}`` shapes."""
|
|
for key in ("results", "data"):
|
|
items = page.get(key)
|
|
if isinstance(items, list):
|
|
return [i for i in items if isinstance(i, dict)]
|
|
return []
|
|
|
|
|
|
def _extract_next(page: dict[str, Any]) -> str | None:
|
|
"""Pull the next-page URL from either DRF ``next`` or JSON:API ``meta.next_page``."""
|
|
nxt = page.get("next")
|
|
if isinstance(nxt, str) and nxt:
|
|
return nxt
|
|
meta = page.get("meta")
|
|
if isinstance(meta, dict):
|
|
meta_nxt = meta.get("next_page") or meta.get("next")
|
|
if isinstance(meta_nxt, str) and meta_nxt:
|
|
return meta_nxt
|
|
return None
|
|
|
|
|
|
class InvestEngineProvider:
|
|
"""Concrete Provider for InvestEngine.
|
|
|
|
Only one logical account per user (one ISA, GBP). The token expiry is
|
|
tracked at the Python layer: the CLI alerts 3 days before expiry, and
|
|
this class fails fast if the clock says the token is already dead
|
|
(cheaper than burning a request for the same 401).
|
|
"""
|
|
|
|
name = "invest-engine"
|
|
|
|
def __init__(
|
|
self,
|
|
*,
|
|
bearer_token: str,
|
|
token_expires_at: datetime,
|
|
base_url: str = _DEFAULT_BASE_URL,
|
|
transport: httpx.AsyncBaseTransport | None = None,
|
|
) -> None:
|
|
self._token = bearer_token
|
|
self._token_expires_at = token_expires_at
|
|
self._client = httpx.AsyncClient(
|
|
base_url=base_url,
|
|
timeout=30.0,
|
|
transport=transport,
|
|
headers={
|
|
"Authorization": f"Bearer {bearer_token}",
|
|
"User-Agent": _USER_AGENT,
|
|
},
|
|
)
|
|
self._version_minor: int | None = None
|
|
|
|
def accounts(self) -> list[Account]:
|
|
return [_ACCOUNT]
|
|
|
|
async def close(self) -> None:
|
|
await self._client.aclose()
|
|
|
|
async def _active_version(self, *, force: bool = False) -> int:
|
|
"""Cache the live API minor. Set ``force=True`` after a 410 to re-probe."""
|
|
if self._version_minor is not None and not force:
|
|
return self._version_minor
|
|
start = (self._version_minor + 1) if force and self._version_minor else _START_VERSION_MINOR
|
|
minor = await _probe_version(self._client, start_minor=start)
|
|
self._version_minor = minor
|
|
return minor
|
|
|
|
async def _request_json(self, path: str, params: dict[str, str] | None = None) -> Any:
|
|
"""GET ``path`` (relative), return JSON. Handles one 410 re-probe retry."""
|
|
resp = await self._client.get(path, params=params)
|
|
if resp.status_code == 401:
|
|
raise InvestEngineTokenExpiredError(
|
|
f"InvestEngine rejected Bearer token (HTTP 401 on {path}); "
|
|
f"token_expires_at={self._token_expires_at.isoformat()}. "
|
|
f"Viktor must paste a new token into Vault.")
|
|
if resp.status_code == 410:
|
|
# Version rolled mid-session. Re-probe, retarget path, retry once.
|
|
old_minor = self._version_minor
|
|
new_minor = await self._active_version(force=True)
|
|
if old_minor is not None and new_minor != old_minor:
|
|
new_path = path.replace(f"/v0.{old_minor}/", f"/v0.{new_minor}/", 1)
|
|
retry = await self._client.get(new_path, params=params)
|
|
if retry.status_code == 401:
|
|
raise InvestEngineTokenExpiredError(f"InvestEngine 401 on retry {new_path}")
|
|
if retry.status_code != 200:
|
|
raise InvestEngineError(
|
|
f"InvestEngine {new_path} HTTP {retry.status_code} after re-probe")
|
|
return retry.json()
|
|
raise InvestEngineError(f"InvestEngine 410 on {path} and no newer version")
|
|
if resp.status_code != 200:
|
|
raise InvestEngineError(
|
|
f"InvestEngine {path} HTTP {resp.status_code}: {resp.text[:200]}")
|
|
return resp.json()
|
|
|
|
async def fetch(
|
|
self,
|
|
*,
|
|
since: datetime | None = None,
|
|
before: datetime | None = None,
|
|
) -> AsyncIterator[Activity]:
|
|
# Fail fast if the token is already known-dead.
|
|
if self._token_expires_at <= datetime.now(UTC):
|
|
raise InvestEngineTokenExpiredError(
|
|
f"InvestEngine token expired at {self._token_expires_at.isoformat()} — "
|
|
f"Viktor must paste a new token.")
|
|
|
|
version = await self._active_version()
|
|
portfolio_ids = await self._list_portfolio_ids(version)
|
|
for pid in portfolio_ids:
|
|
async for activity in self._fetch_portfolio(version, pid, since, before):
|
|
yield activity
|
|
|
|
async def _list_portfolio_ids(self, version: int) -> list[str]:
|
|
"""Walk `/portfolios/` pagination and return the list of ids."""
|
|
ids: list[str] = []
|
|
path: str | None = f"/api/v0.{version}/portfolios/"
|
|
while path is not None:
|
|
page = await self._request_json(path)
|
|
if not isinstance(page, dict):
|
|
log.warning("invest-engine: /portfolios/ returned non-dict %r", type(page))
|
|
break
|
|
for item in _extract_list(page):
|
|
pid = item.get("id")
|
|
if pid is None:
|
|
continue
|
|
ids.append(str(pid))
|
|
path = _next_page_path(_extract_next(page), current=path)
|
|
return ids
|
|
|
|
async def _fetch_portfolio(
|
|
self,
|
|
version: int,
|
|
portfolio_id: str,
|
|
since: datetime | None,
|
|
before: datetime | None,
|
|
) -> AsyncIterator[Activity]:
|
|
params: dict[str, str] = {"portfolio": portfolio_id}
|
|
if since is not None:
|
|
params["start"] = since.date().isoformat()
|
|
if before is not None:
|
|
params["end"] = before.date().isoformat()
|
|
path: str | None = f"/api/v0.{version}/transactions/"
|
|
while path is not None:
|
|
page = await self._request_json(
|
|
path, params=params if path.endswith("/transactions/") else None)
|
|
if not isinstance(page, dict):
|
|
break
|
|
for raw in _extract_list(page):
|
|
activity = _transaction_to_activity(raw)
|
|
if activity is None:
|
|
continue
|
|
if since is not None and activity.date < since:
|
|
continue
|
|
if before is not None and activity.date >= before:
|
|
continue
|
|
yield activity
|
|
path = _next_page_path(_extract_next(page), current=path)
|
|
|
|
|
|
def _next_page_path(raw_next: str | None, *, current: str) -> str | None:
|
|
"""Normalise a ``next`` URL to a request path.
|
|
|
|
IE might emit full URLs (``https://investengine.com/api/v0.32/...``)
|
|
or relative paths. We return the path component only — the httpx
|
|
client holds the base URL.
|
|
"""
|
|
if raw_next is None:
|
|
return None
|
|
if raw_next.startswith("http://") or raw_next.startswith("https://"):
|
|
from urllib.parse import urlparse
|
|
parsed = urlparse(raw_next)
|
|
path = parsed.path
|
|
if parsed.query:
|
|
path = f"{path}?{parsed.query}"
|
|
return path
|
|
return raw_next
|