## Context Viktor's InvestEngine account has both an ISA and a GIA wrapper. Trade confirmation emails (info@investengine.com) are identical between them — subject "Here's how your portfolio looks now", body shows "Client name: Viktor Barzin" with no portfolio/account type. That left the IMAP parser hardcoded to route every IE BUY to the ISA (invest-engine-primary), which produced a 2339-share over-count when 2023-24 GIA buys landed in the ISA during the 2026-04-18 reconciliation. Viktor's rule: from 6 April each tax year, BUYs fill ISA up to the £20,000 cap, then overflow to GIA. This commit codifies that rule in a standalone batch splitter and applies it at the ImapProvider boundary. Also picks up a silent-drop bug surfaced during the same reconciliation: WF's /import (unlike /import/check) rejects naive datetimes with "Invalid date". The sink now coerces tzinfo=UTC defensively so every provider gets the same guarantee. ## This change - `_split_ie_by_isa_cap(activities)` — sorts all IE-ISA BUYs by date and walks them once per UK tax year (6 April boundary). A BUY whose running tax-year total BEFORE it is strictly below £20k stays on the ISA; otherwise it flips to a new `invest-engine-gia` account_id. No fractional splits — boundary activities go whole to whichever bucket their pre-running-total dictates. Non-IE and non-BUY activities pass through unchanged. - `ImapProvider.accounts()` gains an `invest-engine-gia` Account so the pipeline's `_ensure_accounts` can resolve both. - `ImapProvider.fetch()` calls the splitter on the full batch before applying the `since`/`before` date filter — batch-level sort guarantees consistent routing regardless of the order IMAP returns messages. - `WealthfolioSink._activity_to_import_row` coerces naive datetimes to UTC so the row passes WF /import validation. ## What is NOT in this change - No retroactive re-routing of data already in WF. Historical finance-mysql rows (all lumped to `invest-engine-primary` or `invest-engine-gia` by the existing heuristic) keep their current account assignment. If a past tax-year was routed "wrong" under the new rule, that's corrected manually via the WF API, not here. - No change to the Schwab or trading212 paths. ## Verification ### Automated \`\`\` $ poetry run pytest tests/providers/test_imap.py -v tests/providers/test_imap.py::test_uk_tax_year_start_before_april_6_rolls_back PASSED tests/providers/test_imap.py::test_single_tax_year_under_cap_stays_isa PASSED tests/providers/test_imap.py::test_overflow_past_cap_flips_to_gia PASSED tests/providers/test_imap.py::test_tax_year_boundary_resets_cap PASSED tests/providers/test_imap.py::test_out_of_order_activities_sorted_before_cap_applied PASSED tests/providers/test_imap.py::test_non_ie_activities_passed_through_unchanged PASSED 6 passed in 0.36s $ poetry run pytest -q --ignore=tests/test_cli.py 116 passed, 1 skipped in 2.76s $ poetry run ruff check broker_sync/providers/imap.py broker_sync/sinks/wealthfolio.py All checks passed! $ poetry run mypy broker_sync/providers/imap.py broker_sync/sinks/wealthfolio.py Success: no issues found in 2 source files \`\`\` ### Manual verification The tzinfo fix was validated against the live WF instance during the 2026-04-18 reconciliation — before the fix, /import returned \`"errors": {"symbol": ["Invalid date '2022-05-24T00:00:00'."]}\` for every IMAP activity; after, the same payload imported cleanly. The splitter was not exercised against live IMAP data because Viktor's mailbox only has Apr 2022 → Feb 2024 emails, all inside finance.position's existing coverage. Running IMAP ingest with \`since=2024-04-06\` yields fetched=0. The unit tests cover the boundary arithmetic; a live run will happen when newer emails are parsed (or when finance coverage is re-scoped). ## Reproduce locally 1. \`poetry install\` 2. \`poetry run pytest tests/providers/test_imap.py\` 3. Expected: 6 passed, 0 failed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
264 lines
10 KiB
Python
264 lines
10 KiB
Python
from __future__ import annotations
|
|
|
|
import json
|
|
from collections.abc import Iterable
|
|
from datetime import UTC
|
|
from pathlib import Path
|
|
from typing import Any
|
|
|
|
import httpx
|
|
|
|
from broker_sync.models import Account, Activity
|
|
|
|
_LOGIN_PATH = "/api/v1/auth/login"
|
|
_ACCOUNTS_PATH = "/api/v1/accounts"
|
|
_IMPORT_CHECK = "/api/v1/activities/import/check"
|
|
_IMPORT_REAL = "/api/v1/activities/import"
|
|
|
|
|
|
class WealthfolioError(Exception):
|
|
pass
|
|
|
|
|
|
class WealthfolioUnauthorizedError(WealthfolioError):
|
|
"""Raised when login itself fails (bad creds or Wealthfolio down).
|
|
|
|
Distinct from a 401 on a random endpoint — those trigger an
|
|
automatic re-login attempt.
|
|
"""
|
|
|
|
|
|
class ImportValidationError(WealthfolioError):
|
|
"""`/activities/import/check` returned a non-2xx. We never reach the real import."""
|
|
|
|
|
|
class WealthfolioSink:
|
|
"""Push canonical Activities to Wealthfolio via its CSV import endpoint.
|
|
|
|
Auth is JWT-cookie via POST /api/v1/auth/login. Cookies are persisted
|
|
to disk so CronJob pods can reuse them across runs (Wealthfolio's
|
|
/auth/login is 5-req/min rate-limited).
|
|
|
|
Not multi-process safe — file locking is added in Phase 1 when we
|
|
fan out to multiple CronJobs.
|
|
"""
|
|
|
|
def __init__(
|
|
self,
|
|
*,
|
|
base_url: str,
|
|
username: str,
|
|
password: str,
|
|
session_path: Path | str,
|
|
transport: httpx.AsyncBaseTransport | None = None,
|
|
) -> None:
|
|
self._username = username
|
|
self._password = password
|
|
self._session_path = Path(session_path)
|
|
self._client = httpx.AsyncClient(
|
|
base_url=base_url.rstrip("/"),
|
|
timeout=30.0,
|
|
transport=transport,
|
|
)
|
|
|
|
async def close(self) -> None:
|
|
await self._client.aclose()
|
|
|
|
# -- session --
|
|
|
|
def _load_cookies(self) -> dict[str, str]:
|
|
if not self._session_path.exists():
|
|
return {}
|
|
raw = json.loads(self._session_path.read_text())
|
|
got = raw.get("cookies", {})
|
|
assert isinstance(got, dict)
|
|
return got
|
|
|
|
def _save_cookies(self, cookies: dict[str, str]) -> None:
|
|
self._session_path.parent.mkdir(parents=True, exist_ok=True)
|
|
self._session_path.write_text(json.dumps({"cookies": cookies}))
|
|
|
|
async def login(self) -> None:
|
|
# Wealthfolio 3.2's LoginRequest is `{ password: String }` only — a
|
|
# username key is rejected as an unknown field (HTTP 400). The
|
|
# `username` constructor arg is kept for a future Wealthfolio
|
|
# release that may add multi-user support.
|
|
resp = await self._client.post(
|
|
_LOGIN_PATH,
|
|
json={"password": self._password},
|
|
)
|
|
if resp.status_code == 401:
|
|
raise WealthfolioUnauthorizedError("Wealthfolio /auth/login returned 401")
|
|
resp.raise_for_status()
|
|
cookies = dict(resp.cookies.items())
|
|
if not cookies:
|
|
raise WealthfolioError("/auth/login returned 2xx but no Set-Cookie")
|
|
self._save_cookies(cookies)
|
|
|
|
@staticmethod
|
|
def _cookie_header(cookies: dict[str, str]) -> str:
|
|
return "; ".join(f"{k}={v}" for k, v in cookies.items())
|
|
|
|
async def _request(self, method: str, path: str, **kw: Any) -> httpx.Response:
|
|
cookies = self._load_cookies()
|
|
headers = dict(kw.pop("headers", {}) or {})
|
|
if cookies:
|
|
headers["Cookie"] = self._cookie_header(cookies)
|
|
resp = await self._client.request(method, path, headers=headers, **kw)
|
|
if resp.status_code == 401 and path != _LOGIN_PATH:
|
|
await self.login()
|
|
cookies = self._load_cookies()
|
|
headers["Cookie"] = self._cookie_header(cookies)
|
|
resp = await self._client.request(method, path, headers=headers, **kw)
|
|
return resp
|
|
|
|
# -- accounts --
|
|
|
|
async def list_accounts(self) -> list[dict[str, Any]]:
|
|
resp = await self._request("GET", _ACCOUNTS_PATH)
|
|
resp.raise_for_status()
|
|
raw = resp.json()
|
|
assert isinstance(raw, list)
|
|
return raw
|
|
|
|
async def ensure_account(self, account: Account) -> str:
|
|
"""Idempotently create the account and return Wealthfolio's UUID for it.
|
|
|
|
Wealthfolio generates its own UUIDs on POST /accounts, ignoring any
|
|
`id` we supply. We identify accounts by (provider, providerAccountId)
|
|
which Wealthfolio DOES preserve verbatim. Our own Account.id is
|
|
used as the providerAccountId.
|
|
"""
|
|
existing = await self.list_accounts()
|
|
for a in existing:
|
|
if (a.get("provider") == account.provider and a.get("providerAccountId") == account.id):
|
|
wf_id = a.get("id")
|
|
assert isinstance(wf_id, str)
|
|
return wf_id
|
|
|
|
# NewAccount is camelCase with required booleans.
|
|
# See apps/server/src/models.rs#NewAccount.
|
|
resp = await self._request(
|
|
"POST",
|
|
_ACCOUNTS_PATH,
|
|
json={
|
|
"name": account.name,
|
|
"accountType": str(account.account_type),
|
|
"currency": account.currency,
|
|
"isDefault": False,
|
|
"isActive": True,
|
|
"isArchived": False,
|
|
"trackingMode": "TRANSACTIONS",
|
|
"provider": account.provider,
|
|
"providerAccountId": account.id,
|
|
},
|
|
)
|
|
resp.raise_for_status()
|
|
created = resp.json()
|
|
wf_id = created.get("id")
|
|
if not isinstance(wf_id, str):
|
|
raise WealthfolioError(f"POST /accounts returned no id: {created}")
|
|
return wf_id
|
|
|
|
# -- activity import --
|
|
|
|
@staticmethod
|
|
def _activity_to_import_row(a: Activity) -> dict[str, Any]:
|
|
"""Match Wealthfolio's ActivityImport struct (camelCase JSON)."""
|
|
# WF /import rejects naive datetimes with "Invalid date" (even though
|
|
# /import/check accepts them) — coerce to UTC if tzinfo is missing.
|
|
date = a.date if a.date.tzinfo is not None else a.date.replace(tzinfo=UTC)
|
|
row: dict[str, Any] = {
|
|
"date": date.isoformat(),
|
|
"symbol": a.symbol or "$CASH",
|
|
"activityType": str(a.activity_type),
|
|
"currency": a.currency,
|
|
"accountId": a.account_id,
|
|
# Required booleans on ActivityImport (no defaults in Rust struct).
|
|
"isDraft": False,
|
|
"isValid": True,
|
|
}
|
|
if a.quantity is not None:
|
|
row["quantity"] = format(a.quantity, "f")
|
|
if a.unit_price is not None:
|
|
row["unitPrice"] = format(a.unit_price, "f")
|
|
if a.amount is not None:
|
|
row["amount"] = format(a.amount, "f")
|
|
if a.fee:
|
|
row["fee"] = format(a.fee, "f")
|
|
if a.notes:
|
|
row["comment"] = a.notes
|
|
return row
|
|
|
|
async def import_activities(self, activities: Iterable[Activity]) -> list[dict[str, Any]]:
|
|
rows = [self._activity_to_import_row(a) for a in activities]
|
|
if not rows:
|
|
return []
|
|
|
|
# Step 1 — /import/check hydrates each row with resolved asset_id,
|
|
# exchange_mic, quote_ccy, instrument_type, quote_mode (and flags
|
|
# errors). The /import endpoint on Wealthfolio 3.2+ DOES NOT
|
|
# re-resolve — if we send the un-enriched row the activity is
|
|
# silently dropped (import returns 200 OK with activities=[] in
|
|
# the payload). We must feed check's output into import.
|
|
check = await self._request("POST", _IMPORT_CHECK, json={"activities": rows})
|
|
if check.status_code >= 400:
|
|
try:
|
|
payload = check.json()
|
|
except Exception:
|
|
payload = {"raw": check.text}
|
|
raise ImportValidationError(f"Wealthfolio /import/check rejected: {payload}")
|
|
|
|
checked = check.json()
|
|
if not isinstance(checked, list):
|
|
raise ImportValidationError(
|
|
f"Wealthfolio /import/check returned non-list: {type(checked).__name__}")
|
|
|
|
invalid = [r for r in checked if isinstance(r, dict) and r.get("errors")]
|
|
if invalid:
|
|
raise ImportValidationError(f"Wealthfolio /import/check flagged {len(invalid)} row(s); "
|
|
f"first: {invalid[0]}")
|
|
# Drop any row the server marked is_valid=false (shouldn't happen
|
|
# without errors, but defensive).
|
|
valid_rows = [r for r in checked if isinstance(r, dict) and r.get("isValid")]
|
|
|
|
real = await self._request("POST", _IMPORT_REAL, json={"activities": valid_rows})
|
|
real.raise_for_status()
|
|
raw = real.json()
|
|
# Two observed response shapes:
|
|
# - {activities:[...], importRunId:"...", summary:{total,imported,skipped,...}}
|
|
# - bare list (older builds)
|
|
if isinstance(raw, dict) and "activities" in raw:
|
|
got = raw["activities"]
|
|
summary = raw.get("summary") if isinstance(raw.get("summary"), dict) else None
|
|
elif isinstance(raw, list):
|
|
got = raw
|
|
summary = None
|
|
else:
|
|
got = []
|
|
summary = None
|
|
# Summary.imported is THE truth. The `activities` field echoes input
|
|
# with errors annotated — its length equals input even when zero
|
|
# actually persisted.
|
|
if summary is not None:
|
|
imported_n = int(summary.get("imported", 0))
|
|
total_n = int(summary.get("total", len(valid_rows)))
|
|
if imported_n < total_n:
|
|
err_msg = summary.get("errorMessage") or "no errorMessage"
|
|
skipped = int(summary.get("skipped", 0))
|
|
dupes = int(summary.get("duplicates", 0))
|
|
raise ImportValidationError(f"Wealthfolio /import persisted {imported_n}/{total_n} "
|
|
f"(skipped={skipped} duplicates={dupes}). "
|
|
f"errorMessage: {err_msg}")
|
|
# Legacy silent-drop guard for no-summary responses.
|
|
elif valid_rows and not got:
|
|
first_warn = next(
|
|
(r.get("warnings") for r in checked if isinstance(r, dict) and r.get("warnings")),
|
|
None,
|
|
)
|
|
raise ImportValidationError(
|
|
f"Wealthfolio /import silently dropped all {len(valid_rows)} rows. "
|
|
f"First checked row: {checked[0] if checked else 'none'}. "
|
|
f"First warning: {first_warn}")
|
|
assert isinstance(got, list)
|
|
return [r for r in got if isinstance(r, dict)]
|