broker-sync/tests/providers/test_imap.py

157 lines
6.1 KiB
Python
Raw Normal View History

imap: route IE BUYs to ISA first-£20k / GIA overflow per UK tax year ## Context Viktor's InvestEngine account has both an ISA and a GIA wrapper. Trade confirmation emails (info@investengine.com) are identical between them — subject "Here's how your portfolio looks now", body shows "Client name: Viktor Barzin" with no portfolio/account type. That left the IMAP parser hardcoded to route every IE BUY to the ISA (invest-engine-primary), which produced a 2339-share over-count when 2023-24 GIA buys landed in the ISA during the 2026-04-18 reconciliation. Viktor's rule: from 6 April each tax year, BUYs fill ISA up to the £20,000 cap, then overflow to GIA. This commit codifies that rule in a standalone batch splitter and applies it at the ImapProvider boundary. Also picks up a silent-drop bug surfaced during the same reconciliation: WF's /import (unlike /import/check) rejects naive datetimes with "Invalid date". The sink now coerces tzinfo=UTC defensively so every provider gets the same guarantee. ## This change - `_split_ie_by_isa_cap(activities)` — sorts all IE-ISA BUYs by date and walks them once per UK tax year (6 April boundary). A BUY whose running tax-year total BEFORE it is strictly below £20k stays on the ISA; otherwise it flips to a new `invest-engine-gia` account_id. No fractional splits — boundary activities go whole to whichever bucket their pre-running-total dictates. Non-IE and non-BUY activities pass through unchanged. - `ImapProvider.accounts()` gains an `invest-engine-gia` Account so the pipeline's `_ensure_accounts` can resolve both. - `ImapProvider.fetch()` calls the splitter on the full batch before applying the `since`/`before` date filter — batch-level sort guarantees consistent routing regardless of the order IMAP returns messages. - `WealthfolioSink._activity_to_import_row` coerces naive datetimes to UTC so the row passes WF /import validation. ## What is NOT in this change - No retroactive re-routing of data already in WF. Historical finance-mysql rows (all lumped to `invest-engine-primary` or `invest-engine-gia` by the existing heuristic) keep their current account assignment. If a past tax-year was routed "wrong" under the new rule, that's corrected manually via the WF API, not here. - No change to the Schwab or trading212 paths. ## Verification ### Automated \`\`\` $ poetry run pytest tests/providers/test_imap.py -v tests/providers/test_imap.py::test_uk_tax_year_start_before_april_6_rolls_back PASSED tests/providers/test_imap.py::test_single_tax_year_under_cap_stays_isa PASSED tests/providers/test_imap.py::test_overflow_past_cap_flips_to_gia PASSED tests/providers/test_imap.py::test_tax_year_boundary_resets_cap PASSED tests/providers/test_imap.py::test_out_of_order_activities_sorted_before_cap_applied PASSED tests/providers/test_imap.py::test_non_ie_activities_passed_through_unchanged PASSED 6 passed in 0.36s $ poetry run pytest -q --ignore=tests/test_cli.py 116 passed, 1 skipped in 2.76s $ poetry run ruff check broker_sync/providers/imap.py broker_sync/sinks/wealthfolio.py All checks passed! $ poetry run mypy broker_sync/providers/imap.py broker_sync/sinks/wealthfolio.py Success: no issues found in 2 source files \`\`\` ### Manual verification The tzinfo fix was validated against the live WF instance during the 2026-04-18 reconciliation — before the fix, /import returned \`"errors": {"symbol": ["Invalid date '2022-05-24T00:00:00'."]}\` for every IMAP activity; after, the same payload imported cleanly. The splitter was not exercised against live IMAP data because Viktor's mailbox only has Apr 2022 → Feb 2024 emails, all inside finance.position's existing coverage. Running IMAP ingest with \`since=2024-04-06\` yields fetched=0. The unit tests cover the boundary arithmetic; a live run will happen when newer emails are parsed (or when finance coverage is re-scoped). ## Reproduce locally 1. \`poetry install\` 2. \`poetry run pytest tests/providers/test_imap.py\` 3. Expected: 6 passed, 0 failed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 12:02:49 +00:00
from __future__ import annotations
from datetime import UTC, date, datetime
from decimal import Decimal
from typing import TYPE_CHECKING
imap: route IE BUYs to ISA first-£20k / GIA overflow per UK tax year ## Context Viktor's InvestEngine account has both an ISA and a GIA wrapper. Trade confirmation emails (info@investengine.com) are identical between them — subject "Here's how your portfolio looks now", body shows "Client name: Viktor Barzin" with no portfolio/account type. That left the IMAP parser hardcoded to route every IE BUY to the ISA (invest-engine-primary), which produced a 2339-share over-count when 2023-24 GIA buys landed in the ISA during the 2026-04-18 reconciliation. Viktor's rule: from 6 April each tax year, BUYs fill ISA up to the £20,000 cap, then overflow to GIA. This commit codifies that rule in a standalone batch splitter and applies it at the ImapProvider boundary. Also picks up a silent-drop bug surfaced during the same reconciliation: WF's /import (unlike /import/check) rejects naive datetimes with "Invalid date". The sink now coerces tzinfo=UTC defensively so every provider gets the same guarantee. ## This change - `_split_ie_by_isa_cap(activities)` — sorts all IE-ISA BUYs by date and walks them once per UK tax year (6 April boundary). A BUY whose running tax-year total BEFORE it is strictly below £20k stays on the ISA; otherwise it flips to a new `invest-engine-gia` account_id. No fractional splits — boundary activities go whole to whichever bucket their pre-running-total dictates. Non-IE and non-BUY activities pass through unchanged. - `ImapProvider.accounts()` gains an `invest-engine-gia` Account so the pipeline's `_ensure_accounts` can resolve both. - `ImapProvider.fetch()` calls the splitter on the full batch before applying the `since`/`before` date filter — batch-level sort guarantees consistent routing regardless of the order IMAP returns messages. - `WealthfolioSink._activity_to_import_row` coerces naive datetimes to UTC so the row passes WF /import validation. ## What is NOT in this change - No retroactive re-routing of data already in WF. Historical finance-mysql rows (all lumped to `invest-engine-primary` or `invest-engine-gia` by the existing heuristic) keep their current account assignment. If a past tax-year was routed "wrong" under the new rule, that's corrected manually via the WF API, not here. - No change to the Schwab or trading212 paths. ## Verification ### Automated \`\`\` $ poetry run pytest tests/providers/test_imap.py -v tests/providers/test_imap.py::test_uk_tax_year_start_before_april_6_rolls_back PASSED tests/providers/test_imap.py::test_single_tax_year_under_cap_stays_isa PASSED tests/providers/test_imap.py::test_overflow_past_cap_flips_to_gia PASSED tests/providers/test_imap.py::test_tax_year_boundary_resets_cap PASSED tests/providers/test_imap.py::test_out_of_order_activities_sorted_before_cap_applied PASSED tests/providers/test_imap.py::test_non_ie_activities_passed_through_unchanged PASSED 6 passed in 0.36s $ poetry run pytest -q --ignore=tests/test_cli.py 116 passed, 1 skipped in 2.76s $ poetry run ruff check broker_sync/providers/imap.py broker_sync/sinks/wealthfolio.py All checks passed! $ poetry run mypy broker_sync/providers/imap.py broker_sync/sinks/wealthfolio.py Success: no issues found in 2 source files \`\`\` ### Manual verification The tzinfo fix was validated against the live WF instance during the 2026-04-18 reconciliation — before the fix, /import returned \`"errors": {"symbol": ["Invalid date '2022-05-24T00:00:00'."]}\` for every IMAP activity; after, the same payload imported cleanly. The splitter was not exercised against live IMAP data because Viktor's mailbox only has Apr 2022 → Feb 2024 emails, all inside finance.position's existing coverage. Running IMAP ingest with \`since=2024-04-06\` yields fetched=0. The unit tests cover the boundary arithmetic; a live run will happen when newer emails are parsed (or when finance coverage is re-scoped). ## Reproduce locally 1. \`poetry install\` 2. \`poetry run pytest tests/providers/test_imap.py\` 3. Expected: 6 passed, 0 failed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 12:02:49 +00:00
from broker_sync.models import AccountType, Activity, ActivityType
if TYPE_CHECKING:
from pytest import MonkeyPatch
imap: route IE BUYs to ISA first-£20k / GIA overflow per UK tax year ## Context Viktor's InvestEngine account has both an ISA and a GIA wrapper. Trade confirmation emails (info@investengine.com) are identical between them — subject "Here's how your portfolio looks now", body shows "Client name: Viktor Barzin" with no portfolio/account type. That left the IMAP parser hardcoded to route every IE BUY to the ISA (invest-engine-primary), which produced a 2339-share over-count when 2023-24 GIA buys landed in the ISA during the 2026-04-18 reconciliation. Viktor's rule: from 6 April each tax year, BUYs fill ISA up to the £20,000 cap, then overflow to GIA. This commit codifies that rule in a standalone batch splitter and applies it at the ImapProvider boundary. Also picks up a silent-drop bug surfaced during the same reconciliation: WF's /import (unlike /import/check) rejects naive datetimes with "Invalid date". The sink now coerces tzinfo=UTC defensively so every provider gets the same guarantee. ## This change - `_split_ie_by_isa_cap(activities)` — sorts all IE-ISA BUYs by date and walks them once per UK tax year (6 April boundary). A BUY whose running tax-year total BEFORE it is strictly below £20k stays on the ISA; otherwise it flips to a new `invest-engine-gia` account_id. No fractional splits — boundary activities go whole to whichever bucket their pre-running-total dictates. Non-IE and non-BUY activities pass through unchanged. - `ImapProvider.accounts()` gains an `invest-engine-gia` Account so the pipeline's `_ensure_accounts` can resolve both. - `ImapProvider.fetch()` calls the splitter on the full batch before applying the `since`/`before` date filter — batch-level sort guarantees consistent routing regardless of the order IMAP returns messages. - `WealthfolioSink._activity_to_import_row` coerces naive datetimes to UTC so the row passes WF /import validation. ## What is NOT in this change - No retroactive re-routing of data already in WF. Historical finance-mysql rows (all lumped to `invest-engine-primary` or `invest-engine-gia` by the existing heuristic) keep their current account assignment. If a past tax-year was routed "wrong" under the new rule, that's corrected manually via the WF API, not here. - No change to the Schwab or trading212 paths. ## Verification ### Automated \`\`\` $ poetry run pytest tests/providers/test_imap.py -v tests/providers/test_imap.py::test_uk_tax_year_start_before_april_6_rolls_back PASSED tests/providers/test_imap.py::test_single_tax_year_under_cap_stays_isa PASSED tests/providers/test_imap.py::test_overflow_past_cap_flips_to_gia PASSED tests/providers/test_imap.py::test_tax_year_boundary_resets_cap PASSED tests/providers/test_imap.py::test_out_of_order_activities_sorted_before_cap_applied PASSED tests/providers/test_imap.py::test_non_ie_activities_passed_through_unchanged PASSED 6 passed in 0.36s $ poetry run pytest -q --ignore=tests/test_cli.py 116 passed, 1 skipped in 2.76s $ poetry run ruff check broker_sync/providers/imap.py broker_sync/sinks/wealthfolio.py All checks passed! $ poetry run mypy broker_sync/providers/imap.py broker_sync/sinks/wealthfolio.py Success: no issues found in 2 source files \`\`\` ### Manual verification The tzinfo fix was validated against the live WF instance during the 2026-04-18 reconciliation — before the fix, /import returned \`"errors": {"symbol": ["Invalid date '2022-05-24T00:00:00'."]}\` for every IMAP activity; after, the same payload imported cleanly. The splitter was not exercised against live IMAP data because Viktor's mailbox only has Apr 2022 → Feb 2024 emails, all inside finance.position's existing coverage. Running IMAP ingest with \`since=2024-04-06\` yields fetched=0. The unit tests cover the boundary arithmetic; a live run will happen when newer emails are parsed (or when finance coverage is re-scoped). ## Reproduce locally 1. \`poetry install\` 2. \`poetry run pytest tests/providers/test_imap.py\` 3. Expected: 6 passed, 0 failed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 12:02:49 +00:00
from broker_sync.providers.imap import (
_IE_GIA_ACCOUNT_ID,
_IE_ISA_ACCOUNT_ID,
_split_ie_by_isa_cap,
_uk_tax_year_start,
)
def _buy(on: datetime, qty: str, price: str) -> Activity:
return Activity(
external_id=f"invest-engine:{on.isoformat()}|{qty}|{price}",
account_id=_IE_ISA_ACCOUNT_ID,
account_type=AccountType.ISA,
date=on,
activity_type=ActivityType.BUY,
currency="GBP",
symbol="VUAG",
quantity=Decimal(qty),
unit_price=Decimal(price),
)
def test_uk_tax_year_start_before_april_6_rolls_back() -> None:
assert _uk_tax_year_start(datetime(2025, 4, 5, tzinfo=UTC)) == date(2024, 4, 6)
assert _uk_tax_year_start(datetime(2025, 4, 6, tzinfo=UTC)) == date(2025, 4, 6)
assert _uk_tax_year_start(datetime(2025, 1, 15, tzinfo=UTC)) == date(2024, 4, 6)
assert _uk_tax_year_start(datetime(2024, 4, 7, tzinfo=UTC)) == date(2024, 4, 6)
def test_single_tax_year_under_cap_stays_isa() -> None:
acts = [
_buy(datetime(2024, 5, 1, tzinfo=UTC), "100", "50"), # £5000
_buy(datetime(2024, 8, 1, tzinfo=UTC), "100", "80"), # £8000
]
routed = _split_ie_by_isa_cap(acts)
assert all(a.account_id == _IE_ISA_ACCOUNT_ID for a in routed)
assert all(a.account_type is AccountType.ISA for a in routed)
def test_overflow_past_cap_flips_to_gia() -> None:
acts = [
_buy(datetime(2024, 5, 1, tzinfo=UTC), "100", "80"), # £8,000
# +£12,000 → £20,000 total; prev £8k < cap → ISA
_buy(datetime(2024, 6, 1, tzinfo=UTC), "150", "80"),
imap: route IE BUYs to ISA first-£20k / GIA overflow per UK tax year ## Context Viktor's InvestEngine account has both an ISA and a GIA wrapper. Trade confirmation emails (info@investengine.com) are identical between them — subject "Here's how your portfolio looks now", body shows "Client name: Viktor Barzin" with no portfolio/account type. That left the IMAP parser hardcoded to route every IE BUY to the ISA (invest-engine-primary), which produced a 2339-share over-count when 2023-24 GIA buys landed in the ISA during the 2026-04-18 reconciliation. Viktor's rule: from 6 April each tax year, BUYs fill ISA up to the £20,000 cap, then overflow to GIA. This commit codifies that rule in a standalone batch splitter and applies it at the ImapProvider boundary. Also picks up a silent-drop bug surfaced during the same reconciliation: WF's /import (unlike /import/check) rejects naive datetimes with "Invalid date". The sink now coerces tzinfo=UTC defensively so every provider gets the same guarantee. ## This change - `_split_ie_by_isa_cap(activities)` — sorts all IE-ISA BUYs by date and walks them once per UK tax year (6 April boundary). A BUY whose running tax-year total BEFORE it is strictly below £20k stays on the ISA; otherwise it flips to a new `invest-engine-gia` account_id. No fractional splits — boundary activities go whole to whichever bucket their pre-running-total dictates. Non-IE and non-BUY activities pass through unchanged. - `ImapProvider.accounts()` gains an `invest-engine-gia` Account so the pipeline's `_ensure_accounts` can resolve both. - `ImapProvider.fetch()` calls the splitter on the full batch before applying the `since`/`before` date filter — batch-level sort guarantees consistent routing regardless of the order IMAP returns messages. - `WealthfolioSink._activity_to_import_row` coerces naive datetimes to UTC so the row passes WF /import validation. ## What is NOT in this change - No retroactive re-routing of data already in WF. Historical finance-mysql rows (all lumped to `invest-engine-primary` or `invest-engine-gia` by the existing heuristic) keep their current account assignment. If a past tax-year was routed "wrong" under the new rule, that's corrected manually via the WF API, not here. - No change to the Schwab or trading212 paths. ## Verification ### Automated \`\`\` $ poetry run pytest tests/providers/test_imap.py -v tests/providers/test_imap.py::test_uk_tax_year_start_before_april_6_rolls_back PASSED tests/providers/test_imap.py::test_single_tax_year_under_cap_stays_isa PASSED tests/providers/test_imap.py::test_overflow_past_cap_flips_to_gia PASSED tests/providers/test_imap.py::test_tax_year_boundary_resets_cap PASSED tests/providers/test_imap.py::test_out_of_order_activities_sorted_before_cap_applied PASSED tests/providers/test_imap.py::test_non_ie_activities_passed_through_unchanged PASSED 6 passed in 0.36s $ poetry run pytest -q --ignore=tests/test_cli.py 116 passed, 1 skipped in 2.76s $ poetry run ruff check broker_sync/providers/imap.py broker_sync/sinks/wealthfolio.py All checks passed! $ poetry run mypy broker_sync/providers/imap.py broker_sync/sinks/wealthfolio.py Success: no issues found in 2 source files \`\`\` ### Manual verification The tzinfo fix was validated against the live WF instance during the 2026-04-18 reconciliation — before the fix, /import returned \`"errors": {"symbol": ["Invalid date '2022-05-24T00:00:00'."]}\` for every IMAP activity; after, the same payload imported cleanly. The splitter was not exercised against live IMAP data because Viktor's mailbox only has Apr 2022 → Feb 2024 emails, all inside finance.position's existing coverage. Running IMAP ingest with \`since=2024-04-06\` yields fetched=0. The unit tests cover the boundary arithmetic; a live run will happen when newer emails are parsed (or when finance coverage is re-scoped). ## Reproduce locally 1. \`poetry install\` 2. \`poetry run pytest tests/providers/test_imap.py\` 3. Expected: 6 passed, 0 failed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 12:02:49 +00:00
_buy(datetime(2024, 7, 1, tzinfo=UTC), "10", "80"), # prev £20,000 ≥ cap → GIA
_buy(datetime(2024, 8, 1, tzinfo=UTC), "10", "80"), # GIA
]
routed = _split_ie_by_isa_cap(acts)
assert routed[0].account_id == _IE_ISA_ACCOUNT_ID
assert routed[1].account_id == _IE_ISA_ACCOUNT_ID
assert routed[2].account_id == _IE_GIA_ACCOUNT_ID
assert routed[2].account_type is AccountType.GIA
assert routed[3].account_id == _IE_GIA_ACCOUNT_ID
def test_tax_year_boundary_resets_cap() -> None:
acts = [
# 2023-24 tax year: £20k in ISA, plus one in GIA
_buy(datetime(2023, 5, 1, tzinfo=UTC), "400", "50"), # £20,000 → ISA (prev 0 < cap)
_buy(datetime(2024, 1, 1, tzinfo=UTC), "100", "50"), # GIA (prev 20k)
# 2024-25 tax year starts 2024-04-06 — cap resets
_buy(datetime(2024, 5, 1, tzinfo=UTC), "100", "50"), # ISA (prev 0 for new year)
]
routed = _split_ie_by_isa_cap(acts)
assert routed[0].account_id == _IE_ISA_ACCOUNT_ID
assert routed[1].account_id == _IE_GIA_ACCOUNT_ID
assert routed[2].account_id == _IE_ISA_ACCOUNT_ID
def test_out_of_order_activities_sorted_before_cap_applied() -> None:
acts = [
_buy(datetime(2024, 8, 1, tzinfo=UTC), "10", "80"), # later date but given first
_buy(datetime(2024, 5, 1, tzinfo=UTC), "250", "80"), # earlier, £20,000 → ISA
]
routed = _split_ie_by_isa_cap(acts)
by_date = {a.date: a for a in routed}
assert by_date[datetime(2024, 5, 1, tzinfo=UTC)].account_id == _IE_ISA_ACCOUNT_ID
assert by_date[datetime(2024, 8, 1, tzinfo=UTC)].account_id == _IE_GIA_ACCOUNT_ID
def test_non_ie_activities_passed_through_unchanged() -> None:
schwab_act = Activity(
external_id="schwab:abc",
account_id="schwab-workplace",
account_type=AccountType.GIA,
date=datetime(2024, 5, 1, tzinfo=UTC),
activity_type=ActivityType.SELL,
currency="USD",
symbol="META",
quantity=Decimal("10"),
unit_price=Decimal("500"),
)
routed = _split_ie_by_isa_cap([schwab_act])
assert routed[0].account_id == "schwab-workplace"
assert routed[0].account_type is AccountType.GIA
def test_exclude_invest_engine_skips_ie_emails(monkeypatch: MonkeyPatch) -> None:
"""BROKER_SYNC_IMAP_EXCLUDE_PROVIDERS=invest-engine should skip IE messages
so we don't duplicate IE buys already ingested via the bearer-token API path.
Schwab routing must remain unaffected."""
from broker_sync.providers import imap as imap_mod
from broker_sync.providers.parsers import invest_engine as ie_parser
ie_email = (
b"From: noreply@investengine.com\r\n"
b"Subject: VUAG Bought\r\n"
b"Content-Type: text/plain\r\n\r\n"
b"Vanguard S&P 500: VUAG Bought 10.0 @ 100.0 per share Total: 1000.00\r\n"
)
schwab_email = (
b"From: donotreply@schwab.com\r\n"
b"Subject: Order Confirmed\r\n"
b"Content-Type: text/html\r\n\r\n"
b"<html><body>no-op</body></html>\r\n"
)
monkeypatch.setattr(imap_mod, "_fetch_all", lambda _: [ie_email, schwab_email])
monkeypatch.setattr(ie_parser, "parse_invest_engine_email", lambda raw: [object()])
monkeypatch.setattr(imap_mod, "parse_schwab_email", lambda html: [object()])
creds = imap_mod.ImapCreds(host="h", user="u", password="p", directory="d")
monkeypatch.setenv("BROKER_SYNC_IMAP_EXCLUDE_PROVIDERS", "invest-engine")
out_excluded = imap_mod.fetch_activities(creds)
# IE skipped → only the schwab activity is emitted
assert len(out_excluded) == 1
monkeypatch.delenv("BROKER_SYNC_IMAP_EXCLUDE_PROVIDERS", raising=False)
out_default = imap_mod.fetch_activities(creds)
# Both providers fire when env unset
assert len(out_default) == 2
def test_schwab_subdomain_sender_matches() -> None:
"""Real Schwab trade emails come from `donotreply@mail.schwab.com`
(subdomain), not just `donotreply@schwab.com`. The matcher must
accept either form."""
from broker_sync.providers.imap import _SCHWAB_SENDERS
# Verify the static set works
assert "donotreply@schwab.com" in _SCHWAB_SENDERS
# Verify the subdomain suffix check
for addr in (
"donotreply@mail.schwab.com",
"wealthnotify@equityawards.schwab.com",
):
assert addr.endswith(".schwab.com"), addr