Schwab's workplace-RSU confirmation emails have 5 data td elements
with class='dark-background-body' align='right': date, direction, qty,
ticker, price-with-currency-sign. One email → one Activity.
- parse_schwab_email(raw_html) -> list[Activity] (1-item or empty)
- Empty on any parse failure (IMAP batch shouldn't crash on one bad mail)
- Deterministic external_id ('schwab📅ticker:type:qty') — stable
across re-pulls so dedup works
- Hardcoded to account 'schwab-workplace' / AccountType.GIA / USD
- 6 unit tests: SELL + BUY happy path, malformed, missing cells,
external-id stability, commas in price
Dropped from the original finance port:
- msg_timestamp-based external id (non-deterministic — would re-import
on every IMAP walk). Replaced with a hash-stable key.
- Currency.from_sign() currency hack. Schwab US is USD-only; we'll add
FX when that changes.
poetry run pytest -q → 109 passed, 1 skipped
poetry run mypy → clean (added types-python-dateutil)
poetry run ruff check → clean
75 lines
2.9 KiB
Python
75 lines
2.9 KiB
Python
"""Schwab workplace-RSU email parser.
|
|
|
|
Schwab sends HTML transaction-confirmation emails with the core fields in
|
|
five `<td class="dark-background-body" align="right">` elements:
|
|
1. Trade date (human format — e.g. "Jan 23, 2025")
|
|
2. Direction word ("Sold" for SELL; anything else is BUY)
|
|
3. Quantity (share count, float)
|
|
4. Ticker
|
|
5. Price ("$123.45" — currency-sign-prefixed)
|
|
|
|
One email → one Activity. On any parse failure we return an empty list
|
|
(same as the original finance/ behaviour — an unparseable email shouldn't
|
|
crash the whole IMAP batch).
|
|
|
|
Ported from finance/position/provider/schwab/message_parser.py (39 lines).
|
|
Dropped: per-row timestamp id suffix (we use ISO date + ticker + qty which
|
|
is stable across re-pulls), currency-from-sign hackery (US Schwab is USD-
|
|
only in practice — if that ever changes we'll add FX on parse).
|
|
"""
|
|
from __future__ import annotations
|
|
|
|
from decimal import Decimal, InvalidOperation
|
|
|
|
from bs4 import BeautifulSoup
|
|
from dateutil import parser as dateparser
|
|
|
|
from broker_sync.models import AccountType, Activity, ActivityType
|
|
|
|
_ACCOUNT_ID = "schwab-workplace"
|
|
_DEFAULT_CURRENCY = "USD"
|
|
|
|
|
|
def parse_schwab_email(raw_html: str) -> list[Activity]:
|
|
"""Return a single-item list of Activity on success, empty on failure."""
|
|
try:
|
|
soup = BeautifulSoup(raw_html, "html.parser")
|
|
cells = [
|
|
td.get_text(strip=True) for td in soup.find_all("td", {
|
|
"class": "dark-background-body",
|
|
"align": "right"
|
|
})
|
|
]
|
|
if len(cells) < 5:
|
|
return []
|
|
|
|
date_txt, direction_txt, qty_txt, ticker, price_txt = cells[:5]
|
|
trade_date = dateparser.parse(date_txt)
|
|
direction = (ActivityType.SELL
|
|
if direction_txt.strip().lower() == "sold" else ActivityType.BUY)
|
|
quantity = Decimal(qty_txt.replace(",", "").strip())
|
|
# Price like "$123.45" — strip the currency sign and parse the numeric tail.
|
|
# Handle "£", "€", "USD", etc. by taking the last numeric span.
|
|
price_clean = price_txt
|
|
for sign in ("$", "£", "€", "USD", "GBP", "EUR"):
|
|
price_clean = price_clean.replace(sign, "")
|
|
unit_price = Decimal(price_clean.replace(",", "").strip())
|
|
|
|
external_id = (f"schwab:{trade_date.date().isoformat()}:{ticker}:"
|
|
f"{direction.value}:{quantity}")
|
|
return [
|
|
Activity(
|
|
external_id=external_id,
|
|
account_id=_ACCOUNT_ID,
|
|
account_type=AccountType.GIA,
|
|
date=trade_date,
|
|
activity_type=direction,
|
|
symbol=ticker.strip(),
|
|
quantity=quantity,
|
|
unit_price=unit_price,
|
|
currency=_DEFAULT_CURRENCY,
|
|
notes=f"schwab-email:{direction_txt}",
|
|
)
|
|
]
|
|
except (ValueError, InvalidOperation, IndexError, AttributeError):
|
|
return []
|