From ca5f98f77168e047b4afd512311c988ae9b42ec6 Mon Sep 17 00:00:00 2001 From: Viktor Barzin Date: Tue, 26 May 2026 21:52:52 +0000 Subject: [PATCH] docs: add IBKR Flex ingestion design spec MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Adds a daily IBKR Flex Web Service → Wealthfolio ingestion path alongside the existing T212 / IE / Fidelity providers. Uses the ibflex library; mandatory broker-vs-WF position reconciliation built in from day one to prevent the silent-drift class of bug we hit with InvestEngine on 2026-05-26. Account is currently empty so the initial backfill step is a no-op until the first IBKR trade lands. Co-Authored-By: Claude Opus 4.7 --- docs/specs/2026-05-26-ibkr-ingest-design.md | 322 ++++++++++++++++++++ 1 file changed, 322 insertions(+) create mode 100644 docs/specs/2026-05-26-ibkr-ingest-design.md diff --git a/docs/specs/2026-05-26-ibkr-ingest-design.md b/docs/specs/2026-05-26-ibkr-ingest-design.md new file mode 100644 index 0000000..40bb9df --- /dev/null +++ b/docs/specs/2026-05-26-ibkr-ingest-design.md @@ -0,0 +1,322 @@ +# IBKR Flex Ingestion — Design + +**Date:** 2026-05-26 +**Status:** Approved (brainstorming session 2026-05-26) +**Author:** Viktor + Claude (Opus 4.7) +**Implementation plan:** TBD (will be written next session via writing-plans skill) + +## Context + +Adds Interactive Brokers (IBKR UK / IE — stocks/ETFs only) as a new +broker-sync provider, pushing activities to Wealthfolio on a daily +schedule alongside the existing Trading 212 / InvestEngine / Fidelity +pipelines. + +The user's IBKR account is **currently empty** (no positions, no trades). +This design covers the integration as it will run once the account is +funded and active. The initial backfill step in the setup checklist is a +no-op until the first IBKR trade. + +This work is the structural follow-on from the 2026-05-26 Wealthfolio +dedup session, in which £252k of duplicated InvestEngine positions +accumulated silently in WF because the IMAP and API ingestion paths +emitted different `external_id` schemes and never reconciled against +broker-reported truth. The IBKR design bakes in **broker-vs-WF position +reconciliation from day one** — the missing capability that allowed the +IE drift to grow undetected. + +## Decisions + +### D1 — Use IBKR Flex Web Service (not Client Portal API / TWS) + +Flex Web Service is a token-authenticated REST endpoint returning XML +statements. Suits unattended cron because: +- One-year token validity (no daily re-auth, unlike Client Portal Gateway). +- No sidecar / GUI / Java runtime needed. +- Designed for periodic batch reporting — the exact shape of our pipeline. + +Client Portal Web API + `ibind` was considered and rejected: its Gateway +sidecar requires browser-based re-auth roughly every 24 hours, which is +incompatible with unattended scheduling. + +### D2 — Library: `ibflex` (`csingley/ibflex` on PyPI) + +Adds `ibflex = "^0.16"` to `pyproject.toml`. The library provides: +- `client.download(token, query_id) -> bytes` — handles Flex's 2-step + async API (`SendRequest` → `GetStatement` polling). +- `parser.parse(xml) -> FlexQueryResponse` — typed dataclasses for + `Trades`, `CashTransactions`, `OpenPositions`, `SecuritiesInfo`. + +Fallback (Approach B): if `ibflex` proves to lag IBKR schema changes, drop +in raw `httpx` + `xml.etree`. Same provider shape; only the parsing +internals change. + +### D3 — One CronJob, daily 02:00 UK, in `broker-sync` namespace + +Matches the existing `broker-sync-trading212` cadence and placement. No +new namespace, no new image. + +### D4 — Reconciliation is mandatory, not optional + +Every run computes a per-asset quantity from the Flex +`OpenPositions` section and compares against WF's computed quantity from +activities. Drift is published as a Pushgateway metric. Cross-checking +broker truth is the line of defense against the IE-style silent +divergence we saw on 2026-05-26. + +### D5 — One account, one query + +Single Flex Activity Query covering Trades + Cash + Open Positions + +Securities. Single `Interactive Brokers (UK)` account in Wealthfolio. +Multiple accounts can be added later by parameterising the CLI command; +not in scope now. + +## Architecture + +``` +broker-sync K8s namespace +├── CronJob broker-sync-ibkr (schedule: 0 2 * * *) +│ ├── env from broker-sync-secrets: +│ │ IBKR_FLEX_TOKEN, IBKR_FLEX_QUERY_ID, IBKR_ACCOUNT_ID, +│ │ WF_BASE_URL, WF_USERNAME, WF_PASSWORD +│ ├── PVC broker-sync-data-encrypted (shared with other broker-sync jobs) +│ └── image viktorbarzin/broker-sync: command = ["broker-sync", "ibkr"] +│ +│ External calls +│ ├── HTTPS → ndcdyn.interactivebrokers.com (Flex Web Service) +│ ├── HTTP → wealthfolio.wealthfolio.svc (activities import + position read) +│ └── HTTP → pushgateway.monitoring.svc (drift + last-success metrics) +``` + +The provider is structurally identical to `broker-sync-trading212` and +the IE bearer-token path — same Vault → CronJob → provider → pipeline → +WF flow. Existing alerting (CronJob-failed, ExternalSecret-stale, +WF-sync-stale) applies transitively; we only add IBKR-specific alerts on +top. + +## Components + +| Path | Action | Description | +|---|---|---| +| `broker_sync/providers/ibkr.py` | NEW | `IBKRProvider` class implementing the `Provider` protocol. Maps Flex XML to `Activity[]`. ~200 LOC. | +| `broker_sync/cli.py` | MODIFY | New `@app.command("ibkr")` typer command, parallel to `trading212` and `invest-engine`. ~60 LOC. | +| `pyproject.toml` | MODIFY | Add `ibflex = "^0.16"` dependency. | +| `tests/providers/test_ibkr.py` | NEW | Fixture-based parsing tests, sign-conventions, position-drift math, account-id guard. | +| `infra/stacks/broker-sync/main.tf` | MODIFY | New `kubernetes_cron_job_v1.ibkr` resource. | +| Vault `secret/broker-sync` | MODIFY | Add `ibkr_flex_token`, `ibkr_flex_query_id`, `ibkr_account_id`. | +| Wealthfolio (one-time, manual) | NEW data | Create `Interactive Brokers (UK)` account; record its UUID in Vault. | +| `docs/providers/ibkr.md` | NEW | Production-facing provider docs (setup, query design, troubleshooting). Written after first successful run. | + +## Data flow (per CronJob run) + +1. **02:00 UK** — CronJob fires, pod starts with env from `broker-sync-secrets`. +2. **Download** — `ibflex.client.download(token, query_id)` calls Flex + Web Service `SendRequest` + `GetStatement`. Typical 5–20 s. Library + handles retry/polling. +3. **Parse** — `ibflex.parser.parse(xml)` produces a + `FlexQueryResponse`. +4. **Account guard** — two distinct identifiers exist: + - **IBKR_ACCOUNT_ID_UPSTREAM**: the IBKR-side account number + (e.g. `U12345678`), used to validate that the Flex report belongs to + the right account. + - **IBKR_ACCOUNT_ID** (alias: `ibkr_account_id` in Vault): the + Wealthfolio account UUID (e.g. `8a3f...`), used when posting + activities to WF. + Validate `stmt.accountId == os.environ["IBKR_ACCOUNT_ID_UPSTREAM"]`. + Refuse to ingest on mismatch — prevents wrong-account writes from a + misconfigured query. +5. **Map Trades → Activities**: + + | Flex | Activity | Notes | + |---|---|---| + | `Trade.tradeID` | `external_id = "ibkr:trade:" + tradeID` | dedup key | + | `Trade.tradeDate + tradeTime` | `date` (UTC) | timezone normalised | + | `Trade.symbol` | `symbol` | canonicalised — LSE tickers get `.L` suffix | + | `Trade.buySell` (BUY / SELL) | `activity_type` | direct | + | `Trade.quantity` | `quantity` | always positive (broker-sync convention) | + | `Trade.tradePrice` | `unit_price` | | + | `Trade.currency` | `currency` | per-trade, multi-ccy supported | + | `Trade.ibCommission` | `fee = abs(ibCommission)` | always positive | + | `Trade.assetCategory` | (sanity check; skip if not in {STK, ETF}) | + +6. **Map CashTransactions → Activities**: + + | Flex `CashTransaction.type` | Activity `activity_type` | Notes | + |---|---|---| + | `Dividends` | `DIVIDEND` | | + | `Withholding Tax` | `FEE` | tag with `notes="wht:..."` | + | `Broker Interest Paid` | `FEE` | negative direction | + | `Broker Interest Received` | `DIVIDEND` | interest treated as income | + | `Deposits & Withdrawals` | `DEPOSIT` (amount > 0) or `WITHDRAWAL` (amount < 0) | | + | `Commission Adjustments` | `FEE` | | + | anything else | skip + log WARNING with the unknown type | refuse to guess, same convention as IE provider | + + external_id = `"ibkr:cash:" + transactionID`. + +7. **Cash-flow match** — `_with_cash_flow_match(a)` from the shared + pipeline emits a matching DEPOSIT for every BUY (and WITHDRAWAL for + every SELL) so WF cash balance stays consistent. This is the existing + pattern used by T212 + IE; IBKR slots in identically. + +8. **Dedup** — `SyncRecordStore(/data/sync.db)` skips any `external_id` + already synced. Idempotent re-runs are safe. + +9. **Import** — `WealthfolioSink.import_activities(...)` POSTs to + `/api/v1/activities/import`. Existing 401 retry logic applies. + +10. **Reconciliation** — for each `OpenPositions` row: + + ```python + # compute_wf_position_qty: NEW helper in WealthfolioSink. + # Queries POST /api/v1/activities/search filtered by accountId, sums + # BUY/SELL/ADD_HOLDING/REMOVE_HOLDING quantities per asset. + wf_qty_by_asset = wf_sink.compute_position_qty(IBKR_ACCOUNT_ID) + for pos in flex_response.OpenPositions: + symbol = canonical_symbol(pos.symbol) + drift = float(pos.position) - wf_qty_by_asset.get(symbol, Decimal(0)) + push_metric( + "ibkr_position_drift_shares", + labels={"symbol": symbol, "account": "ibkr-uk"}, + value=float(drift), + ) + push_metric("ibkr_sync_last_success_timestamp_seconds", time.time()) + ``` + +11. **Exit 0** on success, non-zero on any unrecoverable error. + +## Error handling + +| Failure | Detection | Response | Alert | +|---|---|---|---| +| Token expired (Flex code 1003) | `ibflex.client.ResponseCodeError` | Exit non-zero with explicit log | `IBKRFlexTokenExpired` Loki rule + stale-success Prom alert | +| Statement generation timeout | `ibflex.client.StatementGenerationTimeout` | Retry once after 60 s, then exit non-zero | Stale-success alert catches it after 24 h | +| Empty report (quiet day) | Zero Trades + zero CashTxns | Log "no new activity", still update success timestamp, still reconcile | (none — happy path) | +| WF API 401 | HTTP status | Re-login via `WealthfolioSink` (existing logic) | (existing) | +| WF rejects an activity row | `summary.skipped > 0` | Log per-row + exit non-zero | `IBKRImportRejected` Loki rule | +| Network / DNS fail | httpx exception | Retry once with 30 s backoff | `KubeJobFailed` (existing) | +| **Position drift > 0.01 share for >24h** | Pushgateway non-zero across runs | Prom alert `IBKRPositionDrift{symbol}` warning → Slack `#security` | **NEW capability** | +| Account ID mismatch | Flex `accountId` != env var | Exit 2 immediately, write nothing | `IBKRAccountMismatch` urgent Loki rule | + +## Setup checklist (one-time) + +### Step 1 — IBKR Client Portal (manual, ~5 min) + +1. Sign in at `https://www.interactivebrokers.co.uk/` → **Account + Settings**. +2. **Reports → Settings → Flex Web Service** → Enable → copy the + one-time-displayed **Token** (1 year validity). +3. **Reports → Flex Queries → Activity Flex Query → Create New**: + - Name: `broker-sync-activity` + - Sections: `Account Information`, `Trades`, `Cash Transactions`, + `Open Positions`, `Securities Information` + - Date Format: `yyyy-MM-dd` · Time Format: `HH:mm:ss TimeZone` + - Date Range: `Last Business Day` (for daily runs; flip to + `Year to Date` only for the initial backfill — irrelevant while + account is empty) + - Format: XML + - Trade fields: ensure `tradeID`, `tradeDate`, `tradeTime`, `symbol`, + `buySell`, `quantity`, `tradePrice`, `currency`, `ibCommission`, + `assetCategory` selected. + - CashTransaction fields: `transactionID`, `dateTime`, `type`, + `amount`, `currency`, `description`. + - OpenPositions fields: `symbol`, `position`, `markPrice`, `currency`, + `assetCategory`. + - Save → copy the **Query ID** (5–7 digit number). + +### Step 2 — Vault + +```bash +vault kv patch secret/broker-sync \ + ibkr_flex_token='YOUR_TOKEN' \ + ibkr_flex_query_id='YOUR_QUERY_ID' \ + ibkr_account_id='WF_UUID_FROM_STEP_3' \ + ibkr_account_id_upstream='YOUR_IBKR_ACCOUNT_NUMBER' +``` + +### Step 3 — Create WF account (script + paste UUID back) + +```bash +# Login → POST /accounts → capture id +curl -sS -c /tmp/wf-jar -X POST "$WF_BASE_URL/api/v1/auth/login" \ + -H 'Content-Type: application/json' -d "{\"password\":\"$WF_PASSWORD\"}" +curl -sS -b /tmp/wf-jar -X POST "$WF_BASE_URL/api/v1/accounts" \ + -H 'Content-Type: application/json' \ + -d '{"name":"Interactive Brokers (UK)","accountType":"GIA","currency":"GBP","isActive":true}' \ + | jq -r '.id' +# Paste the UUID back into Vault under ibkr_account_id +``` + +### Step 4 — Initial backfill (skip while account is empty) + +When the IBKR account first holds positions, switch the Flex query +Date Range to `Year to Date`, run the CronJob manually once, verify WF +totals match the broker app, then switch the Flex query back to +`Last Business Day` for daily incremental. + +### Step 5 — Deploy + +1. Push to broker-sync `main` (direct push — personal repo convention, + no PR) → GHA builds `viktorbarzin/broker-sync:latest`. +2. `cd infra/stacks/broker-sync && scripts/tg apply` creates the new + CronJob. +3. Wait for the 02:00 UK run, or trigger manually: + `kubectl -n broker-sync create job --from=cronjob/broker-sync-ibkr broker-sync-ibkr-test-1`. +4. Verify in WF UI: account exists, activities present (if any), + reconciliation drift metric showing zero. + +## Testing + +**Unit tests** in `tests/providers/test_ibkr.py`: + +- `test_parse_trades_maps_to_activities` — canned 3-trade XML, verify + external_id, symbol mapping, quantity sign, fee sign. +- `test_parse_dividend_maps_to_dividend_activity`. +- `test_parse_unknown_cash_type_logs_warning_and_skips`. +- `test_account_id_mismatch_raises` — Flex returns a different + `accountId` than env, refuse to ingest. +- `test_position_drift_computed_correctly` — three-asset scenario, two + match, one drifts. +- `test_canonical_symbol_lse_suffix` — `VUAG` → `VUAG.L`, + `AAPL` → `AAPL` (US, no suffix), etc. + +All tests mock `ibflex.client.download` to avoid network. + +**Integration test** (manual, post-deploy): +- Trigger CronJob manually. +- Inspect logs. +- Verify in WF UI and Pushgateway. + +## Acceptance criteria + +- [ ] `broker-sync ibkr` command runs end-to-end against the real Flex Web + Service with the user's token. +- [ ] WF accepts the resulting activity imports (no `summary.skipped`). +- [ ] `ibkr_position_drift_shares` is published for every asset; drift = 0 + on a steady-state run. +- [ ] Re-running the command is idempotent — no duplicate activities + written to WF. +- [ ] CronJob completes successfully on its schedule for 7 consecutive days + before the design is marked Done. + +## Out of scope + +- Multi-account support (only one IBKR account designed in). +- Real-time data / order placement (Flex is batch-only). +- Stock split / corporate action handling — IBKR reports these in the + Flex `CorporateActions` section but we're not enabling that section + yet; revisit if it becomes needed. +- Multi-currency FX conversion math — we record per-trade currency + faithfully and let Wealthfolio do FX. If WF's FX handling proves + inadequate, a separate spec covers that. + +## Open questions + +(None at design-approval time. Captured here for future amendments.) + +## References + +- `ibflex` library docs (csingley/ibflex) +- Existing patterns in `broker_sync/providers/trading212.py` and + `broker_sync/providers/invest_engine.py` +- `~/code/infra/stacks/broker-sync/main.tf` (CronJob structure to mirror) +- 2026-05-26 Wealthfolio dedup session (motivates the reconciliation step)