docs: add IBKR Flex ingestion design spec
Some checks failed
CI / test (push) Waiting to run
CI / build (push) Blocked by required conditions
CI / deploy (push) Blocked by required conditions
ci/woodpecker/push/build Pipeline failed

Adds a daily IBKR Flex Web Service → Wealthfolio ingestion path alongside
the existing T212 / IE / Fidelity providers. Uses the ibflex library;
mandatory broker-vs-WF position reconciliation built in from day one to
prevent the silent-drift class of bug we hit with InvestEngine on 2026-05-26.

Account is currently empty so the initial backfill step is a no-op until
the first IBKR trade lands.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
Viktor Barzin 2026-05-26 21:52:52 +00:00
parent e6ef1fce97
commit ca5f98f771

View file

@ -0,0 +1,322 @@
# IBKR Flex Ingestion — Design
**Date:** 2026-05-26
**Status:** Approved (brainstorming session 2026-05-26)
**Author:** Viktor + Claude (Opus 4.7)
**Implementation plan:** TBD (will be written next session via writing-plans skill)
## Context
Adds Interactive Brokers (IBKR UK / IE — stocks/ETFs only) as a new
broker-sync provider, pushing activities to Wealthfolio on a daily
schedule alongside the existing Trading 212 / InvestEngine / Fidelity
pipelines.
The user's IBKR account is **currently empty** (no positions, no trades).
This design covers the integration as it will run once the account is
funded and active. The initial backfill step in the setup checklist is a
no-op until the first IBKR trade.
This work is the structural follow-on from the 2026-05-26 Wealthfolio
dedup session, in which £252k of duplicated InvestEngine positions
accumulated silently in WF because the IMAP and API ingestion paths
emitted different `external_id` schemes and never reconciled against
broker-reported truth. The IBKR design bakes in **broker-vs-WF position
reconciliation from day one** — the missing capability that allowed the
IE drift to grow undetected.
## Decisions
### D1 — Use IBKR Flex Web Service (not Client Portal API / TWS)
Flex Web Service is a token-authenticated REST endpoint returning XML
statements. Suits unattended cron because:
- One-year token validity (no daily re-auth, unlike Client Portal Gateway).
- No sidecar / GUI / Java runtime needed.
- Designed for periodic batch reporting — the exact shape of our pipeline.
Client Portal Web API + `ibind` was considered and rejected: its Gateway
sidecar requires browser-based re-auth roughly every 24 hours, which is
incompatible with unattended scheduling.
### D2 — Library: `ibflex` (`csingley/ibflex` on PyPI)
Adds `ibflex = "^0.16"` to `pyproject.toml`. The library provides:
- `client.download(token, query_id) -> bytes` — handles Flex's 2-step
async API (`SendRequest``GetStatement` polling).
- `parser.parse(xml) -> FlexQueryResponse` — typed dataclasses for
`Trades`, `CashTransactions`, `OpenPositions`, `SecuritiesInfo`.
Fallback (Approach B): if `ibflex` proves to lag IBKR schema changes, drop
in raw `httpx` + `xml.etree`. Same provider shape; only the parsing
internals change.
### D3 — One CronJob, daily 02:00 UK, in `broker-sync` namespace
Matches the existing `broker-sync-trading212` cadence and placement. No
new namespace, no new image.
### D4 — Reconciliation is mandatory, not optional
Every run computes a per-asset quantity from the Flex
`OpenPositions` section and compares against WF's computed quantity from
activities. Drift is published as a Pushgateway metric. Cross-checking
broker truth is the line of defense against the IE-style silent
divergence we saw on 2026-05-26.
### D5 — One account, one query
Single Flex Activity Query covering Trades + Cash + Open Positions +
Securities. Single `Interactive Brokers (UK)` account in Wealthfolio.
Multiple accounts can be added later by parameterising the CLI command;
not in scope now.
## Architecture
```
broker-sync K8s namespace
├── CronJob broker-sync-ibkr (schedule: 0 2 * * *)
│ ├── env from broker-sync-secrets:
│ │ IBKR_FLEX_TOKEN, IBKR_FLEX_QUERY_ID, IBKR_ACCOUNT_ID,
│ │ WF_BASE_URL, WF_USERNAME, WF_PASSWORD
│ ├── PVC broker-sync-data-encrypted (shared with other broker-sync jobs)
│ └── image viktorbarzin/broker-sync:<tag> command = ["broker-sync", "ibkr"]
│ External calls
│ ├── HTTPS → ndcdyn.interactivebrokers.com (Flex Web Service)
│ ├── HTTP → wealthfolio.wealthfolio.svc (activities import + position read)
│ └── HTTP → pushgateway.monitoring.svc (drift + last-success metrics)
```
The provider is structurally identical to `broker-sync-trading212` and
the IE bearer-token path — same Vault → CronJob → provider → pipeline →
WF flow. Existing alerting (CronJob-failed, ExternalSecret-stale,
WF-sync-stale) applies transitively; we only add IBKR-specific alerts on
top.
## Components
| Path | Action | Description |
|---|---|---|
| `broker_sync/providers/ibkr.py` | NEW | `IBKRProvider` class implementing the `Provider` protocol. Maps Flex XML to `Activity[]`. ~200 LOC. |
| `broker_sync/cli.py` | MODIFY | New `@app.command("ibkr")` typer command, parallel to `trading212` and `invest-engine`. ~60 LOC. |
| `pyproject.toml` | MODIFY | Add `ibflex = "^0.16"` dependency. |
| `tests/providers/test_ibkr.py` | NEW | Fixture-based parsing tests, sign-conventions, position-drift math, account-id guard. |
| `infra/stacks/broker-sync/main.tf` | MODIFY | New `kubernetes_cron_job_v1.ibkr` resource. |
| Vault `secret/broker-sync` | MODIFY | Add `ibkr_flex_token`, `ibkr_flex_query_id`, `ibkr_account_id`. |
| Wealthfolio (one-time, manual) | NEW data | Create `Interactive Brokers (UK)` account; record its UUID in Vault. |
| `docs/providers/ibkr.md` | NEW | Production-facing provider docs (setup, query design, troubleshooting). Written after first successful run. |
## Data flow (per CronJob run)
1. **02:00 UK** — CronJob fires, pod starts with env from `broker-sync-secrets`.
2. **Download**`ibflex.client.download(token, query_id)` calls Flex
Web Service `SendRequest` + `GetStatement`. Typical 520 s. Library
handles retry/polling.
3. **Parse**`ibflex.parser.parse(xml)` produces a
`FlexQueryResponse`.
4. **Account guard** — two distinct identifiers exist:
- **IBKR_ACCOUNT_ID_UPSTREAM**: the IBKR-side account number
(e.g. `U12345678`), used to validate that the Flex report belongs to
the right account.
- **IBKR_ACCOUNT_ID** (alias: `ibkr_account_id` in Vault): the
Wealthfolio account UUID (e.g. `8a3f...`), used when posting
activities to WF.
Validate `stmt.accountId == os.environ["IBKR_ACCOUNT_ID_UPSTREAM"]`.
Refuse to ingest on mismatch — prevents wrong-account writes from a
misconfigured query.
5. **Map Trades → Activities**:
| Flex | Activity | Notes |
|---|---|---|
| `Trade.tradeID` | `external_id = "ibkr:trade:" + tradeID` | dedup key |
| `Trade.tradeDate + tradeTime` | `date` (UTC) | timezone normalised |
| `Trade.symbol` | `symbol` | canonicalised — LSE tickers get `.L` suffix |
| `Trade.buySell` (BUY / SELL) | `activity_type` | direct |
| `Trade.quantity` | `quantity` | always positive (broker-sync convention) |
| `Trade.tradePrice` | `unit_price` | |
| `Trade.currency` | `currency` | per-trade, multi-ccy supported |
| `Trade.ibCommission` | `fee = abs(ibCommission)` | always positive |
| `Trade.assetCategory` | (sanity check; skip if not in {STK, ETF}) |
6. **Map CashTransactions → Activities**:
| Flex `CashTransaction.type` | Activity `activity_type` | Notes |
|---|---|---|
| `Dividends` | `DIVIDEND` | |
| `Withholding Tax` | `FEE` | tag with `notes="wht:..."` |
| `Broker Interest Paid` | `FEE` | negative direction |
| `Broker Interest Received` | `DIVIDEND` | interest treated as income |
| `Deposits & Withdrawals` | `DEPOSIT` (amount > 0) or `WITHDRAWAL` (amount < 0) | |
| `Commission Adjustments` | `FEE` | |
| anything else | skip + log WARNING with the unknown type | refuse to guess, same convention as IE provider |
external_id = `"ibkr:cash:" + transactionID`.
7. **Cash-flow match**`_with_cash_flow_match(a)` from the shared
pipeline emits a matching DEPOSIT for every BUY (and WITHDRAWAL for
every SELL) so WF cash balance stays consistent. This is the existing
pattern used by T212 + IE; IBKR slots in identically.
8. **Dedup**`SyncRecordStore(/data/sync.db)` skips any `external_id`
already synced. Idempotent re-runs are safe.
9. **Import**`WealthfolioSink.import_activities(...)` POSTs to
`/api/v1/activities/import`. Existing 401 retry logic applies.
10. **Reconciliation** — for each `OpenPositions` row:
```python
# compute_wf_position_qty: NEW helper in WealthfolioSink.
# Queries POST /api/v1/activities/search filtered by accountId, sums
# BUY/SELL/ADD_HOLDING/REMOVE_HOLDING quantities per asset.
wf_qty_by_asset = wf_sink.compute_position_qty(IBKR_ACCOUNT_ID)
for pos in flex_response.OpenPositions:
symbol = canonical_symbol(pos.symbol)
drift = float(pos.position) - wf_qty_by_asset.get(symbol, Decimal(0))
push_metric(
"ibkr_position_drift_shares",
labels={"symbol": symbol, "account": "ibkr-uk"},
value=float(drift),
)
push_metric("ibkr_sync_last_success_timestamp_seconds", time.time())
```
11. **Exit 0** on success, non-zero on any unrecoverable error.
## Error handling
| Failure | Detection | Response | Alert |
|---|---|---|---|
| Token expired (Flex code 1003) | `ibflex.client.ResponseCodeError` | Exit non-zero with explicit log | `IBKRFlexTokenExpired` Loki rule + stale-success Prom alert |
| Statement generation timeout | `ibflex.client.StatementGenerationTimeout` | Retry once after 60 s, then exit non-zero | Stale-success alert catches it after 24 h |
| Empty report (quiet day) | Zero Trades + zero CashTxns | Log "no new activity", still update success timestamp, still reconcile | (none — happy path) |
| WF API 401 | HTTP status | Re-login via `WealthfolioSink` (existing logic) | (existing) |
| WF rejects an activity row | `summary.skipped > 0` | Log per-row + exit non-zero | `IBKRImportRejected` Loki rule |
| Network / DNS fail | httpx exception | Retry once with 30 s backoff | `KubeJobFailed` (existing) |
| **Position drift > 0.01 share for >24h** | Pushgateway non-zero across runs | Prom alert `IBKRPositionDrift{symbol}` warning → Slack `#security` | **NEW capability** |
| Account ID mismatch | Flex `accountId` != env var | Exit 2 immediately, write nothing | `IBKRAccountMismatch` urgent Loki rule |
## Setup checklist (one-time)
### Step 1 — IBKR Client Portal (manual, ~5 min)
1. Sign in at `https://www.interactivebrokers.co.uk/` → **Account
Settings**.
2. **Reports → Settings → Flex Web Service** → Enable → copy the
one-time-displayed **Token** (1 year validity).
3. **Reports → Flex Queries → Activity Flex Query → Create New**:
- Name: `broker-sync-activity`
- Sections: `Account Information`, `Trades`, `Cash Transactions`,
`Open Positions`, `Securities Information`
- Date Format: `yyyy-MM-dd` · Time Format: `HH:mm:ss TimeZone`
- Date Range: `Last Business Day` (for daily runs; flip to
`Year to Date` only for the initial backfill — irrelevant while
account is empty)
- Format: XML
- Trade fields: ensure `tradeID`, `tradeDate`, `tradeTime`, `symbol`,
`buySell`, `quantity`, `tradePrice`, `currency`, `ibCommission`,
`assetCategory` selected.
- CashTransaction fields: `transactionID`, `dateTime`, `type`,
`amount`, `currency`, `description`.
- OpenPositions fields: `symbol`, `position`, `markPrice`, `currency`,
`assetCategory`.
- Save → copy the **Query ID** (57 digit number).
### Step 2 — Vault
```bash
vault kv patch secret/broker-sync \
ibkr_flex_token='YOUR_TOKEN' \
ibkr_flex_query_id='YOUR_QUERY_ID' \
ibkr_account_id='WF_UUID_FROM_STEP_3' \
ibkr_account_id_upstream='YOUR_IBKR_ACCOUNT_NUMBER'
```
### Step 3 — Create WF account (script + paste UUID back)
```bash
# Login → POST /accounts → capture id
curl -sS -c /tmp/wf-jar -X POST "$WF_BASE_URL/api/v1/auth/login" \
-H 'Content-Type: application/json' -d "{\"password\":\"$WF_PASSWORD\"}"
curl -sS -b /tmp/wf-jar -X POST "$WF_BASE_URL/api/v1/accounts" \
-H 'Content-Type: application/json' \
-d '{"name":"Interactive Brokers (UK)","accountType":"GIA","currency":"GBP","isActive":true}' \
| jq -r '.id'
# Paste the UUID back into Vault under ibkr_account_id
```
### Step 4 — Initial backfill (skip while account is empty)
When the IBKR account first holds positions, switch the Flex query
Date Range to `Year to Date`, run the CronJob manually once, verify WF
totals match the broker app, then switch the Flex query back to
`Last Business Day` for daily incremental.
### Step 5 — Deploy
1. Push to broker-sync `main` (direct push — personal repo convention,
no PR) → GHA builds `viktorbarzin/broker-sync:latest`.
2. `cd infra/stacks/broker-sync && scripts/tg apply` creates the new
CronJob.
3. Wait for the 02:00 UK run, or trigger manually:
`kubectl -n broker-sync create job --from=cronjob/broker-sync-ibkr broker-sync-ibkr-test-1`.
4. Verify in WF UI: account exists, activities present (if any),
reconciliation drift metric showing zero.
## Testing
**Unit tests** in `tests/providers/test_ibkr.py`:
- `test_parse_trades_maps_to_activities` — canned 3-trade XML, verify
external_id, symbol mapping, quantity sign, fee sign.
- `test_parse_dividend_maps_to_dividend_activity`.
- `test_parse_unknown_cash_type_logs_warning_and_skips`.
- `test_account_id_mismatch_raises` — Flex returns a different
`accountId` than env, refuse to ingest.
- `test_position_drift_computed_correctly` — three-asset scenario, two
match, one drifts.
- `test_canonical_symbol_lse_suffix``VUAG``VUAG.L`,
`AAPL``AAPL` (US, no suffix), etc.
All tests mock `ibflex.client.download` to avoid network.
**Integration test** (manual, post-deploy):
- Trigger CronJob manually.
- Inspect logs.
- Verify in WF UI and Pushgateway.
## Acceptance criteria
- [ ] `broker-sync ibkr` command runs end-to-end against the real Flex Web
Service with the user's token.
- [ ] WF accepts the resulting activity imports (no `summary.skipped`).
- [ ] `ibkr_position_drift_shares` is published for every asset; drift = 0
on a steady-state run.
- [ ] Re-running the command is idempotent — no duplicate activities
written to WF.
- [ ] CronJob completes successfully on its schedule for 7 consecutive days
before the design is marked Done.
## Out of scope
- Multi-account support (only one IBKR account designed in).
- Real-time data / order placement (Flex is batch-only).
- Stock split / corporate action handling — IBKR reports these in the
Flex `CorporateActions` section but we're not enabling that section
yet; revisit if it becomes needed.
- Multi-currency FX conversion math — we record per-trade currency
faithfully and let Wealthfolio do FX. If WF's FX handling proves
inadequate, a separate spec covers that.
## Open questions
(None at design-approval time. Captured here for future amendments.)
## References
- `ibflex` library docs (csingley/ibflex)
- Existing patterns in `broker_sync/providers/trading212.py` and
`broker_sync/providers/invest_engine.py`
- `~/code/infra/stacks/broker-sync/main.tf` (CronJob structure to mirror)
- 2026-05-26 Wealthfolio dedup session (motivates the reconciliation step)