broker-sync/docs/specs/2026-05-26-ibkr-ingest-design.md
Viktor Barzin 30af5fe2c9
Some checks failed
CI / test (push) Waiting to run
CI / build (push) Blocked by required conditions
CI / deploy (push) Blocked by required conditions
ci/woodpecker/push/build Pipeline was canceled
docs(ibkr): change Flex date range from Last Business Day → Last 90 Days
Trailing window backed by SyncRecordStore dedup is strictly better than
a single-day window — a single missed cron run with Last Business Day
loses that day's activity permanently. SyncRecordStore is keyed by
ibkr:trade:<tradeID> / ibkr:cash:<transactionID>, so overlapping pulls
are no-ops.

Caught during the brainstorming review.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-27 09:11:57 +00:00

15 KiB
Raw Blame History

IBKR Flex Ingestion — Design

Date: 2026-05-26 Status: Approved (brainstorming session 2026-05-26) Author: Viktor + Claude (Opus 4.7) Implementation plan: TBD (will be written next session via writing-plans skill)

Context

Adds Interactive Brokers (IBKR UK / IE — stocks/ETFs only) as a new broker-sync provider, pushing activities to Wealthfolio on a daily schedule alongside the existing Trading 212 / InvestEngine / Fidelity pipelines.

The user's IBKR account is currently empty (no positions, no trades). This design covers the integration as it will run once the account is funded and active. The initial backfill step in the setup checklist is a no-op until the first IBKR trade.

This work is the structural follow-on from the 2026-05-26 Wealthfolio dedup session, in which £252k of duplicated InvestEngine positions accumulated silently in WF because the IMAP and API ingestion paths emitted different external_id schemes and never reconciled against broker-reported truth. The IBKR design bakes in broker-vs-WF position reconciliation from day one — the missing capability that allowed the IE drift to grow undetected.

Decisions

D1 — Use IBKR Flex Web Service (not Client Portal API / TWS)

Flex Web Service is a token-authenticated REST endpoint returning XML statements. Suits unattended cron because:

  • One-year token validity (no daily re-auth, unlike Client Portal Gateway).
  • No sidecar / GUI / Java runtime needed.
  • Designed for periodic batch reporting — the exact shape of our pipeline.

Client Portal Web API + ibind was considered and rejected: its Gateway sidecar requires browser-based re-auth roughly every 24 hours, which is incompatible with unattended scheduling.

D2 — Library: ibflex (csingley/ibflex on PyPI)

Adds ibflex = "^0.16" to pyproject.toml. The library provides:

  • client.download(token, query_id) -> bytes — handles Flex's 2-step async API (SendRequestGetStatement polling).
  • parser.parse(xml) -> FlexQueryResponse — typed dataclasses for Trades, CashTransactions, OpenPositions, SecuritiesInfo.

Fallback (Approach B): if ibflex proves to lag IBKR schema changes, drop in raw httpx + xml.etree. Same provider shape; only the parsing internals change.

D3 — One CronJob, daily 02:00 UK, in broker-sync namespace

Matches the existing broker-sync-trading212 cadence and placement. No new namespace, no new image.

D4 — Reconciliation is mandatory, not optional

Every run computes a per-asset quantity from the Flex OpenPositions section and compares against WF's computed quantity from activities. Drift is published as a Pushgateway metric. Cross-checking broker truth is the line of defense against the IE-style silent divergence we saw on 2026-05-26.

D5 — One account, one query

Single Flex Activity Query covering Trades + Cash + Open Positions + Securities. Single Interactive Brokers (UK) account in Wealthfolio. Multiple accounts can be added later by parameterising the CLI command; not in scope now.

Architecture

broker-sync K8s namespace
├── CronJob broker-sync-ibkr   (schedule: 0 2 * * *)
│   ├── env from broker-sync-secrets:
│   │     IBKR_FLEX_TOKEN, IBKR_FLEX_QUERY_ID, IBKR_ACCOUNT_ID,
│   │     WF_BASE_URL, WF_USERNAME, WF_PASSWORD
│   ├── PVC broker-sync-data-encrypted (shared with other broker-sync jobs)
│   └── image viktorbarzin/broker-sync:<tag>  command = ["broker-sync", "ibkr"]
│
│ External calls
│  ├── HTTPS → ndcdyn.interactivebrokers.com (Flex Web Service)
│  ├── HTTP  → wealthfolio.wealthfolio.svc (activities import + position read)
│  └── HTTP  → pushgateway.monitoring.svc (drift + last-success metrics)

The provider is structurally identical to broker-sync-trading212 and the IE bearer-token path — same Vault → CronJob → provider → pipeline → WF flow. Existing alerting (CronJob-failed, ExternalSecret-stale, WF-sync-stale) applies transitively; we only add IBKR-specific alerts on top.

Components

Path Action Description
broker_sync/providers/ibkr.py NEW IBKRProvider class implementing the Provider protocol. Maps Flex XML to Activity[]. ~200 LOC.
broker_sync/cli.py MODIFY New @app.command("ibkr") typer command, parallel to trading212 and invest-engine. ~60 LOC.
pyproject.toml MODIFY Add ibflex = "^0.16" dependency.
tests/providers/test_ibkr.py NEW Fixture-based parsing tests, sign-conventions, position-drift math, account-id guard.
infra/stacks/broker-sync/main.tf MODIFY New kubernetes_cron_job_v1.ibkr resource.
Vault secret/broker-sync MODIFY Add ibkr_flex_token, ibkr_flex_query_id, ibkr_account_id.
Wealthfolio (one-time, manual) NEW data Create Interactive Brokers (UK) account; record its UUID in Vault.
docs/providers/ibkr.md NEW Production-facing provider docs (setup, query design, troubleshooting). Written after first successful run.

Data flow (per CronJob run)

  1. 02:00 UK — CronJob fires, pod starts with env from broker-sync-secrets.

  2. Downloadibflex.client.download(token, query_id) calls Flex Web Service SendRequest + GetStatement. Typical 520 s. Library handles retry/polling.

  3. Parseibflex.parser.parse(xml) produces a FlexQueryResponse.

  4. Account guard — two distinct identifiers exist:

    • IBKR_ACCOUNT_ID_UPSTREAM: the IBKR-side account number (e.g. U12345678), used to validate that the Flex report belongs to the right account.
    • IBKR_ACCOUNT_ID (alias: ibkr_account_id in Vault): the Wealthfolio account UUID (e.g. 8a3f...), used when posting activities to WF. Validate stmt.accountId == os.environ["IBKR_ACCOUNT_ID_UPSTREAM"]. Refuse to ingest on mismatch — prevents wrong-account writes from a misconfigured query.
  5. Map Trades → Activities:

    Flex Activity Notes
    Trade.tradeID external_id = "ibkr:trade:" + tradeID dedup key
    Trade.tradeDate + tradeTime date (UTC) timezone normalised
    Trade.symbol symbol canonicalised — LSE tickers get .L suffix
    Trade.buySell (BUY / SELL) activity_type direct
    Trade.quantity quantity always positive (broker-sync convention)
    Trade.tradePrice unit_price
    Trade.currency currency per-trade, multi-ccy supported
    Trade.ibCommission fee = abs(ibCommission) always positive
    Trade.assetCategory (sanity check; skip if not in {STK, ETF})
  6. Map CashTransactions → Activities:

    Flex CashTransaction.type Activity activity_type Notes
    Dividends DIVIDEND
    Withholding Tax FEE tag with notes="wht:..."
    Broker Interest Paid FEE negative direction
    Broker Interest Received DIVIDEND interest treated as income
    Deposits & Withdrawals DEPOSIT (amount > 0) or WITHDRAWAL (amount < 0)
    Commission Adjustments FEE
    anything else skip + log WARNING with the unknown type refuse to guess, same convention as IE provider

    external_id = "ibkr:cash:" + transactionID.

  7. Cash-flow match_with_cash_flow_match(a) from the shared pipeline emits a matching DEPOSIT for every BUY (and WITHDRAWAL for every SELL) so WF cash balance stays consistent. This is the existing pattern used by T212 + IE; IBKR slots in identically.

  8. DedupSyncRecordStore(/data/sync.db) skips any external_id already synced. Idempotent re-runs are safe.

  9. ImportWealthfolioSink.import_activities(...) POSTs to /api/v1/activities/import. Existing 401 retry logic applies.

  10. Reconciliation — for each OpenPositions row:

    # compute_wf_position_qty: NEW helper in WealthfolioSink.
    # Queries POST /api/v1/activities/search filtered by accountId, sums
    # BUY/SELL/ADD_HOLDING/REMOVE_HOLDING quantities per asset.
    wf_qty_by_asset = wf_sink.compute_position_qty(IBKR_ACCOUNT_ID)
    for pos in flex_response.OpenPositions:
        symbol = canonical_symbol(pos.symbol)
        drift = float(pos.position) - wf_qty_by_asset.get(symbol, Decimal(0))
        push_metric(
            "ibkr_position_drift_shares",
            labels={"symbol": symbol, "account": "ibkr-uk"},
            value=float(drift),
        )
    push_metric("ibkr_sync_last_success_timestamp_seconds", time.time())
    
  11. Exit 0 on success, non-zero on any unrecoverable error.

Error handling

Failure Detection Response Alert
Token expired (Flex code 1003) ibflex.client.ResponseCodeError Exit non-zero with explicit log IBKRFlexTokenExpired Loki rule + stale-success Prom alert
Statement generation timeout ibflex.client.StatementGenerationTimeout Retry once after 60 s, then exit non-zero Stale-success alert catches it after 24 h
Empty report (quiet day) Zero Trades + zero CashTxns Log "no new activity", still update success timestamp, still reconcile (none — happy path)
WF API 401 HTTP status Re-login via WealthfolioSink (existing logic) (existing)
WF rejects an activity row summary.skipped > 0 Log per-row + exit non-zero IBKRImportRejected Loki rule
Network / DNS fail httpx exception Retry once with 30 s backoff KubeJobFailed (existing)
Position drift > 0.01 share for >24h Pushgateway non-zero across runs Prom alert IBKRPositionDrift{symbol} warning → Slack #security NEW capability
Account ID mismatch Flex accountId != env var Exit 2 immediately, write nothing IBKRAccountMismatch urgent Loki rule

Setup checklist (one-time)

Step 1 — IBKR Client Portal (manual, ~5 min)

  1. Sign in at https://www.interactivebrokers.co.uk/Account Settings.
  2. Reports → Settings → Flex Web Service → Enable → copy the one-time-displayed Token (1 year validity).
  3. Reports → Flex Queries → Activity Flex Query → Create New:
    • Name: broker-sync-activity
    • Sections: Account Information, Trades, Cash Transactions, Open Positions, Securities Information
    • Date Format: yyyy-MM-dd · Time Format: HH:mm:ss TimeZone
    • Date Range: Last 90 Days — trailing window so a missed cron run (failed pod, outage, vacation) doesn't lose data. SyncRecordStore keys on ibkr:trade:<tradeID> / ibkr:cash:<transactionID>, so overlapping pulls are no-ops. Last Business Day was the original choice but creates a "single missed run = permanent data loss" failure mode — rejected in favour of dedup-backed resync window.
    • Format: XML
    • Trade fields: ensure tradeID, tradeDate, tradeTime, symbol, buySell, quantity, tradePrice, currency, ibCommission, assetCategory selected.
    • CashTransaction fields: transactionID, dateTime, type, amount, currency, description.
    • OpenPositions fields: symbol, position, markPrice, currency, assetCategory.
    • Save → copy the Query ID (57 digit number).

Step 2 — Vault

vault kv patch secret/broker-sync \
  ibkr_flex_token='YOUR_TOKEN' \
  ibkr_flex_query_id='YOUR_QUERY_ID' \
  ibkr_account_id='WF_UUID_FROM_STEP_3' \
  ibkr_account_id_upstream='YOUR_IBKR_ACCOUNT_NUMBER'

Step 3 — Create WF account (script + paste UUID back)

# Login → POST /accounts → capture id
curl -sS -c /tmp/wf-jar -X POST "$WF_BASE_URL/api/v1/auth/login" \
  -H 'Content-Type: application/json' -d "{\"password\":\"$WF_PASSWORD\"}"
curl -sS -b /tmp/wf-jar -X POST "$WF_BASE_URL/api/v1/accounts" \
  -H 'Content-Type: application/json' \
  -d '{"name":"Interactive Brokers (UK)","accountType":"GIA","currency":"GBP","isActive":true}' \
  | jq -r '.id'
# Paste the UUID back into Vault under ibkr_account_id

Step 4 — Initial backfill (skip while account is empty)

When the IBKR account first holds positions, the daily CronJob will backfill automatically up to the 90-day trailing window. For older history, temporarily switch the Flex query Date Range to Year to Date (or Custom Date Range with a 1-year window), run the CronJob manually once, verify WF totals match the broker app, then switch the Flex query back to Last 90 Days for daily incremental. Dedup makes the temporary widening safe — already-synced rows are no-ops.

Step 5 — Deploy

  1. Push to broker-sync main (direct push — personal repo convention, no PR) → GHA builds viktorbarzin/broker-sync:latest.
  2. cd infra/stacks/broker-sync && scripts/tg apply creates the new CronJob.
  3. Wait for the 02:00 UK run, or trigger manually: kubectl -n broker-sync create job --from=cronjob/broker-sync-ibkr broker-sync-ibkr-test-1.
  4. Verify in WF UI: account exists, activities present (if any), reconciliation drift metric showing zero.

Testing

Unit tests in tests/providers/test_ibkr.py:

  • test_parse_trades_maps_to_activities — canned 3-trade XML, verify external_id, symbol mapping, quantity sign, fee sign.
  • test_parse_dividend_maps_to_dividend_activity.
  • test_parse_unknown_cash_type_logs_warning_and_skips.
  • test_account_id_mismatch_raises — Flex returns a different accountId than env, refuse to ingest.
  • test_position_drift_computed_correctly — three-asset scenario, two match, one drifts.
  • test_canonical_symbol_lse_suffixVUAGVUAG.L, AAPLAAPL (US, no suffix), etc.

All tests mock ibflex.client.download to avoid network.

Integration test (manual, post-deploy):

  • Trigger CronJob manually.
  • Inspect logs.
  • Verify in WF UI and Pushgateway.

Acceptance criteria

  • broker-sync ibkr command runs end-to-end against the real Flex Web Service with the user's token.
  • WF accepts the resulting activity imports (no summary.skipped).
  • ibkr_position_drift_shares is published for every asset; drift = 0 on a steady-state run.
  • Re-running the command is idempotent — no duplicate activities written to WF.
  • CronJob completes successfully on its schedule for 7 consecutive days before the design is marked Done.

Out of scope

  • Multi-account support (only one IBKR account designed in).
  • Real-time data / order placement (Flex is batch-only).
  • Stock split / corporate action handling — IBKR reports these in the Flex CorporateActions section but we're not enabling that section yet; revisit if it becomes needed.
  • Multi-currency FX conversion math — we record per-trade currency faithfully and let Wealthfolio do FX. If WF's FX handling proves inadequate, a separate spec covers that.

Open questions

(None at design-approval time. Captured here for future amendments.)

References

  • ibflex library docs (csingley/ibflex)
  • Existing patterns in broker_sync/providers/trading212.py and broker_sync/providers/invest_engine.py
  • ~/code/infra/stacks/broker-sync/main.tf (CronJob structure to mirror)
  • 2026-05-26 Wealthfolio dedup session (motivates the reconciliation step)