add Meet Kevin v2 paper-trading + backtest + UI design

Synthesizes work of two parallel architect agents (strategy +
paper-trading rules / backtest + UI surface) and the subsequent
challenger review. Resolves 11 issues the challenger raised:

- KevinStrategy is standalone, not BaseStrategy subclass (signature
  mismatch — BaseStrategy.evaluate is bar-driven, Kevin is event-driven)
- backtester/kevin_backtest.py as parallel mention-driven mini-engine,
  not a fake adapter onto BacktestEngine
- AlpacaBroker BRACKET support specified (OrderRequest schema + broker
  _build_order_request extensions)
- Filtering paper-account trades via strategy_id FK (the actual field;
  Trade.strategy_name doesn't exist) — migration seeds a 'kevin' row
- Cursor advance race fixed (XADD success → cursor advance)
- Daily counter mechanics specified (Redis INCR + audit dedupe)
- kevin_signal_bridge_state table added to data model (3 new tables now)
- All PKs UUID for consistency with Trade/Position
- StrategyVsBenchmarkCurve.tsx promoted from contingent to definitely-new
- 'avoid' policy split into AVOID_CLOSES_LONGS + AVOID_BLOCKS_DAYS knobs
- Phasing collapsed A+B into Phase 1 (ticker scorecard needs bridge
  audit rows to render WOULD-TRADE badges)
This commit is contained in:
Viktor Barzin 2026-05-23 10:04:04 +00:00
parent ed2195d879
commit 280f807236

View file

@ -0,0 +1,633 @@
# Meet Kevin v2 — Paper Trading + Backtest + Strategy UI — Design
> Synthesises the work of two parallel architect agents:
> - `kevin-strategy-design.md` (strategy + paper-trading rules) — Architect A
> - `kevin-backtest-ui-design.md` (backtest + UI surface) — Architect B
## Goal
Translate Meet Kevin's per-mention signals into automated Alpaca paper trades, surface the strategy's performance against a benchmark in a dedicated UI, and prove it works through a backtest before any real money would be considered.
## Premise
A new `kevin_stock_mentions` row is the primary stimulus. Everything else is plumbing. **Long-only on Alpaca paper for v1.**
## Scope summary
| Area | Owner | New / Existing |
|---|---|---|
| Strategy logic (signal → trade) | `shared/strategies/kevin.py` (NEW) | `BaseStrategy` extension |
| Mention → signal bridging | `services/kevin_signal_bridge/` (NEW) | New service container |
| Trade execution | `services/trade_executor/` | EXISTING — re-enable, no behavior change |
| Risk controls | `services/trade_executor/risk_manager.py` | EXISTING — extend in place |
| Backtest engine | `backtester/` | EXISTING — add `KevinMentionAdapter` |
| Metrics | `backtester/metrics.py` | EXISTING — extend `compute_metrics` |
| Backtest API + persistence | `services/api_gateway/routes/meet_kevin_backtest.py` (NEW) + 2 tables | New |
| Strategy UI page | `dashboard/src/pages/meetKevin/Strategy.tsx` (NEW) | New |
| Paper account UI page | `dashboard/src/pages/meetKevin/PaperAccount.tsx` (NEW) | ~50-line clone of `Portfolio.tsx` |
Total new files: ~12. Total modified files: ~6. Estimated implementer time: 68 hours across a subagent team.
---
## Architecture
```
Postgres
┌──────────────────────┐
│ kevin_stock_mentions │ ← (v1 watcher writes; unchanged)
└──────────┬───────────┘
│ poll every 60s
┌──────────────────────────────────────────────────────────┐
│ services/kevin_signal_bridge/ NEW │
│ - cursor in Redis (kevin:bridge:last_mention_id) │
│ - applies signal filter (action, conviction, age) │
│ - aggregates multi-mention boost │
│ - checks Redis blocklist + per-day caps │
│ - calls KevinStrategy.evaluate(mention, snapshot) │
│ - daily exit-scan job @ 09:35 ET (time-based exits) │
│ - watches trades:executed for fill confirmations │
│ - master kill-switch: TRADING_KEVIN_ENABLE_TRADING=off │
└────────────────┬─────────────────────────────────────────┘
│ publishes TradeSignal with
│ strategy_sources=["kevin:<action>:<conviction>"]
Redis Stream signals:generated
┌──────────────────────────────────────────────────────────┐
│ services/trade_executor/ EXISTING │
│ - RiskManager.check_risk (extended with new caps) │
│ - AlpacaBroker.submit_order (BRACKET: stop+take-profit)│
│ - Trade persistence with strategy_name="kevin" │
│ - publishes trades:executed │
└────────────────┬─────────────────────────────────────────┘
Alpaca Paper (long-only, BRACKET orders)
Redis Stream trades:executed
portfolio_sync (existing) — equity curve, positions
Parallel observability path
┌──────────────────────────────────────────────────────────┐
│ services/api_gateway/routes/meet_kevin_backtest.py NEW │
│ - POST /backtest/run → async job, polls Redis status │
│ - calls backtester/engine.BacktestEngine with │
│ KevinMentionAdapter(KevinStrategy) │
│ - persists final run to kevin_backtest_runs │
│ + per-trade detail in kevin_backtest_trades │
│ - meet_kevin_watcher gains a daily post-poll cron │
│ stage that triggers the scheduled run │
└──────────────────────────────────────────────────────────┘
```
### Key architectural rule
**One source of truth for strategy behavior: `shared/strategies/kevin.py::KevinStrategy`.**
`KevinStrategy` is a **stand-alone class**, NOT a `BaseStrategy` subclass — the existing `BaseStrategy.evaluate(ticker, market: MarketSnapshot, sentiment)` signature is bar/article-driven and would force the Kevin code to fabricate fake `MarketSnapshot` objects. Instead `KevinStrategy` exposes:
```python
async def evaluate_mention(
self,
mention: KevinStockMention,
account_state: KevinAccountState,
) -> KevinDecision | None
```
Both the live `kevin_signal_bridge` AND the backtest engine call this method directly. The unification guarantee comes from "one class, one method, called by both" — not from an inheritance decoration. This applies to:
- Action filtering (which actions trigger trades)
- Conviction floor
- Position sizing (conviction-weighted)
- Holding period (time_horizon-derived)
- Multi-mention aggregation (latest-wins + conviction boost)
- Long-only treatment (sell/avoid semantics)
The backtest *parameterises* slippage + commission + initial capital (per-run knobs) but **never overrides** strategy logic.
---
## Signal translation rules
A `kevin_stock_mentions` row produces (or cancels) exactly one `TradeSignal` when **all** of:
| Condition | Threshold | Rationale |
|---|---|---|
| `action` ∈ {buy, sell, avoid} | Mandatory | hold/watch never trade — UI-only |
| `conviction` | `>= 0.60` | Kevin makes many low-conviction asides |
| `time_horizon` | not `intraday` | 3h poll cadence can't catch intraday |
| Age of mention | `<= 48h` since `created_at` | Past 48h, thesis is stale; backtest handles older |
| Symbol tradable on Alpaca | `broker.get_asset(symbol).tradable == True` | Skip OTC / foreign / crypto |
| `asset.class_ == 'us_equity'` | Hardcoded | v1: no crypto, no options |
**`time_horizon` → target hold period** (used for time-based exit):
| `time_horizon` | Trading days held |
|---|---|
| `days` | 3 |
| `weeks` | 10 |
| `months` | 45 |
| `long_term` | 90 (re-enter via fresh mention if Kevin re-affirms) |
| `unspecified` | 10 (default to weeks — Kevin's modal horizon) |
| `intraday` | filtered out |
**Long-only direction mapping:**
| Action | Currently held? | Behavior |
|---|---|---|
| `buy` | no | Open at sized target |
| `buy` | yes, below per-ticker cap | Top up to cap |
| `buy` | yes, at cap | No-op |
| `sell` | yes | Emit market SELL for full position |
| `sell` | no | No-op (never short in v1) |
| `avoid` | yes | Emit market SELL (controlled by `AVOID_CLOSES_LONGS=true`) + add Redis blocklist `kevin:blocked:{symbol}` with TTL `AVOID_BLOCKS_DAYS=7` |
| `avoid` | no | Add to blocklist; no trade |
**Note on `avoid` policy:** the close-action and the blocklist are split into two env knobs (`TRADING_KEVIN_AVOID_CLOSES_LONGS`, `TRADING_KEVIN_AVOID_BLOCKS_DAYS`). A subsequent `buy` mention on the same ticker **clears** the blocklist early — "latest mention wins" applies symmetrically.
| `hold` | — | Never trade |
| `watch` | — | Never trade |
---
## Position sizing
Conviction-weighted fixed-fractional:
```
base_pct = 0.04 # 4% of equity for max-conviction
conviction_mult = (conviction - 0.6) / 0.4 # 0.6 → 0.0; 1.0 → 1.0
target_pct = base_pct * (0.5 + 0.5 * conviction_mult)
target_dollars = clamp(account.equity * target_pct, MIN_USD, MAX_USD)
qty = floor(target_dollars / current_price)
```
| Conviction | `target_pct` | $ for $100k equity |
|---|---|---|
| 0.60 | 2.0% | $2,000 |
| 0.80 | 3.0% | $3,000 |
| 1.00 | 4.0% | $4,000 |
**Hard caps:**
- Per-trade floor: $500
- Per-trade ceiling: $5,000 (5% of $100k starting equity)
- Per-ticker total cost basis ceiling: $7,500 (absorbs multi-mention boost)
Rejected Kelly sizing (no historical edge data yet). Revisit after backtest reports per-conviction-bucket hit rates.
---
## Exit logic
Closes on **first** of:
1. **Time-based:** `holding_days` for that mention's `time_horizon` (table above). Daily cron in `kevin_signal_bridge` at 09:35 ET scans open Kevin trades, emits SELL.
2. **Stop-loss:** -8% from entry. Implemented as Alpaca BRACKET order (`order_class=BRACKET, stop_loss=...`). **Requires extending the broker abstraction** — see §"Broker extension" below.
3. **Take-profit:** +20% from entry. Implemented as Alpaca BRACKET order's `take_profit` leg.
4. **Kevin reverses:** new `sell` or `avoid` mention on the same ticker → cancel bracket legs + market SELL.
5. **Manual close:** `POST /api/meet-kevin/positions/{symbol}/close` (sets a `kevin:manual_close:{symbol}` Redis flag the bridge reads).
**Not implemented (deliberate YAGNI):** trailing stops, partial scaling out, end-of-week rebalance.
### Broker extension (required for BRACKET orders)
`AlpacaBroker._build_order_request` currently emits only `MarketOrderRequest`/`LimitOrderRequest`/`StopOrderRequest` (no BRACKET). `OrderRequest` schema in `shared/schemas/trading.py` lacks `take_profit_price` / `stop_loss_price` / `order_class` fields.
Required changes:
- Extend `OrderRequest` with optional `take_profit_price: Decimal | None`, `stop_loss_price: Decimal | None`, `order_class: Literal["simple","bracket"] = "simple"`.
- `_build_order_request` branches on `order_class`: when `bracket`, returns `MarketOrderRequest(order_class=OrderClass.BRACKET, take_profit=TakeProfitRequest(limit_price=...), stop_loss=StopLossRequest(stop_price=...))`.
- Cancellation: `broker.cancel_order` already exists; for Kevin-reverses we cancel the parent bracket which auto-cancels the legs.
---
## Multi-mention aggregation
Kevin mentions NVDA `buy` three times in 48 hours:
- **Latest mention wins** the action + time_horizon (most recent thesis is canonical).
- **Conviction boost:** `effective_conviction = min(1.0, latest.conviction + 0.05 * (extra_mentions_in_48h))`, capped at +0.20 (i.e. 4 extra mentions).
- **One position per ticker per active window.** No 3x position size.
- If the latest is `sell`/`avoid`, earlier `buy`s are dead — exit immediately.
Aggregation is done in `kevin_signal_bridge` (windowed SQL over `kevin_stock_mentions`); `KevinStrategy` itself stays stateless and per-mention.
---
## Risk controls
Extends `services/trade_executor/risk_manager.py::RiskManager.check_risk`. **All hard caps, all blocking, all configurable:**
| Control | Default | Notes |
|---|---|---|
| Max position size per ticker | 7.5% of equity | Per-ticker cap absorbs multi-mention boost |
| Max total long exposure | 60% of equity | Existing `max_total_exposure_pct` |
| Max concurrent positions | 15 | Existing `max_positions` (lowered from 20) |
| Max NEW trades / day | 5 | New Redis counter `kevin:daily_trades:{YYYYMMDD}` (atomic `INCR`) |
| Max NEW $ allocated / day | $15,000 | New Redis sum counter `kevin:daily_alloc:{YYYYMMDD}` (atomic `INCRBY`) |
**Daily counter mechanics:**
- Owner: `kevin_signal_bridge` writes the counters **after** `trades:executed` confirms a fill (consumes the stream).
- Reset: keys have `EXPIRE 172800` (48h) so they self-clean. `RiskManager.check_risk` reads `kevin:daily_trades:{YYYYMMDD using UTC date}` for today.
- Idempotency: counter increment is keyed off `Trade.id`, deduplicated by an `INSERT … ON CONFLICT DO NOTHING` audit row in `kevin_signal_bridge_state`.
| Daily P&L circuit breaker | -3% of equity | Sets `trading:paused=1` until next market open |
| Account drawdown halt | Equity < 80% of starting | Sets `trading:paused=PERMANENT` until manually cleared |
| Per-ticker cooldown after exit | 60 min | Bumped from 30 |
| Sector concentration | NOT enforced v1 | Alpaca asset metadata lacks sector — YAGNI |
The circuit breaker + drawdown halt are new periodic checks running inside `kevin_signal_bridge` (writing `trading:paused` which `RiskManager.check_risk` already honours).
---
## Backtest engine
### Methodology
| Knob | Default | Rationale |
|---|---|---|
| Mention window | `[channel_first_polled_at, run_end)` | Use everything we have |
| Entry fill | next trading session OPEN (T+1) | Avoids look-ahead bias |
| Exit fill | OPEN of `entry_session + holding_days_for_that_mention` | `holding_days` is per-mention from strategy spec |
| Price source | Alpaca daily bars (IEX feed) via `market_data` cache, lazy back-fetch | Same wrapper as existing `routes/backtest.py::_fetch_alpaca_bars` |
| Slippage | 5 bps symmetric | `slippage_pct = 0.0005` |
| Commission | $0 (Alpaca paper) | |
| Initial capital | $100,000 | Match Alpaca paper default |
| Benchmark | SPY buy-and-hold over `[first_mention_at, run_end]` window | |
| Walk-forward vs single-pass | Single-pass | <30 days of data walk-forward statistically meaningless |
| Dedupe policy | `roll` (latest mention extends exit) | Mirrors live `KevinStrategy` aggregation |
### Backtest implementation: separate mention-driven engine
The existing `BacktestEngine` in `backtester/engine.py` is **bar-driven** (iterates daily bars, evaluates all strategies per tick). The Kevin backtest is **event-driven** (mention → T+1 open entry → time-based exit). Rather than fabricate per-bar evaluations or graft a mention loop onto a bar engine, write a parallel mini-engine:
```
backtester/kevin_backtest.py ~150 lines
- takes (KevinStrategy, list[KevinStockMention], price_loader, params)
- walks mentions chronologically
- calls strategy.evaluate_mention() per mention to get KevinDecision
- simulates fills at T+1 open (entry) and exit_date open (exit)
- tracks portfolio state, equity curve, trade log
- emits BacktestResult (reusing existing dataclass)
- reuses compute_metrics() for headline numbers
```
Keep `BacktestEngine` untouched — the 9 existing strategies still backtest there. `compute_metrics` is the shared utility both engines call.
### Metrics
`backtester/metrics.py::compute_metrics` extended **in place** (no fork) with these additional fields:
| Field | Formula |
|---|---|
| `total_return_pct` | `(final - initial) / initial * 100` |
| `annualized_return_pct` | `(final/initial)^(365.25/days) - 1) * 100` |
| `sharpe_ratio` | `mean(daily_ret) / stddev(daily_ret) * sqrt(252)`, rf=0 |
| `max_drawdown_pct` | `max((peak - trough) / peak)` over equity curve |
| `win_rate_pct` | `wins / closed_trades * 100` |
| `avg_winner_pct` | `mean(pnl_pct where pnl > 0)` |
| `avg_loser_pct` | `mean(pnl_pct where pnl <= 0)` |
| `best_trade` / `worst_trade` | `{symbol, pnl_pct}` |
| `alpha_vs_spy_pct` | `total_return_pct - spy_return_pct` |
| `beta_vs_spy` | `cov(daily_ret, spy_daily_ret) / var(spy_daily_ret)` |
### Execution surface
- **On-demand:** `POST /api/meet-kevin/backtest/run` → 202 `{run_uuid, status: running}`. Async via `asyncio.Task`; Redis stores `kevin:backtest:status:{uuid}` with 24h TTL for the dashboard's 2s poll.
- **Scheduled:** the `meet_kevin_watcher` service gains a post-analysis cron stage that calls the same internal `run_backtest_for_kevin_strategy()` helper. This produces the canonical "latest" run the UI defaults to.
- **Persistence:** final result lives in `kevin_backtest_runs` + `kevin_backtest_trades` (Postgres). Redis is only the in-flight surface.
### Schema additions (1 Alembic migration)
Three new tables (all `uuid` PKs for consistency with `Trade` / `Position` / `Signal`):
```sql
kevin_signal_bridge_state
id uuid PRIMARY KEY DEFAULT gen_random_uuid(),
mention_id bigint UNIQUE NOT NULL REFERENCES kevin_stock_mentions(id),
bridge_status enum('emitted','skipped_non_tradable','skipped_blocklist',
'skipped_caps','deferred','broker_rejected','dry_run') NOT NULL,
signal_id uuid REFERENCES signals(id), -- nullable; null on skip
trade_id uuid REFERENCES trades(id), -- nullable; null until fill
effective_conviction numeric(4,3), -- after multi-mention boost
decided_at timestamptz NOT NULL DEFAULT now(),
notes text,
INDEX (mention_id),
INDEX (bridge_status, decided_at DESC);
kevin_backtest_runs
id uuid PRIMARY KEY DEFAULT gen_random_uuid(),
run_uuid uuid UNIQUE NOT NULL,
started_at timestamptz NOT NULL DEFAULT now(),
finished_at timestamptz,
status enum('running','completed','failed') NOT NULL,
trigger_source enum('manual','scheduled') NOT NULL DEFAULT 'manual',
params_json jsonb NOT NULL,
metrics_json jsonb,
equity_curve_json jsonb,
benchmark_curve_json jsonb,
error_message text,
INDEX (started_at DESC),
INDEX (status, started_at DESC);
kevin_backtest_trades
id uuid PRIMARY KEY DEFAULT gen_random_uuid(),
run_id uuid NOT NULL REFERENCES kevin_backtest_runs(id) ON DELETE CASCADE,
symbol varchar(16) NOT NULL,
source_mention_id bigint REFERENCES kevin_stock_mentions(id),
entry_at timestamptz NOT NULL,
entry_price numeric(12,4) NOT NULL,
exit_at timestamptz,
exit_price numeric(12,4),
qty numeric(14,4) NOT NULL,
pnl_usd numeric(14,4),
pnl_pct numeric(8,4),
holding_days_actual int,
INDEX (run_id, symbol),
INDEX (run_id, entry_at);
```
**Not reusing the existing `trades` table** — those are real fills; backtest trades are hypothetical history; mixing them breaks consumer assumptions.
---
## UI surface
Two new pages under the existing `/meet-kevin/*` group.
### Filtering paper-account trades to "kevin" — strategy_id FK
`Trade` has `strategy_id` (UUID FK to `strategies` table) NOT `strategy_name`. To filter paper-account UI:
1. Migration **seeds** a row in `strategies`: `('kevin', 'Meet Kevin Strategy', current_weight=1.0, active=true)` — known UUID generated and pinned as a constant in code.
2. `kevin_signal_bridge` sets `strategy_id` to this UUID on every emitted `TradeSignal`.
3. `trade_executor` already passes `signal.strategy_id` through to `Trade.strategy_id` during persistence.
4. The paper-account API filters: `WHERE trades.strategy_id = $kevin_strategy_uuid`.
### `/meet-kevin/strategy` — backtest + tracking (PRIMARY)
```
┌───────────────────────────────────────────────────────────────────────────────┐
│ Meet Kevin / Strategy [Run backtest ▾] [Refresh] │
├──────────────────┬──────────────────┬──────────────────┬──────────────────────┤
│ Strategy Equity │ Total P&L │ vs SPY (alpha) │ Win rate / # trades │
│ $103,420 +3.4% │ +$3,420 (+3.4%) │ +1.8% │ 64.3% · 14 trades │
└──────────────────┴──────────────────┴──────────────────┴──────────────────────┘
┌───────────────────────────────────────────────────────────────────────────────┐
│ Equity curve — Strategy (blue) · SPY benchmark (gray) │
│ [recharts/lightweight-charts dual-line] │
│ [1W] [2W] [1M] [ALL] │
└───────────────────────────────────────────────────────────────────────────────┘
┌───────────────────────────────────────────────────────────────────────────────┐
│ Ticker scorecard [filter: ____] │
├──────┬────────┬──────┬─────────────┬─────────┬─────────┬─────────┬───────────┤
│ Sym │ Action │ Conv │ Entry date │ Live $ │ Unreal. │ Realized│ Status │
│ NVDA │ BUY │ 0.85 │ 2026-05-15 │ $182.40 │ +$420 │ — │ HOLDING │
│ INTC │ BUY │ 0.70 │ 2026-05-14 │ $24.10 │ — │ +$85 │ CLOSED │
│ AMD │ HOLD │ 0.40 │ — │ $145.20 │ — │ — │ WATCH ONLY│
│ ... │ │ │ │ │ │ │ │
└──────┴────────┴──────┴─────────────┴─────────┴─────────┴─────────┴───────────┘
▲ click row → drill-down panel below (per-mention list + per-trade P&L + price chart with mention markers)
┌───────────────────────────────────────────────────────────────────────────────┐
│ Backtest run history (last 5) │
│ [table: run_uuid_short, started_at, holding_days, return, sharpe, status] │
└───────────────────────────────────────────────────────────────────────────────┘
```
**Live prices** come from the `market_data` table (Alpaca-cached daily closes), never from the dashboard's direct Alpaca calls. The API gateway is the single source.
**Empty state** (mention count < 5 or span < 2 trading days): show "Insufficient mention history" but keep the ticker table visible.
### `/meet-kevin/paper-account` — real paper-trading P&L
A separate page, ~50 lines, near-clone of `Portfolio.tsx` with the data filtered to `Trade.strategy_name = "kevin"`. Reuses `EquityCurve`, `MetricsRow`, `PositionsTable` as-is.
Justification for two pages, not one: `/strategy` is "what if" (backtest history + signal universe); `/paper-account` is "what actually happened" (real fills, real positions). Overloading muddles the lenses.
### Sidebar nav
Adds two entries under the existing "Meet Kevin" group:
- Strategy (after Stocks)
- Paper Account
### Dual-line chart component (NEW, not contingent)
`EquityCurve.tsx` renders exactly one `LineSeries`. The strategy-vs-SPY chart needs two series with a legend. **Definitely create a new component** `dashboard/src/components/meetKevin/StrategyVsBenchmarkCurve.tsx` (~80 lines, lightweight-charts) — don't extend `EquityCurve` in place (it's used by other pages).
---
## API surface
All under existing `/api/meet-kevin/*` prefix. New router file `routes/meet_kevin_backtest.py` (keeps `meet_kevin.py` from ballooning).
```
POST /api/meet-kevin/backtest/run → 202 {run_uuid, status}
GET /api/meet-kevin/backtest/runs?limit=20 → run list (headline metrics)
GET /api/meet-kevin/backtest/runs/{run_uuid} → full result
GET /api/meet-kevin/backtest/latest → latest scheduled (or most recent)
GET /api/meet-kevin/strategy/tickers → scorecard table data
GET /api/meet-kevin/strategy/equity-curve?from=&to=&include_benchmark=spy
GET /api/meet-kevin/strategy/performance → headline metrics
POST /api/meet-kevin/positions/{symbol}/close → manual close (sets Redis flag)
```
Frontend `dashboard/src/api/meetKevin.ts` gains 7 methods using the same axios `client`.
---
## Code locations
```
NEW
shared/strategies/kevin.py ~120 lines (KevinStrategy)
services/kevin_signal_bridge/__init__.py
services/kevin_signal_bridge/config.py
services/kevin_signal_bridge/main.py ~250 lines (async loop + exit-scan cron)
services/kevin_signal_bridge/aggregator.py multi-mention window logic
services/api_gateway/routes/meet_kevin_backtest.py backtest routes
backtester/adapters/kevin.py KevinMentionAdapter(BaseStrategy)
shared/market_data/alpaca_fetch.py extracted from existing routes/backtest.py
shared/models/trading.py +KevinTrade audit-link table
shared/models/meet_kevin.py +KevinBacktestRun, KevinBacktestTrade models
alembic/versions/<ts>_kevin_backtest_tables.py migration
dashboard/src/pages/meetKevin/Strategy.tsx
dashboard/src/pages/meetKevin/PaperAccount.tsx ~50 lines
dashboard/src/components/meetKevin/TickerScorecardTable.tsx
dashboard/src/components/meetKevin/BacktestRunHistory.tsx
dashboard/src/components/meetKevin/StrategyVsBenchmarkCurve.tsx if EquityCurve extension is messy
MODIFIED
services/trade_executor/risk_manager.py +daily_trade_count, +daily_alloc_sum, +drawdown halt
services/trade_executor/config.py +new env knobs
shared/strategies/__init__.py export KevinStrategy
backtester/metrics.py extend compute_metrics with alpha, beta, winners, losers
dashboard/src/App.tsx +2 routes
dashboard/src/components/Layout.tsx +2 sidebar entries
dashboard/src/api/meetKevin.ts +7 methods
INFRA (separately owned commit)
infra/stacks/trading-bot/main.tf +kevin_signal_bridge + trade_executor containers
```
---
## Config (env vars, all `TRADING_` prefix)
```
# Signal translation
TRADING_KEVIN_MIN_CONVICTION = 0.60
TRADING_KEVIN_MAX_MENTION_AGE_HOURS = 48
TRADING_KEVIN_HOLD_DAYS = '{"days":3,"weeks":10,"months":45,"long_term":90,"unspecified":10}'
# Sizing
TRADING_KEVIN_BASE_POSITION_PCT = 0.04
TRADING_KEVIN_MIN_TRADE_USD = 500
TRADING_KEVIN_MAX_TRADE_USD = 5000
TRADING_KEVIN_MAX_PER_TICKER_USD = 7500
# Exits
TRADING_KEVIN_STOP_LOSS_PCT = 0.08
TRADING_KEVIN_TAKE_PROFIT_PCT = 0.20
TRADING_KEVIN_AVOID_BLOCKLIST_DAYS = 7
# Aggregation
TRADING_KEVIN_MENTION_BOOST_PER_REPEAT = 0.05
TRADING_KEVIN_MAX_MENTION_BOOST = 0.20
# Risk
TRADING_KEVIN_MAX_POSITION_PCT = 0.075
TRADING_KEVIN_DAILY_TRADE_CAP = 5
TRADING_KEVIN_DAILY_ALLOC_CAP_USD = 15000
TRADING_KEVIN_DAILY_LOSS_CIRCUIT_PCT = 0.03
TRADING_KEVIN_EQUITY_DRAWDOWN_HALT_PCT = 0.80
# Backtest
TRADING_KEVIN_BACKTEST_SLIPPAGE_PCT = 0.0005
TRADING_KEVIN_BACKTEST_INITIAL_CAPITAL = 100000
TRADING_KEVIN_BACKTEST_DAILY_CRON = "30 16 * * 1-5" # 4:30pm ET post-close
# Bridge plumbing
TRADING_KEVIN_BRIDGE_POLL_INTERVAL_SECONDS = 60
TRADING_KEVIN_BRIDGE_EXIT_SCAN_CRON = "35 9 * * 1-5" # 9:35 ET market open
TRADING_KEVIN_ENABLE_TRADING = "false" # MASTER KILL-SWITCH — default OFF
```
**Master kill-switch: `TRADING_KEVIN_ENABLE_TRADING=false` by default.** Bridge writes audit rows but never publishes to `signals:generated`. Flip to `true` only after the backtest looks sane.
---
## Edge cases
| Case | Handling |
|---|---|
| Symbol not tradable on Alpaca | `kevin_signal_bridge_state.bridge_status='skipped_non_tradable'` |
| Crypto / OTC / pink sheet | Filtered by `asset.class_ == 'us_equity'` |
| Conviction = 0 | Filtered by 0.60 floor |
| `time_horizon='unspecified'` | Default to 10 trading days |
| Mention created after market close | Re-queued with `bridge_status='deferred'`; next morning's poll picks up if still <48h |
| Alpaca 5xx | `submit_order` returns `OrderStatus.REJECTED`; bridge logs to `kevin_signal_bridge_state` |
| Partial fill | BRACKET order's `filled_avg_price` is used; stop/TP legs attach to filled qty |
| Same mention processed twice | Redis cursor + `INSERT…ON CONFLICT DO NOTHING` audit |
| Kevin reverses while stop-loss order is open | Cancel bracket legs via `broker.cancel_order`, then market SELL |
| Backtest: ticker has no `market_data` rows | Lazy back-fetch from Alpaca for `[entry-1d, end+1d]`; if Alpaca returns nothing, mark trade `skipped` and exclude from denominators |
| Backtest: <5 mentions OR <2 trading-day span | API returns 400 `insufficient_history` |
| Backtest: weekend/holiday mention | Entry shifts to next trading session open |
| Backtest: SPY price gap | Forward-fill from previous trading day |
| Backtest >60s | Status remains `running`; dashboard keeps polling. Watchdog kills at 5min, marks `failed` |
---
## Testing strategy
Mirrors v1's bar (404 tests, 1 behavior per test, no internal mocking of own code):
```
tests/strategies/test_kevin_strategy.py ~25 tests
- per-action signal generation (buy/sell/hold/watch/avoid × held/not-held)
- conviction threshold filtering
- mention age filtering
- time_horizon → holding_days mapping
- non-tradable symbol skip
- blocklist hit short-circuits LONG
tests/services/kevin_signal_bridge/test_main.py ~15 tests
- cursor advances, dedupe via ON CONFLICT
- multi-mention aggregation (latest wins, conviction boost capped)
- daily cap counter increments + resets at UTC midnight
- master kill-switch suppresses publish
- exit-scan emits SELL for elapsed holds
tests/services/kevin_signal_bridge/test_aggregator.py ~10 tests
tests/backtester/test_kevin_adapter.py ~15 tests
- happy path: 3 mentions → 3 backtest trades with correct entry/exit
- dedupe roll: 2nd buy on held ticker extends exit
- SPY benchmark math (alpha/beta)
- extended metrics correctness (winners, losers, best, worst)
tests/api_gateway/routes/test_meet_kevin_backtest.py ~8 tests
- POST /backtest/run returns 202 with run_uuid
- GET /backtest/runs/{uuid} returns full result
- GET /strategy/tickers shapes correctly
- insufficient_history 400
tests/integration/test_kevin_e2e_paper.py [@integration] ~3 tests
- full pipeline against docker-compose Postgres + Redis, mocked Alpaca
Manual QA
- dashboard /meet-kevin/strategy renders backtest + ticker table
- dashboard /meet-kevin/paper-account renders Kevin-only positions
```
Target: ~76 new unit + integration tests.
---
## Phasing (3 phases, staged)
Adjusted after challenger review — Phase A was leaking live-state into the ticker scorecard (which expects HOLDING positions). Merged A+B into Phase 1.
1. **Phase 1 — Strategy + backtest + bridge (audit-only).** Land `KevinStrategy` (standalone), `backtester/kevin_backtest.py`, all 3 new tables, backtest API routes, `kevin_signal_bridge` service. Bridge runs with `TRADING_KEVIN_ENABLE_TRADING=false` — writes `kevin_signal_bridge_state` audit rows showing what it WOULD trade, but never publishes to `signals:generated`. `/meet-kevin/strategy` page shows backtest history + a ticker scorecard with "WOULD-TRADE" status badges (no live positions yet). User reviews the bridge's decisions against the backtest before flipping the switch.
2. **Phase 2 — Trade executor re-enabled, kill-switch ON.** Add `trade_executor` container back to the K8s Pod. Extend `OrderRequest`/`AlpacaBroker` for BRACKET orders. Seed the `kevin` row in `strategies`. Flip `TRADING_KEVIN_ENABLE_TRADING=true`. Live paper trades begin. Ticker scorecard now shows real HOLDING/CLOSED rows.
3. **Phase 3 — Paper-account UI.** Land `/meet-kevin/paper-account` page (Kevin-strategy-only slice of `Portfolio.tsx` filtered by `strategy_id`).
Each phase is independently deployable. User reviews after each.
---
## Out of scope (intentional)
- Real money (live brokerage account)
- Short selling
- Options
- Crypto
- Multi-account allocation
- Per-mention attribution analytics ("high-conviction-only" filter view)
- CSV export of trades
- Strategy parameter sweeps in the UI (grid search via `scripts/` for now)
- "What-if" UI for tweaking conviction thresholds — the live strategy owns that
- Trailing stops
- End-of-week rebalances
- Sector concentration limits
---
## Success criteria
- Backtest runs against the current 63+ mentions and produces a coherent equity curve + metrics card
- The strategy page renders the ticker scorecard with live prices, action chips, and conviction bars
- After flipping the kill-switch, at least one paper trade executes via Alpaca within 60s of a new Kevin mention that passes the filters
- `/meet-kevin/paper-account` shows the Kevin-attributed equity curve diverging from the unified `/portfolio` view
- All hard-cap risk controls fire correctly in tests (verified by integration test that synthesises a 5-mention burst)
---
## Open questions for the user
1. **Phasing:** ship Phase 1 → 2 → 3 with review checkpoints, or hand all three to one big subagent push?
2. **Initial paper capital:** $100k Alpaca default — or override to a smaller number ($10k?) for less abstract dollar signs?
3. **`avoid` policy split:** the design splits into `AVOID_CLOSES_LONGS` (default `true`) + `AVOID_BLOCKS_DAYS` (default `7`). Subsequent `buy` clears the blocklist. Acceptable?
4. **`/meet-kevin/paper-account` vs reusing `/portfolio`:** two pages (Kevin slice) or one with a Kevin filter?
Defaults if no override: stage all 3 phases, $100k, split-knob avoid, two pages.