trading/docs/plans/2026-05-23-meet-kevin-paper-trading-design.md
Viktor Barzin 280f807236 add Meet Kevin v2 paper-trading + backtest + UI design
Synthesizes work of two parallel architect agents (strategy +
paper-trading rules / backtest + UI surface) and the subsequent
challenger review. Resolves 11 issues the challenger raised:

- KevinStrategy is standalone, not BaseStrategy subclass (signature
  mismatch — BaseStrategy.evaluate is bar-driven, Kevin is event-driven)
- backtester/kevin_backtest.py as parallel mention-driven mini-engine,
  not a fake adapter onto BacktestEngine
- AlpacaBroker BRACKET support specified (OrderRequest schema + broker
  _build_order_request extensions)
- Filtering paper-account trades via strategy_id FK (the actual field;
  Trade.strategy_name doesn't exist) — migration seeds a 'kevin' row
- Cursor advance race fixed (XADD success → cursor advance)
- Daily counter mechanics specified (Redis INCR + audit dedupe)
- kevin_signal_bridge_state table added to data model (3 new tables now)
- All PKs UUID for consistency with Trade/Position
- StrategyVsBenchmarkCurve.tsx promoted from contingent to definitely-new
- 'avoid' policy split into AVOID_CLOSES_LONGS + AVOID_BLOCKS_DAYS knobs
- Phasing collapsed A+B into Phase 1 (ticker scorecard needs bridge
  audit rows to render WOULD-TRADE badges)
2026-05-23 10:04:04 +00:00

633 lines
34 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Meet Kevin v2 — Paper Trading + Backtest + Strategy UI — Design
> Synthesises the work of two parallel architect agents:
> - `kevin-strategy-design.md` (strategy + paper-trading rules) — Architect A
> - `kevin-backtest-ui-design.md` (backtest + UI surface) — Architect B
## Goal
Translate Meet Kevin's per-mention signals into automated Alpaca paper trades, surface the strategy's performance against a benchmark in a dedicated UI, and prove it works through a backtest before any real money would be considered.
## Premise
A new `kevin_stock_mentions` row is the primary stimulus. Everything else is plumbing. **Long-only on Alpaca paper for v1.**
## Scope summary
| Area | Owner | New / Existing |
|---|---|---|
| Strategy logic (signal → trade) | `shared/strategies/kevin.py` (NEW) | `BaseStrategy` extension |
| Mention → signal bridging | `services/kevin_signal_bridge/` (NEW) | New service container |
| Trade execution | `services/trade_executor/` | EXISTING — re-enable, no behavior change |
| Risk controls | `services/trade_executor/risk_manager.py` | EXISTING — extend in place |
| Backtest engine | `backtester/` | EXISTING — add `KevinMentionAdapter` |
| Metrics | `backtester/metrics.py` | EXISTING — extend `compute_metrics` |
| Backtest API + persistence | `services/api_gateway/routes/meet_kevin_backtest.py` (NEW) + 2 tables | New |
| Strategy UI page | `dashboard/src/pages/meetKevin/Strategy.tsx` (NEW) | New |
| Paper account UI page | `dashboard/src/pages/meetKevin/PaperAccount.tsx` (NEW) | ~50-line clone of `Portfolio.tsx` |
Total new files: ~12. Total modified files: ~6. Estimated implementer time: 68 hours across a subagent team.
---
## Architecture
```
Postgres
┌──────────────────────┐
│ kevin_stock_mentions │ ← (v1 watcher writes; unchanged)
└──────────┬───────────┘
│ poll every 60s
┌──────────────────────────────────────────────────────────┐
│ services/kevin_signal_bridge/ NEW │
│ - cursor in Redis (kevin:bridge:last_mention_id) │
│ - applies signal filter (action, conviction, age) │
│ - aggregates multi-mention boost │
│ - checks Redis blocklist + per-day caps │
│ - calls KevinStrategy.evaluate(mention, snapshot) │
│ - daily exit-scan job @ 09:35 ET (time-based exits) │
│ - watches trades:executed for fill confirmations │
│ - master kill-switch: TRADING_KEVIN_ENABLE_TRADING=off │
└────────────────┬─────────────────────────────────────────┘
│ publishes TradeSignal with
│ strategy_sources=["kevin:<action>:<conviction>"]
Redis Stream signals:generated
┌──────────────────────────────────────────────────────────┐
│ services/trade_executor/ EXISTING │
│ - RiskManager.check_risk (extended with new caps) │
│ - AlpacaBroker.submit_order (BRACKET: stop+take-profit)│
│ - Trade persistence with strategy_name="kevin" │
│ - publishes trades:executed │
└────────────────┬─────────────────────────────────────────┘
Alpaca Paper (long-only, BRACKET orders)
Redis Stream trades:executed
portfolio_sync (existing) — equity curve, positions
Parallel observability path
┌──────────────────────────────────────────────────────────┐
│ services/api_gateway/routes/meet_kevin_backtest.py NEW │
│ - POST /backtest/run → async job, polls Redis status │
│ - calls backtester/engine.BacktestEngine with │
│ KevinMentionAdapter(KevinStrategy) │
│ - persists final run to kevin_backtest_runs │
│ + per-trade detail in kevin_backtest_trades │
│ - meet_kevin_watcher gains a daily post-poll cron │
│ stage that triggers the scheduled run │
└──────────────────────────────────────────────────────────┘
```
### Key architectural rule
**One source of truth for strategy behavior: `shared/strategies/kevin.py::KevinStrategy`.**
`KevinStrategy` is a **stand-alone class**, NOT a `BaseStrategy` subclass — the existing `BaseStrategy.evaluate(ticker, market: MarketSnapshot, sentiment)` signature is bar/article-driven and would force the Kevin code to fabricate fake `MarketSnapshot` objects. Instead `KevinStrategy` exposes:
```python
async def evaluate_mention(
self,
mention: KevinStockMention,
account_state: KevinAccountState,
) -> KevinDecision | None
```
Both the live `kevin_signal_bridge` AND the backtest engine call this method directly. The unification guarantee comes from "one class, one method, called by both" — not from an inheritance decoration. This applies to:
- Action filtering (which actions trigger trades)
- Conviction floor
- Position sizing (conviction-weighted)
- Holding period (time_horizon-derived)
- Multi-mention aggregation (latest-wins + conviction boost)
- Long-only treatment (sell/avoid semantics)
The backtest *parameterises* slippage + commission + initial capital (per-run knobs) but **never overrides** strategy logic.
---
## Signal translation rules
A `kevin_stock_mentions` row produces (or cancels) exactly one `TradeSignal` when **all** of:
| Condition | Threshold | Rationale |
|---|---|---|
| `action` ∈ {buy, sell, avoid} | Mandatory | hold/watch never trade — UI-only |
| `conviction` | `>= 0.60` | Kevin makes many low-conviction asides |
| `time_horizon` | not `intraday` | 3h poll cadence can't catch intraday |
| Age of mention | `<= 48h` since `created_at` | Past 48h, thesis is stale; backtest handles older |
| Symbol tradable on Alpaca | `broker.get_asset(symbol).tradable == True` | Skip OTC / foreign / crypto |
| `asset.class_ == 'us_equity'` | Hardcoded | v1: no crypto, no options |
**`time_horizon` → target hold period** (used for time-based exit):
| `time_horizon` | Trading days held |
|---|---|
| `days` | 3 |
| `weeks` | 10 |
| `months` | 45 |
| `long_term` | 90 (re-enter via fresh mention if Kevin re-affirms) |
| `unspecified` | 10 (default to weeks — Kevin's modal horizon) |
| `intraday` | filtered out |
**Long-only direction mapping:**
| Action | Currently held? | Behavior |
|---|---|---|
| `buy` | no | Open at sized target |
| `buy` | yes, below per-ticker cap | Top up to cap |
| `buy` | yes, at cap | No-op |
| `sell` | yes | Emit market SELL for full position |
| `sell` | no | No-op (never short in v1) |
| `avoid` | yes | Emit market SELL (controlled by `AVOID_CLOSES_LONGS=true`) + add Redis blocklist `kevin:blocked:{symbol}` with TTL `AVOID_BLOCKS_DAYS=7` |
| `avoid` | no | Add to blocklist; no trade |
**Note on `avoid` policy:** the close-action and the blocklist are split into two env knobs (`TRADING_KEVIN_AVOID_CLOSES_LONGS`, `TRADING_KEVIN_AVOID_BLOCKS_DAYS`). A subsequent `buy` mention on the same ticker **clears** the blocklist early — "latest mention wins" applies symmetrically.
| `hold` | — | Never trade |
| `watch` | — | Never trade |
---
## Position sizing
Conviction-weighted fixed-fractional:
```
base_pct = 0.04 # 4% of equity for max-conviction
conviction_mult = (conviction - 0.6) / 0.4 # 0.6 → 0.0; 1.0 → 1.0
target_pct = base_pct * (0.5 + 0.5 * conviction_mult)
target_dollars = clamp(account.equity * target_pct, MIN_USD, MAX_USD)
qty = floor(target_dollars / current_price)
```
| Conviction | `target_pct` | $ for $100k equity |
|---|---|---|
| 0.60 | 2.0% | $2,000 |
| 0.80 | 3.0% | $3,000 |
| 1.00 | 4.0% | $4,000 |
**Hard caps:**
- Per-trade floor: $500
- Per-trade ceiling: $5,000 (5% of $100k starting equity)
- Per-ticker total cost basis ceiling: $7,500 (absorbs multi-mention boost)
Rejected Kelly sizing (no historical edge data yet). Revisit after backtest reports per-conviction-bucket hit rates.
---
## Exit logic
Closes on **first** of:
1. **Time-based:** `holding_days` for that mention's `time_horizon` (table above). Daily cron in `kevin_signal_bridge` at 09:35 ET scans open Kevin trades, emits SELL.
2. **Stop-loss:** -8% from entry. Implemented as Alpaca BRACKET order (`order_class=BRACKET, stop_loss=...`). **Requires extending the broker abstraction** — see §"Broker extension" below.
3. **Take-profit:** +20% from entry. Implemented as Alpaca BRACKET order's `take_profit` leg.
4. **Kevin reverses:** new `sell` or `avoid` mention on the same ticker → cancel bracket legs + market SELL.
5. **Manual close:** `POST /api/meet-kevin/positions/{symbol}/close` (sets a `kevin:manual_close:{symbol}` Redis flag the bridge reads).
**Not implemented (deliberate YAGNI):** trailing stops, partial scaling out, end-of-week rebalance.
### Broker extension (required for BRACKET orders)
`AlpacaBroker._build_order_request` currently emits only `MarketOrderRequest`/`LimitOrderRequest`/`StopOrderRequest` (no BRACKET). `OrderRequest` schema in `shared/schemas/trading.py` lacks `take_profit_price` / `stop_loss_price` / `order_class` fields.
Required changes:
- Extend `OrderRequest` with optional `take_profit_price: Decimal | None`, `stop_loss_price: Decimal | None`, `order_class: Literal["simple","bracket"] = "simple"`.
- `_build_order_request` branches on `order_class`: when `bracket`, returns `MarketOrderRequest(order_class=OrderClass.BRACKET, take_profit=TakeProfitRequest(limit_price=...), stop_loss=StopLossRequest(stop_price=...))`.
- Cancellation: `broker.cancel_order` already exists; for Kevin-reverses we cancel the parent bracket which auto-cancels the legs.
---
## Multi-mention aggregation
Kevin mentions NVDA `buy` three times in 48 hours:
- **Latest mention wins** the action + time_horizon (most recent thesis is canonical).
- **Conviction boost:** `effective_conviction = min(1.0, latest.conviction + 0.05 * (extra_mentions_in_48h))`, capped at +0.20 (i.e. 4 extra mentions).
- **One position per ticker per active window.** No 3x position size.
- If the latest is `sell`/`avoid`, earlier `buy`s are dead — exit immediately.
Aggregation is done in `kevin_signal_bridge` (windowed SQL over `kevin_stock_mentions`); `KevinStrategy` itself stays stateless and per-mention.
---
## Risk controls
Extends `services/trade_executor/risk_manager.py::RiskManager.check_risk`. **All hard caps, all blocking, all configurable:**
| Control | Default | Notes |
|---|---|---|
| Max position size per ticker | 7.5% of equity | Per-ticker cap absorbs multi-mention boost |
| Max total long exposure | 60% of equity | Existing `max_total_exposure_pct` |
| Max concurrent positions | 15 | Existing `max_positions` (lowered from 20) |
| Max NEW trades / day | 5 | New Redis counter `kevin:daily_trades:{YYYYMMDD}` (atomic `INCR`) |
| Max NEW $ allocated / day | $15,000 | New Redis sum counter `kevin:daily_alloc:{YYYYMMDD}` (atomic `INCRBY`) |
**Daily counter mechanics:**
- Owner: `kevin_signal_bridge` writes the counters **after** `trades:executed` confirms a fill (consumes the stream).
- Reset: keys have `EXPIRE 172800` (48h) so they self-clean. `RiskManager.check_risk` reads `kevin:daily_trades:{YYYYMMDD using UTC date}` for today.
- Idempotency: counter increment is keyed off `Trade.id`, deduplicated by an `INSERT … ON CONFLICT DO NOTHING` audit row in `kevin_signal_bridge_state`.
| Daily P&L circuit breaker | -3% of equity | Sets `trading:paused=1` until next market open |
| Account drawdown halt | Equity < 80% of starting | Sets `trading:paused=PERMANENT` until manually cleared |
| Per-ticker cooldown after exit | 60 min | Bumped from 30 |
| Sector concentration | NOT enforced v1 | Alpaca asset metadata lacks sector YAGNI |
The circuit breaker + drawdown halt are new periodic checks running inside `kevin_signal_bridge` (writing `trading:paused` which `RiskManager.check_risk` already honours).
---
## Backtest engine
### Methodology
| Knob | Default | Rationale |
|---|---|---|
| Mention window | `[channel_first_polled_at, run_end)` | Use everything we have |
| Entry fill | next trading session OPEN (T+1) | Avoids look-ahead bias |
| Exit fill | OPEN of `entry_session + holding_days_for_that_mention` | `holding_days` is per-mention from strategy spec |
| Price source | Alpaca daily bars (IEX feed) via `market_data` cache, lazy back-fetch | Same wrapper as existing `routes/backtest.py::_fetch_alpaca_bars` |
| Slippage | 5 bps symmetric | `slippage_pct = 0.0005` |
| Commission | $0 (Alpaca paper) | |
| Initial capital | $100,000 | Match Alpaca paper default |
| Benchmark | SPY buy-and-hold over `[first_mention_at, run_end]` window | |
| Walk-forward vs single-pass | Single-pass | <30 days of data walk-forward statistically meaningless |
| Dedupe policy | `roll` (latest mention extends exit) | Mirrors live `KevinStrategy` aggregation |
### Backtest implementation: separate mention-driven engine
The existing `BacktestEngine` in `backtester/engine.py` is **bar-driven** (iterates daily bars, evaluates all strategies per tick). The Kevin backtest is **event-driven** (mention T+1 open entry time-based exit). Rather than fabricate per-bar evaluations or graft a mention loop onto a bar engine, write a parallel mini-engine:
```
backtester/kevin_backtest.py ~150 lines
- takes (KevinStrategy, list[KevinStockMention], price_loader, params)
- walks mentions chronologically
- calls strategy.evaluate_mention() per mention to get KevinDecision
- simulates fills at T+1 open (entry) and exit_date open (exit)
- tracks portfolio state, equity curve, trade log
- emits BacktestResult (reusing existing dataclass)
- reuses compute_metrics() for headline numbers
```
Keep `BacktestEngine` untouched the 9 existing strategies still backtest there. `compute_metrics` is the shared utility both engines call.
### Metrics
`backtester/metrics.py::compute_metrics` extended **in place** (no fork) with these additional fields:
| Field | Formula |
|---|---|
| `total_return_pct` | `(final - initial) / initial * 100` |
| `annualized_return_pct` | `(final/initial)^(365.25/days) - 1) * 100` |
| `sharpe_ratio` | `mean(daily_ret) / stddev(daily_ret) * sqrt(252)`, rf=0 |
| `max_drawdown_pct` | `max((peak - trough) / peak)` over equity curve |
| `win_rate_pct` | `wins / closed_trades * 100` |
| `avg_winner_pct` | `mean(pnl_pct where pnl > 0)` |
| `avg_loser_pct` | `mean(pnl_pct where pnl <= 0)` |
| `best_trade` / `worst_trade` | `{symbol, pnl_pct}` |
| `alpha_vs_spy_pct` | `total_return_pct - spy_return_pct` |
| `beta_vs_spy` | `cov(daily_ret, spy_daily_ret) / var(spy_daily_ret)` |
### Execution surface
- **On-demand:** `POST /api/meet-kevin/backtest/run` 202 `{run_uuid, status: running}`. Async via `asyncio.Task`; Redis stores `kevin:backtest:status:{uuid}` with 24h TTL for the dashboard's 2s poll.
- **Scheduled:** the `meet_kevin_watcher` service gains a post-analysis cron stage that calls the same internal `run_backtest_for_kevin_strategy()` helper. This produces the canonical "latest" run the UI defaults to.
- **Persistence:** final result lives in `kevin_backtest_runs` + `kevin_backtest_trades` (Postgres). Redis is only the in-flight surface.
### Schema additions (1 Alembic migration)
Three new tables (all `uuid` PKs for consistency with `Trade` / `Position` / `Signal`):
```sql
kevin_signal_bridge_state
id uuid PRIMARY KEY DEFAULT gen_random_uuid(),
mention_id bigint UNIQUE NOT NULL REFERENCES kevin_stock_mentions(id),
bridge_status enum('emitted','skipped_non_tradable','skipped_blocklist',
'skipped_caps','deferred','broker_rejected','dry_run') NOT NULL,
signal_id uuid REFERENCES signals(id), -- nullable; null on skip
trade_id uuid REFERENCES trades(id), -- nullable; null until fill
effective_conviction numeric(4,3), -- after multi-mention boost
decided_at timestamptz NOT NULL DEFAULT now(),
notes text,
INDEX (mention_id),
INDEX (bridge_status, decided_at DESC);
kevin_backtest_runs
id uuid PRIMARY KEY DEFAULT gen_random_uuid(),
run_uuid uuid UNIQUE NOT NULL,
started_at timestamptz NOT NULL DEFAULT now(),
finished_at timestamptz,
status enum('running','completed','failed') NOT NULL,
trigger_source enum('manual','scheduled') NOT NULL DEFAULT 'manual',
params_json jsonb NOT NULL,
metrics_json jsonb,
equity_curve_json jsonb,
benchmark_curve_json jsonb,
error_message text,
INDEX (started_at DESC),
INDEX (status, started_at DESC);
kevin_backtest_trades
id uuid PRIMARY KEY DEFAULT gen_random_uuid(),
run_id uuid NOT NULL REFERENCES kevin_backtest_runs(id) ON DELETE CASCADE,
symbol varchar(16) NOT NULL,
source_mention_id bigint REFERENCES kevin_stock_mentions(id),
entry_at timestamptz NOT NULL,
entry_price numeric(12,4) NOT NULL,
exit_at timestamptz,
exit_price numeric(12,4),
qty numeric(14,4) NOT NULL,
pnl_usd numeric(14,4),
pnl_pct numeric(8,4),
holding_days_actual int,
INDEX (run_id, symbol),
INDEX (run_id, entry_at);
```
**Not reusing the existing `trades` table** those are real fills; backtest trades are hypothetical history; mixing them breaks consumer assumptions.
---
## UI surface
Two new pages under the existing `/meet-kevin/*` group.
### Filtering paper-account trades to "kevin" — strategy_id FK
`Trade` has `strategy_id` (UUID FK to `strategies` table) NOT `strategy_name`. To filter paper-account UI:
1. Migration **seeds** a row in `strategies`: `('kevin', 'Meet Kevin Strategy', current_weight=1.0, active=true)` known UUID generated and pinned as a constant in code.
2. `kevin_signal_bridge` sets `strategy_id` to this UUID on every emitted `TradeSignal`.
3. `trade_executor` already passes `signal.strategy_id` through to `Trade.strategy_id` during persistence.
4. The paper-account API filters: `WHERE trades.strategy_id = $kevin_strategy_uuid`.
### `/meet-kevin/strategy` — backtest + tracking (PRIMARY)
```
┌───────────────────────────────────────────────────────────────────────────────┐
│ Meet Kevin / Strategy [Run backtest ▾] [Refresh] │
├──────────────────┬──────────────────┬──────────────────┬──────────────────────┤
│ Strategy Equity │ Total P&L │ vs SPY (alpha) │ Win rate / # trades │
│ $103,420 +3.4% │ +$3,420 (+3.4%) │ +1.8% │ 64.3% · 14 trades │
└──────────────────┴──────────────────┴──────────────────┴──────────────────────┘
┌───────────────────────────────────────────────────────────────────────────────┐
│ Equity curve — Strategy (blue) · SPY benchmark (gray) │
│ [recharts/lightweight-charts dual-line] │
│ [1W] [2W] [1M] [ALL] │
└───────────────────────────────────────────────────────────────────────────────┘
┌───────────────────────────────────────────────────────────────────────────────┐
│ Ticker scorecard [filter: ____] │
├──────┬────────┬──────┬─────────────┬─────────┬─────────┬─────────┬───────────┤
│ Sym │ Action │ Conv │ Entry date │ Live $ │ Unreal. │ Realized│ Status │
│ NVDA │ BUY │ 0.85 │ 2026-05-15 │ $182.40 │ +$420 │ — │ HOLDING │
│ INTC │ BUY │ 0.70 │ 2026-05-14 │ $24.10 │ — │ +$85 │ CLOSED │
│ AMD │ HOLD │ 0.40 │ — │ $145.20 │ — │ — │ WATCH ONLY│
│ ... │ │ │ │ │ │ │ │
└──────┴────────┴──────┴─────────────┴─────────┴─────────┴─────────┴───────────┘
▲ click row → drill-down panel below (per-mention list + per-trade P&L + price chart with mention markers)
┌───────────────────────────────────────────────────────────────────────────────┐
│ Backtest run history (last 5) │
│ [table: run_uuid_short, started_at, holding_days, return, sharpe, status] │
└───────────────────────────────────────────────────────────────────────────────┘
```
**Live prices** come from the `market_data` table (Alpaca-cached daily closes), never from the dashboard's direct Alpaca calls. The API gateway is the single source.
**Empty state** (mention count < 5 or span < 2 trading days): show "Insufficient mention history" but keep the ticker table visible.
### `/meet-kevin/paper-account` — real paper-trading P&L
A separate page, ~50 lines, near-clone of `Portfolio.tsx` with the data filtered to `Trade.strategy_name = "kevin"`. Reuses `EquityCurve`, `MetricsRow`, `PositionsTable` as-is.
Justification for two pages, not one: `/strategy` is "what if" (backtest history + signal universe); `/paper-account` is "what actually happened" (real fills, real positions). Overloading muddles the lenses.
### Sidebar nav
Adds two entries under the existing "Meet Kevin" group:
- Strategy (after Stocks)
- Paper Account
### Dual-line chart component (NEW, not contingent)
`EquityCurve.tsx` renders exactly one `LineSeries`. The strategy-vs-SPY chart needs two series with a legend. **Definitely create a new component** `dashboard/src/components/meetKevin/StrategyVsBenchmarkCurve.tsx` (~80 lines, lightweight-charts) don't extend `EquityCurve` in place (it's used by other pages).
---
## API surface
All under existing `/api/meet-kevin/*` prefix. New router file `routes/meet_kevin_backtest.py` (keeps `meet_kevin.py` from ballooning).
```
POST /api/meet-kevin/backtest/run → 202 {run_uuid, status}
GET /api/meet-kevin/backtest/runs?limit=20 → run list (headline metrics)
GET /api/meet-kevin/backtest/runs/{run_uuid} → full result
GET /api/meet-kevin/backtest/latest → latest scheduled (or most recent)
GET /api/meet-kevin/strategy/tickers → scorecard table data
GET /api/meet-kevin/strategy/equity-curve?from=&to=&include_benchmark=spy
GET /api/meet-kevin/strategy/performance → headline metrics
POST /api/meet-kevin/positions/{symbol}/close → manual close (sets Redis flag)
```
Frontend `dashboard/src/api/meetKevin.ts` gains 7 methods using the same axios `client`.
---
## Code locations
```
NEW
shared/strategies/kevin.py ~120 lines (KevinStrategy)
services/kevin_signal_bridge/__init__.py
services/kevin_signal_bridge/config.py
services/kevin_signal_bridge/main.py ~250 lines (async loop + exit-scan cron)
services/kevin_signal_bridge/aggregator.py multi-mention window logic
services/api_gateway/routes/meet_kevin_backtest.py backtest routes
backtester/adapters/kevin.py KevinMentionAdapter(BaseStrategy)
shared/market_data/alpaca_fetch.py extracted from existing routes/backtest.py
shared/models/trading.py +KevinTrade audit-link table
shared/models/meet_kevin.py +KevinBacktestRun, KevinBacktestTrade models
alembic/versions/<ts>_kevin_backtest_tables.py migration
dashboard/src/pages/meetKevin/Strategy.tsx
dashboard/src/pages/meetKevin/PaperAccount.tsx ~50 lines
dashboard/src/components/meetKevin/TickerScorecardTable.tsx
dashboard/src/components/meetKevin/BacktestRunHistory.tsx
dashboard/src/components/meetKevin/StrategyVsBenchmarkCurve.tsx if EquityCurve extension is messy
MODIFIED
services/trade_executor/risk_manager.py +daily_trade_count, +daily_alloc_sum, +drawdown halt
services/trade_executor/config.py +new env knobs
shared/strategies/__init__.py export KevinStrategy
backtester/metrics.py extend compute_metrics with alpha, beta, winners, losers
dashboard/src/App.tsx +2 routes
dashboard/src/components/Layout.tsx +2 sidebar entries
dashboard/src/api/meetKevin.ts +7 methods
INFRA (separately owned commit)
infra/stacks/trading-bot/main.tf +kevin_signal_bridge + trade_executor containers
```
---
## Config (env vars, all `TRADING_` prefix)
```
# Signal translation
TRADING_KEVIN_MIN_CONVICTION = 0.60
TRADING_KEVIN_MAX_MENTION_AGE_HOURS = 48
TRADING_KEVIN_HOLD_DAYS = '{"days":3,"weeks":10,"months":45,"long_term":90,"unspecified":10}'
# Sizing
TRADING_KEVIN_BASE_POSITION_PCT = 0.04
TRADING_KEVIN_MIN_TRADE_USD = 500
TRADING_KEVIN_MAX_TRADE_USD = 5000
TRADING_KEVIN_MAX_PER_TICKER_USD = 7500
# Exits
TRADING_KEVIN_STOP_LOSS_PCT = 0.08
TRADING_KEVIN_TAKE_PROFIT_PCT = 0.20
TRADING_KEVIN_AVOID_BLOCKLIST_DAYS = 7
# Aggregation
TRADING_KEVIN_MENTION_BOOST_PER_REPEAT = 0.05
TRADING_KEVIN_MAX_MENTION_BOOST = 0.20
# Risk
TRADING_KEVIN_MAX_POSITION_PCT = 0.075
TRADING_KEVIN_DAILY_TRADE_CAP = 5
TRADING_KEVIN_DAILY_ALLOC_CAP_USD = 15000
TRADING_KEVIN_DAILY_LOSS_CIRCUIT_PCT = 0.03
TRADING_KEVIN_EQUITY_DRAWDOWN_HALT_PCT = 0.80
# Backtest
TRADING_KEVIN_BACKTEST_SLIPPAGE_PCT = 0.0005
TRADING_KEVIN_BACKTEST_INITIAL_CAPITAL = 100000
TRADING_KEVIN_BACKTEST_DAILY_CRON = "30 16 * * 1-5" # 4:30pm ET post-close
# Bridge plumbing
TRADING_KEVIN_BRIDGE_POLL_INTERVAL_SECONDS = 60
TRADING_KEVIN_BRIDGE_EXIT_SCAN_CRON = "35 9 * * 1-5" # 9:35 ET market open
TRADING_KEVIN_ENABLE_TRADING = "false" # MASTER KILL-SWITCH — default OFF
```
**Master kill-switch: `TRADING_KEVIN_ENABLE_TRADING=false` by default.** Bridge writes audit rows but never publishes to `signals:generated`. Flip to `true` only after the backtest looks sane.
---
## Edge cases
| Case | Handling |
|---|---|
| Symbol not tradable on Alpaca | `kevin_signal_bridge_state.bridge_status='skipped_non_tradable'` |
| Crypto / OTC / pink sheet | Filtered by `asset.class_ == 'us_equity'` |
| Conviction = 0 | Filtered by 0.60 floor |
| `time_horizon='unspecified'` | Default to 10 trading days |
| Mention created after market close | Re-queued with `bridge_status='deferred'`; next morning's poll picks up if still <48h |
| Alpaca 5xx | `submit_order` returns `OrderStatus.REJECTED`; bridge logs to `kevin_signal_bridge_state` |
| Partial fill | BRACKET order's `filled_avg_price` is used; stop/TP legs attach to filled qty |
| Same mention processed twice | Redis cursor + `INSERT…ON CONFLICT DO NOTHING` audit |
| Kevin reverses while stop-loss order is open | Cancel bracket legs via `broker.cancel_order`, then market SELL |
| Backtest: ticker has no `market_data` rows | Lazy back-fetch from Alpaca for `[entry-1d, end+1d]`; if Alpaca returns nothing, mark trade `skipped` and exclude from denominators |
| Backtest: <5 mentions OR <2 trading-day span | API returns 400 `insufficient_history` |
| Backtest: weekend/holiday mention | Entry shifts to next trading session open |
| Backtest: SPY price gap | Forward-fill from previous trading day |
| Backtest >60s | Status remains `running`; dashboard keeps polling. Watchdog kills at 5min, marks `failed` |
---
## Testing strategy
Mirrors v1's bar (404 tests, 1 behavior per test, no internal mocking of own code):
```
tests/strategies/test_kevin_strategy.py ~25 tests
- per-action signal generation (buy/sell/hold/watch/avoid × held/not-held)
- conviction threshold filtering
- mention age filtering
- time_horizon → holding_days mapping
- non-tradable symbol skip
- blocklist hit short-circuits LONG
tests/services/kevin_signal_bridge/test_main.py ~15 tests
- cursor advances, dedupe via ON CONFLICT
- multi-mention aggregation (latest wins, conviction boost capped)
- daily cap counter increments + resets at UTC midnight
- master kill-switch suppresses publish
- exit-scan emits SELL for elapsed holds
tests/services/kevin_signal_bridge/test_aggregator.py ~10 tests
tests/backtester/test_kevin_adapter.py ~15 tests
- happy path: 3 mentions → 3 backtest trades with correct entry/exit
- dedupe roll: 2nd buy on held ticker extends exit
- SPY benchmark math (alpha/beta)
- extended metrics correctness (winners, losers, best, worst)
tests/api_gateway/routes/test_meet_kevin_backtest.py ~8 tests
- POST /backtest/run returns 202 with run_uuid
- GET /backtest/runs/{uuid} returns full result
- GET /strategy/tickers shapes correctly
- insufficient_history 400
tests/integration/test_kevin_e2e_paper.py [@integration] ~3 tests
- full pipeline against docker-compose Postgres + Redis, mocked Alpaca
Manual QA
- dashboard /meet-kevin/strategy renders backtest + ticker table
- dashboard /meet-kevin/paper-account renders Kevin-only positions
```
Target: ~76 new unit + integration tests.
---
## Phasing (3 phases, staged)
Adjusted after challenger review — Phase A was leaking live-state into the ticker scorecard (which expects HOLDING positions). Merged A+B into Phase 1.
1. **Phase 1 — Strategy + backtest + bridge (audit-only).** Land `KevinStrategy` (standalone), `backtester/kevin_backtest.py`, all 3 new tables, backtest API routes, `kevin_signal_bridge` service. Bridge runs with `TRADING_KEVIN_ENABLE_TRADING=false` — writes `kevin_signal_bridge_state` audit rows showing what it WOULD trade, but never publishes to `signals:generated`. `/meet-kevin/strategy` page shows backtest history + a ticker scorecard with "WOULD-TRADE" status badges (no live positions yet). User reviews the bridge's decisions against the backtest before flipping the switch.
2. **Phase 2 — Trade executor re-enabled, kill-switch ON.** Add `trade_executor` container back to the K8s Pod. Extend `OrderRequest`/`AlpacaBroker` for BRACKET orders. Seed the `kevin` row in `strategies`. Flip `TRADING_KEVIN_ENABLE_TRADING=true`. Live paper trades begin. Ticker scorecard now shows real HOLDING/CLOSED rows.
3. **Phase 3 — Paper-account UI.** Land `/meet-kevin/paper-account` page (Kevin-strategy-only slice of `Portfolio.tsx` filtered by `strategy_id`).
Each phase is independently deployable. User reviews after each.
---
## Out of scope (intentional)
- Real money (live brokerage account)
- Short selling
- Options
- Crypto
- Multi-account allocation
- Per-mention attribution analytics ("high-conviction-only" filter view)
- CSV export of trades
- Strategy parameter sweeps in the UI (grid search via `scripts/` for now)
- "What-if" UI for tweaking conviction thresholds — the live strategy owns that
- Trailing stops
- End-of-week rebalances
- Sector concentration limits
---
## Success criteria
- Backtest runs against the current 63+ mentions and produces a coherent equity curve + metrics card
- The strategy page renders the ticker scorecard with live prices, action chips, and conviction bars
- After flipping the kill-switch, at least one paper trade executes via Alpaca within 60s of a new Kevin mention that passes the filters
- `/meet-kevin/paper-account` shows the Kevin-attributed equity curve diverging from the unified `/portfolio` view
- All hard-cap risk controls fire correctly in tests (verified by integration test that synthesises a 5-mention burst)
---
## Open questions for the user
1. **Phasing:** ship Phase 1 → 2 → 3 with review checkpoints, or hand all three to one big subagent push?
2. **Initial paper capital:** $100k Alpaca default — or override to a smaller number ($10k?) for less abstract dollar signs?
3. **`avoid` policy split:** the design splits into `AVOID_CLOSES_LONGS` (default `true`) + `AVOID_BLOCKS_DAYS` (default `7`). Subsequent `buy` clears the blocklist. Acceptable?
4. **`/meet-kevin/paper-account` vs reusing `/portfolio`:** two pages (Kevin slice) or one with a Kevin filter?
Defaults if no override: stage all 3 phases, $100k, split-knob avoid, two pages.