Composable: cursor/aggregator/strategy/publisher/audit_writer/broker
all injected. Master kill-switch (kevin_enable_trading=false) routes to
audit-only path. Cursor advances ONLY after XADD succeeds (race fix).
Concrete collaborators wired in subsequent tasks.
Also extends TradeSignal + SignalDirection.EXIT with the optional
fields Kevin paths need (strategy_id, target_dollars, stop_loss_pct,
take_profit_pct).
Stateless: mention + account_state -> KevinDecision. Conviction-weighted
sizing, time_horizon-derived hold periods, hard per-ticker cap. The
bridge and the backtest mini-engine both call evaluate_mention so
behaviour cannot drift.
3 new tables + seeds the 'kevin' row in strategies with a pinned UUID
constant so Trade.strategy_id can be joined back to the strategy across
live + backtest paths.
3 tables (kevin_signal_bridge_state, kevin_backtest_runs,
kevin_backtest_trades) all UUID-keyed for consistency with Trade/Position.
KEVIN_STRATEGY_UUID constant pinned for FK joins from Trade.strategy_id.
Strategy.tsx composes 6-up metrics header, StrategyVsBenchmarkCurve
equity chart, TickerScorecardTable and BacktestRunHistory. Selecting a
backtest run replaces the chart with that run's equity curve. Run
Backtest button fires POST and polls for completion.
App.tsx: +route /meet-kevin/strategy.
Layout.tsx: +sidebar entry 'MK Strategy' under Meet Kevin group.
TickerScorecardTable: bridge-status badges (WOULD-TRADE/HOLDING/etc),
conviction bar, unrealised P&L, manual-close button.
BacktestRunHistory: sortable run list with return/sharpe/alpha columns,
row click selects a run for detail view.
StrategyVsBenchmarkCurve: dual lightweight-charts line (strategy blue,
SPY dashed grey) with legend.
BacktestRun, BacktestRunDetail, StrategyTicker, StrategyEquityCurve,
StrategyPerformance types added to meetKevin.ts. New meetKevinStrategy.ts
with 8 axios methods covering the backtest run/list/get/latest and
strategy tickers/equity-curve/performance/close endpoints.
Maps the design doc (commit 280f807) to bite-sized tasks. Phase 1 ships
strategy + backtest + bridge in audit-only mode; Phase 2 extends
OrderRequest/AlpacaBroker for BRACKET orders, extends RiskManager, and
flips the kill-switch; Phase 3 ships the paper-account UI page.
Each task has Test → Run-and-fail → Implement → Run-and-pass → Commit
steps with concrete code in every step. Implementer can pick up any
task without prior session context.
Synthesizes work of two parallel architect agents (strategy +
paper-trading rules / backtest + UI surface) and the subsequent
challenger review. Resolves 11 issues the challenger raised:
- KevinStrategy is standalone, not BaseStrategy subclass (signature
mismatch — BaseStrategy.evaluate is bar-driven, Kevin is event-driven)
- backtester/kevin_backtest.py as parallel mention-driven mini-engine,
not a fake adapter onto BacktestEngine
- AlpacaBroker BRACKET support specified (OrderRequest schema + broker
_build_order_request extensions)
- Filtering paper-account trades via strategy_id FK (the actual field;
Trade.strategy_name doesn't exist) — migration seeds a 'kevin' row
- Cursor advance race fixed (XADD success → cursor advance)
- Daily counter mechanics specified (Redis INCR + audit dedupe)
- kevin_signal_bridge_state table added to data model (3 new tables now)
- All PKs UUID for consistency with Trade/Position
- StrategyVsBenchmarkCurve.tsx promoted from contingent to definitely-new
- 'avoid' policy split into AVOID_CLOSES_LONGS + AVOID_BLOCKS_DAYS knobs
- Phasing collapsed A+B into Phase 1 (ticker scorecard needs bridge
audit rows to render WOULD-TRADE badges)
First production run hit Anthropic's per-account rate_limit_error (429) trying
to burn through 16 backfill videos in seconds. The SDK's built-in retry can't
recover because the rate limit window resets slower than the 3 retry attempts.
Added meet_kevin_inter_video_sleep_seconds (default 30s) to PipelineDeps and
main's _process_pending_videos loop. 16 backfill videos now take ~8 min (16 * 30s
sleeps + ~30s per LLM call) instead of bursting into the rate limit.
sentiment-analyzer service is disabled in the K8s revival (removed from
the workers Pod spec). Its torch+transformers dependency adds ~2GB to
the image with no runtime benefit. Tests still install it via the
.[dev] extras for the test step.
Image size: ~3GB -> ~1GB.
Previous refactor (89f01ad) moved to OpenRouter because no sk-ant-api-* key
was found in Vault. Turns out claude-agent-service-spare-{1,2} hold
sk-ant-oat01-* OAuth tokens (108 chars, scope user:inference, 1-year TTL,
minted via 'claude setup-token' — see memory id=832).
These tokens work with the Anthropic SDK via the auth_token= constructor
argument (routes to Authorization: Bearer ... instead of x-api-key: ...).
They consume the Enterprise Claude subscription quota rather than
per-call billing, so the OpenRouter zero-credit problem goes away.
- llm_analyzer.py: revert OpenAI client to AsyncAnthropic; tool-use API
+ cache_control restored
- config.py: openrouter_api_key -> anthropic_oauth_token; model slug
reverted from anthropic/claude-sonnet-4.5 -> claude-sonnet-4-5
- main.py: AsyncOpenAI -> AsyncAnthropic(auth_token=...), drop OpenRouter
attribution headers
- pyproject: openai>=1.50 -> anthropic>=0.40 in meet_kevin extras
- tests: mocks ported back to messages.create + tool_use blocks
yt-dlp requires:
- ffmpeg for sub-conversion (--convert-subs srt) and any format mux
- A JS runtime for YouTube player decryption (deno or node)
Without these, every caption extraction attempt in the meet-kevin-watcher
container fails with 'ffmpeg not found' + 'No supported JavaScript runtime
could be found'. Adding both to the runtime stage of Dockerfile.service.
Size impact: ~50 MB.
User's Vault has openrouter_api_key but no direct sk-ant-* Anthropic key.
OpenRouter passes through Claude Sonnet 4.6 (~3% markup over Anthropic
list pricing) and matches the existing gpt_mini_endpoint pattern used
by recruiter-responder.
- Replace anthropic.AsyncAnthropic with openai.AsyncOpenAI + base_url
- Convert Anthropic tool-use API to OpenAI function-calling
- System prompt unchanged (analyst instructions are model-agnostic)
- Drop cache_control (not in OpenAI API); revisit later if cost matters
- Model slug: anthropic/claude-sonnet-4.5 (OpenRouter's current Claude tier)
- Pricing: $3.10/M input, $15.50/M output (OpenRouter pass-through)
- Config field anthropic_api_key -> openrouter_api_key
- pyproject extras: anthropic>=0.40 -> openai>=1.50
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Three issues caught during end-to-end manual QA against docker-compose:
1. SAEnum field columns serialized to Python enum NAMES ('DISCOVERED')
but the DB enum had VALUES ('discovered'). Added `values_callable`
to all 5 SAEnum() declarations in shared/models/meet_kevin.py so they
emit values, matching the migration's enum literals.
2. /dashboard's "last 7 days" / "last 14 days" filters used
`func.cast("7 days", type_=None)` which produced NullType DDL.
Replaced with `text("now() - interval '7 days'")`.
3. /dashboard's outlook trend query repeated `func.date_trunc("day", col)`
in SELECT, GROUP BY and ORDER BY — Postgres treats each as a separate
parameterized expression. Hoisted into a single `day_trunc` variable
so all three clauses reference the same SQL fragment.
All 11 /api/meet-kevin/* endpoints now return valid JSON against a
docker-compose Postgres seeded with one analyzed video + NVDA mention.
- Implement CaptionResult frozen dataclass for structured caption data
- Add parse_srt() to parse SubRip format with flexible timestamp handling
- Add extract_captions() async function using yt-dlp subprocess wrapper
- Prefer manual captions over auto-generated; clean up SRT files after parsing
- Add 16 comprehensive tests covering edge cases (empty input, malformed SRT,
timestamp variations, language extraction, manual vs auto selection)
- Type-safe implementation with full mypy --strict compliance
- Add sample.srt fixture with 3 segments mentioning NVDA for test reference
- Replace all Literal[...] type annotations with corresponding enum classes
(TickerAction, TimeHorizon, MarketOutlook, VideoStatus, TranscriptSource)
for MeetKevinTickerMention, MeetKevinAnalysis, and API response models
(VideoSummary, VideoDetail, StockMention, StockSummary, TimelineBucket)
- Add min_length=1, max_length=10 validation to MeetKevinTickerMention.symbol
- Split test_conviction_edge_cases into two separate boundary tests
- Strengthen test_valid_ticker_mention with assertions for all 6 fields
- Trim no-information docstrings from TranscriptSegment, StockTimeline
- All 60 schema tests pass
22-task plan across 6 phases (data layer, watcher service stages,
API gateway, dashboard, container/deps, K8s revival). Each task
includes TDD checkboxes, file paths, acceptance criteria, and
commit commands. References the design doc (ab382af) for schemas,
prompts, and the Terraform diff.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Design for reviving the trading-bot K8s stack with a new Meet Kevin
YouTube watcher pipeline. v1 scope: poll RSS every 3h, extract
captions via yt-dlp, run transcript through Claude Sonnet 4.6 for
structured per-ticker recommendations and market outlook, surface
in a new ticker-centric UI under /meet-kevin/*. Bot integration
(signal_generator) and auto-trading deferred to v2.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- TradeLog: use created_at (from API) instead of timestamp for date display
- EquityCurve: deduplicate data by day before passing to lightweight-charts,
preventing "data must be asc ordered by time" assertion failure when
multiple snapshots exist on the same day
Made-with: Cursor