trading/docs/plans/2026-05-21-meet-kevin-revival-design.md
Viktor Barzin ab382af3f5 add Meet Kevin revival design document
Design for reviving the trading-bot K8s stack with a new Meet Kevin
YouTube watcher pipeline. v1 scope: poll RSS every 3h, extract
captions via yt-dlp, run transcript through Claude Sonnet 4.6 for
structured per-ticker recommendations and market outlook, surface
in a new ticker-centric UI under /meet-kevin/*. Bot integration
(signal_generator) and auto-trading deferred to v2.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-21 18:42:46 +00:00

366 lines
20 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Meet Kevin Revival — Design
## Overview
Revive the trading bot from its current fully-disabled state and add a new pipeline that watches the Meet Kevin (Kevin Paffrath) YouTube channel, extracts captions from every new upload, runs the transcript through Claude Sonnet 4.6 to infer per-ticker recommendations and overall market outlook, and surfaces the results in a ticker-centric UI inside the existing dashboard. Bot integration (feeding the recommendations into `signal_generator`) is explicitly v2; auto-trading stays off in v1.
## Goals
- Subscribe to the Meet Kevin channel and process each new upload end-to-end without human intervention.
- Present sentiment and recommendations keyed on the **stock ticker** (the user's stated primary need), with the video feed as a secondary surface.
- Revive the bot in K8s with the lowest reasonable footprint — keep the scaffolding warm, leave the resource-heavy scrapers and auto-trading off.
## Scope
### In scope (v1)
- New `services/meet_kevin_watcher/` Python service inside `viktor/trading`.
- 5 new Postgres tables via a single Alembic migration.
- `/api/meet-kevin/*` routes added to the existing `api_gateway`.
- 5 new dashboard pages under `/meet-kevin/*`.
- Terraform revival of `infra/stacks/trading-bot/` with 3 services held at `replicas=0` (news-fetcher, sentiment-analyzer, trade-executor) and the new watcher container added to the workers Deployment.
- 1 new pyproject extras group (`meet_kevin`) pinning a current `yt-dlp`.
### Out of scope (v2 or later)
- Signal injection into `signal_generator` (will be wired via a new `kevin:signals` Redis stream + a "kevin" strategy weight).
- Multi-channel support in the UI (schema is multi-channel-ready, surface is single-channel).
- Audio archival or local Whisper fallback (captions-only).
- Slack notification on high-conviction signals.
- Auto-reanalysis when the LLM prompt version changes (manual "Reprocess" only).
## Architecture
```
┌────────────────────────── viktor/trading (revived K8s stack) ──────────────────────────────┐
│ │
│ NEW services/meet_kevin_watcher/ single async service, sequential stages │
│ ├── rss_poller.py every 3h, GET YouTube /feeds/videos.xml, dedupe │
│ ├── caption_extractor.py yt-dlp --write-auto-sub --skip-download │
│ ├── llm_analyzer.py Claude Sonnet 4.6, structured JSON output │
│ └── main.py orchestrates 3 stages, writes Postgres │
│ │
│ EXTEND │
│ services/api_gateway/ +1 router meet_kevin.py + 1 schemas module │
│ dashboard/ +1 page group /meet-kevin/* + components │
│ shared/models/ +kevin_channel, kevin_video, kevin_transcript, │
│ kevin_analysis, kevin_stock_mention │
│ shared/schemas/ +meet_kevin.py │
│ alembic/ +1 migration with 5 tables │
│ │
│ DISABLED (container removed from `trading-bot-workers` Pod): │
│ news_fetcher, sentiment_analyzer, trade_executor │
│ ACTIVE dashboard, api_gateway, (frontend Pod)│
│ signal_generator, learning_engine, market_data, (workers Pod) │
│ meet_kevin_watcher (NEW) (workers Pod) │
│ │
└────────────────────────────────────────────────────────────────────────────────────────────┘
```
### Key architectural choices
- **No Redis streams in v1.** With ~3 videos/day the coordination cost of streams would dwarf the value. The watcher walks each video through all three stages and writes directly to Postgres. A `kevin:signals` stream gets added in v2 when wiring into `signal_generator`.
- **Single long-running async service** (matching the `news_fetcher` pattern). Reuses the same `shared.config.BaseConfig`, `shared.db`, `shared.telemetry` setup as the other services.
- **K8s revival is a single Terraform edit.** No new namespace, no new PVC, no new ingress — transcripts live in Postgres, the embedded YouTube iframe handles video display, no file storage needed.
## Channel
- **Channel**: Meet Kevin (Kevin Paffrath)
- **YouTube channel ID**: `UCUvvj5lwue7PspotMDjk5UA`
- **Feed**: `https://www.youtube.com/feeds/videos.xml?channel_id=UCUvvj5lwue7PspotMDjk5UA`
- **Verified 2026-05-21**: feed parses (Atom), returns 15 most recent uploads, ~23 videos/day.
- Multi-channel support is schema-ready; only this one channel is wired in v1.
## Data model
New Postgres tables (single Alembic migration). All `id` columns are `bigserial`; all timestamp columns are `timestamptz NOT NULL DEFAULT now()`.
```
kevin_channels
id, youtube_channel_id (unique), title,
poll_enabled (bool, default true), poll_interval_seconds (int, default 10800),
daily_cost_cap_usd (numeric(8,2), default 5.00),
last_polled_at (nullable), created_at, updated_at
kevin_videos
id, channel_id (FK kevin_channels.id), youtube_video_id (unique, indexed),
title, description, published_at (indexed), duration_seconds (nullable),
thumbnail_url, status (enum {discovered, captioned, analyzed, failed, skipped}),
failure_reason (nullable), retry_count (int, default 0),
processed_at (nullable), created_at, updated_at
kevin_transcripts
id, video_id (FK kevin_videos.id, unique), source (enum {captions_manual, captions_auto, none}),
language (varchar(8)), raw_text (text), segments_json (jsonb), word_count, created_at
kevin_analyses -- one row per LLM run; supports re-runs
id, video_id (FK kevin_videos.id, indexed), model, prompt_version,
market_outlook_direction (enum {bullish, neutral, bearish, mixed}),
market_outlook_reasoning (text),
macro_themes_json (jsonb), -- array of strings
key_risks_json (jsonb), -- array of strings
summary (text), raw_response_json (jsonb),
prompt_tokens, completion_tokens, cost_usd (numeric(10,4)),
created_at
kevin_stock_mentions -- denormalized per-ticker
id, video_id (FK kevin_videos.id, indexed),
analysis_id (FK kevin_analyses.id),
symbol (varchar(16), indexed), action (enum {buy, sell, hold, watch, avoid}),
conviction (numeric(4,3)), -- 0.000 to 1.000
time_horizon (enum {intraday, days, weeks, months, long_term, unspecified}),
rationale_quote (text),
video_timestamp_seconds (int, nullable), -- deep-link target
created_at
```
Indexes: `(symbol, created_at desc)` on `kevin_stock_mentions` (ticker timeline queries), `(published_at desc)` on `kevin_videos` (video feed), `(status)` partial index on `kevin_videos WHERE status IN ('discovered', 'captioned')` (pipeline scan).
## Pipeline
Three stages, run sequentially per video by a single async loop.
### Stage 1 — RSS poll (every 3h)
1. For each `kevin_channels.poll_enabled = true` row, GET `/feeds/videos.xml?channel_id=<id>`.
2. Parse the Atom feed with `feedparser`.
3. For each entry, `INSERT … ON CONFLICT (youtube_video_id) DO NOTHING` into `kevin_videos` with `status = 'discovered'`.
4. Update `kevin_channels.last_polled_at`.
5. Failure: log warning, increment metric, retry on next 3h cycle. No DLQ — RSS replays the same 15 entries on every poll.
**First-run backfill**: the feed returns the 15 most recent uploads at once, so the first poll inserts up to 15 `discovered` rows. The pipeline processes them sequentially (~60s each, ~$1.50 in Claude calls total). One-off burst, contained by the daily cost cap.
### Stage 2 — Caption extraction
For each video with `status = 'discovered'`, in `published_at` ascending order:
1. `yt-dlp --write-auto-sub --write-sub --sub-lang 'en.*' --skip-download --convert-subs srt -o '<workdir>/<vid>.%(ext)s' 'https://www.youtube.com/watch?v=<vid>'`
2. If the SRT file exists, parse into segments `[{start, end, text}]` and a flat `raw_text`.
3. Insert `kevin_transcripts` row, advance `kevin_videos.status` to `captioned`.
4. Cleanup the workdir.
**Failure modes**:
- No captions available → `status = 'failed'`, `failure_reason = 'no_captions'`. Surfaces in the dashboard with a "Reprocess" button (no-op for now; meaningful when we add Whisper fallback in v2).
- yt-dlp bot-detection / network → increment `retry_count`, exp-backoff (1m, 5m, 15m, 1h, 6h), keep `status = 'discovered'`. After 5 retries → `failed`.
**yt-dlp version**: the system's `apt`-installed yt-dlp (`2024.04.09`) trips YouTube's bot-detection. The Dockerfile pip-installs a current PyPI release (pinned `>=2026.04`).
### Stage 3 — LLM analysis
For each video with `status = 'captioned'`:
1. Check daily cost: `SELECT SUM(cost_usd) FROM kevin_analyses WHERE created_at >= date_trunc('day', now() at time zone 'UTC')`. If `>= daily_cost_cap_usd`, log + skip (video remains `captioned`, resumes tomorrow).
2. Build the Claude Sonnet 4.6 request:
- System prompt (cached via Anthropic prompt caching — 23K tokens describing schema, examples, output format)
- User message: `published_at`, `title`, `description`, plus the full transcript with segment timestamps preserved
- Tool definition that constrains response to the structured JSON schema below
3. Call the API. On HTTP error, exp-backoff retry up to 3x, then `failed`.
4. Validate the tool-input JSON via Pydantic. On parse error, `failed` with `raw_response_json` captured for debugging.
5. Insert one `kevin_analyses` row + N `kevin_stock_mentions` rows in a single transaction. Advance `kevin_videos.status` to `analyzed`.
### LLM output schema (Pydantic, also the tool-input schema)
```python
class MeetKevinTickerMention(BaseModel):
symbol: str # uppercased, validated [A-Z]{1,6}
action: Literal["buy","sell","hold","watch","avoid"]
conviction: float # 0.01.0
time_horizon: Literal["intraday","days","weeks","months","long_term","unspecified"]
rationale_quote: str # short verbatim or paraphrased quote
video_timestamp_seconds: int | None # deep-link target
class MeetKevinAnalysis(BaseModel):
market_outlook_direction: Literal["bullish","neutral","bearish","mixed"]
market_outlook_reasoning: str
macro_themes: list[str]
key_risks: list[str]
summary: str # ~200 words
tickers: list[MeetKevinTickerMention]
```
### Cost & rate
- ~3 videos/day × ~715K input tokens × Sonnet 4.6 → ~$0.070.10 per video
- Prompt cache: the 23K-token system prompt is reused across videos in a 5-min window → ~95% cache hit on the system portion
- Default daily cap **$5/day**, configurable per channel via `daily_cost_cap_usd`
- Expected monthly spend: $310
## API surface
All routes added to `services/api_gateway/routes/meet_kevin.py`. Behind the existing JWT/passkey auth.
```
GET /api/meet-kevin/health
GET /api/meet-kevin/channels
PATCH /api/meet-kevin/channels/{id}
GET /api/meet-kevin/videos ?status=&from=&to=&q=&page=&limit=
GET /api/meet-kevin/videos/{id}
GET /api/meet-kevin/videos/{id}/transcript
POST /api/meet-kevin/videos/{id}/reprocess ?stage={captions|analysis}
GET /api/meet-kevin/stocks
GET /api/meet-kevin/stocks/{symbol}
GET /api/meet-kevin/stocks/{symbol}/timeline ?bucket=day|week
GET /api/meet-kevin/dashboard home aggregate
```
Response schemas live in `shared/schemas/meet_kevin.py` so both backend and frontend (via codegen or hand-written TS interfaces) consume the same shape.
## Dashboard pages
New page group under `dashboard/src/pages/meet-kevin/` with React Router routes. Sidebar gains a "Meet Kevin" section.
| Route | Purpose | Key components |
|---|---|---|
| `/meet-kevin` | Home — latest video card, top-conviction tickers this week, outlook trendline (14d), pipeline status pill | `LatestVideoHero`, `TopConvictionTable`, `OutlookTrendChart` |
| `/meet-kevin/videos` | Feed with filters (status, date range, ticker search) | `VideoFeed`, `VideoCard`, `FilterBar` |
| `/meet-kevin/videos/:id` | Detail — embedded YouTube iframe + tabs (Analysis / Transcript / Raw) | `YouTubeEmbed`, `AnalysisPanel`, `TranscriptPanel`, `TickerCard` |
| `/meet-kevin/stocks` | **Primary surface** — sortable table: symbol, mention count, last seen, latest action, avg conviction, 14d sparkline | `StocksTable`, `Sparkline` |
| `/meet-kevin/stocks/:symbol` | Drill-down — sentiment trendline + chronological mention list + (optional) market-data price overlay | `TickerHeader`, `SentimentTrendChart`, `MentionList` |
**Stack reuse** (all already in `dashboard/package.json`): TanStack Query, Tailwind 4, recharts, lightweight-charts (for the optional price overlay), React Router 7. No new top-level dependencies.
**Empty states**: home shows "Waiting for first poll" with a manual-trigger button. Stocks list shows "No analyses yet" until Claude returns first response.
**Frontend stack note**: `~/.claude/CLAUDE.md` defaults new web apps to Svelte; this is extending the existing React 19 dashboard, so React stays. Documented for traceability.
## K8s revival changes
Single edit to `~/code/infra/stacks/trading-bot/main.tf` (file is currently inside a `/* … */` block-comment, disabled 2026-04-06):
1. **Uncomment the entire `/* … */` block.**
2. In the `trading-bot-workers` Deployment **remove the 3 disabled containers** from the Pod spec: `news-fetcher`, `sentiment-analyzer`, `trade-executor`. (Note: all 6 workers share one Pod, so "disable" means delete the container block — there is no per-container `replicas` knob.)
3. **Add a new `meet-kevin-watcher` container** to the same Pod spec:
```hcl
container {
name = "meet-kevin-watcher"
image = "viktorbarzin/trading-bot-service:latest"
image_pull_policy = "Always"
command = ["python", "-m", "services.meet_kevin_watcher.main"]
dynamic "env" {
for_each = local.common_env
content { name = env.key, value = env.value }
}
env { name = "TRADING_OTEL_METRICS_PORT", value = "9097" }
env_from { secret_ref { name = "trading-bot-secrets" } }
env_from { secret_ref { name = "trading-bot-db-creds" } }
resources {
requests = { cpu = "10m", memory = "128Mi" }
limits = { memory = "256Mi" }
}
}
```
4. **Final container order** in the `trading-bot-workers` Pod (4 containers):
- `[0]` signal-generator
- `[1]` learning-engine
- `[2]` market-data
- `[3]` meet-kevin-watcher (NEW)
Update `lifecycle.ignore_changes` accordingly: 4 image-index lines (`spec[0].template[0].spec[0].container[0..3].image`) plus the existing `dns_config` line.
5. Extend the `external_secret` ExternalSecret template with two new keys:
```
TRADING_ANTHROPIC_API_KEY = "{{ .anthropic_api_key }}"
TRADING_MEET_KEVIN_CHANNEL_ID = "{{ .meet_kevin_channel_id }}"
```
and add the matching `data` entries.
6. Extend `locals.common_env`:
```
TRADING_MEET_KEVIN_POLL_INTERVAL_SECONDS = "10800" # 3h
TRADING_MEET_KEVIN_DAILY_COST_CAP_USD = "5"
TRADING_MEET_KEVIN_LLM_MODEL = "claude-sonnet-4-6"
TRADING_MEET_KEVIN_PROMPT_VERSION = "v1"
```
No Helm changes, no ingress changes, no new namespace, no new PVC, no new Service.
### Vault secrets
Two new keys at `secret/trading-bot` (Vault KV v2):
```
anthropic_api_key = sk-ant-... # Claude API key (pay-per-use)
meet_kevin_channel_id = UCUvvj5lwue7PspotMDjk5UA
```
Populate via:
```
vault kv patch secret/trading-bot \
anthropic_api_key=sk-ant-... \
meet_kevin_channel_id=UCUvvj5lwue7PspotMDjk5UA
```
The existing ExternalSecret refresh interval (15m) picks them up.
## Container & dependency changes
`pyproject.toml` — new optional dependency group:
```toml
meet_kevin = [
"yt-dlp>=2026.04", # avoid the apt-shipped 2024.04.09 (bot-detected by YouTube)
"feedparser>=6.0",
"anthropic>=0.40",
]
```
`docker/Dockerfile.service` — add `EXTRAS=meet_kevin` to the install matrix so the watcher container has its deps.
## Observability
Reuses the existing OpenTelemetry → Prometheus pattern. New metrics (port 9097):
```
meet_kevin_videos_discovered_total{channel}
meet_kevin_captions_extracted_total{result="ok|no_captions|error"}
meet_kevin_llm_calls_total{result="ok|api_error|parse_error|cost_capped"}
meet_kevin_llm_cost_usd_total # counter
meet_kevin_llm_cost_usd_today # gauge, resets at UTC midnight
meet_kevin_pipeline_lag_seconds{stage} # published_at → stage transition
```
Existing pod-annotation scrape config auto-discovers the new port.
## Failure modes (consolidated)
| Stage | Failure | Behavior | User-visible signal |
|---|---|---|---|
| RSS | network / 5xx | log + retry next 3h | `/health` flags `last_poll_age > 6h` |
| RSS | XML parse error | log + skip cycle | same |
| Caption | no captions | `failed` / `no_captions` | "Reprocess" button (no-op v1) |
| Caption | yt-dlp / network | exp-backoff to 5x → `failed` | per-row "Reprocess" |
| LLM | API error | exp-backoff 3x → `failed` (raw response captured) | "Reprocess"; Raw JSON tab |
| LLM | malformed JSON | `failed`, raw response retained | Raw JSON tab |
| LLM | daily cost cap hit | stop processing, leave `captioned` | pipeline-status badge: "Cost cap reached, resumes 00:00 UTC" |
## Testing strategy
Match the existing test-quality bar (1 behavior per test, no internal mocking of own code).
```
tests/services/meet_kevin_watcher/
test_rss_parser.py fixture: committed Meet Kevin RSS XML; parse + dedupe paths
test_caption_extractor.py mock yt-dlp subprocess; SRT parse, no_captions path
test_llm_analyzer.py mock anthropic.Anthropic.messages.create; prompt structure,
response validation, cost calc, daily cap logic
test_pipeline.py status transitions, retry, failure paths
(pytest-asyncio + in-memory SQLite where viable)
tests/integration/test_meet_kevin_e2e.py [@pytest.mark.integration]
full Postgres + Redis; mock the external HTTP layer only (RSS, yt-dlp, Anthropic);
assert discovered → captioned → analyzed flow + ticker mentions row count
tests/api_gateway/routes/test_meet_kevin.py
FastAPI TestClient per endpoint, auth bypass via existing dev-mode flag
dashboard/src/pages/meet-kevin/__tests__/ (manual QA only — repo has no FE test suite)
```
Target: ~30 unit tests + 1 integration test + 8 API tests.
## Future work (v2 candidates)
1. **Signal injection into `signal_generator`**: new `kevin:signals` Redis stream, new "kevin" strategy weight in the ensemble, opt-in approval queue in the UI.
2. **Whisper API fallback** when captions are missing — currently `no_captions` is terminal.
3. **Multi-channel support** in the UI (schema already supports it).
4. **Slack notification** on new high-conviction mention (`conviction > 0.8`).
5. **Auto re-analysis on prompt version bump** — currently manual via "Reprocess" button.
6. **Audio archival** to NFS for offline reprocessing or training data.