docs: map existing codebase
This commit is contained in:
parent
d36ae40df1
commit
bc34c78072
7 changed files with 2003 additions and 0 deletions
246
.planning/codebase/ARCHITECTURE.md
Normal file
246
.planning/codebase/ARCHITECTURE.md
Normal file
|
|
@ -0,0 +1,246 @@
|
|||
# Architecture
|
||||
|
||||
**Analysis Date:** 2025-02-23
|
||||
|
||||
## Pattern Overview
|
||||
|
||||
**Overall:** Event-driven microservices architecture using Redis Streams for inter-service communication.
|
||||
|
||||
**Key Characteristics:**
|
||||
- 8 independent Python services communicating asynchronously via Redis Streams
|
||||
- Layered abstraction for extensibility: broker abstraction (`BaseBroker`), strategy abstraction (`BaseStrategy`), source abstraction for news
|
||||
- Each service consumes from one or more Redis Streams, processes data, persists to PostgreSQL, and publishes downstream results
|
||||
- Shared libraries provide domain models, database layer, telemetry, and abstractions
|
||||
- React TypeScript frontend connected via WebSocket and REST API
|
||||
|
||||
## Layers
|
||||
|
||||
**Service Layer (Event Processors):**
|
||||
- Purpose: Independent microservices that transform data across the trading pipeline
|
||||
- Location: `services/*/main.py` (entry points)
|
||||
- Contains: RSS/Reddit fetchers, sentiment analysis, signal generation, trade execution, learning engine, market data subscriptions, API gateway
|
||||
- Depends on: `shared/` libraries, external APIs (Alpaca, Reddit, Ollama)
|
||||
- Used by: Each other (via Redis Streams), dashboard (via API gateway)
|
||||
|
||||
**Infrastructure/Abstraction Layer:**
|
||||
- Purpose: Pluggable interfaces and database access
|
||||
- Location: `shared/broker/`, `shared/strategies/`, `shared/db.py`
|
||||
- Contains: `BaseBroker` (Alpaca implementation), `BaseStrategy` (Momentum/MeanReversion/NewsDriven), database engine setup
|
||||
- Depends on: SQLAlchemy, alpaca-py, external ML models
|
||||
- Used by: Trade executor, signal generator, backtester
|
||||
|
||||
**Data Layer:**
|
||||
- Purpose: Persistent storage of trades, signals, articles, market data, user credentials
|
||||
- Location: `shared/models/` (SQLAlchemy ORM models)
|
||||
- Contains: 14 tables including `trades`, `signals`, `articles`, `article_sentiments`, `market_data`, `positions`, `users`, `strategies`, strategy weights history
|
||||
- Depends on: PostgreSQL + TimescaleDB extension
|
||||
- Used by: All services for read/write persistence
|
||||
|
||||
**Messaging Layer:**
|
||||
- Purpose: Asynchronous event streaming between services
|
||||
- Location: `shared/redis_streams.py`
|
||||
- Contains: `StreamPublisher` (XADD), `StreamConsumer` (XREADGROUP with consumer groups)
|
||||
- Depends on: Redis Streams
|
||||
- Used by: All services for pub/sub communication
|
||||
|
||||
**Frontend Layer:**
|
||||
- Purpose: User-facing dashboard for monitoring and control
|
||||
- Location: `dashboard/src/`
|
||||
- Contains: React pages, components, API client, WebAuthn auth
|
||||
- Depends on: TanStack Query, TradingView charts, Recharts, React Router
|
||||
- Used by: End users for real-time monitoring
|
||||
|
||||
**Utility Layer:**
|
||||
- Purpose: Cross-cutting concerns
|
||||
- Location: `shared/config.py`, `shared/telemetry.py`, `shared/schemas/`
|
||||
- Contains: Configuration management (Pydantic BaseSettings), OpenTelemetry/Prometheus metrics setup, message schemas (Pydantic v2)
|
||||
- Depends on: pydantic-settings, opentelemetry-sdk, prometheus-client
|
||||
- Used by: All services
|
||||
|
||||
## Data Flow
|
||||
|
||||
**News Ingestion Pipeline:**
|
||||
|
||||
1. `services/news_fetcher/main.py` polls RSS feeds and Reddit on independent schedules
|
||||
2. Uses `RSSSource` and `RedditSource` to fetch articles
|
||||
3. Deduplicates via Redis SET (`news:seen_hashes`)
|
||||
4. Publishes `RawArticle` messages to `news:raw` Redis Stream
|
||||
|
||||
**Sentiment Analysis:**
|
||||
|
||||
1. `services/sentiment_analyzer/main.py` consumes from `news:raw`
|
||||
2. Runs articles through `FinBERTAnalyzer` for sentiment scores
|
||||
3. Falls back to `OllamaAnalyzer` if confidence < threshold (configured via `SentimentAnalyzerConfig.finbert_confidence_threshold`)
|
||||
4. Extracts ticker mentions via regex in `ticker_extractor.py`
|
||||
5. Persists `Article` + `ArticleSentiment` to database
|
||||
6. Publishes `ScoredArticle` messages (one per ticker per article) to `news:scored` Redis Stream
|
||||
|
||||
**Market Data Collection:**
|
||||
|
||||
1. `services/market_data/main.py` fetches OHLCV bars from Alpaca (historical + live)
|
||||
2. Publishes bars to `market:bars` Redis Stream as ticker-specific snapshots
|
||||
3. Consumed by signal generator for technical indicators
|
||||
|
||||
**Signal Generation:**
|
||||
|
||||
1. `services/signal_generator/main.py` consumes from both `news:scored` and `market:bars`
|
||||
2. Maintains in-memory buffers: per-ticker sentiment context (deque of last 50 scores) and market data (SMA, RSI)
|
||||
3. Runs `WeightedEnsemble` combining three strategies:
|
||||
- `MomentumStrategy`: SMA cross-over based
|
||||
- `MeanReversionStrategy`: RSI-based
|
||||
- `NewsDrivenStrategy`: Sentiment-aware
|
||||
4. Weights loaded from Redis cache (default equal weights: 0.333 each)
|
||||
5. Only emits `TradeSignal` when combined strength > threshold (default 0.3)
|
||||
6. Publishes to `signals:generated` Redis Stream, persists `Signal` model to database
|
||||
|
||||
**Trade Execution:**
|
||||
|
||||
1. `services/trade_executor/main.py` consumes from `signals:generated`
|
||||
2. `RiskManager` performs pre-trade checks:
|
||||
- Market hours validation (9:30 AM - 4:00 PM ET)
|
||||
- Max daily loss limit
|
||||
- Per-ticker position limits
|
||||
- Cooldown period after exits
|
||||
3. Calculates position size based on account equity and risk parameters
|
||||
4. Submits order via `AlpacaBroker` (abstraction over alpaca-py)
|
||||
5. Records `Trade` model with order details
|
||||
6. Publishes `TradeExecution` to `trades:executed` Redis Stream
|
||||
|
||||
**Learning & Weight Adjustment:**
|
||||
|
||||
1. `services/learning_engine/main.py` consumes from `trades:executed`
|
||||
2. Identifies position closes by tracking opening trades in Redis hash (`positions:history`)
|
||||
3. `TradeEvaluator` calculates realized P&L
|
||||
4. `WeightAdjuster` uses multi-armed bandit approach:
|
||||
- Attributes P&L credit to contributing strategies
|
||||
- Adjusts weights (with guardrails: min trades threshold, max shift limit, weight floor)
|
||||
- Persists adjustments to `LearningAdjustment` model for auditability
|
||||
5. Saves updated weights to Redis cache
|
||||
|
||||
**Portfolio Sync (Background Task):**
|
||||
|
||||
1. `services/api_gateway/tasks/portfolio_sync.py` runs as background coroutine
|
||||
2. Periodically queries Alpaca for current positions, cash, equity
|
||||
3. Stores `PortfolioSnapshot` in database for historical tracking
|
||||
4. Dashboard polls portfolio endpoint for real-time data
|
||||
|
||||
**API Gateway & Dashboard:**
|
||||
|
||||
1. `services/api_gateway/main.py` FastAPI app with:
|
||||
- REST endpoints (`/api/signals`, `/api/trades`, `/api/portfolio`, `/api/news`, `/api/strategies`, `/api/backtest`, `/api/controls`)
|
||||
- WebSocket endpoint for real-time updates
|
||||
- WebAuthn/JWT authentication
|
||||
2. Dashboard (`dashboard/src/App.tsx`) consumed by authenticated users
|
||||
3. Real-time market data, trades, signals streamed via WebSocket
|
||||
|
||||
**State Management:**
|
||||
|
||||
- **Redis Streams**: Messages flow unidirectionally; consumer groups ensure at-least-once processing
|
||||
- **Redis Cache**: Strategy weights, position history, deduplication hashes
|
||||
- **PostgreSQL**: Long-term storage of trades, signals, articles, market snapshots, users
|
||||
- **In-Memory**: Services maintain per-ticker buffers (sentiment scores, market data) for current processing cycle; cleared/rebuilt as data arrives
|
||||
- **JWT Sessions**: Issued at login, renewed via refresh token endpoint
|
||||
|
||||
## Key Abstractions
|
||||
|
||||
**BaseBroker:**
|
||||
- Purpose: Swap brokerage providers without changing services
|
||||
- Examples: `shared/broker/base.py` (ABC), `shared/broker/alpaca_broker.py` (implementation)
|
||||
- Pattern: Abstract methods for `submit_order()`, `get_account()`, `get_positions()`, `cancel_order()`
|
||||
- Trade executor only knows `BaseBroker` interface; `AlpacaBroker` is a concrete implementation
|
||||
|
||||
**BaseStrategy:**
|
||||
- Purpose: Pluggable trading strategy implementations
|
||||
- Examples: `shared/strategies/base.py` (ABC), `momentum.py`, `mean_reversion.py`, `news_driven.py`
|
||||
- Pattern: Each strategy implements `async evaluate(ticker, market, sentiment)` returning `TradeSignal | None`
|
||||
- Signal generator instantiates all three at startup, runs via `WeightedEnsemble.evaluate()`
|
||||
|
||||
**StreamPublisher / StreamConsumer:**
|
||||
- Purpose: Simplified Redis Streams wrapper with JSON serialization
|
||||
- Examples: `shared/redis_streams.py`
|
||||
- Pattern: `Publisher.publish(dict)` → `XADD`, `Consumer.consume()` → async iterator yielding `(msg_id, data)`
|
||||
- Consumer groups ensure each message is processed once per group
|
||||
|
||||
**Pydantic Schemas:**
|
||||
- Purpose: Message validation and serialization
|
||||
- Examples: `shared/schemas/news.py`, `shared/schemas/trading.py`, `shared/schemas/learning.py`
|
||||
- Pattern: Each stream has a schema (e.g., `RawArticle`, `ScoredArticle`, `TradeSignal`, `TradeExecution`)
|
||||
- Services validate on consume, serialize with `model_dump(mode="json")` before publish
|
||||
|
||||
## Entry Points
|
||||
|
||||
**News Fetcher:**
|
||||
- Location: `services/news_fetcher/main.py`
|
||||
- Triggers: Starts as standalone service, polling on timers
|
||||
- Responsibilities: Fetch RSS/Reddit articles, deduplicate, publish to `news:raw`
|
||||
|
||||
**Sentiment Analyzer:**
|
||||
- Location: `services/sentiment_analyzer/main.py`
|
||||
- Triggers: Consumes messages from `news:raw` stream
|
||||
- Responsibilities: Score sentiment (FinBERT + Ollama fallback), extract tickers, persist articles, publish to `news:scored`
|
||||
|
||||
**Market Data Service:**
|
||||
- Location: `services/market_data/main.py`
|
||||
- Triggers: Starts as standalone service, fetches on startup + live streaming
|
||||
- Responsibilities: Fetch OHLCV bars from Alpaca, publish to `market:bars`
|
||||
|
||||
**Signal Generator:**
|
||||
- Location: `services/signal_generator/main.py`
|
||||
- Triggers: Consumes from both `news:scored` and `market:bars` streams
|
||||
- Responsibilities: Combine sentiment + technical data, run strategy ensemble, publish to `signals:generated`
|
||||
|
||||
**Trade Executor:**
|
||||
- Location: `services/trade_executor/main.py`
|
||||
- Triggers: Consumes from `signals:generated` stream
|
||||
- Responsibilities: Risk check, position sizing, order submission, trade recording, publish to `trades:executed`
|
||||
|
||||
**Learning Engine:**
|
||||
- Location: `services/learning_engine/main.py`
|
||||
- Triggers: Consumes from `trades:executed` stream
|
||||
- Responsibilities: Evaluate closed positions, adjust strategy weights, audit adjustments
|
||||
|
||||
**API Gateway:**
|
||||
- Location: `services/api_gateway/main.py`
|
||||
- Triggers: HTTP server listening on configurable port (default 8000)
|
||||
- Responsibilities: REST/WebSocket API, user authentication (WebAuthn), portfolio sync task, query database
|
||||
|
||||
**Backtester:**
|
||||
- Location: `backtester/engine.py`
|
||||
- Triggers: Called from API endpoint `/api/backtest` (POST)
|
||||
- Responsibilities: Historical replay with `SimulatedBroker`, metrics calculation, equity curve generation
|
||||
|
||||
## Error Handling
|
||||
|
||||
**Strategy:** Graceful degradation and event loop resilience.
|
||||
|
||||
**Patterns:**
|
||||
- **Service-level**: Catch broad `Exception` in polling loops (news fetcher), log and continue; metrics counters increment for monitoring
|
||||
- **Fallback chains**: Sentiment analyzer tries FinBERT first, falls back to Ollama on low confidence
|
||||
- **Risk checks**: Trade executor rejects signals that fail risk checks (logs reason, continues)
|
||||
- **Database retries**: If `IntegrityError` on article insert (duplicate), continue without re-raising
|
||||
- **Stream consumption**: Consumer groups handle delivery guarantees; manual `XACK` after processing ensures fault tolerance
|
||||
- **Shutdown**: `asyncio.TaskGroup` and signal handlers ensure graceful termination; cleanup of DB/Redis connections in FastAPI lifespan
|
||||
|
||||
## Cross-Cutting Concerns
|
||||
|
||||
**Logging:**
|
||||
- Each service uses Python `logging` module with service name
|
||||
- All exceptions logged via `logger.exception()` for stack traces
|
||||
- Key events logged at INFO level (article counts, trades executed, weight updates)
|
||||
|
||||
**Validation:**
|
||||
- Pydantic v2 schemas validate all messages on consume
|
||||
- Database models use SQLAlchemy constraints (NOT NULL, UNIQUE, FOREIGN KEY)
|
||||
- Config classes extend `BaseSettings` with environment variable prefixes (`TRADING_*`)
|
||||
|
||||
**Authentication:**
|
||||
- WebAuthn (passkey) via `webauthn` PyPI package (`services/api_gateway/auth/`)
|
||||
- JWT tokens issued at login, verified on every protected endpoint via `get_current_user` dependency
|
||||
- Credentials stored as `UserCredential` model (credential ID, public key, counter)
|
||||
|
||||
**Telemetry:**
|
||||
- OpenTelemetry meters instantiated per service via `setup_telemetry(service_name, metrics_port)`
|
||||
- Counters: `articles_fetched`, `finbert_count`, `ollama_count`, `rejections`, `trades_executed`
|
||||
- Histograms: `fill_latency`, sentiment analysis duration, strategy evaluation time
|
||||
- Prometheus scrapes `/metrics` endpoint (default port 9090) from each service
|
||||
- No Prometheus/Grafana deployed; uses existing infrastructure for metrics collection
|
||||
385
.planning/codebase/CONCERNS.md
Normal file
385
.planning/codebase/CONCERNS.md
Normal file
|
|
@ -0,0 +1,385 @@
|
|||
# Codebase Concerns
|
||||
|
||||
**Analysis Date:** 2026-02-23
|
||||
|
||||
## Tech Debt
|
||||
|
||||
**Incomplete API Portfolio Metrics:**
|
||||
- Issue: Portfolio metrics endpoint returns hardcoded placeholder values instead of computed analytics
|
||||
- Files: `services/api_gateway/routes/portfolio.py` (lines 90-92, 170, 172)
|
||||
- Impact: Dashboard displays incorrect P&L, drawdown, and hold duration metrics; client decisions based on false data
|
||||
- Fix approach: Implement cumulative P&L tracking from first portfolio snapshot, compute max drawdown from historical snapshots, calculate average hold duration from trade outcomes table, implement Redis trading pause flag integration
|
||||
|
||||
**Missing Portfolio Sync Credentials Handling:**
|
||||
- Issue: Portfolio sync task silently disables when Alpaca credentials are missing, with no alerting mechanism
|
||||
- Files: `services/api_gateway/tasks/portfolio_sync.py` (lines 125-129)
|
||||
- Impact: Dashboard shows stale or zero portfolio data without user awareness; critical data source failure goes unnoticed
|
||||
- Fix approach: Add explicit warning log at startup, implement health check endpoint that reports sync status, consider sending alerts to dashboard via WebSocket
|
||||
|
||||
**Broad Exception Handling:**
|
||||
- Issue: Multiple services catch bare `Exception` without specific error classification
|
||||
- Files: `services/sentiment_analyzer/main.py` (lines 150, 207, 226), `services/trade_executor/main.py` (lines 136, 203, 222), `services/signal_generator/main.py` (lines 87, 178, 194, 257), `services/learning_engine/main.py` (lines 304), `services/market_data/main.py` (lines 108, 158, 253), `services/api_gateway/tasks/portfolio_sync.py` (lines 152)
|
||||
- Impact: Errors silently logged without retry logic, potential data loss, operational issues invisible to monitoring
|
||||
- Fix approach: Implement specific exception handling (IntegrityError, APIError, ValidationError, etc.), add telemetry counters for error categories, implement exponential backoff for transient failures
|
||||
|
||||
**Implicit Database Persistence Fallback:**
|
||||
- Issue: Multiple services (sentiment-analyzer, trade-executor) silently continue if DB initialization fails
|
||||
- Files: `services/sentiment_analyzer/main.py` (lines 203-208), `services/trade_executor/main.py` (lines 199-204), `services/signal_generator/main.py` (initial DB setup)
|
||||
- Impact: Services run without persistence layer — no audit trail, no dashboard data, potential duplicate processing if service restarts
|
||||
- Fix approach: Fail fast on DB connection errors, or implement persistent message queue (file-based) as fallback to ensure no trades are lost
|
||||
|
||||
**Learning Engine Position State In Redis Only:**
|
||||
- Issue: Opening trades stored in Redis `positions:history` hash with no persistence — loss of intermediate state if service crashes
|
||||
- Files: `services/learning_engine/main.py` (lines 65-82, 121-138)
|
||||
- Impact: Position close detection fails after service restart; partial trades with no close detected = unrealistic P&L
|
||||
- Fix approach: Move position history to database with persistent transaction log, implement consistent read-modify-write with transactions
|
||||
|
||||
---
|
||||
|
||||
## Known Bugs
|
||||
|
||||
**Portfolio Sync N+1 Query Pattern:**
|
||||
- Symptoms: One SELECT per position during portfolio sync; if 100 positions, 100 database queries per cycle
|
||||
- Files: `services/api_gateway/tasks/portfolio_sync.py` (lines 74-92)
|
||||
- Trigger: Any portfolio with multiple open positions
|
||||
- Workaround: None; will degrade performance as position count grows
|
||||
- Fix: Use `SELECT * WHERE ticker IN (...)` to fetch all positions in one query, then update in-memory dict
|
||||
|
||||
**Daily P&L Calculation Bug:**
|
||||
- Symptoms: Portfolio endpoint returns same value for `daily_pnl_pct` and `total_pnl_pct`
|
||||
- Files: `services/api_gateway/routes/portfolio.py` (lines 90-91)
|
||||
- Trigger: Every portfolio request
|
||||
- Root cause: Code reuses `daily_pnl_pct` for `total_pnl_pct` instead of computing cumulative metric
|
||||
- Workaround: Manually inspect database portfolio_snapshots for correct metrics
|
||||
- Fix: Compute cumulative P&L from first snapshot timestamp to now
|
||||
|
||||
**Position Unrealized P&L Division by Zero:**
|
||||
- Symptoms: Potential division by zero when `p.qty == 0` (closed position)
|
||||
- Files: `services/api_gateway/routes/portfolio.py` (lines 116-117, 120-121)
|
||||
- Trigger: When position is exactly 0 shares (edge case in reconciliation)
|
||||
- Workaround: Guards in place (`if p.qty`) but logic is error-prone
|
||||
- Fix: Refactor to use `if p.qty and p.avg_entry` guard early, simplify arithmetic
|
||||
|
||||
**Signal Sentiment Context Missing Price Field:**
|
||||
- Symptoms: Risk manager reads `sentiment_context["current_price"]` but signal may not include it
|
||||
- Files: `services/trade_executor/risk_manager.py` (lines 118-120)
|
||||
- Trigger: If signal is generated without market snapshot or price is omitted
|
||||
- Workaround: Falls back to 0.0, position sizing becomes incorrect
|
||||
- Fix: Ensure signal generator always includes current_price in sentiment_context, add assertion in executor
|
||||
|
||||
---
|
||||
|
||||
## Security Considerations
|
||||
|
||||
**JWT Secret Key Not Enforced in Development:**
|
||||
- Risk: Default development deployments may use weak or missing JWT secret keys
|
||||
- Files: `services/api_gateway/config.py` (lines 13-15), `services/api_gateway/auth/jwt.py` (lines 41, 96)
|
||||
- Current mitigation: Config requires `jwt_secret_key` but does not validate length/entropy
|
||||
- Recommendations:
|
||||
- Add validation in ApiGatewayConfig that secret is >= 32 bytes
|
||||
- Fail startup if secret is not set (don't use default)
|
||||
- Add warning log if secret is weak (< 256 bits)
|
||||
- Document minimum secret requirements in README
|
||||
|
||||
**Test Keys Insufficient for Production:**
|
||||
- Risk: Test suites use hardcoded short keys that would be cryptographically weak in production
|
||||
- Files: Multiple test files use test keys like `"test-secret-key"`
|
||||
- Current mitigation: Only in test environment
|
||||
- Recommendations:
|
||||
- Use pytest fixture to generate strong keys per test
|
||||
- Document that tests should not be deployed as-is
|
||||
- Consider environment variable override in CI
|
||||
|
||||
**CORS Configuration Hardcoded for Localhost:**
|
||||
- Risk: Default CORS allows only `http://localhost:5173`, but production deployment requires environment override
|
||||
- Files: `services/api_gateway/config.py` (line 26)
|
||||
- Current mitigation: CORS is at least restricted (not `*`), but easy to miss in production deploy
|
||||
- Recommendations:
|
||||
- Make CORS_ORIGINS a required config variable (no default)
|
||||
- Add validation that origins use HTTPS in production
|
||||
- Log warning if origin is localhost in production environment
|
||||
|
||||
**Passkey Registration Not Rate-Limited:**
|
||||
- Risk: WebAuthn registration endpoint (if present) has no rate limiting, allowing brute-force attacks
|
||||
- Files: Not fully reviewed; check `services/api_gateway/auth/routes.py`
|
||||
- Current mitigation: Unknown
|
||||
- Recommendations:
|
||||
- Add rate limiting (redis-backed) on registration endpoints
|
||||
- Implement CAPTCHA or email verification for new accounts
|
||||
- Log failed registration attempts for monitoring
|
||||
|
||||
**No HTTPS Enforcement in API Gateway:**
|
||||
- Risk: JWT tokens transmitted over plaintext HTTP in development config
|
||||
- Files: `services/api_gateway/config.py` (line 31 uses `http://localhost:5173`)
|
||||
- Current mitigation: Only affects localhost development
|
||||
- Recommendations:
|
||||
- Add `enforce_https` config flag; reject insecure origins in production
|
||||
- Set `secure` flag on JWT cookies
|
||||
- Use HSTS headers in FastAPI
|
||||
|
||||
---
|
||||
|
||||
## Performance Bottlenecks
|
||||
|
||||
**Sentiment Scores Stored in Unbounded Deque:**
|
||||
- Problem: Signal generator stores up to 50 sentiment scores per ticker in memory (never evicted from running process)
|
||||
- Files: `services/signal_generator/main.py` (line 35, lines 107-111)
|
||||
- Cause: deques are bounded but not persisted; high-frequency tickers accumulate memory
|
||||
- Current: 50 * ~5000 tickers = ~250K floats in memory during high volume
|
||||
- Improvement path:
|
||||
- Store sentiment history in database (timeseries table) instead of memory
|
||||
- Query last N scores on-demand per signal generation
|
||||
- Add TTL/expiration for old scores
|
||||
|
||||
**Market Data Manager Holds Full OHLCV History:**
|
||||
- Problem: `MarketDataManager` buffers all bars received for each ticker in memory
|
||||
- Files: `services/signal_generator/main.py` (referenced from main)
|
||||
- Cause: No bounded retention; running service accumulates bars indefinitely
|
||||
- Impact: Memory leak over weeks of operation
|
||||
- Improvement path:
|
||||
- Implement sliding window (keep last 100 bars only)
|
||||
- Persist bars to database for strategy backtesting
|
||||
- Use TimescaleDB hypertable compression for efficient historical queries
|
||||
|
||||
**Portfolio Sync Deletes and Re-inserts All Positions:**
|
||||
- Problem: Every sync cycle deletes all local positions and re-inserts from broker
|
||||
- Files: `services/api_gateway/tasks/portfolio_sync.py` (lines 94-101)
|
||||
- Cause: Simplistic approach; doesn't diff changes
|
||||
- Impact: Every 60s delete cascade on Position table (triggers cascade on related trades?)
|
||||
- Improvement path:
|
||||
- Implement diff-based upsert: only update changed tickers
|
||||
- Delete positions not in broker snapshot
|
||||
- Reduces churn, avoids potential foreign key cascade issues
|
||||
|
||||
**No Connection Pooling Tuning:**
|
||||
- Problem: AsyncPG default pool size may be insufficient for 7 services + API load
|
||||
- Files: `shared/db.py` (line 13-16)
|
||||
- Cause: `create_async_engine` uses default pool_size
|
||||
- Impact: Connection starvation under load, query timeouts
|
||||
- Improvement path:
|
||||
- Add `pool_size` and `max_overflow` config to BaseConfig
|
||||
- Monitor pool utilization (add telemetry)
|
||||
- Right-size pools per environment
|
||||
|
||||
**Alpaca API Calls Not Cached:**
|
||||
- Problem: Every trade executor and portfolio sync makes synchronous Alpaca API calls for account/positions
|
||||
- Files: `services/trade_executor/main.py` (line 74), `services/api_gateway/tasks/portfolio_sync.py` (line 64)
|
||||
- Cause: No request deduplication or caching layer
|
||||
- Impact: Slow order submission, API rate limit risks
|
||||
- Improvement path:
|
||||
- Cache account snapshot in Redis (TTL 5s)
|
||||
- Batch position queries where possible
|
||||
- Implement exponential backoff for API failures
|
||||
|
||||
---
|
||||
|
||||
## Fragile Areas
|
||||
|
||||
**Sentiment Analyzer Ollama Fallback Untested:**
|
||||
- Files: `services/sentiment_analyzer/analyzers/ollama_analyzer.py`, `services/sentiment_analyzer/main.py` (lines 71-79)
|
||||
- Why fragile: Fallback path (Ollama) rarely exercised if FinBERT always has high confidence
|
||||
- Safe modification: Add integration test that mocks FinBERT confidence < threshold, verifies Ollama called
|
||||
- Test coverage: No explicit test for fallback path
|
||||
- Impact: If FinBERT fails completely, Ollama fallback could silently degrade or crash
|
||||
|
||||
**Signal Generator Ensemble Evaluation:**
|
||||
- Files: `services/signal_generator/main.py` (line 149), `services/signal_generator/ensemble.py`
|
||||
- Why fragile: Three strategies weighted by Redis weights; if weights become NaN or negative, ensemble breaks silently
|
||||
- Safe modification: Add assertions that weights are valid (sum~1, all in [0,1]), implement weight validation on load
|
||||
- Test coverage: No explicit test for weight anomalies (e.g., all weights corrupted)
|
||||
- Impact: Invalid weights could produce signals with undefined strength
|
||||
|
||||
**Risk Manager Market Hours Check Time Zone:**
|
||||
- Files: `services/trade_executor/risk_manager.py` (lines 19-25, 59-61)
|
||||
- Why fragile: Hard-coded ET timezone; DST transitions not handled explicitly
|
||||
- Safe modification: Use ZoneInfo with explicit DST handling (Python 3.9+), add tests for DST boundaries
|
||||
- Test coverage: No explicit DST test
|
||||
- Impact: Orders placed 1 hour off during DST transition
|
||||
|
||||
**Learning Engine Position Matching Logic:**
|
||||
- Files: `services/learning_engine/main.py` (lines 85-98, 121-138)
|
||||
- Why fragile: String comparison of side enum (`opening_side == OrderSide.BUY.value`)
|
||||
- Safe modification: Use enum comparison directly, add type hints, add unit tests for all side combinations
|
||||
- Test coverage: Tests exist but may not cover edge cases (e.g., SELL -> BUY -> SELL)
|
||||
- Impact: Position close detection fails, P&L not recorded, weights never adjusted
|
||||
|
||||
**WebAuthn Credential Validation:**
|
||||
- Files: `services/api_gateway/auth/routes.py` (check WebAuthn implementation)
|
||||
- Why fragile: WebAuthn implementation may not validate all required fields (origin, user verification, etc.)
|
||||
- Safe modification: Review against OWASP WebAuthn checklist, add explicit assertion tests
|
||||
- Test coverage: Limited test coverage for security properties
|
||||
- Impact: Weak authentication allows account takeover
|
||||
|
||||
---
|
||||
|
||||
## Scaling Limits
|
||||
|
||||
**Redis Single Node No Persistence:**
|
||||
- Current capacity: In-memory Redis instance; loses all data on restart
|
||||
- Limit: All stream data (news, signals, trades) lost if Redis crashes
|
||||
- Blocks: Reliable message delivery, audit trails, service recovery
|
||||
- Scaling path:
|
||||
- Enable AOF (append-only file) persistence
|
||||
- Consider Redis Cluster for HA
|
||||
- Alternatively, add database-backed message queue (e.g., TimescaleDB FIFO table)
|
||||
|
||||
**PostgreSQL Shared Memory Not Tuned:**
|
||||
- Current capacity: Default shared_buffers, work_mem, etc.
|
||||
- Limit: Query performance degrades with 7 concurrent services
|
||||
- Scaling path:
|
||||
- Tune PostgreSQL config: shared_buffers, work_mem, effective_cache_size per machine
|
||||
- Add read replicas for reporting queries
|
||||
- Partition portfolio snapshots by date (TimescaleDB hypertables)
|
||||
|
||||
**Dashboard WebSocket No Backpressure:**
|
||||
- Current capacity: Single WebSocket connection broadcasts all events
|
||||
- Limit: Slow clients blocked, queue unbounded memory growth
|
||||
- Scaling path:
|
||||
- Implement client subscription topics (only send relevant tickers)
|
||||
- Add message batching (send updates every 100ms)
|
||||
- Implement client-side connection pooling
|
||||
|
||||
**No Circuit Breaker for Alpaca API:**
|
||||
- Current capacity: Alpaca API calls block entire trade executor on timeout
|
||||
- Limit: One slow Alpaca response stalls all signal processing
|
||||
- Scaling path:
|
||||
- Implement circuit breaker pattern (fail-open after N failures)
|
||||
- Add request timeout with retry logic
|
||||
- Queue unprocessable signals for retry after recovery
|
||||
|
||||
---
|
||||
|
||||
## Dependencies at Risk
|
||||
|
||||
**Ollama Local LLM Not Containerized with Stack:**
|
||||
- Risk: Ollama runs separately; must be manually started for sentiment analyzer to work
|
||||
- Files: `docker-compose.yml` (check if Ollama service present)
|
||||
- Impact: Sentiment analyzer fails silently if Ollama not running; fallback depends on FinBERT
|
||||
- Migration plan: Add Ollama to docker-compose.yml, or make Ollama truly optional with better error handling
|
||||
|
||||
**FinBERT Model Download Not Cached:**
|
||||
- Risk: FinBERT downloads model from HuggingFace on first run (~600MB)
|
||||
- Files: `services/sentiment_analyzer/analyzers/finbert.py`
|
||||
- Impact: Long startup time, may timeout in resource-constrained environments
|
||||
- Migration plan: Pre-download model into Docker image, add model cache invalidation strategy
|
||||
|
||||
**Alpaca SDK Direct HTTP Calls Not Mocked in Integration Tests:**
|
||||
- Risk: Integration tests mock Alpaca SDK, so real behavior never tested
|
||||
- Files: `tests/services/test_trade_executor.py` (check mocks), `tests/integration/`
|
||||
- Impact: Order submission bugs discovered only in production
|
||||
- Migration plan: Implement live integration tests against Alpaca paper trading, verify in CI/CD
|
||||
|
||||
**SQLAlchemy 2.0 Async Mode Requires Specific Dialect:**
|
||||
- Risk: AsyncPG optional dependency; falls back to sync driver if missing
|
||||
- Files: `shared/db.py` (line 13 uses `postgresql+asyncpg://`)
|
||||
- Impact: If asyncpg not installed, services run in sync mode, defeating async benefits
|
||||
- Migration plan: Add explicit check in `create_db()` to fail if asyncpg missing
|
||||
|
||||
---
|
||||
|
||||
## Missing Critical Features
|
||||
|
||||
**No Duplicate Trade Detection:**
|
||||
- Problem: If signal generator processes same article twice (due to Redis consumer group lag), executor may place duplicate trades
|
||||
- Blocks: Multi-service reliability, audit accuracy
|
||||
- Scope: Implement idempotency key in TradeExecution, check signal_id before placing order
|
||||
|
||||
**No Circuit Break / Trading Pause Mechanism:**
|
||||
- Problem: If models degrade, bot continues trading losses
|
||||
- Blocks: Risk mitigation, manual emergency stop
|
||||
- Current: `TRADING_PAUSED_KEY` in Redis (`services/api_gateway/routes/controls.py`), but not enforced in executor
|
||||
- Scope: Check executor consumes trading pause flag before submitting orders
|
||||
|
||||
**No Backtester Trade Slippage Model:**
|
||||
- Problem: Backtester assumes fills at exact signal price; unrealistic
|
||||
- Blocks: Accurate backtest results, risk estimation
|
||||
- Scope: Add configurable slippage model (fixed basis points + volume-based)
|
||||
|
||||
**No Strategy Parameter Sensitivity Analysis:**
|
||||
- Problem: Can't understand which parameters drive P&L
|
||||
- Blocks: Scientific strategy optimization, risk understanding
|
||||
- Scope: Implement parameter sweep in backtester, record sensitivity metrics
|
||||
|
||||
**No Live Trade Reconciliation:**
|
||||
- Problem: If Alpaca and local DB diverge, no auto-correction
|
||||
- Blocks: Accurate position tracking, tax reporting
|
||||
- Scope: Implement daily reconciliation job that compares broker vs. DB, alerts on mismatch
|
||||
|
||||
---
|
||||
|
||||
## Test Coverage Gaps
|
||||
|
||||
**Dashboard Components Not Unit Tested:**
|
||||
- What's not tested: React components (portfolio display, trade table, charts, alerts)
|
||||
- Files: `dashboard/src/components/`, `dashboard/src/pages/`
|
||||
- Risk: Component bugs, broken UI layouts only caught in E2E (if at all)
|
||||
- Priority: High — dashboard is user-facing; regressions impact usability
|
||||
|
||||
**API Routes Happy Path Only:**
|
||||
- What's not tested: Edge cases (missing fields, invalid enum values, concurrent requests)
|
||||
- Files: `tests/services/test_api_routes.py` (13 tests covering basic CRUD)
|
||||
- Risk: API crashes on edge inputs, poor error messages
|
||||
- Priority: Medium — most users use dashboard, not direct API
|
||||
|
||||
**Market Data Manager Not Unit Tested:**
|
||||
- What's not tested: Deque overflow behavior, snapshot generation with missing bars
|
||||
- Files: `services/signal_generator/market_data.py` (no test file)
|
||||
- Risk: Ensemble generates signals with incomplete data
|
||||
- Priority: High — directly impacts signal quality
|
||||
|
||||
**Learning Engine Weight Adjustment Edge Cases:**
|
||||
- What's not tested: Weight convergence, NaN/inf propagation, simultaneous rewards for same strategy
|
||||
- Files: `services/learning_engine/weight_adjuster.py` (test coverage unknown)
|
||||
- Risk: Weights diverge to extremes, strategies disabled unintentionally
|
||||
- Priority: High — learning is core feature
|
||||
|
||||
**WebAuthn Registration and Verification:**
|
||||
- What's not tested: Invalid registration data, replay attacks, malformed challenge responses
|
||||
- Files: `services/api_gateway/auth/routes.py` (WebAuthn handlers)
|
||||
- Risk: Authentication bypass, account takeover
|
||||
- Priority: Critical — security sensitive
|
||||
|
||||
**Redis Streams Consumer Group Edge Cases:**
|
||||
- What's not tested: Consumer group recovery after crash, message redelivery, offset management
|
||||
- Files: `shared/redis_streams.py` (5 tests total)
|
||||
- Risk: Messages lost or duplicated after service restart
|
||||
- Priority: High — data reliability depends on it
|
||||
|
||||
**Concurrent Signal and Trade Execution:**
|
||||
- What's not tested: Race conditions when multiple signals processed simultaneously
|
||||
- Files: Integration tests mock this with single thread
|
||||
- Risk: Phantom positions, incorrect risk calculations
|
||||
- Priority: High — real production has concurrent load
|
||||
|
||||
---
|
||||
|
||||
## Recommendations for Production Readiness
|
||||
|
||||
**Before going live, address at minimum:**
|
||||
|
||||
1. **Critical (blocks production):**
|
||||
- Implement portfolio metrics computation (cumulative P&L, max drawdown)
|
||||
- Fix portfolio sync query performance (N+1 issue)
|
||||
- Validate JWT secret key entropy at startup
|
||||
- Add rate limiting to auth endpoints
|
||||
- Implement circuit breaker for Alpaca API calls
|
||||
- Add duplicate trade detection (idempotency)
|
||||
|
||||
2. **High (should address soon):**
|
||||
- Implement trade reconciliation job
|
||||
- Move position state from Redis to database
|
||||
- Add specific exception handling (not bare Exception)
|
||||
- Implement backpressure in WebSocket broadcasting
|
||||
- Test Ollama fallback path end-to-end
|
||||
- Add comprehensive dashboard unit tests
|
||||
|
||||
3. **Medium (can defer):**
|
||||
- Optimize portfolio sync with diff-based approach
|
||||
- Persist market data and sentiment to TimescaleDB
|
||||
- Tune PostgreSQL connection pool
|
||||
- Add DST test coverage for market hours
|
||||
- Implement parameter sensitivity analysis in backtester
|
||||
|
||||
---
|
||||
|
||||
*Concerns audit: 2026-02-23*
|
||||
276
.planning/codebase/CONVENTIONS.md
Normal file
276
.planning/codebase/CONVENTIONS.md
Normal file
|
|
@ -0,0 +1,276 @@
|
|||
# Coding Conventions
|
||||
|
||||
**Analysis Date:** 2026-02-23
|
||||
|
||||
## Naming Patterns
|
||||
|
||||
**Files (Python):**
|
||||
- Modules: lowercase with underscores (e.g., `redis_streams.py`, `finbert_analyzer.py`)
|
||||
- Packages: underscores, not hyphens (e.g., `news_fetcher`, `sentiment_analyzer`)
|
||||
- Private modules: prefix with underscore (e.g., `_deduplicate_and_publish`)
|
||||
|
||||
**Files (TypeScript/React):**
|
||||
- Components: PascalCase (e.g., `MetricsRow.tsx`, `ProtectedRoute.tsx`)
|
||||
- Hooks: camelCase with `use` prefix (e.g., `useAuth.ts`, `useWebSocket.ts`)
|
||||
- Utilities/APIs: camelCase (e.g., `client.ts`, `auth.ts`)
|
||||
- Config files: camelCase with dots (e.g., `tsconfig.app.json`, `vite.config.ts`)
|
||||
|
||||
**Functions and Methods:**
|
||||
- Snake_case in Python (e.g., `_deduplicate_and_publish`, `ensure_group`)
|
||||
- CamelCase in TypeScript/React (e.g., `startRegistration`, `useCallback`)
|
||||
- Async functions: prefix with `async` keyword, no special naming convention
|
||||
- Private/internal functions: prefix with underscore in Python (e.g., `_make_config`, `_make_signal`)
|
||||
|
||||
**Variables:**
|
||||
- Snake_case in Python for all variables (e.g., `session_factory`, `confidence_threshold`)
|
||||
- CamelCase in TypeScript (e.g., `isAuthenticated`, `loading`, `token`)
|
||||
- Constants: UPPER_CASE with underscores (e.g., `SEEN_HASHES_KEY = "news:seen_hashes"`, `RAW_STREAM = "test:news:raw"`)
|
||||
- Loop variables and temporaries: follow standard conventions (e.g., `msg_id`, `score`, `ticker`)
|
||||
|
||||
**Types and Classes:**
|
||||
- Classes: PascalCase (e.g., `StreamPublisher`, `TradeEvaluator`, `BaseStrategy`)
|
||||
- Enums: PascalCase (e.g., `TradeSide`, `TradeStatus`, `SignalDirection`)
|
||||
- Type hints: use full import path or import explicitly (e.g., `list[PositionInfo]`, `dict[str, str]`)
|
||||
- Generic types: use Python 3.12+ syntax without `typing.` prefix (e.g., `dict`, `list`, `tuple`)
|
||||
|
||||
**Interfaces (TypeScript):**
|
||||
- PascalCase with `Props` suffix for component props (e.g., `MetricsRowProps`)
|
||||
- PascalCase for return types (e.g., `UseAuthReturn`)
|
||||
- Prefix common interfaces: `I` is not used; use descriptive names instead
|
||||
|
||||
## Code Style
|
||||
|
||||
**Formatting (Python):**
|
||||
- Line length: 120 characters (configured in `pyproject.toml` under `[tool.ruff]`)
|
||||
- Target Python version: 3.12 (configured under `[tool.ruff] target-version`)
|
||||
- Formatter: Ruff (for linting and formatting)
|
||||
- No Black configuration — Ruff is the primary tool
|
||||
|
||||
**Formatting (TypeScript/React):**
|
||||
- Target: ES2022 (configured in `tsconfig.app.json`)
|
||||
- Module resolution: bundler
|
||||
- Strict mode: enabled (`"strict": true`)
|
||||
- JSX mode: react-jsx
|
||||
|
||||
**Linting (Python):**
|
||||
- Tool: Ruff (`ruff>=0.3`)
|
||||
- Key rules: Follow Python 3.12 conventions, enforce async functions
|
||||
- MyPy: enabled (`mypy>=1.8`)
|
||||
- `warn_return_any = true`
|
||||
- `warn_unused_configs = true`
|
||||
|
||||
**Linting (TypeScript):**
|
||||
- ESLint with flat config (`eslint.config.js` in `dashboard/`)
|
||||
- Extends:
|
||||
- `@eslint/js` (recommended)
|
||||
- `typescript-eslint` (recommended)
|
||||
- `react-hooks` (react-hooks/exhaustive-deps)
|
||||
- `react-refresh` (for Vite)
|
||||
- Global ignores: `dist/`
|
||||
- Files: `**/*.{ts,tsx}`
|
||||
|
||||
## Import Organization
|
||||
|
||||
**Python Import Order:**
|
||||
1. `from __future__ import annotations` (always first if present)
|
||||
2. Standard library: `import asyncio`, `import json`, `import logging`, etc.
|
||||
3. Third-party packages: `from redis.asyncio import Redis`, `from pydantic import BaseModel`, etc.
|
||||
4. Shared modules: `from shared.config import BaseConfig`, `from shared.redis_streams import StreamPublisher`
|
||||
5. Service-specific modules: `from services.news_fetcher.config import NewsFetcherConfig`
|
||||
6. Absolute imports preferred; relative imports acceptable within a service
|
||||
|
||||
**TypeScript Import Order:**
|
||||
1. React and core libraries: `import { useState, useCallback } from 'react'`
|
||||
2. Routing: `import { Routes, Route } from 'react-router-dom'`
|
||||
3. Third-party UI/packages: `import { startRegistration } from '@simplewebauthn/browser'`
|
||||
4. Internal utilities/API: `import { client } from '../api/client'`
|
||||
5. Components: `import { MetricsRow } from '../components/MetricsRow'`
|
||||
6. Hooks: `import { useAuth } from '../hooks/useAuth'`
|
||||
|
||||
**Path Aliases:**
|
||||
- Python: No path aliases configured; use absolute imports from project root
|
||||
- TypeScript: None configured in `tsconfig.app.json`; use relative paths with `../`
|
||||
|
||||
## Error Handling
|
||||
|
||||
**Python Error Handling:**
|
||||
- Use `try/except` for external service calls (Redis, database, API calls)
|
||||
- Log all exceptions at the point of handling: `logger.exception("Context: %s", variable)`
|
||||
- Specific exception handling before generic: `except IntegrityError:` before `except Exception:`
|
||||
- Re-raise specific errors or return default/None on fallback paths
|
||||
- Example from `sentiment_analyzer/main.py`:
|
||||
```python
|
||||
try:
|
||||
async with db_session_factory() as session:
|
||||
# persist to DB
|
||||
except IntegrityError:
|
||||
# Handle duplicate
|
||||
pass
|
||||
except Exception:
|
||||
logger.exception("Error processing article: %s", data.get("title", "<unknown>"))
|
||||
```
|
||||
- Async functions use `try/except` around `await` calls
|
||||
- Use `asyncio.TimeoutError` for timeout handling
|
||||
|
||||
**TypeScript/React Error Handling:**
|
||||
- Catch errors from API calls and set error state: `err?.response?.data?.detail || err?.message`
|
||||
- Always provide fallback messages: `'Registration failed'` as default
|
||||
- Log error context: `throw err` after setting error state to propagate to caller if needed
|
||||
- Example from `useAuth.ts`:
|
||||
```typescript
|
||||
catch (err: any) {
|
||||
const message = err?.response?.data?.detail || err?.message || 'Registration failed';
|
||||
setError(message);
|
||||
throw err;
|
||||
}
|
||||
```
|
||||
|
||||
## Logging
|
||||
|
||||
**Framework:** Python uses `logging` module; no structured logging library
|
||||
|
||||
**Patterns:**
|
||||
- Create logger at module level: `logger = logging.getLogger(__name__)`
|
||||
- Use appropriate levels:
|
||||
- `logger.debug()` — Detailed debug info (e.g., "Published to stream:xxx")
|
||||
- `logger.info()` — General informational (e.g., "Created consumer group…")
|
||||
- `logger.warning()` — Warning conditions (e.g., "Order rejected by broker")
|
||||
- `logger.exception()` — Log exceptions with stack trace (always use at exception point)
|
||||
- Always include context in log messages using `%` formatting: `logger.info("Message with %s", variable)`
|
||||
- Do NOT use f-strings in logging (allows lazy evaluation)
|
||||
- Log level configuration: `log_level` in `BaseConfig` (default: "INFO")
|
||||
|
||||
**Examples from codebase:**
|
||||
```python
|
||||
logger.debug("Published to %s: %s", self.stream, msg_id)
|
||||
logger.info("Created consumer group %s on %s", self.group, self.stream)
|
||||
logger.exception("Ollama analysis failed") # Includes stack trace
|
||||
logger.warning("Order rejected by Alpaca: %s", exc)
|
||||
```
|
||||
|
||||
## Comments
|
||||
|
||||
**When to Comment:**
|
||||
- Complex business logic that isn't self-documenting (e.g., weight adjustment formulas, P&L calculations)
|
||||
- Non-obvious design decisions (e.g., "SADD returns 1 if the member was added (i.e. not already present)")
|
||||
- Caveats or workarounds (e.g., "BUSYGROUP means group already exists — expected on subsequent starts.")
|
||||
- Public APIs and exported functions should have docstrings
|
||||
|
||||
**When NOT to Comment:**
|
||||
- Self-explanatory code (variable names, clear function logic)
|
||||
- Redundant comments that repeat the code
|
||||
- Outdated comments (remove or update)
|
||||
|
||||
**Docstring/Comments:**
|
||||
- Module docstrings: Present at top of file, describe purpose
|
||||
```python
|
||||
"""Thin wrappers around redis-py Streams for publish/consume with JSON serialization."""
|
||||
```
|
||||
- Class docstrings: Describe the class purpose and key behavior
|
||||
```python
|
||||
class StreamPublisher:
|
||||
"""Publishes JSON-encoded messages to a Redis Stream."""
|
||||
```
|
||||
- Method docstrings: Use Google/NumPy style for async methods
|
||||
```python
|
||||
async def ensure_group(self) -> None:
|
||||
"""Create the consumer group if it does not already exist."""
|
||||
```
|
||||
- Parameters/Returns documented in docstrings for public APIs:
|
||||
```python
|
||||
"""Score a single article and publish one ScoredArticle per extracted ticker.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
article:
|
||||
The raw article consumed from the ``news:raw`` stream.
|
||||
"""
|
||||
```
|
||||
|
||||
**Inline Comments:**
|
||||
- Prefix with `#` and space: `# This explains why…`
|
||||
- Place above the code line (not at end of line unless very brief)
|
||||
- Example: `# SADD returns 1 if the member was added (i.e. not already present)`
|
||||
|
||||
## Function Design
|
||||
|
||||
**Size Guidelines:**
|
||||
- Functions should be focused and single-responsibility
|
||||
- Async functions in services are typically 5-30 lines (after docstring)
|
||||
- Helper functions (prefixed with `_`) are 3-20 lines
|
||||
- Example: `_deduplicate_and_publish` is ~20 lines, handles one concern
|
||||
|
||||
**Parameters:**
|
||||
- Use keyword arguments for clarity in calls: `await publish(data=article, index=stream)`
|
||||
- Type hints are required for all parameters and returns
|
||||
- Default parameters allowed (e.g., `batch_size: int = 10`)
|
||||
- For configuration, pass config objects rather than many flags: `config: MyConfig` not `threshold=0.5, timeout=10`
|
||||
|
||||
**Return Values:**
|
||||
- Use union types for optional/multiple returns: `TradeSignal | None` (Python 3.10+ syntax)
|
||||
- Return early for error cases (guard clauses):
|
||||
```python
|
||||
if not tickers:
|
||||
logger.debug("No tickers found")
|
||||
return
|
||||
```
|
||||
- For async functions, return types should use `Awaitable` or direct annotation: `async def x() -> PositionInfo:`
|
||||
|
||||
## Module Design
|
||||
|
||||
**Exports:**
|
||||
- Use `__all__` in modules that re-export from submodules
|
||||
- Example from `shared/models/__init__.py`:
|
||||
```python
|
||||
__all__ = [
|
||||
"Base",
|
||||
"TimestampMixin",
|
||||
# Trading
|
||||
"Strategy",
|
||||
"Signal",
|
||||
# ...
|
||||
]
|
||||
```
|
||||
- Imports are grouped by category (Trading, News, Learning, Auth, Timeseries) with comments
|
||||
|
||||
**Barrel Files:**
|
||||
- `shared/models/__init__.py` acts as a barrel, importing all models so Alembic can discover them
|
||||
- Comment each import group for clarity
|
||||
- Do NOT use wildcard imports (`from x import *`)
|
||||
|
||||
**Config Classes:**
|
||||
- All config classes extend `shared.config.BaseConfig` (Pydantic BaseSettings)
|
||||
- Use `model_config = {"env_prefix": "TRADING_"}` for environment variable loading
|
||||
- Example: `services/news_fetcher/config.py` extends `BaseConfig`
|
||||
|
||||
**Service Entry Points:**
|
||||
- Located at `services/{service_name}/main.py`
|
||||
- Contains `async def main()` or async generator functions
|
||||
- Modules imported from shared libraries and service-specific modules
|
||||
- Signal handling for graceful shutdown
|
||||
|
||||
## Special Conventions
|
||||
|
||||
**Redis Stream Constants:**
|
||||
- Define at module level as uppercase strings
|
||||
- Example: `NEWS_RAW_STREAM = "news:raw"`, `SEEN_HASHES_KEY = "news:seen_hashes"`
|
||||
|
||||
**Pydantic Models:**
|
||||
- Use `model_dump(mode="json")` when serializing to JSON (v2 syntax)
|
||||
- Set `model_config = {"from_attributes": True}` for ORM mapping
|
||||
- Enum fields use value comparison: `TradeSide.BUY == "BUY"`
|
||||
|
||||
**SQLAlchemy (2.0+):**
|
||||
- Use async sessions: `async_sessionmaker` with async context managers
|
||||
- Use `mapped_column` style (not `Column`)
|
||||
- Type hints required on model fields
|
||||
- Example: models in `shared/models/trading.py`
|
||||
|
||||
**Test Naming:**
|
||||
- Test functions: `test_{component}_{scenario}` (e.g., `test_publisher_publishes_json`)
|
||||
- Test classes: `Test{ComponentName}` (e.g., `TestEvaluateProfitableTrade`)
|
||||
- Helper functions in tests: prefix with `_make_` (e.g., `_make_config`, `_make_signal`)
|
||||
|
||||
---
|
||||
|
||||
*Convention analysis: 2026-02-23*
|
||||
221
.planning/codebase/INTEGRATIONS.md
Normal file
221
.planning/codebase/INTEGRATIONS.md
Normal file
|
|
@ -0,0 +1,221 @@
|
|||
# External Integrations
|
||||
|
||||
**Analysis Date:** 2025-02-23
|
||||
|
||||
## APIs & External Services
|
||||
|
||||
**Brokerage Trading:**
|
||||
- Alpaca Markets - Paper and live trading
|
||||
- SDK/Client: `alpaca-py` (21.0+)
|
||||
- API Key: `TRADING_ALPACA_API_KEY` env var
|
||||
- Secret: `TRADING_ALPACA_SECRET_KEY` env var
|
||||
- Base URL: `TRADING_ALPACA_BASE_URL` (defaults to paper trading endpoint)
|
||||
- Paper trading: `TRADING_PAPER_TRADING=true` flag in config
|
||||
- Used by:
|
||||
- `shared/broker/alpaca_broker.py` - Order execution, position tracking, account info via `TradingClient`
|
||||
- `services/market_data/main.py` - Historical bars via `StockHistoricalDataClient` and live streaming
|
||||
- `services/api_gateway/tasks/portfolio_sync.py` - Portfolio synchronization background task
|
||||
- `services/trade_executor/main.py` - Order placement and status monitoring
|
||||
|
||||
**News & Content Sources:**
|
||||
- Yahoo Finance RSS - Financial news feed
|
||||
- Feed URL: `https://finance.yahoo.com/news/rssindex`
|
||||
- Client: feedparser 6.0+
|
||||
- Polling: `TRADING_RSS_POLL_INTERVAL_SECONDS` (default 300s)
|
||||
- Configured in: `services/news_fetcher/config.py`
|
||||
|
||||
- Reuters RSS - Business news feed
|
||||
- Feed URL: `https://feeds.reuters.com/reuters/businessNews`
|
||||
- Client: feedparser 6.0+
|
||||
|
||||
- Dow Jones RSS - Market news feed
|
||||
- Feed URL: `https://feeds.content.dowjones.io/public/rss/mw_topstories`
|
||||
- Client: feedparser 6.0+
|
||||
|
||||
- Reddit - Wallstreetbets, stocks, investing communities
|
||||
- SDK/Client: praw 7.7+ (blocking) and asyncpraw 7.7+ (async)
|
||||
- Client ID: `TRADING_REDDIT_CLIENT_ID` env var
|
||||
- Client Secret: `TRADING_REDDIT_CLIENT_SECRET` env var
|
||||
- User Agent: `TRADING_REDDIT_USER_AGENT` (default: `trading-bot/0.1`)
|
||||
- Subreddits: `["wallstreetbets", "stocks", "investing"]` (configurable via `TRADING_REDDIT_SUBREDDITS`)
|
||||
- Min score filter: `TRADING_REDDIT_MIN_SCORE` (default 10)
|
||||
- Polling: `TRADING_REDDIT_POLL_INTERVAL_SECONDS` (default 600s)
|
||||
- Used by: `services/news_fetcher/main.py`
|
||||
|
||||
**Machine Learning Models:**
|
||||
- FinBERT (Hugging Face) - Financial sentiment classification
|
||||
- Model: `ProsusAI/finbert` (transformer)
|
||||
- Confidence threshold: `TRADING_FINBERT_CONFIDENCE_THRESHOLD` (default 0.4)
|
||||
- Library: transformers 4.38+
|
||||
- Used by: `services/sentiment_analyzer/analyzers/finbert.py`
|
||||
- Location: Loaded from Hugging Face model hub
|
||||
|
||||
- Ollama - Local LLM for fallback sentiment analysis
|
||||
- Host: `TRADING_OLLAMA_HOST` (default `http://localhost:11434`)
|
||||
- Model: `TRADING_OLLAMA_MODEL` (default `gemma3`)
|
||||
- Used when: FinBERT confidence below threshold
|
||||
- Library: ollama-python 0.1+
|
||||
- Used by: `services/sentiment_analyzer/analyzers/ollama_analyzer.py`
|
||||
|
||||
## Data Storage
|
||||
|
||||
**Databases:**
|
||||
- PostgreSQL 16 + TimescaleDB extension (Docker: `timescale/timescaledb:latest-pg16`)
|
||||
- Connection: Async via asyncpg driver
|
||||
- URL: `TRADING_DATABASE_URL` env var (default: `postgresql+asyncpg://trading:trading@localhost:5432/trading`)
|
||||
- Username: `trading` (env: `POSTGRES_USER`)
|
||||
- Password: `POSTGRES_PASSWORD` env var
|
||||
- Database: `trading`
|
||||
- ORM: SQLAlchemy 2.0+ with mapped_column style
|
||||
- Migrations: Alembic (auto-run via `python -m alembic upgrade head`)
|
||||
- Tables (14 models in `shared/models/`):
|
||||
- Auth: `users`, `webauthn_credentials`
|
||||
- News: `articles`, `article_sentiments`
|
||||
- Trading: `strategies`, `trade_signals`, `executed_trades`, `positions`
|
||||
- Learning: `strategy_weights`, `weight_history`
|
||||
- Market data: OHLCV timeseries via TimescaleDB hypertables
|
||||
- Used by: All services via `shared/db.py` async sessionmaker
|
||||
|
||||
**File Storage:**
|
||||
- Local filesystem only (no cloud storage)
|
||||
- Docker volumes for persistent data:
|
||||
- `pgdata` - PostgreSQL persistent storage
|
||||
- `redisdata` - Redis persistent storage
|
||||
|
||||
**Caching & Message Broker:**
|
||||
- Redis 7-alpine (Docker image)
|
||||
- Connection: `TRADING_REDIS_URL` env var (default: `redis://localhost:6379/0`)
|
||||
- Default port: 6379
|
||||
- Persistent volume: `redisdata`
|
||||
- Features used:
|
||||
- Streams (consumer groups) for inter-service messaging
|
||||
- Key-value cache for rate limiting
|
||||
- Stream topics (producer/consumer contracts):
|
||||
- `news:raw` - Raw articles from news fetcher → consumed by sentiment analyzer
|
||||
- `news:scored` - Scored articles from sentiment analyzer → consumed by signal generator
|
||||
- `signals:generated` - Trade signals from signal generator → consumed by trade executor and learning engine
|
||||
- `trades:executed` - Executed trades from trade executor → consumed by learning engine
|
||||
- Client library: redis 5.0+ with async support
|
||||
- Consumer group per service (e.g., `sentiment-analyzer-group` on `news:raw` stream)
|
||||
|
||||
## Authentication & Identity
|
||||
|
||||
**Auth Provider:**
|
||||
- Custom WebAuthn/Passkey implementation (no third-party OAuth provider)
|
||||
- Library: webauthn 2.0+ (Python backend)
|
||||
- Frontend: @simplewebauthn/browser 13.2.2 (JavaScript/TypeScript)
|
||||
- Flow: WebAuthn registration → credential storage → authentication challenge verification
|
||||
- Credentials storage: `webauthn_credentials` table in PostgreSQL
|
||||
- Implementation: `services/api_gateway/auth/routes.py` (register/login/verify endpoints)
|
||||
|
||||
**Session Management:**
|
||||
- JWT tokens with HS256 (HMAC-SHA256)
|
||||
- Library: PyJWT 2.8+ with crypto support
|
||||
- Secret key: `TRADING_JWT_SECRET_KEY` env var (REQUIRED, must be 32+ bytes in production)
|
||||
- Algorithm: `TRADING_JWT_ALGORITHM` (default: HS256)
|
||||
- Access token expiry: `TRADING_ACCESS_TOKEN_EXPIRE_MINUTES` (default 15)
|
||||
- Refresh token expiry: `TRADING_REFRESH_TOKEN_EXPIRE_DAYS` (default 7)
|
||||
- Token verification: `services/api_gateway/auth/middleware.py` (Bearer token extraction and validation)
|
||||
- Implementation: `services/api_gateway/auth/jwt.py`
|
||||
|
||||
## Monitoring & Observability
|
||||
|
||||
**Error Tracking:**
|
||||
- None detected - No Sentry or other APM integration
|
||||
- Logging: Standard Python logging module with `TRADING_LOG_LEVEL` env var
|
||||
|
||||
**Logs:**
|
||||
- Console logging to stdout/stderr
|
||||
- Log level controlled by: `TRADING_LOG_LEVEL` env var (default: INFO)
|
||||
- Each service logs to standard output (captured by Docker/container orchestration)
|
||||
|
||||
**Metrics & Telemetry:**
|
||||
- OpenTelemetry 1.20+ - Metrics collection and export
|
||||
- PrometheusMetricReader - Metrics exporter
|
||||
- prometheus-client - HTTP `/metrics` endpoint
|
||||
- Setup: `shared/telemetry.py` - `setup_telemetry()` called by each service
|
||||
- Metrics port: `TRADING_OTEL_METRICS_PORT` env var (default 9090)
|
||||
- Service name: `TRADING_OTEL_SERVICE_NAME` env var (default: `trading-bot`)
|
||||
- Scrape endpoint: `http://<service>:9090/metrics` for each service
|
||||
- Note: Prometheus/Grafana not deployed; metrics available for external monitoring
|
||||
|
||||
## CI/CD & Deployment
|
||||
|
||||
**Hosting:**
|
||||
- Docker & Docker Compose (local development and self-hosted deployment)
|
||||
- No managed hosting detected (e.g., AWS, Heroku, GCP)
|
||||
|
||||
**CI Pipeline:**
|
||||
- None detected - No GitHub Actions, GitLab CI, or other CI/CD pipeline
|
||||
|
||||
**Docker Composition:**
|
||||
- Services defined in `docker-compose.yml`:
|
||||
- `postgres` - Database
|
||||
- `redis` - Message broker
|
||||
- `migrations` - Database schema initialization
|
||||
- `news-fetcher` - Service
|
||||
- `sentiment-analyzer` - Service
|
||||
- `signal-generator` - Service
|
||||
- `trade-executor` - Service
|
||||
- `learning-engine` - Service
|
||||
- `market-data` - Service
|
||||
- `api-gateway` - FastAPI + WebSocket server (port 8000)
|
||||
- `dashboard` - Nginx serving React build (port 3000)
|
||||
|
||||
## Environment Configuration
|
||||
|
||||
**Required env vars:**
|
||||
- `TRADING_JWT_SECRET_KEY` - **REQUIRED** for API Gateway JWT signing (must be set, generate with `python -c "import secrets; print(secrets.token_hex(32))"`)
|
||||
- `TRADING_ALPACA_API_KEY` - Alpaca account API key
|
||||
- `TRADING_ALPACA_SECRET_KEY` - Alpaca account secret
|
||||
- `TRADING_REDDIT_CLIENT_ID` - Reddit app client ID
|
||||
- `TRADING_REDDIT_CLIENT_SECRET` - Reddit app secret
|
||||
|
||||
**Optional env vars with defaults:**
|
||||
- `TRADING_DATABASE_URL` - Default: `postgresql+asyncpg://trading:trading@localhost:5432/trading`
|
||||
- `TRADING_REDIS_URL` - Default: `redis://localhost:6379/0`
|
||||
- `TRADING_LOG_LEVEL` - Default: `INFO`
|
||||
- `TRADING_ALPACA_BASE_URL` - Default: Paper trading endpoint
|
||||
- `TRADING_PAPER_TRADING` - Default: `true`
|
||||
- `TRADING_OLLAMA_HOST` - Default: `http://localhost:11434`
|
||||
- `TRADING_OLLAMA_MODEL` - Default: `gemma3`
|
||||
- `TRADING_FINBERT_CONFIDENCE_THRESHOLD` - Default: `0.4`
|
||||
- `TRADING_RSS_POLL_INTERVAL_SECONDS` - Default: `300`
|
||||
- `TRADING_REDDIT_POLL_INTERVAL_SECONDS` - Default: `600`
|
||||
- `TRADING_RP_ID` - Default: `localhost`
|
||||
- `TRADING_RP_NAME` - Default: `Trading Bot`
|
||||
- `TRADING_RP_ORIGIN` - Default: `http://localhost:5173`
|
||||
- `TRADING_CORS_ORIGINS` - Default: `["http://localhost:5173"]`
|
||||
- `POSTGRES_PASSWORD` - Default: `trading`
|
||||
|
||||
**Secrets location:**
|
||||
- `.env` file (git-ignored, loads via docker-compose `env_file` directive)
|
||||
- Environment variables passed at container runtime
|
||||
- No secrets management service (Vault, Secrets Manager) integrated
|
||||
|
||||
## Webhooks & Callbacks
|
||||
|
||||
**Incoming:**
|
||||
- None detected - No webhook endpoints for external services
|
||||
|
||||
**Outgoing:**
|
||||
- None detected - No outbound webhooks to external systems
|
||||
|
||||
**Real-time Communication:**
|
||||
- WebSocket support via `websockets` library for API Gateway
|
||||
- Used for: Real-time dashboard updates (TBD implementation in routes)
|
||||
- Endpoint: `/ws` proxy configured in Vite dev server
|
||||
|
||||
## Service Dependencies (Internal Messaging)
|
||||
|
||||
**Message Flows:**
|
||||
1. News Fetcher → (`news:raw` stream) → Sentiment Analyzer
|
||||
2. Sentiment Analyzer → (`news:scored` stream) → Signal Generator
|
||||
3. Signal Generator → (`signals:generated` stream) → Trade Executor + Learning Engine
|
||||
4. Trade Executor → (`trades:executed` stream) → Learning Engine
|
||||
5. Market Data → Alpaca API → Database (timeseries persisted)
|
||||
6. API Gateway → Alpaca API → Portfolio sync background task
|
||||
|
||||
---
|
||||
|
||||
*Integration audit: 2025-02-23*
|
||||
169
.planning/codebase/STACK.md
Normal file
169
.planning/codebase/STACK.md
Normal file
|
|
@ -0,0 +1,169 @@
|
|||
# Technology Stack
|
||||
|
||||
**Analysis Date:** 2025-02-23
|
||||
|
||||
## Languages
|
||||
|
||||
**Primary:**
|
||||
- Python 3.12 - Backend microservices, data processing, model serving
|
||||
- TypeScript 5.9 - React frontend, type safety
|
||||
- JavaScript - React components, utilities
|
||||
|
||||
**Secondary:**
|
||||
- Bash - Docker entrypoints, scripts
|
||||
|
||||
## Runtime
|
||||
|
||||
**Environment:**
|
||||
- Python 3.12 runtime (Docker: `python:3.12-slim`)
|
||||
- Node.js 20 (Docker: `node:20-alpine`)
|
||||
|
||||
**Package Manager:**
|
||||
- pip (Python) with `pyproject.toml` configuration
|
||||
- npm (Node) with `package.json` and `package-lock.json`
|
||||
- Lockfile: `package-lock.json` present for frontend
|
||||
|
||||
## Frameworks
|
||||
|
||||
**Core Backend:**
|
||||
- FastAPI 0.110+ - HTTP server for API gateway at `services/api_gateway/main.py`
|
||||
- SQLAlchemy 2.0+ with asyncio - ORM for PostgreSQL at `shared/db.py`
|
||||
- Alembic 1.13+ - Database migrations in `alembic/` directory
|
||||
- Uvicorn 0.27+ - ASGI server for FastAPI
|
||||
|
||||
**Frontend:**
|
||||
- React 19.2 - UI framework
|
||||
- Vite 7.3 - Build tool and dev server
|
||||
- Tailwind CSS 4.2 - Styling framework
|
||||
- React Router 7.13 - Client-side routing
|
||||
|
||||
**Data & State:**
|
||||
- TanStack Query 5.90+ - Server state management for API calls
|
||||
- Axios 1.13+ - HTTP client for backend communication
|
||||
- Recharts 3.7 - Charts for portfolio analytics
|
||||
- TradingView lightweight-charts 5.1 - Professional candlestick charts
|
||||
|
||||
**Testing:**
|
||||
- pytest 8.0+ - Python test runner with `asyncio_mode = "auto"`
|
||||
- pytest-asyncio 0.23+ - Async test support
|
||||
- pytest-cov 4.1+ - Coverage reporting
|
||||
- React Testing Library - Frontend component testing (imported but not actively tested)
|
||||
|
||||
**Build/Dev:**
|
||||
- Ruff 0.3+ - Python linter and formatter (line-length: 120)
|
||||
- MyPy 1.8+ - Python type checker
|
||||
- TypeScript Compiler (tsc) - TypeScript compilation
|
||||
- ESLint 9.39+ - JavaScript/TypeScript linting with React hooks and refresh plugins
|
||||
|
||||
## Key Dependencies
|
||||
|
||||
**Critical Backend:**
|
||||
- redis 5.0+ - Message broker via Redis Streams at `shared/redis_streams.py`
|
||||
- asyncpg 0.29+ - PostgreSQL async driver (via SQLAlchemy)
|
||||
- pydantic 2.0+ - Data validation and settings management at `shared/config.py`
|
||||
- pydantic-settings 2.0+ - Environment-based configuration
|
||||
|
||||
**Machine Learning & NLP:**
|
||||
- transformers 4.38+ - FinBERT model for sentiment analysis at `services/sentiment_analyzer/analyzers/finbert.py`
|
||||
- torch 2.2+ - PyTorch runtime for transformers
|
||||
- ollama 0.1+ - Local LLM inference client (fallback analyzer) at `services/sentiment_analyzer/analyzers/ollama_analyzer.py`
|
||||
|
||||
**News & Data Sources:**
|
||||
- feedparser 6.0+ - RSS feed parsing at `services/news_fetcher/main.py`
|
||||
- praw 7.7+ - Reddit API client (blocking)
|
||||
- asyncpraw 7.7+ - Async Reddit API client for non-blocking Reddit fetches
|
||||
- httpx 0.27+ - Async HTTP client for web requests
|
||||
|
||||
**Brokerage Integration:**
|
||||
- alpaca-py 0.21+ - Alpaca trading API client at `shared/broker/alpaca_broker.py` and `services/market_data/main.py`
|
||||
- Provides: Order management, position tracking, market data (bars), account info
|
||||
- Used by: Trade executor, market data service, API gateway portfolio sync
|
||||
|
||||
**Authentication & Security:**
|
||||
- webauthn 2.0+ - WebAuthn/passkey registration and verification at `services/api_gateway/auth/routes.py`
|
||||
- PyJWT 2.8+ with crypto support - JWT token creation/verification at `services/api_gateway/auth/jwt.py`
|
||||
|
||||
**Observability:**
|
||||
- opentelemetry-sdk 1.20+ - Metrics collection framework
|
||||
- opentelemetry-exporter-prometheus 0.45b+ - Prometheus metrics exporter
|
||||
- opentelemetry-api 1.20+ - Telemetry instrumentation
|
||||
- prometheus-client - HTTP metrics server (imported in `shared/telemetry.py`)
|
||||
|
||||
**Data Processing:**
|
||||
- numpy 1.26+ - Numeric operations for backtester
|
||||
- pandas 2.2+ - Time series data manipulation
|
||||
- pytz 2024.1+ - Timezone support for market data
|
||||
|
||||
**Utilities:**
|
||||
- websockets 12.0+ - WebSocket support for real-time APIs (in api_gateway optional-dependencies)
|
||||
|
||||
## Configuration
|
||||
|
||||
**Environment:**
|
||||
- Pydantic BaseSettings with `TRADING_` prefix for all environment variables
|
||||
- Service-specific config classes extend `BaseConfig` at:
|
||||
- `services/sentiment_analyzer/config.py` - FinBERT threshold, Ollama host
|
||||
- `services/news_fetcher/config.py` - RSS feeds, Reddit credentials
|
||||
- `services/trade_executor/config.py` - Risk limits, Alpaca credentials
|
||||
- `services/api_gateway/config.py` - JWT settings, CORS, WebAuthn RP
|
||||
- `services/market_data/config.py` - Watchlist, bar timeframe, poll intervals
|
||||
|
||||
**Configuration File:**
|
||||
- `.env` - Runtime environment variables (not tracked, example: `.env.example`)
|
||||
- `pyproject.toml` - Project metadata and optional dependency groups:
|
||||
- `[api]` - FastAPI, WebAuthn, JWT
|
||||
- `[news]` - feedparser, praw, asyncpraw, httpx
|
||||
- `[sentiment]` - transformers, torch, ollama
|
||||
- `[trading]` - alpaca-py, pytz
|
||||
- `[backtester]` - numpy, pandas
|
||||
- `[dev]` - pytest, coverage, linters
|
||||
|
||||
**Database:**
|
||||
- PostgreSQL 16 + TimescaleDB extension (Docker: `timescale/timescaledb:latest-pg16`)
|
||||
- Connection: Async with asyncpg driver
|
||||
- Migrations: Alembic with `alembic upgrade head` (auto-run at startup)
|
||||
|
||||
**Cache/Message Broker:**
|
||||
- Redis 7-alpine - In-memory data store
|
||||
- Streams feature for inter-service messaging (consumer groups)
|
||||
- Persistent volumes for both PostgreSQL and Redis
|
||||
|
||||
## Build Configuration
|
||||
|
||||
**Backend:**
|
||||
- `pyproject.toml` - Single source of truth for Python dependencies
|
||||
- Setuptools 70.0+ - Build backend
|
||||
- Finds packages under: `shared*`, `services*`, `backtester*`, `scripts*`, `tests*`
|
||||
|
||||
**Frontend:**
|
||||
- `dashboard/vite.config.ts` - Vite configuration with React plugin and Tailwind CSS
|
||||
- `dashboard/tsconfig.json` - TypeScript configuration (references app and node configs)
|
||||
- Build: `npm run build` produces static files served by Nginx
|
||||
|
||||
**Docker:**
|
||||
- `docker/Dockerfile.service` - Multi-stage build for all Python services
|
||||
- Stage 1: Builder installs dependencies via pip
|
||||
- Stage 2: Runtime with slim Python image
|
||||
- `docker/Dockerfile.dashboard` - Multi-stage Node+Nginx build
|
||||
- Stage 1: Node 20 builds Vite app
|
||||
- Stage 2: Nginx serves static files
|
||||
- `docker-compose.yml` - Orchestrates: PostgreSQL, Redis, 7 microservices, dashboard
|
||||
|
||||
## Platform Requirements
|
||||
|
||||
**Development:**
|
||||
- Python 3.12 with venv or virtualenv
|
||||
- Node.js 20+ with npm
|
||||
- Docker & Docker Compose (for integration tests and full-stack deployment)
|
||||
- Optional: Ollama (local LLM, defaults to `http://localhost:11434`)
|
||||
|
||||
**Production:**
|
||||
- Docker and Docker Compose
|
||||
- PostgreSQL 16+ (external managed service or container)
|
||||
- Redis 7+ (external managed service or container)
|
||||
- Environment variables set via `.env` file or container environment
|
||||
- Optional: External Ollama service for sentiment fallback
|
||||
|
||||
---
|
||||
|
||||
*Stack analysis: 2025-02-23*
|
||||
309
.planning/codebase/STRUCTURE.md
Normal file
309
.planning/codebase/STRUCTURE.md
Normal file
|
|
@ -0,0 +1,309 @@
|
|||
# Codebase Structure
|
||||
|
||||
**Analysis Date:** 2025-02-23
|
||||
|
||||
## Directory Layout
|
||||
|
||||
```
|
||||
trading-bot/
|
||||
├── alembic/ # Database migration versions and configuration
|
||||
│ ├── env.py # Alembic runtime environment
|
||||
│ ├── script.py.mako # Migration template
|
||||
│ └── versions/ # Individual migration files
|
||||
├── backtester/ # Historical replay engine
|
||||
│ ├── __init__.py
|
||||
│ ├── config.py # Backtester configuration
|
||||
│ ├── data_loader.py # Load bars from database
|
||||
│ ├── engine.py # Main replay loop with SimulatedBroker
|
||||
│ ├── metrics.py # Equity curve, Sharpe, drawdown calculations
|
||||
│ └── simulated_broker.py # Paper trading for backtests
|
||||
├── dashboard/ # React/TypeScript frontend
|
||||
│ ├── public/ # Static assets (favicon, manifest)
|
||||
│ ├── src/
|
||||
│ │ ├── api/ # API client (auth.ts, client.ts)
|
||||
│ │ ├── assets/ # Images, fonts
|
||||
│ │ ├── components/ # Reusable UI (PositionsTable, EquityCurve, etc.)
|
||||
│ │ ├── hooks/ # Custom React hooks (useAuth, useWebSocket, usePortfolio)
|
||||
│ │ ├── pages/ # Page components (Login, Portfolio, TradeLog, etc.)
|
||||
│ │ ├── App.tsx # Router and layout
|
||||
│ │ ├── main.tsx # Entry point
|
||||
│ │ └── index.css # Tailwind styles
|
||||
│ ├── index.html # HTML template
|
||||
│ ├── package.json # npm dependencies (Vite, React, TypeScript)
|
||||
│ └── vite.config.ts # Vite build config
|
||||
├── docker/ # Dockerfiles and compose configs
|
||||
│ ├── Dockerfile.service # Python service base image
|
||||
│ ├── Dockerfile.dashboard # Node.js build + Nginx serve
|
||||
│ └── nginx.conf # Nginx reverse proxy config
|
||||
├── docs/ # Documentation and planning
|
||||
│ └── plans/ # Implementation plans from GSD phases
|
||||
├── scripts/ # Utility scripts
|
||||
│ ├── seed_strategies.py # Initialize default strategies in DB
|
||||
│ ├── smoke_test.sh # Health check / integration test script
|
||||
│ └── __init__.py
|
||||
├── services/ # Microservices (Python packages with underscores)
|
||||
│ ├── api_gateway/ # FastAPI HTTP/WebSocket server
|
||||
│ │ ├── main.py # App factory, lifespan, router registration
|
||||
│ │ ├── config.py # ApiGatewayConfig (CORS, JWT secret, etc.)
|
||||
│ │ ├── ws.py # WebSocket endpoint
|
||||
│ │ ├── auth/
|
||||
│ │ │ ├── routes.py # Register/login/logout endpoints
|
||||
│ │ │ ├── jwt.py # JWT token encoding/decoding
|
||||
│ │ │ └── middleware.py # get_current_user dependency
|
||||
│ │ ├── routes/ # REST endpoints organized by domain
|
||||
│ │ │ ├── news.py # GET /api/news
|
||||
│ │ │ ├── signals.py # GET /api/signals
|
||||
│ │ │ ├── trades.py # GET /api/trades
|
||||
│ │ │ ├── portfolio.py # GET /api/portfolio (live account)
|
||||
│ │ │ ├── strategies.py # GET/PUT /api/strategies
|
||||
│ │ │ ├── controls.py # POST /api/controls/* (pause/resume/etc)
|
||||
│ │ │ └── backtest.py # POST /api/backtest
|
||||
│ │ └── tasks/
|
||||
│ │ └── portfolio_sync.py # Background task: sync account data
|
||||
│ ├── news_fetcher/ # Polls RSS + Reddit
|
||||
│ │ ├── main.py # Entry point: fetch + deduplicate + publish
|
||||
│ │ ├── config.py # NewsFetcherConfig (feed URLs, poll intervals)
|
||||
│ │ └── sources/
|
||||
│ │ ├── rss.py # RSSSource.fetch() via feedparser
|
||||
│ │ └── reddit.py # RedditSource.fetch() via asyncpraw
|
||||
│ ├── sentiment_analyzer/ # Score sentiment + extract tickers
|
||||
│ │ ├── main.py # Entry point: consume raw → score → publish
|
||||
│ │ ├── config.py # SentimentAnalyzerConfig (thresholds, model paths)
|
||||
│ │ ├── ticker_extractor.py # Regex-based company name → ticker
|
||||
│ │ └── analyzers/
|
||||
│ │ ├── finbert.py # FinBERTAnalyzer (transformers model)
|
||||
│ │ └── ollama_analyzer.py # OllamaAnalyzer (fallback LLM)
|
||||
│ ├── signal_generator/ # Combine sentiment + market → signals
|
||||
│ │ ├── main.py # Entry point: consume scored articles + bars → ensemble
|
||||
│ │ ├── config.py # SignalGeneratorConfig (ensemble threshold)
|
||||
│ │ ├── ensemble.py # WeightedEnsemble orchestration
|
||||
│ │ └── market_data.py # MarketDataManager: SMA/RSI calculation per ticker
|
||||
│ ├── trade_executor/ # Risk checks + order submission
|
||||
│ │ ├── main.py # Entry point: consume signals → risk check → submit
|
||||
│ │ ├── config.py # TradeExecutorConfig (risk limits, position size formula)
|
||||
│ │ └── risk_manager.py # RiskManager: position limits, market hours, cooldowns
|
||||
│ ├── learning_engine/ # Evaluate trades + adjust weights
|
||||
│ │ ├── main.py # Entry point: consume executions → evaluate → adjust weights
|
||||
│ │ ├── config.py # LearningEngineConfig (adjustment parameters)
|
||||
│ │ ├── evaluator.py # TradeEvaluator: match opens to closes, calc P&L
|
||||
│ │ └── weight_adjuster.py # WeightAdjuster: multi-armed bandit logic
|
||||
│ ├── market_data/ # Fetch + stream OHLCV bars
|
||||
│ │ ├── main.py # Entry point: fetch from Alpaca → publish bars
|
||||
│ │ ├── config.py # MarketDataConfig (watchlist, timeframe, limits)
|
||||
│ │ └── __main__.py # Allows `python -m services.market_data`
|
||||
│ └── __init__.py
|
||||
├── shared/ # Shared Python libraries (imported by all services)
|
||||
│ ├── __init__.py
|
||||
│ ├── config.py # BaseConfig: common settings (DB, Redis, logging)
|
||||
│ ├── db.py # create_db(): async engine + sessionmaker
|
||||
│ ├── redis_streams.py # StreamPublisher, StreamConsumer
|
||||
│ ├── telemetry.py # setup_telemetry(): OpenTelemetry + Prometheus
|
||||
│ ├── broker/ # Brokerage abstraction layer
|
||||
│ │ ├── base.py # BaseBroker ABC (abstract interface)
|
||||
│ │ └── alpaca_broker.py # AlpacaBroker: alpaca-py implementation
|
||||
│ ├── models/ # SQLAlchemy ORM models
|
||||
│ │ ├── __init__.py # Imports all models so Alembic discovers them
|
||||
│ │ ├── base.py # Base (declarative), TimestampMixin
|
||||
│ │ ├── auth.py # User, UserCredential
|
||||
│ │ ├── news.py # Article, ArticleSentiment
|
||||
│ │ ├── trading.py # Signal, Trade, Position, Strategy, StrategyWeightHistory
|
||||
│ │ ├── learning.py # TradeOutcome, LearningAdjustment
|
||||
│ │ └── timeseries.py # MarketData, PortfolioSnapshot, StrategyMetric
|
||||
│ ├── schemas/ # Pydantic v2 message schemas
|
||||
│ │ ├── __init__.py # Imports all schemas
|
||||
│ │ ├── base.py # BaseSchema, TimestampSchema
|
||||
│ │ ├── news.py # RawArticle, ScoredArticle
|
||||
│ │ ├── trading.py # TradeSignal, TradeExecution, OrderRequest, etc.
|
||||
│ │ └── learning.py # TradeOutcomeSchema, WeightAdjustment
|
||||
│ └── strategies/ # Trading strategy implementations
|
||||
│ ├── __init__.py # Exports BaseStrategy, concrete implementations
|
||||
│ ├── base.py # BaseStrategy ABC
|
||||
│ ├── momentum.py # MomentumStrategy
|
||||
│ ├── mean_reversion.py # MeanReversionStrategy
|
||||
│ └── news_driven.py # NewsDrivenStrategy
|
||||
├── tests/ # Test suites
|
||||
│ ├── __init__.py
|
||||
│ ├── conftest.py # Pytest fixtures (fake Redis, DB, configs)
|
||||
│ ├── test_*.py # Unit tests: models, schemas, broker, strategies, backtester
|
||||
│ ├── services/ # Service-specific tests
|
||||
│ │ ├── test_news_fetcher.py
|
||||
│ │ ├── test_sentiment_analyzer.py
|
||||
│ │ ├── test_signal_generator.py
|
||||
│ │ ├── test_trade_executor.py
|
||||
│ │ ├── test_learning_engine.py
|
||||
│ │ ├── test_api_auth.py
|
||||
│ │ ├── test_api_routes.py
|
||||
│ │ ├── test_market_data.py
|
||||
│ │ └── test_portfolio_sync.py
|
||||
│ └── integration/ # Integration tests (require docker)
|
||||
│ ├── test_news_pipeline.py
|
||||
│ └── test_trading_flow.py
|
||||
├── .claude/ # Project-specific Claude knowledge
|
||||
│ └── CLAUDE.md # Architecture notes, patterns, known issues
|
||||
├── .env.example # Example environment variables
|
||||
├── alembic.ini # Alembic configuration
|
||||
├── docker-compose.yml # Service orchestration (8 services + infra)
|
||||
├── pyproject.toml # Python package config, dependencies, test config
|
||||
└── README.md # Project overview
|
||||
```
|
||||
|
||||
## Directory Purposes
|
||||
|
||||
**`alembic/`:**
|
||||
- Purpose: Database schema migrations using Alembic (SQLAlchemy migration tool)
|
||||
- Contains: Migration scripts (timestamped Python files in `versions/`)
|
||||
- Workflow: `alembic revision -m "message"` creates migration, `alembic upgrade head` applies to DB
|
||||
|
||||
**`backtester/`:**
|
||||
- Purpose: Historical replay engine for strategy testing
|
||||
- Contains: Engine loop, simulated broker, metrics calculation
|
||||
- Used by: API endpoint `/api/backtest` (triggered by dashboard)
|
||||
- Workflow: Load historical bars from DB, replay trades with `SimulatedBroker`, output equity curve + metrics
|
||||
|
||||
**`dashboard/`:**
|
||||
- Purpose: React TypeScript frontend for user interaction
|
||||
- Contains: Pages (Login, Portfolio, TradeLog, News, etc.), components (charts, tables), API client, hooks
|
||||
- Entry: `src/main.tsx` → `src/App.tsx`
|
||||
- Build: `npm run build` outputs to `dist/`, served by Nginx in Docker
|
||||
|
||||
**`docker/`:**
|
||||
- Purpose: Container images and deployment config
|
||||
- Contains: `Dockerfile.service` (Python services base), `Dockerfile.dashboard` (Node build + Nginx)
|
||||
- Used by: docker-compose.yml to build and run all services
|
||||
|
||||
**`scripts/`:**
|
||||
- Purpose: Utility and setup scripts
|
||||
- Key files:
|
||||
- `seed_strategies.py`: Inserts default strategies into DB (Momentum, MeanReversion, NewsDriven)
|
||||
- `smoke_test.sh`: Checks all service health endpoints
|
||||
|
||||
**`services/*/main.py`:**
|
||||
- Purpose: Entry point for each microservice
|
||||
- Pattern: Each service has own `main.py` that can be run as `python -m services.SERVICE_NAME`
|
||||
- Lifecycle: Setup config → connect to Redis/DB → start polling/consuming → handle shutdown gracefully
|
||||
|
||||
**`services/*/config.py`:**
|
||||
- Purpose: Service-specific configuration
|
||||
- Pattern: Extends `BaseConfig`, adds service-specific settings
|
||||
- Example: `NewsFetcherConfig` adds `rss_urls`, `reddit_subreddits`, `rss_poll_interval_seconds`
|
||||
|
||||
**`shared/models/`:**
|
||||
- Purpose: SQLAlchemy ORM models (database schema)
|
||||
- Key tables: `trades`, `signals`, `articles`, `article_sentiments`, `market_data`, `positions`, `users`, `strategies`, `strategy_weight_history`, `learning_adjustments`, `portfolio_snapshots`
|
||||
- Pattern: Each model imports in `__init__.py` so Alembic can auto-discover for migrations
|
||||
|
||||
**`shared/schemas/`:**
|
||||
- Purpose: Pydantic v2 message schemas for validation
|
||||
- Used by: Services to validate messages on consume/publish via Redis Streams
|
||||
- Naming: Schemas mirror the Redis Stream message types (RawArticle, ScoredArticle, TradeSignal, TradeExecution, etc.)
|
||||
|
||||
**`shared/strategies/`:**
|
||||
- Purpose: Pluggable trading strategy implementations
|
||||
- Pattern: All extend `BaseStrategy`, implement `async evaluate(ticker, market, sentiment) → TradeSignal | None`
|
||||
- Used by: Signal generator via `WeightedEnsemble`
|
||||
|
||||
**`tests/`:**
|
||||
- Purpose: Unit and integration tests
|
||||
- Pattern: `test_*.py` files match modules in `shared/` and `services/`
|
||||
- Marker: `@pytest.mark.integration` for tests requiring live services (redis, postgres)
|
||||
- Run: `pytest tests/ -v -m "not integration"` for unit, or with docker running for integration
|
||||
|
||||
## Key File Locations
|
||||
|
||||
**Entry Points:**
|
||||
- `services/news_fetcher/main.py`: Polls RSS/Reddit, publishes to `news:raw`
|
||||
- `services/sentiment_analyzer/main.py`: Consumes `news:raw`, publishes to `news:scored`
|
||||
- `services/signal_generator/main.py`: Consumes `news:scored` + `market:bars`, publishes to `signals:generated`
|
||||
- `services/trade_executor/main.py`: Consumes `signals:generated`, submits orders, publishes to `trades:executed`
|
||||
- `services/learning_engine/main.py`: Consumes `trades:executed`, adjusts strategy weights
|
||||
- `services/market_data/main.py`: Fetches OHLCV bars, publishes to `market:bars`
|
||||
- `services/api_gateway/main.py`: HTTP server listening on port 8000 (configurable)
|
||||
- `dashboard/src/main.tsx`: React app entry, connects to API gateway
|
||||
|
||||
**Configuration:**
|
||||
- `.env`: Runtime environment variables (database URL, Redis URL, API keys, etc.)
|
||||
- `pyproject.toml`: Python project metadata, dependencies (split into optional groups per service)
|
||||
- `docker-compose.yml`: Service orchestration, port mappings, environment variables
|
||||
- `alembic.ini`: Database migration tool configuration
|
||||
|
||||
**Core Logic:**
|
||||
- `services/signal_generator/ensemble.py`: Weighted ensemble combining strategies
|
||||
- `services/trade_executor/risk_manager.py`: Risk checks before order submission
|
||||
- `services/learning_engine/weight_adjuster.py`: Multi-armed bandit weight adjustment
|
||||
- `backtester/engine.py`: Historical replay loop
|
||||
- `shared/broker/alpaca_broker.py`: Alpaca order submission and account queries
|
||||
|
||||
**Testing:**
|
||||
- `tests/conftest.py`: Pytest fixtures for mock Redis, DB, service configs
|
||||
- `tests/test_models.py`: ORM model tests
|
||||
- `tests/test_schemas.py`: Pydantic schema validation tests
|
||||
- `tests/services/`: Service-specific unit tests
|
||||
- `tests/integration/`: End-to-end pipeline tests (require docker)
|
||||
|
||||
## Naming Conventions
|
||||
|
||||
**Files:**
|
||||
- Python service directories use underscores: `news_fetcher`, `api_gateway`, `signal_generator` (matches Docker Compose service names with hyphens replaced)
|
||||
- Test files: `test_*.py` or `*_test.py` (pytest convention)
|
||||
- Database migration files: `{timestamp}_{description}.py` (Alembic auto-generated)
|
||||
- React components: PascalCase `.tsx` files (Login.tsx, Portfolio.tsx, PositionsTable.tsx)
|
||||
- React hooks: `use*.ts` prefix (useAuth.ts, useWebSocket.ts, usePortfolio.ts)
|
||||
- Config classes: `*Config` suffix extending `BaseConfig` (NewsFetcherConfig, ApiGatewayConfig)
|
||||
|
||||
**Directories:**
|
||||
- Services under `services/`: lowercase with underscores (news_fetcher, api_gateway)
|
||||
- Shared libraries under `shared/`: logical grouping (broker/, models/, schemas/, strategies/)
|
||||
- Tests mirror source structure: `tests/services/test_*.py` for `services/*/` modules
|
||||
- React pages: match route names (Portfolio.tsx for /portfolio, TradeLog.tsx for /trades)
|
||||
|
||||
## Where to Add New Code
|
||||
|
||||
**New Feature (End-to-End):**
|
||||
- Service-specific code: `services/{SERVICE_NAME}/{module}.py`
|
||||
- Shared models: `shared/models/{domain}.py` (if new entity type)
|
||||
- Shared schemas: `shared/schemas/{domain}.py` (if new message type)
|
||||
- Tests: `tests/services/test_{SERVICE_NAME}.py` and/or `tests/integration/`
|
||||
|
||||
**New Component/Module:**
|
||||
- Service: Create `services/{service_name}/` directory with `__init__.py`, `main.py`, `config.py`
|
||||
- Strategy: Extend `BaseStrategy` in `shared/strategies/{strategy_name}.py`
|
||||
- Broker adapter: Extend `BaseBroker` in `shared/broker/{provider_name}_broker.py`
|
||||
- API route: Add to `services/api_gateway/routes/{domain}.py`
|
||||
- Dashboard page: Add `dashboard/src/pages/{PageName}.tsx` and route in `App.tsx`
|
||||
|
||||
**Utilities:**
|
||||
- Shared helpers: `shared/{module}.py` (e.g., `shared/redis_streams.py`)
|
||||
- Service-specific helpers: `services/{SERVICE_NAME}/{helper}.py`
|
||||
- API utilities: `services/api_gateway/{utility}.py`
|
||||
|
||||
**Database:**
|
||||
- New table: Add model in `shared/models/{domain}.py`
|
||||
- Schema migration: Run `alembic revision -m "description"`, fill in migration file
|
||||
- Persist data: Service calls `session.add(model_instance)` then `session.commit()`
|
||||
|
||||
**Tests:**
|
||||
- Unit test: `tests/test_{module}.py` for shared, `tests/services/test_{service}.py` for services
|
||||
- Integration test: `tests/integration/test_{flow}.py`
|
||||
- Fixtures: Add to `tests/conftest.py` (database, Redis, mocked API responses)
|
||||
|
||||
## Special Directories
|
||||
|
||||
**`.planning/codebase/`:**
|
||||
- Purpose: Generated codebase mapping documents (ARCHITECTURE.md, STRUCTURE.md, etc.)
|
||||
- Generated by: `/gsd:map-codebase` command
|
||||
- Committed: Yes, part of repo for CI/CD reference
|
||||
|
||||
**`.pytest_cache/`, `__pycache__/`:**
|
||||
- Purpose: Pytest and Python runtime caches
|
||||
- Generated: Yes (auto-created by pytest/Python)
|
||||
- Committed: No (in .gitignore)
|
||||
|
||||
**`trading_bot.egg-info/`:**
|
||||
- Purpose: Build metadata from `setuptools.build_meta`
|
||||
- Generated: Yes (auto-created during install)
|
||||
- Committed: No (in .gitignore)
|
||||
|
||||
**`dashboard/dist/`:**
|
||||
- Purpose: Built React app (HTML, JS, CSS bundles)
|
||||
- Generated: Yes (by `npm run build`)
|
||||
- Committed: No (in .gitignore)
|
||||
397
.planning/codebase/TESTING.md
Normal file
397
.planning/codebase/TESTING.md
Normal file
|
|
@ -0,0 +1,397 @@
|
|||
# Testing Patterns
|
||||
|
||||
**Analysis Date:** 2026-02-23
|
||||
|
||||
## Test Framework
|
||||
|
||||
**Runner:**
|
||||
- pytest 8.0+ (`pytest>=8.0` in `pyproject.toml`)
|
||||
- Config: `pyproject.toml` under `[tool.pytest.ini_options]`
|
||||
- `asyncio_mode = "auto"` — Automatically handles async test discovery and execution
|
||||
- `testpaths = ["tests"]` — Tests located in root-level `tests/` directory
|
||||
- Test markers: `integration` marks tests requiring docker services (redis, postgres)
|
||||
|
||||
**Assertion Library:**
|
||||
- Built-in pytest assertions (`assert`, `assert x == y`)
|
||||
- `pytest.approx()` for floating-point comparisons (e.g., `assert outcome.realized_pnl == pytest.approx(100.0)`)
|
||||
|
||||
**Run Commands:**
|
||||
```bash
|
||||
python -m pytest tests/ -v # Run all tests
|
||||
python -m pytest tests/ -v -m "not integration" # Run unit tests only
|
||||
python -m pytest tests/ -v -m integration # Run integration tests only (requires docker)
|
||||
python -m pytest tests/ --cov # Run with coverage report
|
||||
python -m pytest tests/ -x # Stop on first failure
|
||||
python -m pytest tests/ -k test_name # Run tests matching pattern
|
||||
```
|
||||
|
||||
**Test Execution:**
|
||||
- Async tests automatically discovered and run via `asyncio_mode="auto"`
|
||||
- No `@pytest.mark.asyncio` decorator needed (though present in some tests for clarity)
|
||||
- Integration tests require `docker-compose up -d` with Redis and PostgreSQL running
|
||||
|
||||
## Test File Organization
|
||||
|
||||
**Location:**
|
||||
- Co-located tests pattern: Tests in `tests/` directory mirroring `services/` and `shared/` structure
|
||||
- Structure:
|
||||
```
|
||||
tests/
|
||||
├── test_redis_streams.py # Tests for shared/redis_streams.py
|
||||
├── test_models.py # Tests for shared/models/
|
||||
├── test_schemas.py # Tests for shared/schemas/
|
||||
├── test_broker.py # Tests for shared/broker/
|
||||
├── test_strategies.py # Tests for shared/strategies/
|
||||
├── test_backtester.py # Tests for backtester/
|
||||
├── services/
|
||||
│ ├── test_news_fetcher.py
|
||||
│ ├── test_sentiment_analyzer.py
|
||||
│ ├── test_signal_generator.py
|
||||
│ ├── test_trade_executor.py
|
||||
│ ├── test_learning_engine.py
|
||||
│ ├── test_api_auth.py
|
||||
│ ├── test_api_routes.py
|
||||
│ ├── test_market_data.py
|
||||
│ └── test_portfolio_sync.py
|
||||
└── integration/
|
||||
├── test_news_pipeline.py
|
||||
└── test_trading_flow.py
|
||||
```
|
||||
|
||||
**Naming:**
|
||||
- Test files: `test_{module}.py` (e.g., `test_redis_streams.py`)
|
||||
- Test functions: `test_{component}_{scenario}` (e.g., `test_publisher_publishes_json`)
|
||||
- Test classes: `Test{Scenario}` (e.g., `TestEvaluateProfitableTrade`)
|
||||
- Helper functions: `_make_{object}` (e.g., `_make_config`, `_make_signal`, `_make_trade_id`)
|
||||
|
||||
## Test Structure
|
||||
|
||||
**Suite Organization:**
|
||||
```python
|
||||
# Module docstring describing test scope
|
||||
"""Tests for the Redis Streams publish/consume helpers."""
|
||||
|
||||
# Imports (pytest first, then unittest.mock, then project imports)
|
||||
import json
|
||||
from unittest.mock import AsyncMock
|
||||
import pytest
|
||||
from shared.redis_streams import StreamConsumer, StreamPublisher
|
||||
|
||||
# Fixtures (if any)
|
||||
@pytest.fixture
|
||||
async def redis_client():
|
||||
"""Provide a clean Redis connection and clean up after."""
|
||||
client = Redis.from_url(REDIS_URL)
|
||||
yield client
|
||||
await client.aclose()
|
||||
|
||||
# Test functions or classes
|
||||
@pytest.mark.asyncio
|
||||
async def test_publisher_publishes_json():
|
||||
"""StreamPublisher should XADD a JSON-serialised payload."""
|
||||
redis = AsyncMock()
|
||||
# ... test implementation
|
||||
|
||||
class TestEvaluateProfitableTrade:
|
||||
"""A long trade that gains in price should have positive PnL and ROI."""
|
||||
|
||||
def test_evaluate_profitable_trade(self):
|
||||
# ... test implementation
|
||||
```
|
||||
|
||||
**Section Comments:**
|
||||
- Use comment separators: `# ---------------------------------------------------------------------------`
|
||||
- Group tests by concern: Enums, Fixtures, RSS tests, Reddit tests, Integration tests, etc.
|
||||
- Example from `test_models.py`:
|
||||
```python
|
||||
# ---------------------------------------------------------------------------
|
||||
# Enum tests
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
class TestEnums:
|
||||
```
|
||||
|
||||
**Patterns:**
|
||||
- **Setup pattern**: Create fixtures as pytest `@pytest.fixture` decorators
|
||||
- Can be module-level (reused) or function-level (isolated)
|
||||
- Async fixtures use `async def` and `yield` or `yield from`
|
||||
- Example:
|
||||
```python
|
||||
@pytest.fixture
|
||||
async def redis_client():
|
||||
client = Redis.from_url(REDIS_URL)
|
||||
yield client
|
||||
await client.aclose()
|
||||
```
|
||||
|
||||
- **Teardown pattern**: Use `yield` in fixtures for cleanup
|
||||
- Code after `yield` runs after the test completes
|
||||
- Example from `test_news_pipeline.py`:
|
||||
```python
|
||||
@pytest.fixture
|
||||
async def redis_client():
|
||||
client = Redis.from_url(REDIS_URL)
|
||||
await client.delete(RAW_STREAM, SCORED_STREAM) # Setup
|
||||
yield client
|
||||
await client.delete(RAW_STREAM, SCORED_STREAM) # Teardown
|
||||
await client.aclose()
|
||||
```
|
||||
|
||||
- **Assertion pattern**: Use pytest assertions directly
|
||||
- For equality: `assert x == y`
|
||||
- For calls: `redis.xadd.assert_called_once_with(...)`
|
||||
- For floating point: `assert value == pytest.approx(expected)`
|
||||
- Example from `test_redis_streams.py`:
|
||||
```python
|
||||
redis.xadd.assert_called_once_with(
|
||||
"test:stream",
|
||||
{"data": json.dumps({"ticker": "AAPL", "score": 0.8})},
|
||||
)
|
||||
assert msg_id == b"1-0"
|
||||
```
|
||||
|
||||
## Mocking
|
||||
|
||||
**Framework:** `unittest.mock` (built-in)
|
||||
|
||||
**Patterns:**
|
||||
- AsyncMock for async functions: `AsyncMock(return_value=...)`
|
||||
- MagicMock for sync functions and objects: `MagicMock()`
|
||||
- SimpleNamespace for lightweight objects: `SimpleNamespace(title=..., score=...)`
|
||||
|
||||
**Example from `test_redis_streams.py`:**
|
||||
```python
|
||||
redis = AsyncMock()
|
||||
redis.xadd = AsyncMock(return_value=b"1-0")
|
||||
|
||||
pub = StreamPublisher(redis, "test:stream")
|
||||
msg_id = await pub.publish({"ticker": "AAPL"})
|
||||
|
||||
redis.xadd.assert_called_once_with(...)
|
||||
assert msg_id == b"1-0"
|
||||
```
|
||||
|
||||
**Example from `test_news_fetcher.py` (multi-call behavior):**
|
||||
```python
|
||||
redis.xreadgroup = AsyncMock(
|
||||
side_effect=[
|
||||
[("test:stream", [(b"1-0", {b"data": json.dumps(payload).encode()})])],
|
||||
KeyboardInterrupt, # Break loop on second call
|
||||
]
|
||||
)
|
||||
```
|
||||
|
||||
**What to Mock:**
|
||||
- External services: Redis, database (use AsyncMock with return values)
|
||||
- API calls: HTTP requests, OpenTelemetry counters
|
||||
- ML models: FinBERT and Ollama analysis (patch and return synthetic scores)
|
||||
- Broker connections: Alpaca API (return fake order results)
|
||||
- File I/O and network operations
|
||||
|
||||
**What NOT to Mock:**
|
||||
- Core business logic (RiskManager, TradeEvaluator, WeightAdjuster)
|
||||
- Data structures and schemas
|
||||
- Internal function calls within a module
|
||||
- Time-based operations in unit tests (use fixtures for time-dependent tests)
|
||||
|
||||
**Patching Example from `test_sentiment_analyzer.py`:**
|
||||
```python
|
||||
with patch("services.sentiment_analyzer.analyzers.finbert.FinBERTAnalyzer") as mock_finbert:
|
||||
mock_instance = AsyncMock()
|
||||
mock_instance.analyze = AsyncMock(return_value=(0.8, 0.95))
|
||||
mock_finbert.return_value = mock_instance
|
||||
# ... run test
|
||||
```
|
||||
|
||||
## Fixtures and Factories
|
||||
|
||||
**Test Data Patterns:**
|
||||
|
||||
Helper functions to create test objects:
|
||||
```python
|
||||
def _make_config(**overrides) -> LearningEngineConfig:
|
||||
"""Create a LearningEngineConfig with sensible defaults + overrides."""
|
||||
defaults = dict(
|
||||
learning_rate=0.1,
|
||||
min_trades_before_adjustment=20,
|
||||
max_weight_shift_pct=0.10,
|
||||
)
|
||||
defaults.update(overrides)
|
||||
return LearningEngineConfig(**defaults)
|
||||
|
||||
def _make_signal(
|
||||
ticker: str = "AAPL",
|
||||
direction: SignalDirection = SignalDirection.LONG,
|
||||
) -> TradeSignal:
|
||||
return TradeSignal(
|
||||
ticker=ticker,
|
||||
direction=direction,
|
||||
strength=0.8,
|
||||
strategy_sources=["test"],
|
||||
timestamp=datetime.now(timezone.utc),
|
||||
)
|
||||
```
|
||||
|
||||
**Pytest Fixtures:**
|
||||
```python
|
||||
@pytest.fixture
|
||||
def sample_article() -> RawArticle:
|
||||
"""Return a sample RawArticle mentioning AAPL."""
|
||||
return RawArticle(
|
||||
source="rss",
|
||||
url="https://example.com/aapl-news",
|
||||
title="Apple Inc AAPL reports record quarterly earnings",
|
||||
content="...",
|
||||
published_at=datetime.now(timezone.utc),
|
||||
fetched_at=datetime.now(timezone.utc),
|
||||
content_hash="test-hash-aapl-001",
|
||||
)
|
||||
|
||||
@pytest.fixture()
|
||||
def config() -> ApiGatewayConfig:
|
||||
return ApiGatewayConfig(
|
||||
jwt_secret_key="test-secret-for-routes",
|
||||
database_url="sqlite+aiosqlite:///:memory:",
|
||||
redis_url="redis://localhost:6379/0",
|
||||
)
|
||||
```
|
||||
|
||||
**Location:**
|
||||
- Helper functions: Top of test file, marked with `_make_` prefix, after docstring and imports
|
||||
- Pytest fixtures: After helpers, before test classes/functions, decorated with `@pytest.fixture`
|
||||
- Shared fixtures: In separate test files if reused across multiple tests
|
||||
- Integration test fixtures: `redis_client` (cleanup with delete and close), database fixtures
|
||||
|
||||
## Coverage
|
||||
|
||||
**Requirements:** Not enforced by default; 246 unit tests pass with zero failures (as of last sprint)
|
||||
|
||||
**View Coverage:**
|
||||
```bash
|
||||
python -m pytest tests/ --cov=shared --cov=services --cov-report=term-missing
|
||||
python -m pytest tests/ --cov --cov-report=html # Generate HTML report
|
||||
```
|
||||
|
||||
**Coverage Statistics (approximate):**
|
||||
- `tests/test_redis_streams.py` — 5 tests (complete coverage of StreamPublisher/Consumer)
|
||||
- `tests/test_models.py` — 21 tests (enums, relationships)
|
||||
- `tests/test_schemas.py` — 49 tests (Pydantic schema validation)
|
||||
- `tests/test_broker.py` — 18 tests (AlpacaBroker implementation)
|
||||
- `tests/test_strategies.py` — 24 tests (RSI, EMA, Momentum strategies)
|
||||
- `tests/test_backtester.py` — 13 tests (backtest simulation)
|
||||
- `tests/services/test_news_fetcher.py` — 10 tests (RSS, Reddit, deduplication)
|
||||
- `tests/services/test_sentiment_analyzer.py` — 19 tests (FinBERT, Ollama, tickers)
|
||||
- `tests/services/test_signal_generator.py` — 17 tests (weighted ensemble)
|
||||
- `tests/services/test_trade_executor.py` — 16 tests (RiskManager, order flow)
|
||||
- `tests/services/test_learning_engine.py` — 28 tests (trade evaluation, weight adjustment)
|
||||
- `tests/services/test_api_auth.py` — 13 tests (WebAuthn, JWT)
|
||||
- `tests/services/test_api_routes.py` — 13 tests (endpoint responses)
|
||||
- `tests/integration/` — 9 integration tests (news pipeline, trading flow)
|
||||
|
||||
## Test Types
|
||||
|
||||
**Unit Tests:**
|
||||
- Scope: Single function, class, or module
|
||||
- Strategy: Mock all external dependencies (Redis, DB, API calls, ML models)
|
||||
- Location: `tests/test_*.py` and `tests/services/test_*.py`
|
||||
- Execution: Runs in isolation without services
|
||||
- Examples: `test_publisher_publishes_json`, `test_evaluate_profitable_trade`
|
||||
|
||||
**Integration Tests:**
|
||||
- Scope: Multi-service interaction (e.g., news fetcher → sentiment analyzer pipeline)
|
||||
- Strategy: Real Redis streams, real database, mocked ML models and external APIs
|
||||
- Location: `tests/integration/test_*.py`
|
||||
- Execution: Requires `docker-compose up -d` with Redis and PostgreSQL running
|
||||
- Marker: `@pytest.mark.integration` (separate via `pytest -m integration`)
|
||||
- Examples: `test_news_pipeline.py` (publishes to `news:raw`, reads from `news:scored`), `test_trading_flow.py`
|
||||
|
||||
**E2E Tests:**
|
||||
- Not implemented; would require running full docker-compose stack with live Alpaca paper trading
|
||||
- Could be added for smoke testing production deployments
|
||||
|
||||
## Common Patterns
|
||||
|
||||
**Async Testing:**
|
||||
```python
|
||||
@pytest.mark.asyncio
|
||||
async def test_consumer_consume_yields_and_acks() -> None:
|
||||
"""consume() should yield deserialised data and ACK each message."""
|
||||
redis = AsyncMock()
|
||||
redis.xgroup_create = AsyncMock()
|
||||
redis.xreadgroup = AsyncMock(side_effect=[
|
||||
[("test:stream", [(b"1-0", {b"data": json.dumps(payload).encode()})])],
|
||||
KeyboardInterrupt,
|
||||
])
|
||||
|
||||
consumer = StreamConsumer(redis, "test:stream", "grp", "c1")
|
||||
results = []
|
||||
try:
|
||||
async for msg_id, data in consumer.consume():
|
||||
results.append((msg_id, data))
|
||||
except KeyboardInterrupt:
|
||||
pass
|
||||
|
||||
assert len(results) == 1
|
||||
assert results[0] == (b"1-0", payload)
|
||||
```
|
||||
|
||||
**Error Testing:**
|
||||
```python
|
||||
def test_consumer_ensure_group_ignores_existing() -> None:
|
||||
"""If the group already exists the exception should be swallowed."""
|
||||
redis = AsyncMock()
|
||||
redis.xgroup_create = AsyncMock(side_effect=Exception("BUSYGROUP"))
|
||||
|
||||
consumer = StreamConsumer(redis, "test:stream", "my-group", "worker-1")
|
||||
# Should not raise
|
||||
await consumer.ensure_group() # No assertion; test passes if no exception
|
||||
```
|
||||
|
||||
**Parametrized Tests:**
|
||||
- Not heavily used in current codebase
|
||||
- Could be added for testing multiple input scenarios (e.g., different signal directions)
|
||||
- Use `@pytest.mark.parametrize` if needed
|
||||
|
||||
**Floating-Point Assertions:**
|
||||
```python
|
||||
assert outcome.realized_pnl == pytest.approx(100.0) # Allows small differences
|
||||
assert outcome.roi_pct == pytest.approx(10.0, rel=1e-5) # With tolerance
|
||||
```
|
||||
|
||||
**Class-Based Test Organization:**
|
||||
```python
|
||||
class TestEvaluateProfitableTrade:
|
||||
"""A long trade that gains in price should have positive PnL and ROI."""
|
||||
|
||||
def test_evaluate_profitable_trade(self):
|
||||
evaluator = TradeEvaluator()
|
||||
outcome = evaluator.evaluate_trade(...)
|
||||
|
||||
assert outcome.realized_pnl == pytest.approx(100.0)
|
||||
assert outcome.was_profitable is True
|
||||
|
||||
class TestEvaluateLosingTrade:
|
||||
"""A long trade that drops should have negative PnL."""
|
||||
|
||||
def test_evaluate_losing_trade(self):
|
||||
# ... different scenario
|
||||
```
|
||||
|
||||
## Test Configuration
|
||||
|
||||
**pytest.ini_options (from pyproject.toml):**
|
||||
```toml
|
||||
[tool.pytest.ini_options]
|
||||
asyncio_mode = "auto"
|
||||
testpaths = ["tests"]
|
||||
markers = ["integration: marks tests requiring docker services (redis, postgres)"]
|
||||
```
|
||||
|
||||
**Environment:**
|
||||
- Database URL: Tests use `sqlite+aiosqlite:///:memory:` for in-memory databases
|
||||
- Redis: Tests use `redis://localhost:6379/1` (DB 1) for integration tests to avoid conflicts
|
||||
- Async mode: "auto" handles all async test discovery automatically
|
||||
|
||||
---
|
||||
|
||||
*Testing analysis: 2026-02-23*
|
||||
Loading…
Add table
Add a link
Reference in a new issue