[ci skip] Flatten module wrappers into stack roots

Remove the module "xxx" { source = "./module" } indirection layer
from all 66 service stacks. Resources are now defined directly in
each stack's main.tf instead of through a wrapper module.

- Merge module/main.tf contents into stack main.tf
- Apply variable replacements (var.tier -> local.tiers.X, renamed vars)
- Fix shared module paths (one fewer ../ at each level)
- Move extra files/dirs (factory/, chart_values, subdirs) to stack root
- Update state files to strip module.<name>. prefix
- Update CLAUDE.md to reflect flat structure

Verified: terragrunt plan shows 0 add, 0 destroy across all stacks.
This commit is contained in:
Viktor Barzin 2026-02-22 15:13:55 +00:00
parent b0499a7f31
commit c7c7047f1c
245 changed files with 11733 additions and 12432 deletions

View file

@ -0,0 +1,191 @@
# Architecture
**Analysis Date:** 2026-02-17
## Pattern Overview
**Overall:** Layered monolithic service with clear separation between HTTP API layer, business logic, and persistent storage layer.
**Key Characteristics:**
- Single Go binary serving both API and static frontend
- File-based JSON persistence (no database)
- Modular internal packages for distinct concerns
- WebAuthn-based passwordless authentication
- Background scraper for content aggregation
- Rate-limited proxy service
## Layers
**HTTP Handler Layer:**
- Purpose: Accept and route HTTP requests, apply middleware, respond to clients
- Location: `internal/server/`
- Contains: Route registration, handler functions, middleware chains
- Depends on: Auth, Store, Proxy, Scraper packages
- Used by: HTTP clients (browser, mobile)
**Authentication & Authorization Layer:**
- Purpose: Manage user registration, login, sessions, and permission checks
- Location: `internal/auth/`
- Contains: WebAuthn ceremony implementations, session management, context helpers
- Depends on: Store, go-webauthn library
- Used by: Server middleware and handlers
**Business Logic Layer:**
- Purpose: Core domain operations (stream management, scraping, proxying)
- Location: `internal/scraper/`, `internal/proxy/`
- Contains: Scraper service (Reddit polling), Proxy service (content fetching with rate limiting)
- Depends on: Store for persistence
- Used by: Server, main entry point for orchestration
**Data Model Layer:**
- Purpose: Define domain types and interfaces
- Location: `internal/models/models.go`
- Contains: `User`, `Stream`, `ScrapedLink`, `Session` types
- Depends on: External WebAuthn library for credential types
- Used by: All layers
**Persistence Layer:**
- Purpose: Provide file-based storage abstraction
- Location: `internal/store/`
- Contains: JSON read/write helpers, file-based storage per entity type (streams, users, sessions, scraped links)
- Depends on: Models, filesystem
- Used by: All business logic layers
## Data Flow
**Stream Submission Flow:**
1. Client submits stream URL and title via `POST /api/streams`
2. Server handler validates URL format and length
3. Optional: If authenticated user, stream marked as unpublished; if anonymous, marked as published
4. Stream stored via `Store.AddStream()` which reads current `streams.json`, appends new stream, writes atomically
5. Response returned with stream metadata
**Authentication Flow (WebAuthn):**
1. User initiates registration with `POST /api/auth/register/begin` sending username
2. Server validates username format, checks uniqueness, creates temporary user
3. Server generates WebAuthn registration options via go-webauthn library
4. Server stores session data in memory with 5-minute expiry
5. Client performs attestation ceremony, sends credential via `POST /api/auth/register/finish?username=...`
6. Server retrieves in-memory session, validates with go-webauthn
7. Credential appended to user in `users.json`
8. Session token created in `sessions.json`, set as HttpOnly cookie
**Scraper Flow:**
1. Scraper runs on timer (default 15 minutes) or on manual trigger
2. Calls `scrapeReddit()` to poll r/motorsportsstreams2 new posts
3. Extracts URLs using regex, filters by F1-related keywords
4. Merges with existing `scraped.json`, deduplicating by normalized URL
5. Writes updated list atomically
6. Stale entries cleaned up, active ones returned via `GET /api/scraped`
**Proxy Flow:**
1. Client requests `GET /proxy?url=https://...`
2. Server validates URL scheme (must be HTTPS), length, and target is not private IP
3. Applies rate limiting via token bucket per client IP
4. Fetches URL with timeout, limits response body to 5MB
5. Injects `<base>` tag into HTML response for relative URL resolution
6. Strips X-Frame-Options and CSP headers to allow iframe embedding
7. Returns modified content
**Admin Approval Flow:**
1. Anonymous streams created with `Published: false`
2. Admin views all streams via `GET /api/admin/streams`
3. Admin toggles publication status via `PUT /api/streams/{id}/publish`
4. Published streams visible in `GET /api/streams/public`
**State Management:**
- **User Sessions:** In-memory WebAuthn ceremony sessions (5-minute TTL), persistent sessions in `sessions.json` with configurable TTL
- **Streams:** Fully loaded into memory from `streams.json` on each read/write, entire file rewritten atomically
- **Scraped Links:** Similar full-file pattern, deduplicated during scrape merge
- **Users:** Fully loaded per query, updated atomically per write
- **Cleanup:** Hourly cleanup of expired sessions via background goroutine
## Key Abstractions
**Store Interface (implicit):**
- Purpose: Encapsulate all file-based persistence operations
- Examples: `store.AddStream()`, `store.GetUserByName()`, `store.CreateSession()`
- Pattern: Each entity type has dedicated file; reads are lock-protected; writes are atomic (temp-file-then-rename)
**Auth Middleware Chain:**
- Purpose: Extract and validate user from session cookie, inject into request context
- Examples: `AuthMiddleware()`, `RequireAuth()`, `RequireAdmin()`
- Pattern: Composable handler functions that wrap next handler
**Scraper Service:**
- Purpose: Periodically fetch and aggregate content from external sources
- Examples: Background goroutine running on interval, triggered scrape
- Pattern: Mutex-protected scrape operations to prevent concurrent executions
**Proxy Handler:**
- Purpose: Fetch external content safely with rate limiting and framing bypass
- Examples: URL validation, private IP blocking, rate limiting per IP, HTML base tag injection
- Pattern: Implements `http.Handler` interface, maintains per-IP token bucket state
## Entry Points
**HTTP Server (`main.go`):**
- Location: `main.go`
- Triggers: Process start
- Responsibilities: Initialize all services, configure routes, handle graceful shutdown on SIGTERM/SIGINT
**Handler Routes (`internal/server/server.go`):**
- Location: `internal/server/server.go:registerRoutes()`
- Pattern: All routes defined in single function, middleware applied uniformly
- Public endpoints: Health, public streams, public scraped links
- Authenticated endpoints: Personal streams, submit stream, delete stream
- Admin endpoints: All streams, toggle publish, trigger scrape
**Background Services:**
- Scraper: Started in goroutine at startup via `scraper.Run(ctx)`
- Session cleanup: Goroutine with hourly ticker
- Proxy rate-limit cleanup: Goroutine with 10-minute ticker
## Error Handling
**Strategy:** Error strings returned in JSON responses with appropriate HTTP status codes. Panics caught and logged by recovery middleware.
**Patterns:**
- Validation errors: `400 Bad Request`
- Authentication failures: `401 Unauthorized`
- Permission denied: `403 Forbidden`
- Resource not found: `404 Not Found`
- Duplicate entries: `409 Conflict`
- Server errors: `500 Internal Server Error`
- Rate limit exceeded: `429 Too Many Requests`
Errors include descriptive messages: `{"error":"username must be 3-30 chars, alphanumeric or underscore"}`
## Cross-Cutting Concerns
**Logging:** stdlib log package
- Request logging: Method, path, remote address via `LoggingMiddleware`
- Scraper logging: Intervals, timing, link counts
- Proxy logging: Fetch errors
- All goes to stdout
**Validation:**
- Username: 3-30 chars, alphanumeric + underscore
- URLs: Must be HTTP(S), max 2048 chars, proxy-only supports HTTPS
- HTML escaping on stream titles to prevent injection
**Authentication:**
- WebAuthn for registration/login (passwordless)
- Session tokens as HttpOnly, Secure, SameSite=Strict cookies
- Configurable session TTL (default 720 hours)
- First registered user becomes admin unless ADMIN_USERNAME env var set
**CORS/Origin Check:**
- Origin header validated on mutation requests (POST, PUT, DELETE)
- Allowed origins configurable via WEBAUTHN_ORIGIN env var (comma-separated)
- CSRF protection via origin validation
---
*Architecture analysis: 2026-02-17*

View file

@ -0,0 +1,232 @@
# Codebase Concerns
**Analysis Date:** 2026-02-17
## Tech Debt
**File-based JSON storage as primary data persistence:**
- Issue: All data (users, streams, sessions, scraped links) are stored as JSON files on disk with file-level locking. This is a fundamental scalability constraint.
- Files: `internal/store/store.go`, `internal/store/streams.go`, `internal/store/sessions.go`, `internal/store/users.go`, `internal/store/scraped.go`
- Impact:
- Non-atomic multi-file operations (e.g., DeleteStream reads all streams, filters, writes back). Race conditions possible if two deletes happen simultaneously.
- Entire file loaded into memory for any operation, even reads. With thousands of streams/sessions, this becomes slow and memory-inefficient.
- Sessions file grows unbounded until manual cleanup (CleanExpiredSessions runs hourly). Could cause memory/disk pressure.
- No transaction support, no rollback capability on failure.
- Fix approach: Migrate to a proper database (SQLite for simplicity, PostgreSQL for production). Keep JSON file for backup/export purposes only.
**In-memory WebAuthn ceremony session storage with no cleanup guarantee:**
- Issue: Registration and login ceremony session data stored in `Auth.regSessions` and `Auth.loginSessions` maps. Cleanup relies on goroutines that may not execute if server crashes.
- Files: `internal/auth/auth.go` (lines 27-29, 107-117, 230-239)
- Impact:
- Memory leak on server restarts: orphaned sessions never cleaned up.
- No recovery mechanism if goroutine misses cleanup window.
- Session hijacking if an attacker can predict/guess the cleanup timing.
- Fix approach: Either move ceremony sessions to persistent store or use a time.AfterFunc with guaranteed cleanup (still risky). Better: use signed JWTs for ceremony state instead of server-side storage.
**Scraper loads entire scraped links list into memory on every scrape:**
- Issue: `Scraper.scrape()` loads all existing links, filters and deduplicates them, then rewrites entire file.
- Files: `internal/scraper/scraper.go` (lines 46-92)
- Impact: With thousands of links, each 15-minute scrape cycle causes a large memory spike and full file rewrite. Inefficient deduplication logic (O(n) map lookups on every new link).
- Fix approach: With database migration, use INSERT OR IGNORE / upsert patterns. For now, batch process links in chunks and use database indexes for deduplication.
**No input validation on URL lengths beyond basic checks:**
- Issue: URL length limited to 2048 chars in two places (`internal/server/server.go` line 153, `internal/proxy/proxy.go` line 72), but no validation of URL structure beyond "starts with http/https" and HTTPS-only in proxy.
- Files: `internal/server/server.go` (lines 146-160), `internal/proxy/proxy.go` (lines 54-80)
- Impact: Malformed URLs could bypass checks and cause unexpected behavior in downstream systems. User submission streams could contain typos/malware links.
- Fix approach: Use a proper URL parsing library with validation. Whitelist domains for stream submissions. Consider regex validation for known stream site patterns.
**Hardcoded default streams in main.go:**
- Issue: Default stream URLs are hardcoded and point to external streaming sites that may become unavailable, redirect, or change terms of service.
- Files: `main.go` (lines 100-123)
- Impact: If any of these URLs break, users get broken default content. Sites could shut down or get legal takedown notices. Application appears to endorse/support these sites.
- Fix approach: Move to configuration file. Make seeding optional. Add stream validation/health checks before serving. Consider removing entirely if this is a liability concern.
**Proxy strips CSP headers without replacement:**
- Issue: `internal/proxy/proxy.go` deliberately strips `X-Frame-Options` and CSP headers (line 123) to allow iframe-based proxying. No security headers added back.
- Files: `internal/proxy/proxy.go` (lines 121-125)
- Impact: Proxied content loses all origin security protections. Could allow downstream attacks to run XSS, clickjacking, etc. in the proxy context. Injected `<base>` tag doesn't prevent all attacks.
- Fix approach: Add back a strict CSP policy scoped to the proxy origin. Implement iframe sandbox attributes. Add additional security headers (X-Content-Type-Options: nosniff, etc.).
## Security Considerations
**Authentication ceremony session fixation vulnerability:**
- Risk: Username used as session key for WebAuthn ceremonies (`Auth.BeginRegistration`, `Auth.BeginLogin`). Attacker could start ceremony for victim's account, then victim continues from attacker's session state.
- Files: `internal/auth/auth.go` (lines 107-108, 230-231)
- Current mitigation: None. Ceremony session stored in-memory and deleted after 5 minutes, but no CSRF token or state validation.
- Recommendations: Use cryptographically random state tokens for ceremony sessions instead of username. Store state in secure HTTP-only cookies or database. Validate state on finish.
**Rate limiting per-IP but no account lockout for failed authentication:**
- Risk: Brute force attacks on specific usernames are possible. Attacker can try many passwords (using different IPs) against a single account without consequence.
- Files: `internal/proxy/proxy.go` implements rate limiting (per-IP token bucket), but no equivalent exists for auth endpoints (`internal/auth/auth.go`).
- Current mitigation: WebAuthn makes guessing harder (passkeys), but early attack surface (BeginLogin endpoint) has no protection. Leaked user list could enable targeted attacks.
- Recommendations: Add per-username failure tracking. Lock account after N failed attempts. Add exponential backoff. Require captcha after threshold.
**CORS Origin validation incomplete:**
- Risk: `OriginCheck` middleware in `internal/server/middleware.go` (lines 71-93) only checks on non-GET requests. GET requests can still trigger state-changing operations (e.g., visiting a crafted link that proxies through the app).
- Files: `internal/server/middleware.go` (lines 74)
- Current mitigation: Proxy request uses query param, but no SameSite cookie attribute on proxy endpoint (only on session cookie).
- Recommendations: Require Origin header on all mutation requests. Consider using POST for scrape trigger. Add X-CSRF-Token validation.
**Admin user initialization has race condition:**
- Risk: First user to register becomes admin if `ADMIN_USERNAME` not set. Two concurrent registration requests could both see 0 users and both become admin.
- Files: `internal/auth/auth.go` (lines 83-91)
- Current mitigation: Relies on file-level locking in store operations, but store operations are done after the check (line 121), not atomic.
- Recommendations: Move first-user-is-admin logic into CreateUser transaction, or seed admin during initialization phase before accepting requests.
**Session token stored in http-only cookie but not marked Secure in non-HTTPS:**
- Risk: Cookie marked `Secure: r.TLS != nil` (line 187, 300). In development or non-HTTPS deployments, session token sent over plaintext HTTP.
- Files: `internal/auth/auth.go` (lines 187, 300)
- Current mitigation: None for non-HTTPS. Relies on deployment to enforce HTTPS.
- Recommendations: Always set Secure=true. Force HTTPS in production via HSTS header. Log warning if TLS is nil.
**Proxy does not validate Content-Type before injecting `<base>` tag:**
- Risk: Non-HTML responses (PDFs, images, binaries) could be corrupted by injecting `<base>` tag. Base64 encoded binary data could break.
- Files: `internal/proxy/proxy.go` (lines 104-119)
- Current mitigation: 5MB body size limit, but no content-type validation.
- Recommendations: Check Content-Type header before modification. Skip injection for non-HTML types. Use proper HTML parsing (e.g., golang.org/x/net/html) instead of string manipulation.
## Performance Bottlenecks
**Scraper Reddit parsing with inefficient comment recursion:**
- Problem: `walkComments` in `internal/scraper/reddit.go` (lines 245-260) recursively walks comment trees using JSON unmarshaling in each recursion level. Could cause O(n^2) behavior on deep comment threads.
- Files: `internal/scraper/reddit.go` (lines 245-260, 132-142)
- Cause: Each comment reply is unmarshaled separately. For a thread with 1000 nested replies, this could create 1000 unmarshaling operations.
- Improvement path: Pre-flatten comment tree or use iterative traversal instead of recursion. Cache unmarshaled comments during initial fetch.
**O(n) lookups on every store operation:**
- Problem: All store methods (GetUserByName, GetUserByID, FindStream by ID) iterate through entire in-memory list.
- Files: `internal/store/users.go` (lines 21-49), `internal/store/streams.go` (lines 12-52)
- Cause: File-based storage forces full-file loads. Even with caching, no indexing.
- Improvement path: With database migration, use indexed lookups. For now, maintain in-process cache with invalidation on updates.
**Rate limiter token bucket not garbage collected properly:**
- Problem: Buckets for old IPs are deleted every 10 minutes (bucketCleanup), but inactive users' buckets accumulate until cleanup cycle.
- Files: `internal/proxy/proxy.go` (lines 170-181)
- Cause: Cleanup is reactive, not triggered on write. High-traffic scenarios could have thousands of stale buckets in memory.
- Improvement path: Use sync.Map for lock-free reads. Implement heap-based cleanup timer per bucket instead of global interval.
**Entire streams/sessions list rewritten on every add/delete:**
- Problem: Adding one stream requires reading all streams, appending, and rewriting entire file. Deleting a session does the same.
- Files: `internal/store/streams.go` (lines 54-78, 80-103), `internal/store/sessions.go` (lines 22-44, 61-81)
- Cause: Atomic write pattern (writeJSON uses temp-file-then-rename), but forces full serialization.
- Improvement path: Migrate to database with transaction support. Implement write-ahead logging if staying with files.
## Fragile Areas
**Proxy string-based HTML manipulation is fragile:**
- Files: `internal/proxy/proxy.go` (lines 107-119)
- Why fragile: Uses string.Index to find `<head>` and `<html>` tags with string.ToLower comparisons. Cases like `<HEAD>` would be missed. Malformed HTML (missing closing tags, nested structures) could place `<base>` tag in wrong location.
- Safe modification: Use golang.org/x/net/html parser. Insert `<base>` into head node properly. Handle edge cases (no head, multiple heads, xhtml).
- Test coverage: No tests for proxy HTML injection logic. Edge cases untested.
**Auth ceremony cleanup relies on goroutines:**
- Files: `internal/auth/auth.go` (lines 112-117, 234-239)
- Why fragile: If goroutine is blocked or delayed, cleanup doesn't happen. No guarantee cleanup runs at correct time. Server crash loses all in-flight ceremonies.
- Safe modification: Use context deadlines instead of sleep timers. Implement cleanup on FinishRegistration/FinishLogin regardless of goroutine. Store ceremonies in database with TTL.
- Test coverage: No tests for ceremony timeout behavior. Hard to test goroutine cleanup timing.
**DeleteStream and related operations use string.Contains for error classification:**
- Files: `internal/server/server.go` (lines 196-203)
- Why fragile: Error messages must contain specific strings ("not authorized", "not found") for proper HTTP status mapping. Changing error text breaks error handling.
- Safe modification: Use error types (custom errors or error wrapping with errors.Is/As). Map error types to status codes centrally.
- Test coverage: No tests for error status code mapping.
**Scraper is single-threaded with mutex but TriggerScrape starts new goroutine:**
- Files: `internal/scraper/scraper.go` (lines 42-44, 46-92)
- Why fragile: Calling TriggerScrape while scrape() is running (locked) will queue a second scrape. If scrapes take >15 minutes, queue grows. No bounds on concurrent scrapes.
- Safe modification: Use atomic flag to prevent concurrent scrapes. Queue only one pending scrape. Timeout long-running scrapes.
- Test coverage: No tests for concurrent scrape behavior or queue limits.
**Admin check depends on user count atomicity:**
- Files: `internal/auth/auth.go` (lines 83-91)
- Why fragile: Check user count, then create user are separate operations. Two concurrent registrations both see count=0, both get admin. Later operation fails due to username uniqueness check, but by then both claimed to be admin.
- Safe modification: Move atomicity into CreateUser. Use database transaction.
- Test coverage: No concurrency tests for admin initialization.
## Scaling Limits
**All data files live on single filesystem:**
- Current capacity: Depends on disk size. Assuming 1GB available, JSON files with generous spacing could hold ~100k streams, users, or sessions before performance degrades.
- Limit: At 10k active users with 5 sessions each (50k sessions), sessions.json alone is >50MB uncompressed. Each read loads entire file.
- Scaling path: Migrate to database. Use SQLite for single-node, PostgreSQL for distributed. Implement sharding for sessions by user_id.
**In-memory rate limit buckets per IP:**
- Current capacity: ~100k unique IPs can be tracked before memory pressure (each bucket ~48 bytes).
- Limit: Behind a proxy/load balancer, all traffic appears from proxy IP, making per-IP limiting useless. Map grows indefinitely per proxy.
- Scaling path: Move rate limiting to reverse proxy/load balancer layer (nginx, Envoy). Or, extract real IP from X-Forwarded-For more carefully (currently does this, but assumes trust).
**Scraper single-threaded, only hits one subreddit:**
- Current capacity: 25 posts per run * 15-min interval = 100 posts/hour. Each post processes comments once. Total throughput ~1000-5000 URLs/hour depending on post depth.
- Limit: If stream demand increases or multiple subreddits need scraping, single scraper becomes bottleneck. No parallelism.
- Scaling path: Implement scraper pool. Scrape multiple subreddits in parallel. Move scraper to separate service. Implement distributed job queue.
**WebAuthn session storage grows unbounded until server restart:**
- Current capacity: Each ceremony session is ~1KB. 1000 concurrent registrations = 1MB. 100k in-flight = 100MB.
- Limit: Memory exhaustion if registrations are started but not finished (or attacker starts many ceremonies).
- Scaling path: Use database for ceremony sessions. Implement hard timeout (e.g., 5 min) enforced by scheduled cleanup task. Set max concurrent ceremonies.
## Dependencies at Risk
**go-webauthn/webauthn v0.15.0:**
- Risk: Security library. May have vulnerabilities. Check for updates regularly.
- Impact: Passkey authentication could be compromised if library has bugs.
- Migration plan: Keep updated. Monitor GitHub releases. Test updates before deploying.
**Hardcoded subreddit URL (reddit.com API):**
- Risk: Reddit API could change, add authentication requirements, or shut down /r/motorsportsstreams2 community.
- Impact: Scraper stops working entirely. No fallback stream sources.
- Migration plan: Implement abstraction for stream sources. Support multiple scraper backends (Reddit, Discord, Twitter, etc.). Add health checks for scraper endpoints.
## Test Coverage Gaps
**No tests for HTTP error handling:**
- What's not tested: Error status code mapping, error response formatting, error logging.
- Files: `internal/server/server.go` (all handlers), `internal/auth/auth.go` (all endpoints)
- Risk: Error responses could be inconsistent or leaky (exposing internal details). Status codes could be wrong.
- Priority: High
**No tests for concurrent store operations:**
- What's not tested: Race conditions in add/delete/update. Concurrent reads while write in progress.
- Files: `internal/store/streams.go`, `internal/store/sessions.go`, `internal/store/users.go`
- Risk: Data corruption or loss under load. Auth bypass if race condition allows duplicate users.
- Priority: High
**No tests for WebAuthn ceremony timeouts:**
- What's not tested: Behavior when ceremony session expires. Cleanup of orphaned sessions.
- Files: `internal/auth/auth.go` (BeginRegistration, FinishRegistration, BeginLogin, FinishLogin)
- Risk: Session fixation, orphaned memory, unexpected behavior on retry.
- Priority: Medium
**No tests for proxy HTML injection:**
- What's not tested: Edge cases (malformed HTML, no head tag, nested structures). Security implications (XSS prevention, CSP).
- Files: `internal/proxy/proxy.go` (ServeHTTP)
- Risk: Injected tags could be placed incorrectly. Proxied content could break. Security headers could be ineffective.
- Priority: Medium
**No tests for rate limiter token bucket algorithm:**
- What's not tested: Burst capacity behavior, refill rate, edge cases (high request volume, time skew).
- Files: `internal/proxy/proxy.go` (allowRequest, cleanBuckets)
- Risk: Rate limiting could be too strict or too lenient. Cleanup could fail to run.
- Priority: Medium
**No tests for admin initialization logic:**
- What's not tested: First user gets admin flag. Edge cases with concurrent registrations. Behavior when ADMIN_USERNAME is set.
- Files: `internal/auth/auth.go` (BeginRegistration, lines 83-91)
- Risk: Non-admin user gets admin flag (privilege escalation). Two admins created unexpectedly.
- Priority: High
**No integration tests for full auth flow:**
- What's not tested: Complete registration + login + logout cycle. Error recovery. Session expiration.
- Files: All of `internal/auth/auth.go` and `internal/server/server.go` auth endpoints.
- Risk: Subtle bugs in ceremony sequencing. Auth logic could break without being detected.
- Priority: High
**No tests for scraper Reddit parsing:**
- What's not tested: Comment tree recursion. URL extraction. F1 keyword matching. Deduplication logic.
- Files: `internal/scraper/reddit.go`
- Risk: Scraper could miss streams, extract bad URLs, or fail on unexpected Reddit response format.
- Priority: Medium
---
*Concerns audit: 2026-02-17*

View file

@ -0,0 +1,159 @@
# Coding Conventions
**Analysis Date:** 2026-02-17
## Naming Patterns
**Files:**
- Go packages: lowercase, single word when possible (e.g., `auth`, `store`, `proxy`)
- Go files: lowercase with descriptive names (e.g., `server.go`, `middleware.go`, `reddit.go`)
- JSON files: snake_case (e.g., `users.json`, `sessions.json`, `scraped_links.json`)
- JavaScript files: camelCase (e.g., `app.js`, `auth.js`, `streams.js`)
**Functions and Methods:**
- Go: PascalCase for exported functions (e.g., `New`, `BeginRegistration`, `ServeHTTP`)
- Go: camelCase for unexported functions (e.g., `randomID`, `isF1Post`, `normalizeURL`)
- JavaScript: camelCase for all functions (e.g., `showToast`, `switchTab`, `doRegister`)
**Variables and Fields:**
- Go: camelCase for local variables (e.g., `streams`, `userID`, `sessionTTL`)
- Go: PascalCase for exported struct fields (e.g., `ID`, `Username`, `IsAdmin`)
- Go: prefixed mutex pattern: `resourceMu` for mutex protecting resource (e.g., `streamsMu`, `usersMu`, `sessionsMu`)
- JavaScript: camelCase for all variables (e.g., `currentUser`, `beginResp`, `container`)
**Types and Constants:**
- Go: PascalCase for exported types (e.g., `Server`, `Auth`, `Store`, `User`)
- Go: camelCase for unexported types (e.g., `contextKey`, `bucket`, `redditListing`)
- Go: SCREAMING_SNAKE_CASE for constants (e.g., `maxBodySize`, `rateLimit`, `bucketCleanup`)
**Interfaces:**
- Go context keys use private types with exported constants (e.g., `type contextKey string; const userKey contextKey = "user"`)
## Code Style
**Formatting:**
- Language: Go (no automated formatter config detected, using standard gofmt conventions)
- Import organization: Standard library → local packages (separated by blank line)
- File layout: Package declaration → Imports → Constants/Variables → Types → Functions
**Linting:**
- No eslint or golangci-yml configuration found
- Go code follows idiomatic Go conventions: error checking, defer cleanup, interface composition
## Import Organization
**Go Order:**
1. Standard library imports (context, encoding/json, fmt, log, etc.)
2. Blank line
3. Local f1-stream packages (internal/auth, internal/models, etc.)
4. Blank line
5. External third-party packages (github.com/...)
**Example from `internal/auth/auth.go`:**
```go
import (
"crypto/rand"
"encoding/json"
"fmt"
"log"
"net/http"
"regexp"
"sync"
"time"
"f1-stream/internal/models"
"f1-stream/internal/store"
"github.com/go-webauthn/webauthn/webauthn"
)
```
**JavaScript:**
- No explicit import organization (vanilla JavaScript, no modules)
- HTML file loads scripts in order: utils → app → auth/streams
## Error Handling
**Patterns:**
- Go: Explicit error return as second value (e.g., `err := operation(); if err != nil { return err }`)
- Go: Wrapping errors with context: `fmt.Errorf("operation failed: %w", err)`
- Go: String matching on error messages for classification (see `internal/server/server.go` line 196-205)
- Go: Logging errors with `log.Printf()` for non-critical failures, `log.Fatalf()` for startup errors
- Go: HTTP errors returned via `http.Error(w, message, statusCode)` for API endpoints
- JavaScript: Try-catch blocks for async operations, error fields in UI (e.g., `errEl.textContent = err.error || 'Operation failed'`)
**HTTP Error Responses:**
- Standard JSON format: `{"error":"description"}`
- Success responses vary by endpoint (JSON arrays, `{"ok":true}`, encoded objects via `json.NewEncoder`)
## Logging
**Framework:** `log` package (standard library)
**Patterns:**
- Informational: `log.Printf("message with %v context", value)`
- Errors: `log.Printf("operation failed: %v", err)`
- Startup: `log.Fatalf("critical: %v", err)` for initialization failures
- Component prefixes: `log.Printf("scraper: action description")`
**Example from `internal/scraper/scraper.go`:**
```go
log.Printf("scraper: starting scrape")
log.Printf("scraper: error after %v: %v", time.Since(start).Round(time.Millisecond), err)
log.Printf("scraper: done in %v, added %d new links (total: %d)", time.Since(start).Round(time.Millisecond), added, len(existing))
```
## Comments
**When to Comment:**
- Explain WHY, not WHAT (code shows what, comments explain reasoning)
- Used for non-obvious logic or security concerns
- Example from `internal/proxy/proxy.go` line 123: `// Explicitly do NOT copy X-Frame-Options or CSP`
- Example from `internal/auth/auth.go` line 120: `// Store user temporarily - will be committed on finish`
**Patterns:**
- Short inline comments before complex sections
- Package-level comments before exported types explaining purpose
- Security/business logic gets explained
## Function Design
**Size:** Functions keep complexity low, typically 20-50 lines; larger operations split across helpers
**Parameters:**
- Receiver methods use pointer receivers: `func (s *Store) GetSession(token string) ...`
- Constructor pattern returns initialized type and error: `func New(...) (*Type, error)`
- HTTP handlers follow signature: `func(w http.ResponseWriter, r *http.Request)`
**Return Values:**
- Errors always returned as last value: `(result, error)`
- Multiple return values when needed: `(*Type, error)` or `([]Type, error)`
- HTTP handlers write directly to ResponseWriter, return via `http.Error()` or direct writes
- Query methods return nil for "not found" rather than error (see `internal/store/users.go` line 33)
## Module Design
**Exports:**
- Exported names start with capital letter (e.g., `New`, `User`, `Server`)
- Unexported helpers start with lowercase
- Types exported when they're part of public API
- Helper functions (e.g., `isF1Post`, `normalizeURL`) kept unexported
**Barrel Files:** Not used; single concerns per file
**Package Organization:**
- `internal/auth/`: Authentication and WebAuthn implementation
- `internal/store/`: Data persistence (users.go, streams.go, sessions.go, scraped.go, store.go)
- `internal/server/`: HTTP routing and middleware
- `internal/scraper/`: Reddit scraping logic
- `internal/proxy/`: HTTP proxy with rate limiting
- `internal/models/`: Type definitions only
**Struct Composition:**
- `Server` struct holds dependencies injected at construction (line 15-21 in `internal/server/server.go`)
- Methods extend functionality through receiver pattern
- No inheritance, composition via embedded types used sparingly
---
*Convention analysis: 2026-02-17*

View file

@ -0,0 +1,121 @@
# External Integrations
**Analysis Date:** 2026-02-17
## APIs & External Services
**Reddit API:**
- Service: Reddit public JSON API (no authentication required)
- What it's used for: Scraping F1 stream links from r/motorsportsstreams2 subreddit
- Fetches 25 most recent posts: `https://www.reddit.com/r/motorsportsstreams2/new.json?limit=25`
- Extracts URLs from post titles and comment bodies
- Filters by F1-related keywords
- Runs on configurable interval (default 15 minutes)
- Implementation: `internal/scraper/reddit.go`
- Authentication: None - public API endpoint
**HTTP Proxy Service:**
- Service: Internal HTTP proxy for accessing external streams
- What it's used for: Fetching and proxying external stream pages while enforcing security policies
- Rate limiting: 30 requests/minute per IP with 5-request burst capacity
- URL validation: Only HTTPS URLs allowed, 2048-character limit
- Private IP blocking: Blocks requests to loopback, private, and link-local addresses
- Content transformation: Injects `<base>` tag for relative URL resolution
- Strips X-Frame-Options and CSP headers to allow iframe embedding
- Implementation: `internal/proxy/proxy.go`
- Endpoint: `GET /proxy?url=[url]`
## Data Storage
**File Storage:**
- Type: Local filesystem (JSON files)
- Location: Configurable via `DATA_DIR` environment variable (default: `/data`)
- Persistence mechanism:
- Atomic writes using temp-file-then-rename pattern
- No database server required
- Files stored in flat structure:
- `streams.json` - User-submitted and scraped stream links
- `users.json` - User accounts with WebAuthn credentials
- `sessions.json` - Active user sessions
- `scraped.json` - Reddit-scraped links
- Client: Go `encoding/json` standard library with sync.RWMutex for thread-safe access
**Caching:**
- Type: None - file-based storage only
- Session cleanup: Automatic garbage collection every 1 hour
## Authentication & Identity
**Auth Provider:**
- Type: Custom WebAuthn/FIDO2 implementation
- Library: github.com/go-webauthn/webauthn v0.15.0
- Implementation details:
- Passwordless authentication using WebAuthn standard
- Registration ceremony: `POST /api/auth/register/begin``POST /api/auth/register/finish`
- Login ceremony: `POST /api/auth/login/begin``POST /api/auth/login/finish`
- Session tokens stored in HTTP-only, SameSite-strict cookies
- In-memory ceremony data storage with 5-minute expiration
- Manual admin assignment via `ADMIN_USERNAME` env var
- First user automatically becomes admin if no `ADMIN_USERNAME` set
- Files: `internal/auth/auth.go`, `internal/auth/context.go`
## Monitoring & Observability
**Error Tracking:**
- Type: None - no external error tracking service
- Implementation: Standard Go logging with `log` package
**Logs:**
- Format: Standard Go log output (stdout)
- Level: Info and error messages
- No centralized logging, no external integration
## CI/CD & Deployment
**Hosting:**
- Platform: Kubernetes (Terraform module at `infra/modules/kubernetes/f1-stream/`)
- Deployment method: Container image
**CI Pipeline:**
- Type: Not detected in this codebase
- Build method: Dockerfile multi-stage build
- Builder: golang:1.23-alpine with `go mod download`
- Runtime: alpine:3.20 with minimal dependencies
## Environment Configuration
**Required env vars (with defaults):**
- `LISTEN_ADDR` - Server listen address (default: `:8080`)
- `DATA_DIR` - Data storage directory (default: `/data`)
- `SCRAPE_INTERVAL` - Reddit scraper frequency (default: 15m)
- `SESSION_TTL` - Session expiration (default: 720h)
- `PROXY_TIMEOUT` - Proxy request timeout (default: 10s)
- `WEBAUTHN_RPID` - Relying party ID (default: `localhost`)
- `WEBAUTHN_ORIGIN` - Origin URL list, comma-separated (default: `http://localhost:8080`)
- `WEBAUTHN_DISPLAY_NAME` - UI display name (default: `F1 Stream`)
- `ADMIN_USERNAME` - Optional: pre-set admin username (no default)
**Secrets location:**
- No secrets required - uses WebAuthn credentials stored locally
- CORS origin validation via `WEBAUTHN_ORIGIN` env var
## Webhooks & Callbacks
**Incoming:**
- None detected
**Outgoing:**
- None detected
## Stream Link Sources
**Default Stream URLs (hardcoded in main.go):**
1. `https://wearechecking.live/streams-pages/motorsports` - WeAreChecking Motorsports
2. `https://vipleague.im/formula-1-schedule-streaming-links` - VIPLeague F1
3. `https://www.vipbox.lc/` - VIPBox
4. `https://f1box.me/` - F1Box
5. `https://1stream.vip/formula-1-streams/` - 1Stream F1
---
*Integration audit: 2026-02-17*

View file

@ -0,0 +1,109 @@
# Technology Stack
**Analysis Date:** 2026-02-17
## Languages
**Primary:**
- Go 1.24.1 - Backend application and main server logic
**Secondary:**
- HTML/CSS/JavaScript - Frontend UI
## Runtime
**Environment:**
- Go runtime (compiled binary)
**Container Runtime:**
- Docker/Alpine Linux (3.20) - Production deployment target
- Multi-stage Dockerfile with golang:1.23-alpine builder
**Package Manager:**
- Go modules (go.mod/go.sum)
## Frameworks
**Core:**
- Standard Go `net/http` - HTTP server and routing
- Native http.ServeMux for route handling (Go 1.22+ pattern routing)
- Native http.FileServer for static file serving
- Native http.Handler interface for middleware
**Authentication:**
- github.com/go-webauthn/webauthn v0.15.0 - WebAuthn/FIDO2 authentication
- Handles registration and login ceremonies
- Supports multiple credential types
**Frontend:**
- HTML5 - Markup
- CSS - Styling (Pico CSS framework for minimal styling)
- Vanilla JavaScript - Client-side interactivity (no framework detected)
## Key Dependencies
**Critical:**
- github.com/go-webauthn/webauthn v0.15.0 - Passwordless authentication via WebAuthn
- Includes transitive dependencies:
- github.com/go-webauthn/x v0.1.26 - WebAuthn extension support
- github.com/golang-jwt/jwt/v5 v5.3.0 - JWT token handling
- github.com/google/go-tpm v0.9.6 - TPM support for credentials
- github.com/fxamacker/cbor/v2 v2.9.0 - CBOR encoding/decoding
- github.com/go-viper/mapstructure/v2 v2.4.0 - Configuration mapping
- github.com/google/uuid v1.6.0 - UUID generation
- golang.org/x/crypto v0.43.0 - Cryptographic primitives
- golang.org/x/sys v0.37.0 - System-level primitives
**Infrastructure:**
- None detected (no external databases, queues, or third-party services in go.mod)
- File-based storage only
## Configuration
**Environment Variables:**
- `LISTEN_ADDR` - Server listen address (default: `:8080`)
- `DATA_DIR` - Data storage directory (default: `/data`)
- `SCRAPE_INTERVAL` - Reddit scraper interval (default: 15 minutes)
- `ADMIN_USERNAME` - Admin account username (optional)
- `SESSION_TTL` - Session expiration time (default: 720 hours)
- `PROXY_TIMEOUT` - HTTP proxy request timeout (default: 10 seconds)
- `WEBAUTHN_RPID` - WebAuthn relying party ID (default: `localhost`)
- `WEBAUTHN_ORIGIN` - WebAuthn origin URL (default: `http://localhost:8080`)
- `WEBAUTHN_DISPLAY_NAME` - WebAuthn display name (default: `F1 Stream`)
**Build:**
- `Dockerfile` - Multi-stage Docker build
- Builder stage: golang:1.23-alpine with CGO_ENABLED=0
- Runtime stage: alpine:3.20 with ca-certificates
- Exposes port 8080
## Platform Requirements
**Development:**
- Go 1.24.1 or compatible
- Unix-like shell (bash/zsh) for build scripts
- Optional: Docker for containerized development
**Production:**
- Kubernetes cluster (Terraform module structure suggests K8s deployment)
- Persistent volume for `/data` directory
- Port 8080 exposed for HTTP traffic
- ca-certificates for HTTPS proxying
## Storage
**Data Persistence:**
- File-based JSON storage in `DATA_DIR`
- Files: `streams.json`, `users.json`, `sessions.json`, `scraped.json`
- Atomic writes using temp-file-then-rename pattern (`writeJSON` function in `internal/store/store.go`)
## External Data Sources
**Reddit API:**
- URL: `https://www.reddit.com/r/motorsportsstreams2/new.json?limit=25`
- No authentication required (public subreddit)
- Used for scraping F1 stream links
---
*Stack analysis: 2026-02-17*

View file

@ -0,0 +1,202 @@
# Codebase Structure
**Analysis Date:** 2026-02-17
## Directory Layout
```
f1-stream/
├── main.go # Entry point, service initialization, signal handling
├── go.mod # Go module definition
├── go.sum # Dependency lock file
├── Dockerfile # Container image definition
├── redeploy.sh # Kubernetes redeployment script
├── index.html # HTML template served at root
├── internal/ # Private Go packages
│ ├── auth/ # WebAuthn authentication and session management
│ ├── models/ # Domain data types
│ ├── server/ # HTTP handlers, routes, middleware
│ ├── store/ # File-based persistence layer
│ ├── scraper/ # Reddit content scraper
│ └── proxy/ # HTTP proxy with rate limiting
├── static/ # Frontend assets served to clients
│ ├── index.html # Main SPA template
│ ├── css/ # Stylesheets
│ └── js/ # Client-side JavaScript modules
└── .planning/ # Planning/documentation directory
└── codebase/ # Architecture analysis documents
```
## Directory Purposes
**Root Level:**
- Purpose: Service configuration and entry point
- Contains: Go module, main executable, Docker configuration, shell scripts
- Key files: `main.go` (service bootstrap), `go.mod` (dependencies)
**`internal/`:**
- Purpose: Private packages (not importable by external code)
- Contains: All business logic, separated by concern
- Key pattern: Each subdirectory is a distinct Go package with clear responsibility
**`internal/auth/`:**
- Purpose: User authentication, session management, context helpers
- Contains: WebAuthn ceremony handlers, session token management, user-in-context utilities
- Key files:
- `auth.go`: Registration/login handlers, ceremony session storage, credential validation
- `context.go`: Request context helpers for passing user data between middleware and handlers
**`internal/models/`:**
- Purpose: Domain model definitions
- Contains: User, Stream, ScrapedLink, Session type definitions
- Key files: `models.go` (all types, includes WebAuthn interface implementations)
**`internal/server/`:**
- Purpose: HTTP API and routing layer
- Contains: Handler functions, route registration, middleware implementations
- Key files:
- `server.go`: Server struct, route registration, API handlers (streams, admin, public endpoints)
- `middleware.go`: LoggingMiddleware, RecoveryMiddleware, AuthMiddleware, RequireAuth, RequireAdmin, OriginCheck
**`internal/store/`:**
- Purpose: Persistent storage abstraction over file system
- Contains: JSON file operations, per-entity storage methods, atomic write patterns
- Key files:
- `store.go`: Store struct, directory initialization, JSON helper functions (readJSON, writeJSON)
- `streams.go`: Stream CRUD operations, publish toggle, seeding
- `users.go`: User lookup, credential updates, admin count
- `sessions.go`: Session creation, validation, expiry cleanup
- `scraped.go`: Scraped link persistence, active link filtering
**`internal/scraper/`:**
- Purpose: Background content aggregation
- Contains: Interval-based scraper, Reddit-specific scraper logic
- Key files:
- `scraper.go`: Scraper service, interval-based run loop, manual trigger mechanism, deduplication logic
- `reddit.go`: Reddit API polling, F1 keyword filtering, URL extraction (not included in sample reads but referenced)
**`internal/proxy/`:**
- Purpose: HTTP content fetching with security controls and rate limiting
- Contains: Rate limiter, private IP validation, response modification
- Key files: `proxy.go` (implements http.Handler, rate limiting, content fetching, base tag injection)
**`static/`:**
- Purpose: Frontend assets served to browser
- Contains: HTML template and client-side code
- Key files:
- `index.html`: SPA HTML template (includes script tags loading js/)
- `js/app.js`: Toast notifications, dialog system, tab switching, initialization
- `js/auth.js`: Registration/login UI, WebAuthn client ceremony
- `js/streams.js`: Stream display, filtering, admin operations
- `js/utils.js`: Shared utilities (HTML escaping)
- `css/`: Stylesheets for app UI
## Key File Locations
**Entry Points:**
- `main.go`: Service initialization, dependency injection, signal handling, goroutine startup
**Configuration:**
- Environment variables read in `main.go` (LISTEN_ADDR, DATA_DIR, SCRAPE_INTERVAL, etc.)
- WebAuthn config passed to `auth.New()`
- `.env` files not tracked (see .gitignore)
**Core Logic:**
- Request routing: `internal/server/server.go:registerRoutes()`
- Auth logic: `internal/auth/auth.go`
- Data storage: `internal/store/store.go` and per-entity files
- Scraping: `internal/scraper/scraper.go`
- Proxying: `internal/proxy/proxy.go`
**Testing:**
- No test files present in codebase (see TESTING.md concerns section)
## Naming Conventions
**Files:**
- Go source files: lowercase with underscores (e.g., `auth.go`, `middleware.go`)
- JavaScript files: lowercase with hyphens or underscores (e.g., `app.js`, `auth.js`)
- JSON data files: lowercase (e.g., `streams.json`, `users.json`, `sessions.json`)
**Directories:**
- Go packages: lowercase, single word preferred (e.g., `auth`, `store`, `models`)
- Frontend assets: plural nouns (e.g., `static`, `css`, `js`)
**Functions:**
- Go: CamelCase (exported), camelCase (unexported)
- JavaScript: camelCase throughout (e.g., `loadPublicStreams()`, `showToast()`)
**Types:**
- Go structs: CamelCase (e.g., `User`, `Stream`, `Store`, `Auth`)
- Methods: CamelCase (e.g., `BeginLogin()`, `AddStream()`)
**Variables:**
- Go: camelCase (e.g., `listenAddr`, `dataDir`, `adminUsername`)
- JavaScript: camelCase (e.g., `container`, `userID`, `sessionToken`)
## Where to Add New Code
**New Feature (e.g., new stream filter):**
- Primary code: Add handler in `internal/server/server.go`, register route in `registerRoutes()`
- Store operations: Add method to appropriate file in `internal/store/` (likely `streams.go`)
- Frontend: Add UI in `static/` and API call in `static/js/streams.js` or new module
- Models: Extend types in `internal/models/models.go` if new fields needed
**New Authentication Method:**
- Core implementation: New file in `internal/auth/` (e.g., `oauth.go`)
- Handlers: Add methods following WebAuthn pattern (BeginXxx, FinishXxx)
- Routes: Register in `registerRoutes()`
- Frontend: Add form/button in `static/js/auth.js`
**New Background Service (e.g., content validator):**
- Implementation: New file in `internal/` or new package `internal/validator/`
- Integration: Initialize in `main()` alongside `scraper.New()`
- Lifecycle: Use context pattern from scraper's `Run(ctx)` method
- Storage: Use existing `Store` instance
**Utilities/Helpers:**
- Shared by Go packages: Add to package where most useful, or create new `internal/util/` package
- Shared by frontend: Add to `static/js/utils.js` or create new module
- Shared helpers pattern: Functions not tied to single package, used across multiple
## Special Directories
**`internal/`:**
- Purpose: Enforce package privacy (cannot be imported by external code)
- Generated: No
- Committed: Yes
**`static/`:**
- Purpose: Served directly to clients via `http.FileServer`
- Generated: No (hand-written frontend)
- Committed: Yes
**`.planning/codebase/`:**
- Purpose: Architecture documentation for development guidance
- Generated: No (manually created by mapping process)
- Committed: Yes
**Data Directory (runtime):**
- Purpose: Persistent JSON files (streams.json, users.json, sessions.json, scraped.json)
- Location: Specified by DATA_DIR env var (default `/data`)
- Generated: Yes (created on first run)
- Committed: No (varies per deployment environment)
## Import Patterns
**Go Package Imports:**
- Standard library first: `import ("context" "fmt" "log")`
- Internal packages second: `import ("f1-stream/internal/auth" "f1-stream/internal/store")`
- External third-party last: `import ("github.com/go-webauthn/webauthn/webauthn")`
**Cross-Package Dependencies:**
- Server depends on: Auth, Store, Proxy, Scraper, Models
- Auth depends on: Store, Models
- Scraper depends on: Store, Models
- Proxy depends on: none (standalone service)
- Store depends on: Models
- Models depends on: external WebAuthn library only
---
*Structure analysis: 2026-02-17*

View file

@ -0,0 +1,256 @@
# Testing Patterns
**Analysis Date:** 2026-02-17
## Test Framework
**Status:** No testing infrastructure present
**Runner:** Not detected
**Assertion Library:** Not applicable
**Run Commands:** Not applicable
## Test File Organization
**Current State:** Zero test files found
After scanning the codebase:
- No `*_test.go` files in `internal/` packages
- No `*.test.js` or `*.spec.js` files in static assets
- No test configuration files (jest.config.js, vitest.config.ts, etc.)
- No test runners in go.mod dependencies
## Test Coverage
**Requirements:** Not enforced; no test infrastructure
**Current Coverage:** 0% - no tests exist
## Test Types Present in Codebase
### Unit Test Candidates (Not Currently Tested)
**`internal/models/models.go`:**
- User and Stream model struct definitions
- WebAuthn interface implementations (lines 18-21)
**`internal/auth/auth.go`:**
- Username validation via regex `usernameRe` (line 19)
- Registration/login ceremony steps
- Session creation and token generation
- Admin user detection logic (lines 83-91)
**`internal/store/*.go` (all files):**
- JSON read/write operations with file locking
- User lookup and stream operations
- Session creation, validation, and cleanup
- Scraped link filtering and deduplication
**`internal/scraper/reddit.go`:**
- F1 post detection: `isF1Post()` function (line 272-285)
- URL normalization: `normalizeURL()` function (line 262-270)
- URL extraction: `extractURLs()` function (line 210-243)
- Comment walking: `walkComments()` function (line 245-260)
- Keyword matching logic (lines 29-45)
- Retry logic with backoff (lines 183-208)
**`internal/proxy/proxy.go`:**
- Rate limiting with token bucket algorithm (lines 145-168)
- Private host detection: `isPrivateHost()` function (line 128-143)
- Client IP extraction: `clientAddr()` function (line 184-191)
- Bucket cleanup mechanism (lines 170-182)
**`internal/server/middleware.go`:**
- Auth middleware context injection
- Authorization checks (RequireAuth, RequireAdmin)
- Origin validation for CSRF protection
- Panic recovery middleware
### Integration Test Candidates (Not Currently Tested)
**Authentication Flow:**
- Begin registration → finish registration → session creation
- Begin login → finish login → session creation
- Session validation and expiration
- WebAuthn ceremony with mock credentials
**Stream Management:**
- Add stream → save to JSON → retrieve
- Delete stream with authorization checks
- Toggle publish status
- Filter streams by visibility/ownership
**Scraping Pipeline:**
- Fetch Reddit listing
- Extract F1 posts
- Walk comments recursively
- Deduplicate URLs
- Merge with existing links
### E2E Test Candidates (Not Currently Tested)
**HTTP Endpoints:**
- Full registration flow (POST /api/auth/register/begin, /api/auth/register/finish)
- Full login flow (POST /api/auth/login/begin, /api/auth/login/finish)
- Stream CRUD operations
- Public stream viewing
- Scrape triggering and result retrieval
## Critical Untested Paths
**High Risk - Security:**
- Authentication middleware context injection (`internal/server/middleware.go` lines 33-43)
- Admin authorization checks (line 62)
- CSRF origin validation (line 78-88)
- Private address filtering in proxy (line 77-80 in `internal/proxy/proxy.go`)
- Rate limiting enforcement (line 62-65 in `internal/proxy/proxy.go`)
**High Risk - Data Integrity:**
- Concurrent access to store files via mutex protection (no verification that race conditions are prevented)
- JSON read/write atomicity with temp files (lines 41-52 in `internal/store/store.go`)
- Session expiration cleanup (lines 83-98 in `internal/store/sessions.go`)
- Stream deduplication during scraping (lines 65-85 in `internal/scraper/scraper.go`)
**Medium Risk - Business Logic:**
- F1 post detection with negative keywords (lines 272-285 in `internal/scraper/reddit.go`)
- URL normalization for deduplication (line 262-270)
- Retry logic with rate limit backoff (line 183-208)
## What Needs Testing
### Unit Test Suggestions
```go
// Example: Test username validation
func TestUsernameValidation(t *testing.T) {
tests := []struct {
username string
valid bool
}{
{"valid123", true},
{"valid_name", true},
{"ab", false}, // too short
{"invalid-char", false}, // invalid character
{"", false}, // empty
}
// usernameRe.MatchString(username) for each test case
}
// Example: Test F1 post detection
func TestIsF1Post(t *testing.T) {
tests := []struct {
title string
expected bool
}{
{"F1 GP Race - Monaco", true},
{"Formula 1 Practice", true},
{"Help with F1 key binding", false}, // negative keyword
{"Random post about cars", false},
}
// isF1Post(title) for each test case
}
// Example: Test URL normalization
func TestNormalizeURL(t *testing.T) {
// Check that different URL formats normalize to same string
// Check case-insensitivity and trailing slash handling
}
// Example: Test rate limiting
func TestRateLimiting(t *testing.T) {
p := New(10 * time.Second)
ip := "192.168.1.1"
// First burst allowed
for i := 0; i < 5; i++ {
if !p.allowRequest(ip) {
t.Fail()
}
}
// Burst exhausted
if p.allowRequest(ip) {
t.Fail()
}
// Wait and verify replenishment
time.Sleep(10 * time.Second)
if !p.allowRequest(ip) {
t.Fail()
}
}
```
### Integration Test Suggestions
```go
// Example: Test store operations with concurrency
func TestConcurrentStreamOperations(t *testing.T) {
st, _ := store.New(t.TempDir())
// Concurrent adds from multiple goroutines
// Verify no data corruption
// Verify final count is correct
}
// Example: Test scraper deduplication
func TestScraperDeduplication(t *testing.T) {
// Create scraper with test store
// Mock Reddit response with duplicate URLs
// Verify only unique URLs are stored
// Verify normalization works (http vs https, trailing slashes)
}
// Example: Test auth middleware
func TestAuthMiddleware(t *testing.T) {
st, _ := store.New(t.TempDir())
auth, _ := auth.New(st, ...)
// Create test token
// Make request with session cookie
// Verify user injected into context
}
```
## Recommended Testing Strategy
1. **Phase 1 - Unit Tests (Highest Priority):**
- Validation functions (username regex, F1 keywords)
- String utilities (URL normalization, truncate)
- Rate limiting algorithm
- Private host detection
2. **Phase 2 - Integration Tests:**
- Store operations with concurrency (verify mutex protection)
- Scraper pipeline (Reddit fetch → parse → deduplicate → save)
- Auth ceremony flow with mock WebAuthn
- Stream CRUD with permission checks
3. **Phase 3 - E2E Tests:**
- Full HTTP request flows
- Middleware chain validation
- Session management across endpoints
## Testing Patterns to Establish
**Once framework chosen (Go: testing or testify):**
- Use `t.TempDir()` for store tests to avoid file conflicts
- Mock HTTP responses for scraper tests
- Use `net/http/httptest` for handler testing
- Mock WebAuthn responses for auth tests
- Table-driven tests for validation logic
- Parallel test execution with `-race` flag for concurrency detection
**Coverage gaps to close:**
- All error paths in store operations
- Session expiration edge cases
- Concurrent access scenarios
- HTTP header validation
- CORS/origin validation
---
*Testing analysis: 2026-02-17*