[ci skip] Flatten module wrappers into stack roots

Remove the module "xxx" { source = "./module" } indirection layer
from all 66 service stacks. Resources are now defined directly in
each stack's main.tf instead of through a wrapper module.

- Merge module/main.tf contents into stack main.tf
- Apply variable replacements (var.tier -> local.tiers.X, renamed vars)
- Fix shared module paths (one fewer ../ at each level)
- Move extra files/dirs (factory/, chart_values, subdirs) to stack root
- Update state files to strip module.<name>. prefix
- Update CLAUDE.md to reflect flat structure

Verified: terragrunt plan shows 0 add, 0 destroy across all stacks.
This commit is contained in:
Viktor Barzin 2026-02-22 15:13:55 +00:00
parent b0499a7f31
commit c7c7047f1c
245 changed files with 11733 additions and 12432 deletions

View file

@ -0,0 +1,107 @@
---
phase: 01-scraper-validation
plan: 01
subsystem: scraper
tags: [go, http, video-detection, content-validation, streaming]
# Dependency graph
requires: []
provides:
- "URL validation pipeline with video marker detection (validateLinks)"
- "Configurable validation timeout via SCRAPER_VALIDATE_TIMEOUT env var"
- "Video content type and HTML marker detection functions"
affects: [02-health-checks, 04-link-extraction]
# Tech tracking
tech-stack:
added: []
patterns:
- "Pipeline filter pattern: scrapeReddit -> validateLinks -> merge"
- "String-match video detection (no DOM parsing) for Phase 1 speed"
- "2MB body limit for HTML inspection to prevent memory issues"
key-files:
created:
- internal/scraper/validate.go
- internal/scraper/validate_test.go
modified:
- internal/scraper/scraper.go
- main.go
key-decisions:
- "String matching over DOM parsing for video detection (DOM reserved for Phase 4)"
- "2MB body limit to prevent memory issues on large pages"
- "3 redirect limit to avoid infinite redirect chains"
patterns-established:
- "Pipeline filter: validate scraped links before merge into store"
- "Env var config pattern: envDuration for timeout configuration"
requirements-completed: [SCRP-01, SCRP-02, SCRP-03, SCRP-04]
# Metrics
duration: 3min
completed: 2026-02-17
---
# Phase 1 Plan 1: Scraper Validation Summary
**URL validation pipeline with 18 video/player markers filtering scraped links before store merge, configurable via SCRAPER_VALIDATE_TIMEOUT**
## Performance
- **Duration:** 3 min
- **Started:** 2026-02-17T20:49:16Z
- **Completed:** 2026-02-17T20:51:54Z
- **Tasks:** 3
- **Files modified:** 4
## Accomplishments
- Created validate.go with 18 video/player markers covering HTML5, HLS, DASH, and 10+ player libraries
- Wired validateLinks into scrape() pipeline between URL extraction and store merge
- Added SCRAPER_VALIDATE_TIMEOUT env var (default 10s) following existing config patterns
- Added 25 unit tests (10 positive + 4 negative marker tests, 6 positive + 5 negative content type tests)
## Task Commits
Each task was committed atomically:
1. **Task 1: Create validate.go with video marker detection** - `adeb478` (feat)
2. **Task 2: Wire validation into scraper pipeline and add config** - `22d29db` (feat)
3. **Task 3: Add unit tests for validation functions** - `6c5cc02` (test)
## Files Created/Modified
- `internal/scraper/validate.go` - URL validation with video marker detection (validateLinks, hasVideoContent, containsVideoMarkers, isDirectVideoContentType)
- `internal/scraper/validate_test.go` - Table-driven unit tests for marker detection and content type checks (25 cases)
- `internal/scraper/scraper.go` - Added validateTimeout field and validateLinks call in scrape()
- `main.go` - Added SCRAPER_VALIDATE_TIMEOUT env var read (default 10s)
## Decisions Made
- Used string matching (not DOM parsing) for video detection -- DOM parsing reserved for Phase 4 link extraction
- Set 2MB body read limit to prevent memory issues on large streaming pages
- Limited redirects to 3 to avoid infinite redirect chains on sketchy stream sites
- Validation runs sequentially (not concurrent) to avoid overwhelming target sites
## Deviations from Plan
None - plan executed exactly as written.
## Issues Encountered
None.
## User Setup Required
None - no external service configuration required.
## Next Phase Readiness
- Validation pipeline is integrated and tested, ready for health check layer (Phase 2)
- The validateLinks function provides the filtering foundation that health checks will build upon
- No blockers or concerns
## Self-Check: PASSED
All 5 files verified present. All 3 task commits verified in git log.
---
*Phase: 01-scraper-validation*
*Completed: 2026-02-17*