[ci skip] Flatten module wrappers into stack roots
Remove the module "xxx" { source = "./module" } indirection layer
from all 66 service stacks. Resources are now defined directly in
each stack's main.tf instead of through a wrapper module.
- Merge module/main.tf contents into stack main.tf
- Apply variable replacements (var.tier -> local.tiers.X, renamed vars)
- Fix shared module paths (one fewer ../ at each level)
- Move extra files/dirs (factory/, chart_values, subdirs) to stack root
- Update state files to strip module.<name>. prefix
- Update CLAUDE.md to reflect flat structure
Verified: terragrunt plan shows 0 add, 0 destroy across all stacks.
This commit is contained in:
parent
b0499a7f31
commit
c7c7047f1c
245 changed files with 11733 additions and 12432 deletions
|
|
@ -0,0 +1,107 @@
|
|||
---
|
||||
phase: 01-scraper-validation
|
||||
plan: 01
|
||||
subsystem: scraper
|
||||
tags: [go, http, video-detection, content-validation, streaming]
|
||||
|
||||
# Dependency graph
|
||||
requires: []
|
||||
provides:
|
||||
- "URL validation pipeline with video marker detection (validateLinks)"
|
||||
- "Configurable validation timeout via SCRAPER_VALIDATE_TIMEOUT env var"
|
||||
- "Video content type and HTML marker detection functions"
|
||||
affects: [02-health-checks, 04-link-extraction]
|
||||
|
||||
# Tech tracking
|
||||
tech-stack:
|
||||
added: []
|
||||
patterns:
|
||||
- "Pipeline filter pattern: scrapeReddit -> validateLinks -> merge"
|
||||
- "String-match video detection (no DOM parsing) for Phase 1 speed"
|
||||
- "2MB body limit for HTML inspection to prevent memory issues"
|
||||
|
||||
key-files:
|
||||
created:
|
||||
- internal/scraper/validate.go
|
||||
- internal/scraper/validate_test.go
|
||||
modified:
|
||||
- internal/scraper/scraper.go
|
||||
- main.go
|
||||
|
||||
key-decisions:
|
||||
- "String matching over DOM parsing for video detection (DOM reserved for Phase 4)"
|
||||
- "2MB body limit to prevent memory issues on large pages"
|
||||
- "3 redirect limit to avoid infinite redirect chains"
|
||||
|
||||
patterns-established:
|
||||
- "Pipeline filter: validate scraped links before merge into store"
|
||||
- "Env var config pattern: envDuration for timeout configuration"
|
||||
|
||||
requirements-completed: [SCRP-01, SCRP-02, SCRP-03, SCRP-04]
|
||||
|
||||
# Metrics
|
||||
duration: 3min
|
||||
completed: 2026-02-17
|
||||
---
|
||||
|
||||
# Phase 1 Plan 1: Scraper Validation Summary
|
||||
|
||||
**URL validation pipeline with 18 video/player markers filtering scraped links before store merge, configurable via SCRAPER_VALIDATE_TIMEOUT**
|
||||
|
||||
## Performance
|
||||
|
||||
- **Duration:** 3 min
|
||||
- **Started:** 2026-02-17T20:49:16Z
|
||||
- **Completed:** 2026-02-17T20:51:54Z
|
||||
- **Tasks:** 3
|
||||
- **Files modified:** 4
|
||||
|
||||
## Accomplishments
|
||||
- Created validate.go with 18 video/player markers covering HTML5, HLS, DASH, and 10+ player libraries
|
||||
- Wired validateLinks into scrape() pipeline between URL extraction and store merge
|
||||
- Added SCRAPER_VALIDATE_TIMEOUT env var (default 10s) following existing config patterns
|
||||
- Added 25 unit tests (10 positive + 4 negative marker tests, 6 positive + 5 negative content type tests)
|
||||
|
||||
## Task Commits
|
||||
|
||||
Each task was committed atomically:
|
||||
|
||||
1. **Task 1: Create validate.go with video marker detection** - `adeb478` (feat)
|
||||
2. **Task 2: Wire validation into scraper pipeline and add config** - `22d29db` (feat)
|
||||
3. **Task 3: Add unit tests for validation functions** - `6c5cc02` (test)
|
||||
|
||||
## Files Created/Modified
|
||||
- `internal/scraper/validate.go` - URL validation with video marker detection (validateLinks, hasVideoContent, containsVideoMarkers, isDirectVideoContentType)
|
||||
- `internal/scraper/validate_test.go` - Table-driven unit tests for marker detection and content type checks (25 cases)
|
||||
- `internal/scraper/scraper.go` - Added validateTimeout field and validateLinks call in scrape()
|
||||
- `main.go` - Added SCRAPER_VALIDATE_TIMEOUT env var read (default 10s)
|
||||
|
||||
## Decisions Made
|
||||
- Used string matching (not DOM parsing) for video detection -- DOM parsing reserved for Phase 4 link extraction
|
||||
- Set 2MB body read limit to prevent memory issues on large streaming pages
|
||||
- Limited redirects to 3 to avoid infinite redirect chains on sketchy stream sites
|
||||
- Validation runs sequentially (not concurrent) to avoid overwhelming target sites
|
||||
|
||||
## Deviations from Plan
|
||||
|
||||
None - plan executed exactly as written.
|
||||
|
||||
## Issues Encountered
|
||||
None.
|
||||
|
||||
## User Setup Required
|
||||
|
||||
None - no external service configuration required.
|
||||
|
||||
## Next Phase Readiness
|
||||
- Validation pipeline is integrated and tested, ready for health check layer (Phase 2)
|
||||
- The validateLinks function provides the filtering foundation that health checks will build upon
|
||||
- No blockers or concerns
|
||||
|
||||
## Self-Check: PASSED
|
||||
|
||||
All 5 files verified present. All 3 task commits verified in git log.
|
||||
|
||||
---
|
||||
*Phase: 01-scraper-validation*
|
||||
*Completed: 2026-02-17*
|
||||
Loading…
Add table
Add a link
Reference in a new issue