From 980ec554183635c787589f5b2cece43b09fa0ff5 Mon Sep 17 00:00:00 2001 From: Viktor Barzin Date: Thu, 11 Jun 2026 14:23:53 +0000 Subject: [PATCH] tripit: enable live flight-fare scrape via shared chrome-service CDP MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Sets FARE_PROVIDER=playwright + FARE_CDP_URL on the tripit deployment so the planning workspace's flight_fare cells auto-fetch live Google Flights quotes through the existing in-cluster headed browser (tripit issue #18, ADR-0007 — rate-limited, cached, degrades to manual entry). Viktor asked to complete the trip-planning tickets; this is the infra leg of the fare-scrape slice. Docs: chrome-service architecture + service catalog updated (tripit is now the second active CDP caller; catalog's legacy :3000 WS pool line corrected to CDP :9222). HOLD-ORDER NOTE: pushed only after the tripit image containing FareMode.playwright rolled out (older images crash-loop on the unknown enum). Co-Authored-By: Claude Opus 4.8 --- .claude/reference/service-catalog.md | 2 +- docs/architecture/chrome-service.md | 11 ++++++++--- stacks/tripit/main.tf | 9 +++++++++ 3 files changed, 18 insertions(+), 4 deletions(-) diff --git a/.claude/reference/service-catalog.md b/.claude/reference/service-catalog.md index 218f7af7..eb442d61 100644 --- a/.claude/reference/service-catalog.md +++ b/.claude/reference/service-catalog.md @@ -47,7 +47,7 @@ | calibre | E-book management (may be merged into ebooks stack) | calibre | | onlyoffice | Document editing | onlyoffice | | f1-stream | F1 streaming (uses chrome-service for hmembeds verifier); source in own repo `viktor/f1-stream` (Forgejo, extracted 2026-06-05), Woodpecker-native build->deploy (repo id 166) | f1-stream | -| chrome-service | Headed Chromium WebSocket pool (`ws://chrome-service.chrome-service.svc:3000/`) for sibling services driving anti-bot embeds | chrome-service | +| chrome-service | Headed Chromium over CDP (`http://chrome-service.chrome-service.svc:9222`, `connect_over_cdp`; legacy `:3000/` WS pool removed 2026-06-04) for sibling services driving anti-bot pages — snapshot-harvester CronJob + tripit fare scrape | chrome-service | | rybbit | Analytics | rybbit | | isponsorblocktv | SponsorBlock for TV | isponsorblocktv | | actualbudget | Budgeting (factory pattern) | actualbudget | diff --git a/docs/architecture/chrome-service.md b/docs/architecture/chrome-service.md index b70fe185..7b95d4a0 100644 --- a/docs/architecture/chrome-service.md +++ b/docs/architecture/chrome-service.md @@ -10,9 +10,14 @@ serves two distinct populations: `chromium.connect_over_cdp("http://chrome-service.chrome-service.svc:9222")` to drive a real browser when upstream anti-bot trips a headless one (`disable-devtool.js` redirect-to-google trap, `navigator.webdriver` - checks, console-clear timing tricks). The only currently-active - in-cluster caller is the `chrome-service-snapshot-harvester` CronJob; - the `stacks/f1-stream/files/backend/playback_verifier.py` + + checks, console-clear timing tricks). Currently-active in-cluster + callers: the `chrome-service-snapshot-harvester` CronJob, and + **tripit's `PlaywrightFareProvider`** (since 2026-06-11, tripit issue + #18 / ADR-0007) — the flight-fare scrape connects per quote, opens a + fresh incognito context, scrapes Google Flights, and closes the + context; rate-limited to one attempt per 30s with a 6h fare cache, so + browser load is negligible. The + `stacks/f1-stream/files/backend/playback_verifier.py` + `chrome_browser.py` tree is a vestigial design — the deployed f1-stream image (built from `github.com/ViktorBarzin/f1-stream`) does not use this code path. diff --git a/stacks/tripit/main.tf b/stacks/tripit/main.tf index 2c099fc6..49ae3037 100644 --- a/stacks/tripit/main.tf +++ b/stacks/tripit/main.tf @@ -79,6 +79,15 @@ locals { TTS_MODE = "openai_compatible" TTS_BASE_URL = "http://chatterbox-tts.tts.svc.cluster.local:8000" TTS_MODEL = "chatterbox" + # Live flight-fare scrape (tripit ADR-0007, issue #18): Playwright driving + # the SHARED chrome-service browser over CDP (no per-pod browser). The + # provider rate-limits (30s min interval), caches 6h, backs off 300s on + # failure, and degrades to manual entry — never blocks the grid. NOTE: + # FareMode `playwright` only exists in images >= the #18 slice; setting + # this against an older image crash-loops on the unknown enum (old pods + # keep serving), so the env landed AFTER that image rolled out. + FARE_PROVIDER = "playwright" + FARE_CDP_URL = "http://chrome-service.chrome-service.svc.cluster.local:9222" } }