The chrome-service stack ran `playwright launch-server`, which creates
ephemeral browser contexts per `connect()`. Despite the encrypted PVC
mounted at /profile, no chromium user-data ever persisted — only npm
cache + fontconfig. Logging in via noVNC was effectively a no-op.
Refactor:
- Replace launch-server with direct chromium (TCP CDP on :9223 internal),
fronted by a Python HTTP+WS bridge on :9222 that rewrites the Host
header to bypass Chrome's hardcoded DNS-rebinding protection (no
`--remote-allow-hosts` flag exists in stock Chrome 130; verified by
binary string grep). Bridge also forces Connection: close on HTTP
responses so Node ws opens a fresh TCP for the WS upgrade rather than
trying to reuse the dead keep-alive socket.
- Add `--user-data-dir=/profile/chromium-data` so cookies/localStorage
actually persist on the encrypted PVC.
- New snapshot-server sidecar (stdlib python HTTP) serves
GET /api/snapshot at chrome.viktorbarzin.me/api/snapshot,
bearer-token-gated by the existing api_bearer_token.
- New chrome-service-snapshot-harvester CronJob (hourly) connects via
CDP, dumps storage_state() (cookies + localStorage), writes atomically
to /profile/snapshots/storage-state.json.
- NetworkPolicy: TCP/9222 (was :3000), TCP/8088 added for traefik.
Caller migration:
- f1-stream: `chromium.connect(ws_url)` → `chromium.connect_over_cdp(cdp_url)`,
env var CHROME_WS_URL → CHROME_CDP_URL. CHROME_WS_TOKEN dropped (no
longer used by code; ExternalSecret kept for symmetry with the snapshot
endpoint).
Dev-box side (out of scope for this commit — see ~/.config/systemd/user/):
- playwright-mcp.service flips to `--isolated --storage-state=...`
so per-Claude-Code-session ephemeral contexts seed from the snapshot.
- playwright-snapshot-refresh.{service,timer} (hourly) pulls the
snapshot via the bearer-gated HTTPS endpoint.
Docs updated:
- docs/architecture/chrome-service.md — new architecture diagram + wire protocol.
- docs/runbooks/chrome-service-snapshot.md — day-2 ops (refresh, rotation,
failure modes, restore).
- stacks/chrome-service/README.md — connect_over_cdp recipe.
Design spec at docs/superpowers/specs/2026-06-04-playwright-per-session-browser-design.md.
Two fixes for the previously-dormant subreddit extractor + a chrome-browser TARGETS pivot to MotoGP weekend live URLs.
1. **Reddit fetch was 403'd by `Accept: application/json`**. Cluster IP +
that header trips Reddit's anti-bot fingerprint and returns HTML 403.
Removing the explicit Accept (default `*/*`) restores HTTP 200 with
JSON. Confirmed via direct httpx test from the f1-stream pod.
2. **Search the right things**. The community uses a stable
`[Watch / Download] <Series> <Year> - <Round> | <Event>` post pattern
with selftext links to admin-curated WordPress sites (motomundo.net
for MotoGP, sister sites for F1 when active). New extractor:
- Hits both /new.json and /search.json across r/MotorsportsReplays
and three smaller motorsport subs.
- Filters posts where title contains `[watch`, `watch online`, or
flair = `live`.
- Extracts URLs from selftext (regex), filters to a positive
`_INTERESTING_HOSTS` allowlist (motomundo, freemotorsports,
pitsport, rerace, dd12, etc.) so we don't drown the verifier in
YouTube/Discord/gofile links.
- Returns each as embed-type so the chrome-service verifier visits.
3. **chrome_browser.TARGETS pivoted** to the live MotoMundo MotoGP
French GP iframes (motomundo.top/e/<id> + motomundo.upns.xyz/#<id>)
while the weekend is on. The previous DD12 NASCAR + Acestrlms F1
targets were both broken JW Player paths anyway.
State after deploy:
- /streams: 3 verified live (WRC Rally Portugal, NASCAR 24/7, Premier League Darts) — Darts is currently active because UK is mid-match.
- Subreddit extractor surfaces the live MotoMundo URL but the verifier
marks the WordPress wrapper page playable=False (no top-level <video>
element; the m3u8 lives in nested iframes). Next iteration: drill the
verifier into iframe contentDocument and capture from there.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
User asked to broaden the source pipeline so f1-stream can find F1 (and
adjacent motorsport) streams from Sky Sports / DAZN / Reddit / etc.,
using the in-cluster chrome-service headed browser where needed. Four
changes:
1. **streamed.py**: BASE_URL streamed.su → streamed.pk. The .su domain
stopped serving the API host in 2026 (only the marketing page is
left); .pk hosts the JSON API now. Adds 3 events/round (currently
all routed through embedsports.top — see #2 caveat).
2. **chrome_browser.py** (new): generic chrome-service-driven extractor.
Connects to the existing chrome-service WS (CHROME_WS_URL +
CHROME_WS_TOKEN env), navigates a list of TARGETS, captures any HLS
playlist URL the page fetches at runtime, returns one ExtractedStream
per discovery. Uses the same stealth init script as the verifier so
anti-bot checks don't trip the page. Handles iframes (DD12-style
/nas → /new-nas/jwplayer) and probes child-frame <video>/source
elements after settle. Caveat: most aggregator sites (pooembed,
embedsports, hmembeds, even DD12's JW Player path) use a broken
runtime decoder that produces no m3u8 in our environment, so the
TARGETS list is currently 0-yielding; the framework is the
contribution and concrete sites can be added as they're discovered.
3. **subreddit.py** (new): scans r/MotorsportsReplays, r/motorsports,
r/formula1, r/motogp via the public old.reddit.com JSON API for
posts whose flair/title indicates a live stream. Discovered URLs
are returned as embed-type streams; the verifier visits each via
chrome-service to confirm playability. Note: Reddit currently HTTP
403's our cluster outbound IP for anonymous JSON requests; the
extractor returns 0 in that state and logs a debug message. Will
work from any IP Reddit isn't blocking.
4. **dd12.py** (new): inline-HTML scraper for DD12Streams. The site
embeds `playerInstance.setup({file: "..."})` directly in HTML — no
JS decoder needed. Currently surfaces NASCAR Cup Series 24/7 (clean
BunnyCDN-hosted HLS at w9329432hnf3h34.b-cdn.net/pdfs/master.m3u8);
add new `(path, label, title)` tuples to CHANNELS as DD12 expands.
Result: /streams now shows 2 verified live streams (Rally TV via
pitsport + DD12 NASCAR Cup 24/7). When the next F1 weekend (Canadian
GP, May 22-24) goes live, pitsport will surface F1 sessions
automatically via the existing pushembdz path.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>