From a91bbe189e21ee8c59957c98fbf2f20f5804e34f Mon Sep 17 00:00:00 2001 From: Viktor Barzin Date: Thu, 7 May 2026 18:37:11 +0000 Subject: [PATCH] f1-stream: subreddit extractor finds Reddit '[Watch / Download]' threads MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Two fixes for the previously-dormant subreddit extractor + a chrome-browser TARGETS pivot to MotoGP weekend live URLs. 1. **Reddit fetch was 403'd by `Accept: application/json`**. Cluster IP + that header trips Reddit's anti-bot fingerprint and returns HTML 403. Removing the explicit Accept (default `*/*`) restores HTTP 200 with JSON. Confirmed via direct httpx test from the f1-stream pod. 2. **Search the right things**. The community uses a stable `[Watch / Download] - | ` post pattern with selftext links to admin-curated WordPress sites (motomundo.net for MotoGP, sister sites for F1 when active). New extractor: - Hits both /new.json and /search.json across r/MotorsportsReplays and three smaller motorsport subs. - Filters posts where title contains `[watch`, `watch online`, or flair = `live`. - Extracts URLs from selftext (regex), filters to a positive `_INTERESTING_HOSTS` allowlist (motomundo, freemotorsports, pitsport, rerace, dd12, etc.) so we don't drown the verifier in YouTube/Discord/gofile links. - Returns each as embed-type so the chrome-service verifier visits. 3. **chrome_browser.TARGETS pivoted** to the live MotoMundo MotoGP French GP iframes (motomundo.top/e/ + motomundo.upns.xyz/#) while the weekend is on. The previous DD12 NASCAR + Acestrlms F1 targets were both broken JW Player paths anyway. State after deploy: - /streams: 3 verified live (WRC Rally Portugal, NASCAR 24/7, Premier League Darts) — Darts is currently active because UK is mid-match. - Subreddit extractor surfaces the live MotoMundo URL but the verifier marks the WordPress wrapper page playable=False (no top-level