User feedback: every stream on /watch shows ads but the player fails
to load. Three causes, three fixes:
1. CuratedExtractor's two hmembeds 24/7 channels (Sky F1, DAZN F1)
sat at the top of the list and ALWAYS failed: they load the
upstream's ad overlay then JW Player throws error 102630 (empty
playlist; the obfuscated decoder produces no fileURL in our
environment). Disabled the registration in extractors/__init__.py
until/unless we find a working bypass — leaving the existing
`CURATED_BYPASS = {"curated"}` shim in service.py so the swap is
reversible.
2. Pitsport surfaces every WRC stage / MotoGP session as its own
/watch UUID, but they all resolve to the same upstream m3u8 URL
(e.g. RallyTV one master.m3u8 across all 22 Rally de Portugal
stages). Added URL-keyed dedupe in service.run_extraction so the
/streams response shows one row per actual stream.
3. The pitsport category filter was still narrowed to motorsport.
Pitsport.xyz only lists curated sports broadcasts (WRC, MotoGP,
IndyCar, NASCAR, Premier League Darts, Premier League football…),
so the site's own selection is the right filter. Replaced the
hand-maintained MOTORSPORT_KEYWORDS list with `bool(category or
title)` — anything pitsport returns goes through. Streams that
aren't actually live get filtered out downstream when the embed
API returns an empty manifest.
Frontend: hls.js `lowLatencyMode` was on by default but RallyTV (and
most non-LL-HLS providers) don't ship the LL-HLS extensions, which
broke playback in real browsers. Default to `lowLatencyMode: false`.
Result: /streams is now 1 verified live entry (Rally TV WRC stage
currently airing); was 24 with the top 2 always broken + 22 dupes.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The previous extractor only surfaced Formula 1/2/3 and never returned
anything outside race weekends. Two fixes:
1. Broadened category filter from {formula 1/2/3} to a motorsport set
(MotoGP/Moto2/Moto3, WRC/WEC/IndyCar/NASCAR + the F1 series).
Replaces the NON_F1_KEYWORDS exclusion list with a positive-match
MOTORSPORT_KEYWORDS set; removes the F1-specific filter on title
keywords. Old `_is_f1_*` aliases retained as compat shims.
2. Updated `_parse_stream_config` for the current pushembdz.store embed
payload — Next.js now serves `safeStream` (just title + method) and
the actual stream URL is fetched at runtime from
`pushembdz.store/api/stream/<slug>`. Extractor now hits that endpoint
when the inline link is missing. Treats `method=jwp` as HLS and
accepts URLs ending in `.css` (pushembdz disguises some HLS playlists
with a `.css` extension).
End-to-end result: /streams went from 2 (curated, broken JW decoder) to
24 streams marked `is_live=True`. The verifier confirms each via
`manifest_parsed_codec_missing_in_verifier` (Playwright Chromium has no
H.264 — manifest fetch alone is the codec-independent positive signal).
Currently surfaces Rally de Portugal SS1–SS22 (WRC); MotoGP starts
appearing once the French GP weekend goes live tomorrow.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Cuts the stream list from 23 mostly-broken entries to ~6 confirmed-playable
ones, and adds an iframe-stripping proxy so embed sources (hmembeds, etc.)
load through our origin without X-Frame-Options / CSP / JS frame-buster
blocks.
Why: the previous list was dominated by Discord-shared news article URLs,
hardcoded aggregator landing pages, and other non-stream URLs that all sat
at is_live=true because embed streams skipped the health check entirely.
Users could not tell which links would actually play.
What:
- backend/playback_verifier.py: new headless-Chromium verifier (Playwright)
that polls each candidate stream for a codec-independent "playable" signal
(hls.js MANIFEST_PARSED for m3u8; <video>/player div for embed). Replaces
the unconditional is_live=True for embed streams in service.py.
- backend/embed_proxy.py: new /embed and /embed-asset routes that fetch
upstream embed pages, strip X-Frame-Options/CSP/Set-Cookie, and inject a
<base href> + frame-buster-defeat <script> that locks down window.top,
document.referrer, console.clear/table, and window.location so the
hmembeds disable-devtool.js redirect-to-google trap can't fire.
- extractors/curated.py: new always-on extractor with two known-good 24/7
hmembeds embeds (Sky Sports F1, DAZN F1) so the list isn't empty between
race weekends.
- extractors/__init__.py: register CuratedExtractor first; drop
FallbackExtractor (its 10 aggregator landing-pages can't iframe-play).
- extractors/discord_source.py: positive-match path filter (must look like
/embed/, /stream, /watch, /live, /player, *.m3u8, *.php) plus expanded
domain blocklist for news sites — was 10 noise URLs, now ~1.
- extractors/service.py: run_extraction now health-checks AND verifier-
checks both stream types; only verified-playable streams reach is_live.
- main.py: register /embed + /embed-asset routes; defer initial extraction
by 8s so the verifier can reach the local /embed proxy on 127.0.0.1:8000.
- frontend/lib/api.js + watch/+page.svelte: route embed iframes through
/embed proxy instead of the upstream URL, so X-Frame-Options/CSP can't
block them.
- Dockerfile: install Playwright chromium + system codec-runtime libs.
- main.tf: bump pod memory 256Mi → 1Gi for chromium.
Verified end-to-end with Playwright against
https://f1.viktorbarzin.me/watch — 6/6 streams reach a player UI; the 3
demo m3u8s actually play (codec-bearing browser); the 3 embeds (Sky
Sports F1, DAZN F1, sportsurge) render iframes through the proxy.
Image: viktorbarzin/f1-stream:v6.0.5
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>