f1-stream: subreddit extractor scans r/motorsportsstreams2 (active sub)

User asked specifically for r/motorsportstreams. Reddit banned that sub
years ago; the active 12.5k-subscriber successor is r/motorsportsstreams2.
Added it to SUBREDDITS plus r/f1streams (709 subs, public).

Also extended:
- SEARCH_QUERIES with three Sky Sports F1 / live-stream phrases that
  catch the `[F1 STREAM]` post pattern the community uses on race
  weekends (titles like "[F1 STREAM] Bahrain GP - Live Race | No Buffer
  | Mobile Friendly" linking to boxboxbox.pro/stream-1).
- _INTERESTING_HOSTS allowlist with boxboxbox.{pro,live,lol},
  pitsport.live, ppv.to, streamed.pk, acestrlms/aceztrims, and the
  Super Formula direct CDNs (racelive.jp, cdn.sfgo.jp) — all observed
  in last-50-posts on r/motorsportsstreams2.

Where this leaves us, honestly:
- The r/motorsportsstreams2 megathread "Where to watch every F1 race"
  recommends EXACTLY the four sites we already pull from: pitsport.xyz,
  streamed.pk, ppv.to, acestrlms. The community has the same broken JW
  Player chain we have for Sky Sports F1 24/7 streams. There is no
  free-and-working alternative they know about.
- boxboxbox.pro (the most-promoted F1 stream domain in race-weekend
  posts) is currently NXDOMAIN; .live is parked, .lol unreachable. The
  domain rotates after takedowns; Reddit posts will surface fresh ones
  when posters share them.
- For F1 specifically: extractor surfaces 2 motomundo.net candidates
  (MotoGP wrappers) and lights up to ~6+ during F1 race weekends as
  posters share fresh boxboxbox/equivalent URLs.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
Viktor Barzin 2026-05-07 22:42:51 +00:00
parent 4326f09baa
commit c5a0424571

View file

@ -38,37 +38,62 @@ USER_AGENT = (
"Version/17.4 Safari/605.1.15" "Version/17.4 Safari/605.1.15"
) )
# Subreddits to scan. r/MotorsportsReplays is the main signal; the others # Subreddits to scan.
# rarely have stream posts but cost nothing to skim. # - r/motorsportsstreams2 is the active 12.5k-sub successor to the banned
# r/motorsportstreams; race-weekend "[F1 STREAM]" posts include
# `boxboxbox.pro/stream-1` URLs and similar fresh aggregator links.
# - r/MotorsportsReplays runs the [Watch / Download] mod-post pattern
# linking to motomundo.net (MotoGP) and sister sites.
# - The rest are low-yield but cost nothing.
SUBREDDITS: tuple[str, ...] = ( SUBREDDITS: tuple[str, ...] = (
"motorsportsstreams2",
"MotorsportsReplays", "MotorsportsReplays",
"f1streams",
"motorsports", "motorsports",
"formula1", "formula1",
"motogp", "motogp",
) )
# Search queries to fire against r/MotorsportsReplays (the ones below # Search queries fired against r/motorsportsstreams2 + r/MotorsportsReplays.
# capture the consistent mod-post pattern). Encoded into the JSON # The first set captures the [Watch / Download] mod posts; the second set
# search endpoint. # catches race-weekend live discussion threads.
SEARCH_QUERIES: tuple[str, ...] = ( SEARCH_QUERIES: tuple[str, ...] = (
"Watch Download F1 2026", "Watch Download F1 2026",
"Watch Download MotoGP 2026", "Watch Download MotoGP 2026",
"Watch Online F1 2026", "Watch Online F1 2026",
"Watch Online MotoGP 2026", "F1 STREAM live",
"Sky Sports F1 live",
"Sky F1 stream",
) )
# Hosts we accept as "interesting" stream-page URLs. These are the # Hosts we accept as "interesting" stream-page URLs. These are the
# admin-curated WordPress / aggregator sites the community links to. # admin-curated WordPress / aggregator sites the community links to.
# motomundo.net hosts MotoGP; new entries can be added freely. # Anchored to what r/motorsportsstreams2 currently posts (May 2026 sweep).
_INTERESTING_HOSTS = ( _INTERESTING_HOSTS = (
"motomundo.net", # MotoGP # WordPress wrappers / community-run sites
"motomundo.net", # MotoGP — admin-curated WP
"motomundo.top", # MotoMundo embed host "motomundo.top", # MotoMundo embed host
"motomundo.upns.xyz", # MotoMundo embed host (newer) "motomundo.upns.xyz", # MotoMundo embed host (newer)
"freemotorsports.com", # community curated link list "freemotorsports.com", # WAC successor curated link list
"pitsport.xyz", # in case a Reddit poster links it "boxboxbox.pro", # F1 race-weekend aggregator (community fav)
"rerace.io", # F1 archives + live (when up) "boxboxbox.live", # boxboxbox sister
"dd12streams.com", # live aggregator "boxboxbox.lol",
"f1mundo.net", # speculative F1 sister to motomundo # Aggregators we already have direct extractors for, but Reddit may
# surface event-specific deeplinks (e.g. /watch/<UUID>) we'd miss
# otherwise.
"pitsport.xyz",
"pitsport.live",
"rerace.io",
"dd12streams.com",
"ppv.to",
"streamed.pk",
"acestrlms.pages.dev",
"aceztrims.pages.dev",
# Sport-specific direct CDNs that occasionally appear in posts
"racelive.jp", # Super Formula
"cdn.sfgo.jp", # Super Formula CDN
# Speculative F1 sister sites — pattern likely if motomundo for MotoGP
"f1mundo.net",
"f1.live", "f1.live",
"f1live", "f1live",
"skystreams", "skystreams",