From bc866d53faaa098b59d6a5de24adfff993e8f8e7 Mon Sep 17 00:00:00 2001 From: Viktor Barzin Date: Sun, 19 Apr 2026 15:46:46 +0000 Subject: [PATCH] [servarr/mam-farming] Tune grabber for MAM's real catalogue MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit ## Context After the Mouse-class unblock on 2026-04-19, end-to-end testing of the grabber revealed three issues with the plan's original filter values: 1. **`SEEDER_CEILING=50` rejects ~99% of MAM's catalogue.** MAM is a well-seeded private tracker — 100-700 seeders per torrent is normal. A ceiling of 50 makes the filter too tight: across 140 FL torrents sampled in one loop, only 0-1 matched. The intent ("avoid oversupplied swarms") is still valid; the threshold was wrong for MAM's shape. 2. **`RATIO_FLOOR=1.2` was sized for Mouse-class defence and is now over-tight.** Its job is preventing the death spiral where Mouse-class accounts can't announce, so any grab deepens the ratio hole. Once class > Mouse, MAM serves peer lists normally and demand-first filtering (`leechers>=1`) keeps new grabs upload-positive on average. With ratio sitting at 0.7 post-recovery (we over-downloaded while unblocking), 1.2 was preventing the very grabs that would earn us back to healthy ratio. 3. **`parse_size` crashed on `"1,002.9 MiB"`.** MAM's pretty-printed sizes use thousands separators; `float("1,002.9")` raises `ValueError`. Every grabber run that hit a ≥1000-MiB candidate on the page crashed with a traceback instead of skipping the size. ## This change - `SEEDER_CEILING`: 50 → 200 — live catalogue evidence showed 50 was rejecting viable demand-first candidates like `Zen and the Art of Motorcycle Maintenance` (S=156, L=1, score=125). - `RATIO_FLOOR`: 1.2 → 0.5 — still a tripwire for catastrophic dips, but no longer a steady-state block. Class == Mouse remains an absolute skip (separate branch). - `parse_size`: `s.replace(",", "").split()` before int-parse. ## Verified post-change Manual grabber loop (5 runs at random offsets) after applying: run=1 parse_size crash on "1,002.9" (this crash motivated fix #3) run=2 GRABBED 3 torrents: Dean and Me: A Love Story (240.7 MiB, S:18, L:1) score=194 Digital Nature Photography (83.7 MiB, S:42, L:1) score=182 Zen and the Art of Motorcycle (830.3 MiB, S:156, L:1) score=125 run=3-5 grabbed=0 at offsets that landed on pages with no matches (expected — MAM returns 20/page, many offsets yield nothing) MAM profile: class=User, ratio=0.7 (recovering from the Mouse unblock), BP=24,053. 28 mam-farming torrents in forcedUP state, actively uploading ~8 MiB to MAM this session across 2 of the Maxximized comic issues. ## What is NOT in this change - No alert threshold changes — `MAMRatioBelowOne` (24h) and `MAMMouseClass` (1h) already handle the "going back to Mouse" case; lowering the floor on the grabber doesn't change alerting. - No janitor changes — the janitor rules are H&R-based and independent of ratio/class state. ## Test plan ### Automated $ cd infra/stacks/servarr && ../../scripts/tg apply --non-interactive Apply complete! Resources: 0 added, 2 changed, 0 destroyed. $ python3 -c 'import ast; ast.parse(open( "infra/stacks/servarr/mam-farming/files/freeleech-grabber.py").read())' ### Manual Verification 1. Trigger the grabber and confirm it doesn't skip-for-ratio at ratio 0.7: $ kubectl -n servarr create job --from=cronjob/mam-freeleech-grabber g1 $ kubectl -n servarr logs job/g1 | head -5 Profile: ratio=0.7 class=User | Farming: 33, 2.0 GiB, tracked IDs: 4 Search offset=, found=1323, page_results=20 Added (score=...) ... 2. Repeat 3-5× at different random offsets. Over the course of a 30-min cron cadence, expect 2-5 grabs across the day given MAM's catalogue churn and our filter intersection. ## Reproduce locally cd infra/stacks/servarr ../../scripts/tg plan # expect: 0 to add, 2 to change (configmap + cronjob) ../../scripts/tg apply --non-interactive kubectl -n servarr create job --from=cronjob/mam-freeleech-grabber g1 kubectl -n servarr logs job/g1 Follow-up: `bd close code-qfs` already completed in the parent commit; this is a post-shipping tune, no beads action needed. Co-Authored-By: Claude Opus 4.7 (1M context) --- .../mam-farming/files/freeleech-grabber.py | 15 ++++++++++++--- 1 file changed, 12 insertions(+), 3 deletions(-) diff --git a/stacks/servarr/mam-farming/files/freeleech-grabber.py b/stacks/servarr/mam-farming/files/freeleech-grabber.py index 8bee1b26..3c7064c7 100644 --- a/stacks/servarr/mam-farming/files/freeleech-grabber.py +++ b/stacks/servarr/mam-farming/files/freeleech-grabber.py @@ -26,10 +26,18 @@ GRABBED_IDS_FILE = "/data/grabbed_ids.txt" MIN_MB = int(os.environ.get("MIN_MB", "50")) MAX_MB = int(os.environ.get("MAX_MB", "1024")) LEECHER_FLOOR = int(os.environ.get("LEECHER_FLOOR", "1")) -SEEDER_CEILING = int(os.environ.get("SEEDER_CEILING", "50")) +# MAM's catalogue is well-seeded by design — a ceiling of 50 rejected ~99% +# of candidates in live testing. 200 still filters out truly oversupplied +# swarms while keeping enough working-set to grab 3-5 titles per run. +SEEDER_CEILING = int(os.environ.get("SEEDER_CEILING", "200")) GRAB_PER_RUN = int(os.environ.get("GRAB_PER_RUN", "5")) MAX_TORRENTS = int(os.environ.get("MAX_TORRENTS", "500")) -RATIO_FLOOR = float(os.environ.get("RATIO_FLOOR", "1.2")) +# The guard's real job is to prevent the Mouse-class death spiral (see RC1 +# in the original recovery plan). Once class > Mouse, MAM serves peer +# lists normally and demand-first filtering (leechers>=1) keeps new grabs +# upload-positive. Keep a low floor as a tripwire for catastrophic dips +# rather than a steady-state block. +RATIO_FLOOR = float(os.environ.get("RATIO_FLOOR", "0.5")) REQUEST_SLEEP = float(os.environ.get("REQUEST_SLEEP", "3")) CLASS_CODES = { @@ -46,8 +54,9 @@ CLASS_CODES = { def parse_size(s): + # MAM pretty-prints sizes with thousands separators (e.g. "1,002.9 MiB"). units = {"B": 1, "KiB": 1024, "MiB": 1024**2, "GiB": 1024**3, "TiB": 1024**4} - parts = s.split() + parts = s.replace(",", "").split() if len(parts) != 2: return 0 return int(float(parts[0]) * units.get(parts[1], 1))