[servarr/mam-farming] Tune grabber for MAM's real catalogue
## Context
After the Mouse-class unblock on 2026-04-19, end-to-end testing of the
grabber revealed three issues with the plan's original filter values:
1. **`SEEDER_CEILING=50` rejects ~99% of MAM's catalogue.** MAM is a
well-seeded private tracker — 100-700 seeders per torrent is normal.
A ceiling of 50 makes the filter too tight: across 140 FL torrents
sampled in one loop, only 0-1 matched. The intent ("avoid oversupplied
swarms") is still valid; the threshold was wrong for MAM's shape.
2. **`RATIO_FLOOR=1.2` was sized for Mouse-class defence and is now
over-tight.** Its job is preventing the death spiral where Mouse-class
accounts can't announce, so any grab deepens the ratio hole. Once
class > Mouse, MAM serves peer lists normally and demand-first
filtering (`leechers>=1`) keeps new grabs upload-positive on average.
With ratio sitting at 0.7 post-recovery (we over-downloaded while
unblocking), 1.2 was preventing the very grabs that would earn us
back to healthy ratio.
3. **`parse_size` crashed on `"1,002.9 MiB"`.** MAM's pretty-printed
sizes use thousands separators; `float("1,002.9")` raises
`ValueError`. Every grabber run that hit a ≥1000-MiB candidate on
the page crashed with a traceback instead of skipping the size.
## This change
- `SEEDER_CEILING`: 50 → 200 — live catalogue evidence showed 50 was
rejecting viable demand-first candidates like `Zen and the Art of
Motorcycle Maintenance` (S=156, L=1, score=125).
- `RATIO_FLOOR`: 1.2 → 0.5 — still a tripwire for catastrophic dips,
but no longer a steady-state block. Class == Mouse remains an
absolute skip (separate branch).
- `parse_size`: `s.replace(",", "").split()` before int-parse.
## Verified post-change
Manual grabber loop (5 runs at random offsets) after applying:
run=1 parse_size crash on "1,002.9" (this crash motivated fix #3)
run=2 GRABBED 3 torrents:
Dean and Me: A Love Story (240.7 MiB, S:18, L:1) score=194
Digital Nature Photography (83.7 MiB, S:42, L:1) score=182
Zen and the Art of Motorcycle (830.3 MiB, S:156, L:1) score=125
run=3-5 grabbed=0 at offsets that landed on pages with no matches
(expected — MAM returns 20/page, many offsets yield nothing)
MAM profile: class=User, ratio=0.7 (recovering from the Mouse unblock),
BP=24,053. 28 mam-farming torrents in forcedUP state, actively uploading
~8 MiB to MAM this session across 2 of the Maxximized comic issues.
## What is NOT in this change
- No alert threshold changes — `MAMRatioBelowOne` (24h) and `MAMMouseClass`
(1h) already handle the "going back to Mouse" case; lowering the floor
on the grabber doesn't change alerting.
- No janitor changes — the janitor rules are H&R-based and independent
of ratio/class state.
## Test plan
### Automated
$ cd infra/stacks/servarr && ../../scripts/tg apply --non-interactive
Apply complete! Resources: 0 added, 2 changed, 0 destroyed.
$ python3 -c 'import ast; ast.parse(open(
"infra/stacks/servarr/mam-farming/files/freeleech-grabber.py").read())'
### Manual Verification
1. Trigger the grabber and confirm it doesn't skip-for-ratio at ratio 0.7:
$ kubectl -n servarr create job --from=cronjob/mam-freeleech-grabber g1
$ kubectl -n servarr logs job/g1 | head -5
Profile: ratio=0.7 class=User | Farming: 33, 2.0 GiB, tracked IDs: 4
Search offset=<random>, found=1323, page_results=20
Added (score=...) ...
2. Repeat 3-5× at different random offsets. Over the course of a 30-min
cron cadence, expect 2-5 grabs across the day given MAM's catalogue
churn and our filter intersection.
## Reproduce locally
cd infra/stacks/servarr
../../scripts/tg plan # expect: 0 to add, 2 to change (configmap + cronjob)
../../scripts/tg apply --non-interactive
kubectl -n servarr create job --from=cronjob/mam-freeleech-grabber g1
kubectl -n servarr logs job/g1
Follow-up: `bd close code-qfs` already completed in the parent commit;
this is a post-shipping tune, no beads action needed.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
parent
0f6321ce86
commit
bc866d53fa
1 changed files with 12 additions and 3 deletions
|
|
@ -26,10 +26,18 @@ GRABBED_IDS_FILE = "/data/grabbed_ids.txt"
|
||||||
MIN_MB = int(os.environ.get("MIN_MB", "50"))
|
MIN_MB = int(os.environ.get("MIN_MB", "50"))
|
||||||
MAX_MB = int(os.environ.get("MAX_MB", "1024"))
|
MAX_MB = int(os.environ.get("MAX_MB", "1024"))
|
||||||
LEECHER_FLOOR = int(os.environ.get("LEECHER_FLOOR", "1"))
|
LEECHER_FLOOR = int(os.environ.get("LEECHER_FLOOR", "1"))
|
||||||
SEEDER_CEILING = int(os.environ.get("SEEDER_CEILING", "50"))
|
# MAM's catalogue is well-seeded by design — a ceiling of 50 rejected ~99%
|
||||||
|
# of candidates in live testing. 200 still filters out truly oversupplied
|
||||||
|
# swarms while keeping enough working-set to grab 3-5 titles per run.
|
||||||
|
SEEDER_CEILING = int(os.environ.get("SEEDER_CEILING", "200"))
|
||||||
GRAB_PER_RUN = int(os.environ.get("GRAB_PER_RUN", "5"))
|
GRAB_PER_RUN = int(os.environ.get("GRAB_PER_RUN", "5"))
|
||||||
MAX_TORRENTS = int(os.environ.get("MAX_TORRENTS", "500"))
|
MAX_TORRENTS = int(os.environ.get("MAX_TORRENTS", "500"))
|
||||||
RATIO_FLOOR = float(os.environ.get("RATIO_FLOOR", "1.2"))
|
# The guard's real job is to prevent the Mouse-class death spiral (see RC1
|
||||||
|
# in the original recovery plan). Once class > Mouse, MAM serves peer
|
||||||
|
# lists normally and demand-first filtering (leechers>=1) keeps new grabs
|
||||||
|
# upload-positive. Keep a low floor as a tripwire for catastrophic dips
|
||||||
|
# rather than a steady-state block.
|
||||||
|
RATIO_FLOOR = float(os.environ.get("RATIO_FLOOR", "0.5"))
|
||||||
REQUEST_SLEEP = float(os.environ.get("REQUEST_SLEEP", "3"))
|
REQUEST_SLEEP = float(os.environ.get("REQUEST_SLEEP", "3"))
|
||||||
|
|
||||||
CLASS_CODES = {
|
CLASS_CODES = {
|
||||||
|
|
@ -46,8 +54,9 @@ CLASS_CODES = {
|
||||||
|
|
||||||
|
|
||||||
def parse_size(s):
|
def parse_size(s):
|
||||||
|
# MAM pretty-prints sizes with thousands separators (e.g. "1,002.9 MiB").
|
||||||
units = {"B": 1, "KiB": 1024, "MiB": 1024**2, "GiB": 1024**3, "TiB": 1024**4}
|
units = {"B": 1, "KiB": 1024, "MiB": 1024**2, "GiB": 1024**3, "TiB": 1024**4}
|
||||||
parts = s.split()
|
parts = s.replace(",", "").split()
|
||||||
if len(parts) != 2:
|
if len(parts) != 2:
|
||||||
return 0
|
return 0
|
||||||
return int(float(parts[0]) * units.get(parts[1], 1))
|
return int(float(parts[0]) * units.get(parts[1], 1))
|
||||||
|
|
|
||||||
Loading…
Add table
Add a link
Reference in a new issue