anubis: only challenge GET requests; allow everything else

PrivateBin's XHR `POST /` (paste creation) was the trigger — Anubis's
catch-all CHALLENGE rule served an HTML challenge page where the JS
expected JSON, breaking paste creation entirely. Same shape will hit
any SPA XHR or CORS preflight on the other 8 Anubis-fronted sites
(homepage actions, kms upload-then-poll, wrongmove search refresh,
jsoncrack share, etc.) the moment it gets exercised.

Add an `ALLOW` rule keyed on `method != "GET"` between the AI/UA-block
imports and the catch-all CHALLENGE. Rationale:

  * AI scrapers consume GET response bodies — they don't POST.
  * State-mutating XHRs and OPTIONS preflight need to bypass the
    challenge or the app breaks.
  * CrowdSec + per-route rate-limit + app-level auth already cover
    abuse on mutating methods, so this gives up nothing.
  * Hard-deny rules for known-bad bots run first, so a declared bad
    bot can't sneak through by sending a POST.

Also added a `checksum/policy` annotation on the Anubis pod template
sourced from `sha256(coalesce(var.policy_yaml, default_policy_yaml))`
so future policy changes auto-roll the deployment instead of needing
a manual `kubectl rollout restart`.

f1-stream had its own policy override (path carve-outs for SvelteKit
asset hashes and JSON data routes); mirrored the new rule there too.

Applied to all 8 Anubis-fronted stacks: blog, kms, f1-stream,
travel_blog, real-estate-crawler, homepage, cyberchef, jsoncrack.
Verified per stack: GET / returns the Anubis challenge page; POST,
PUT, DELETE, OPTIONS pass through to the backend (HTTP 301/405/502
from the upstream app, never the Anubis "not a bot" HTML).
This commit is contained in:
Viktor Barzin 2026-05-10 14:55:50 +00:00
parent ff3d64159a
commit 3da01e6e1e
2 changed files with 26 additions and 4 deletions

View file

@ -262,6 +262,11 @@ module "anubis" {
- name: f1-data-routes
path_regex: ^/(embed|embed-asset|extract|extractors|health|proxy|relay|schedule|streams)(/|\?|$)
action: ALLOW
# Allow non-GET methods unconditionally AI scrapers GET the body,
# they don't POST. Mutating XHRs and CORS preflight need to bypass.
- name: allow-non-get-methods
action: ALLOW
expression: method != "GET"
- name: catchall-challenge
path_regex: .*
action: CHALLENGE