infra

viktor/infra

Fork 0

Commit graph

Author	SHA1	Message	Date
Viktor Barzin	12fbc404ec	anubis: strict bot policy — catch-all CHALLENGE for unmatched UAs The default upstream policy only WEIGHs Mozilla\|Opera UAs and lets everything else (curl, wget, python-requests, scrapy, headless CLI scrapers) fall through to the implicit ALLOW. On non-CDN-fronted hosts (kms, anything dns_type=non-proxied) this meant a plain `curl https://kms.viktorbarzin.me/` returned the real backend content with no challenge — defeating the whole point of the "avoid casual scrapers" intent. Now the module ships a custom POLICY_FNAME mounted via ConfigMap: - Imports the upstream deny-pathological / ai-block-aggressive / allow-good-crawlers / keep-internet-working snippets unchanged - Adds a final `path_regex: .*` → action: CHALLENGE catch-all Result: only IP-verified search engines (Googlebot from Google IPs, Bingbot, etc.) and well-known paths (robots.txt, .well-known, favicon, sitemap) skip the challenge. Everything else — including spoofed-Googlebot-UA-from-random-IP — solves PoW or gets nothing. Verified post-apply: curl default UA on viktorbarzin.me + kms + travel returns the Anubis challenge HTML; /robots.txt still 200s straight through.	2026-05-10 00:21:56 +00:00
Viktor Barzin	f48da84770	anubis: per-site PoW reverse proxy on blog + kms + travel-blog Adds modules/kubernetes/anubis_instance/ — a per-site reverse proxy instance pinned to ghcr.io/techarohq/anubis:v1.25.0. Each instance issues a 30-day JWT cookie scoped to viktorbarzin.me after a tiny proof-of-work (difficulty 2 ≈ 250 ms desktop / 700 ms mobile). The shared ed25519 signing key (Vault: secret/viktor → anubis_ed25519_key) makes a single solve good across every Anubis-fronted subdomain. Wired into blog (viktorbarzin.me + www), kms.viktorbarzin.me, and travel.viktorbarzin.me — each with anti_ai_scraping=false on the ingress so the redundant ai-bot-block forwardAuth is dropped from the chain. Skipped forgejo (Git/API clients can't solve PoW) and resume (replicas=0). Also tightens bot-block-proxy nginx timeouts (3s/5s → 100ms/200ms) so any ingress still using the ai-bot-block forwardAuth pays at most ~150 ms when poison-fountain is scaled down, instead of 3 s. End-to-end TTFB on viktorbarzin.me dropped from ~3.2 s to ~150-200 ms. Docs: .claude/reference/patterns.md "Anti-AI Scraping" updated to 4 layers; .claude/CLAUDE.md adds the Anubis usage paragraph and Forgejo/API caveat.	2026-05-10 00:06:21 +00:00

Author

SHA1

Message

Date

Viktor Barzin

12fbc404ec

anubis: strict bot policy — catch-all CHALLENGE for unmatched UAs

The default upstream policy only WEIGHs Mozilla|Opera UAs and lets
everything else (curl, wget, python-requests, scrapy, headless CLI
scrapers) fall through to the implicit ALLOW. On non-CDN-fronted
hosts (kms, anything dns_type=non-proxied) this meant a plain
`curl https://kms.viktorbarzin.me/` returned the real backend
content with no challenge — defeating the whole point of the
"avoid casual scrapers" intent.

Now the module ships a custom POLICY_FNAME mounted via ConfigMap:
- Imports the upstream deny-pathological / ai-block-aggressive /
  allow-good-crawlers / keep-internet-working snippets unchanged
- Adds a final `path_regex: .*` → action: CHALLENGE catch-all

Result: only IP-verified search engines (Googlebot from Google IPs,
Bingbot, etc.) and well-known paths (robots.txt, .well-known,
favicon, sitemap) skip the challenge. Everything else — including
spoofed-Googlebot-UA-from-random-IP — solves PoW or gets nothing.

Verified post-apply: curl default UA on viktorbarzin.me + kms +
travel returns the Anubis challenge HTML; /robots.txt still 200s
straight through.

2026-05-10 00:21:56 +00:00

Viktor Barzin

f48da84770

anubis: per-site PoW reverse proxy on blog + kms + travel-blog

Adds modules/kubernetes/anubis_instance/ — a per-site reverse proxy
instance pinned to ghcr.io/techarohq/anubis:v1.25.0. Each instance
issues a 30-day JWT cookie scoped to viktorbarzin.me after a tiny
proof-of-work (difficulty 2 ≈ 250 ms desktop / 700 ms mobile). The
shared ed25519 signing key (Vault: secret/viktor → anubis_ed25519_key)
makes a single solve good across every Anubis-fronted subdomain.

Wired into blog (viktorbarzin.me + www), kms.viktorbarzin.me, and
travel.viktorbarzin.me — each with anti_ai_scraping=false on the
ingress so the redundant ai-bot-block forwardAuth is dropped from the
chain. Skipped forgejo (Git/API clients can't solve PoW) and resume
(replicas=0).

Also tightens bot-block-proxy nginx timeouts (3s/5s → 100ms/200ms) so
any ingress still using the ai-bot-block forwardAuth pays at most
~150 ms when poison-fountain is scaled down, instead of 3 s.

End-to-end TTFB on viktorbarzin.me dropped from ~3.2 s to ~150-200 ms.

Docs: .claude/reference/patterns.md "Anti-AI Scraping" updated to
4 layers; .claude/CLAUDE.md adds the Anubis usage paragraph and
Forgejo/API caveat.

2026-05-10 00:06:21 +00:00

2 commits