infra

viktor/infra

Fork 0

Commit graph

Author	SHA1	Message	Date
Viktor Barzin	8197842646	anubis: fix 500 on multi-replica + roll out to 6 more public sites Browser visits to viktorbarzin.me started returning HTTP 500 with `store: key not found: "challenge:..."` in pod logs. Root cause: each Anubis pod stores in-flight challenges in process memory; with 2 replicas behind a ClusterIP, the PoW-solved request can be routed to a different pod than the one that issued the challenge. Anubis upstream documents the same caveat ("when running multiple instances on the same base domain, the key must be the same across all instances" — true for the ed25519 signing key, but the challenge store is still pod-local without a shared backend). Drop module default replicas: 2 → 1. Worst-case: ~1s cold-start on pod restart. Real fix (Redis-backed challenge store) noted as a follow-up in CLAUDE.md. Roll Anubis out to: f1-stream, cyberchef (cc), jsoncrack (json), privatebin (pb), homepage (home), real-estate-crawler (wrongmove UI only — `/api` ingress stays direct via path-based ingress carve- out so XHRs from the SPA bypass the challenge). End-state: 9 public hosts now Anubis-fronted (blog, www, kms, travel, f1, cc, json, pb, home, wrongmove). All return the challenge HTML to bare curl/browser; verified-IP search engines and /robots.txt + /.well-known still skip via the strict-policy allowlist.	2026-05-10 11:12:40 +00:00
Viktor Barzin	abdef1781c	anubis: strict bot policy — catch-all CHALLENGE for unmatched UAs The default upstream policy only WEIGHs Mozilla\|Opera UAs and lets everything else (curl, wget, python-requests, scrapy, headless CLI scrapers) fall through to the implicit ALLOW. On non-CDN-fronted hosts (kms, anything dns_type=non-proxied) this meant a plain `curl https://kms.viktorbarzin.me/` returned the real backend content with no challenge — defeating the whole point of the "avoid casual scrapers" intent. Now the module ships a custom POLICY_FNAME mounted via ConfigMap: - Imports the upstream deny-pathological / ai-block-aggressive / allow-good-crawlers / keep-internet-working snippets unchanged - Adds a final `path_regex: .*` → action: CHALLENGE catch-all Result: only IP-verified search engines (Googlebot from Google IPs, Bingbot, etc.) and well-known paths (robots.txt, .well-known, favicon, sitemap) skip the challenge. Everything else — including spoofed-Googlebot-UA-from-random-IP — solves PoW or gets nothing. Verified post-apply: curl default UA on viktorbarzin.me + kms + travel returns the Anubis challenge HTML; /robots.txt still 200s straight through.	2026-05-10 11:12:40 +00:00
Viktor Barzin	58fd4025f8	anubis: per-site PoW reverse proxy on blog + kms + travel-blog Adds modules/kubernetes/anubis_instance/ — a per-site reverse proxy instance pinned to ghcr.io/techarohq/anubis:v1.25.0. Each instance issues a 30-day JWT cookie scoped to viktorbarzin.me after a tiny proof-of-work (difficulty 2 ≈ 250 ms desktop / 700 ms mobile). The shared ed25519 signing key (Vault: secret/viktor → anubis_ed25519_key) makes a single solve good across every Anubis-fronted subdomain. Wired into blog (viktorbarzin.me + www), kms.viktorbarzin.me, and travel.viktorbarzin.me — each with anti_ai_scraping=false on the ingress so the redundant ai-bot-block forwardAuth is dropped from the chain. Skipped forgejo (Git/API clients can't solve PoW) and resume (replicas=0). Also tightens bot-block-proxy nginx timeouts (3s/5s → 100ms/200ms) so any ingress still using the ai-bot-block forwardAuth pays at most ~150 ms when poison-fountain is scaled down, instead of 3 s. End-to-end TTFB on viktorbarzin.me dropped from ~3.2 s to ~150-200 ms. Docs: .claude/reference/patterns.md "Anti-AI Scraping" updated to 4 layers; .claude/CLAUDE.md adds the Anubis usage paragraph and Forgejo/API caveat.	2026-05-10 11:12:40 +00:00

Author

SHA1

Message

Date

Viktor Barzin

8197842646

anubis: fix 500 on multi-replica + roll out to 6 more public sites

Browser visits to viktorbarzin.me started returning HTTP 500 with
`store: key not found: "challenge:..."` in pod logs. Root cause:
each Anubis pod stores in-flight challenges in process memory; with
2 replicas behind a ClusterIP, the PoW-solved request can be
routed to a different pod than the one that issued the challenge.
Anubis upstream documents the same caveat ("when running multiple
instances on the same base domain, the key must be the same across
all instances" — true for the ed25519 signing key, but the
challenge store is still pod-local without a shared backend).

Drop module default replicas: 2 → 1. Worst-case: ~1s cold-start on
pod restart. Real fix (Redis-backed challenge store) noted as a
follow-up in CLAUDE.md.

Roll Anubis out to: f1-stream, cyberchef (cc), jsoncrack (json),
privatebin (pb), homepage (home), real-estate-crawler (wrongmove
UI only — `/api` ingress stays direct via path-based ingress carve-
out so XHRs from the SPA bypass the challenge).

End-state: 9 public hosts now Anubis-fronted (blog, www, kms,
travel, f1, cc, json, pb, home, wrongmove). All return the
challenge HTML to bare curl/browser; verified-IP search engines and
/robots.txt + /.well-known still skip via the strict-policy
allowlist.

2026-05-10 11:12:40 +00:00

Viktor Barzin

abdef1781c

anubis: strict bot policy — catch-all CHALLENGE for unmatched UAs

The default upstream policy only WEIGHs Mozilla|Opera UAs and lets
everything else (curl, wget, python-requests, scrapy, headless CLI
scrapers) fall through to the implicit ALLOW. On non-CDN-fronted
hosts (kms, anything dns_type=non-proxied) this meant a plain
`curl https://kms.viktorbarzin.me/` returned the real backend
content with no challenge — defeating the whole point of the
"avoid casual scrapers" intent.

Now the module ships a custom POLICY_FNAME mounted via ConfigMap:
- Imports the upstream deny-pathological / ai-block-aggressive /
  allow-good-crawlers / keep-internet-working snippets unchanged
- Adds a final `path_regex: .*` → action: CHALLENGE catch-all

Result: only IP-verified search engines (Googlebot from Google IPs,
Bingbot, etc.) and well-known paths (robots.txt, .well-known,
favicon, sitemap) skip the challenge. Everything else — including
spoofed-Googlebot-UA-from-random-IP — solves PoW or gets nothing.

Verified post-apply: curl default UA on viktorbarzin.me + kms +
travel returns the Anubis challenge HTML; /robots.txt still 200s
straight through.

2026-05-10 11:12:40 +00:00

Viktor Barzin

58fd4025f8

anubis: per-site PoW reverse proxy on blog + kms + travel-blog

Adds modules/kubernetes/anubis_instance/ — a per-site reverse proxy
instance pinned to ghcr.io/techarohq/anubis:v1.25.0. Each instance
issues a 30-day JWT cookie scoped to viktorbarzin.me after a tiny
proof-of-work (difficulty 2 ≈ 250 ms desktop / 700 ms mobile). The
shared ed25519 signing key (Vault: secret/viktor → anubis_ed25519_key)
makes a single solve good across every Anubis-fronted subdomain.

Wired into blog (viktorbarzin.me + www), kms.viktorbarzin.me, and
travel.viktorbarzin.me — each with anti_ai_scraping=false on the
ingress so the redundant ai-bot-block forwardAuth is dropped from the
chain. Skipped forgejo (Git/API clients can't solve PoW) and resume
(replicas=0).

Also tightens bot-block-proxy nginx timeouts (3s/5s → 100ms/200ms) so
any ingress still using the ai-bot-block forwardAuth pays at most
~150 ms when poison-fountain is scaled down, instead of 3 s.

End-to-end TTFB on viktorbarzin.me dropped from ~3.2 s to ~150-200 ms.

Docs: .claude/reference/patterns.md "Anti-AI Scraping" updated to
4 layers; .claude/CLAUDE.md adds the Anubis usage paragraph and
Forgejo/API caveat.

2026-05-10 11:12:40 +00:00

3 commits