Compare commits

..

No commits in common. "6c2c56ab3b7a769c0fb30a77a5dea296f252c121" and "4df741f6de76a6e8d50fed15627e65fe160cebae" have entirely different histories.

3 changed files with 111 additions and 176 deletions

View file

@ -202,7 +202,7 @@ the workflow's built-in `GITHUB_TOKEN` (`packages: write`).
- **Critical path services scaled to 3**: Traefik, Authentik, CrowdSec LAPI, PgBouncer, Cloudflared.
- **PDBs**: minAvailable=2 on Traefik and Authentik.
- **Fallback proxies**: basicAuth when Authentik is down, fail-open when poison-fountain is down.
- **CrowdSec enforcement is out-of-band** (no Traefik plugin/middleware — the dead Yaegi `crowdsec-bouncer-traefik-plugin` was removed on Traefik 3.7.5): banned IPs are dropped **in-kernel via nftables** by the `cs-firewall-bouncer` DaemonSet on **direct** hosts (drops in BOTH the `input` and `forward` hooks — Traefik is ETP=Local so client traffic is DNAT'd to the pod via `forward`; pulls ALL decisions incl. the ~31k CAPI blocklist), and **blocked at the Cloudflare edge** for **proxied** hosts (one `crowdsec_ban` Rules List + a zone WAF block rule, fed by the `crowdsec-cf-sync` CronJob in `rybbit` ns every 2 min — excludes CAPI). Zero per-request latency; **fails open** (LAPI down → no new bans, existing drops persist, legit traffic never blocked). Whitelist covers RFC1918 + tailnet + internal CIDRs. Full as-built: `docs/architecture/security.md`.
- **CrowdSec bouncer**: graceful degradation mode (fail-open on error).
- **Rate limiting**: Return 429 (not 503). Per-service tuning via dedicated middleware + `skip_default_rate_limit` (default 10/s burst 50): Immich 1000/20000, ActualBudget 50/300 (app boot = ~70 parallel revalidations).
- **Retry middleware**: 2 attempts, 100ms — in default ingress chain.
- **Entrypoint transport timeouts** (`websecure` `respondingTimeouts`): `writeTimeout=0` (unlimited download duration), `readTimeout=3600s` (uploads ≤1h), `idleTimeout=600s`. These are **HARD total-duration caps**, not nginx-style per-read idle timeouts — a finite `writeTimeout` truncates *any* large download at that wall-clock mark (a prior `writeTimeout=60s` silently cut Immich videos at 60s). **Do NOT re-tighten `writeTimeout`**; keep `readTimeout` finite (slow-loris backstop) but ≥ longest expected upload. Full rationale: `docs/architecture/networking.md` → "Entrypoint Transport Timeouts".
@ -216,7 +216,7 @@ the workflow's built-in `GITHUB_TOKEN` (`packages: write`).
|---------|--------------------------|
| Nextcloud | MaxRequestWorkers=150, needs 8Gi limit (Apache transient memory spikes, see commit eb94144), very generous startup probe |
| Immich | ML on SSD (CUDA), disable ModSecurity (breaks streaming), frequent upgrades. **`immich-machine-learning` MUST run with `MACHINE_LEARNING_MODEL_TTL > 0`** (set to `600` in `stacks/immich/main.tf`, env on the `immich-machine-learning` deployment). At `0`, no model ever unloads and onnxruntime's CUDA arena (OCR's dynamic input shapes inflate it to ~10 GB) is held forever on the **time-sliced T4 it shares with llama-swap/frigate/immich-server** — which has no VRAM isolation, so immich-ml starved llama-swap (qwen3-8b) and silently broke recruiter-responder triage for ~5 h on 2026-06-02 (post-mortem `docs/post-mortems/2026-06-02-immich-ml-ttl-gpu-oom-recruiter.md`). TTL>0 lets idle models (OCR, face — AND CLIP) free VRAM. The TTL is a single GLOBAL knob (no per-model pin), so CLIP would also unload after 600s idle; the `clip-keepalive` CronJob (`*/5 * * * *`, same stack) pings the CLIP textual encoder so smart-search stays warm without pinning the ad-hoc models. **Smart search has a SECOND warmth layer in Postgres** (don't conflate it with the ML model): the ~665MB vchord `clip_index` must stay resident in PG `shared_buffers`, else an ANN probe that lands on an evicted list pays a ~1.8s cold storage read vs ~4ms warm. The `postStart` hook prewarms it ONCE at pod start and `pg_prewarm.autoprewarm` only re-warms at *startup*, so the index decays out of cache over days under job buffer-pressure (observed ~33% resident after 9d uptime → slow context search, easily misattributed to the ML model). The `clip-index-prewarm` CronJob (`*/5`, same stack) re-runs `pg_prewarm('clip_index')` to pin it hot; `immich-search-probe` (`*/5`) measures live latency + residency → Pushgateway gauges (`immich_smart_search_db_seconds`, `immich_clip_index_cached_pct`) → alerts `ImmichSmartSearchSlow`/`ImmichClipIndexColdCache`/`ImmichSearchProbeStale` + cluster-health check #46 (`check_immich_search`). immich PG role is a superuser so the CronJobs can run `pg_prewarm`/`pg_buffercache`. **Video transcoding is GPU-accelerated**: `immich-server` is pinned to GPU node1 (nodeSelector `nvidia.com/gpu.present` + NoSchedule toleration + `gpu-workload` priority) with a time-sliced `nvidia.com/gpu=1` slice — the stock immich-server image's ffmpeg already ships h264/hevc_nvenc + NVDEC. Activated via `ffmpeg.accel=nvenc` + `accelDecode=true` in the **DB** system-config (`system_metadata` table, key `system-config`, JSONB — NOT Terraform; app config is DB-managed here like oauth/smtp). Direct DB edits need a pod **recreate** to reload (config is cached at boot; only API-driven changes broadcast a reload). **Streaming bitrate is capped** to keep 4K playback smooth on the contended HDD and over remote uplinks: `ffmpeg.maxBitrate=20000k` + `preset=medium` + `transcode=bitrate` (set 2026-06-01 — was uncapped `maxBitrate=0` + `ultrafast` + `targetResolution=original`, which produced 77264 Mbps 4K transcodes that stuttered for every client, local and remote, since even a single stream needs ~1013.5 MB/s off the shared `sdc` spindle). 4K resolution is preserved (`targetResolution=original`); originals are NEVER modified — only the `encoded-video/` streaming copy. To re-apply transcode settings to EXISTING videos (config changes only affect new/missing ones): delete the offenders' `asset_file` rows `WHERE type='encoded_video'` (derived/regenerable — never touches originals) then run videoConversion `force=false` (admin Jobs API → "Missing"); it regenerates them to the deterministic `<assetId>.mp4` path at concurrency 1 (gentle on sdc). See `docs/runbooks/immich-transcode-bitrate.md`. If Immich is ever reinstalled fresh (not restored), re-set these keys (accel, accelDecode, **maxBitrate=20000k, preset=medium, transcode=bitrate**). Thumbnails/previews live on SSD NFS (sdb) — do NOT move to block storage (HDD sdc = slower + the contended IO domain). **Background-job concurrency is capped to protect sdc** (DB-managed system-config, `system_metadata` key `system-config`, JSONB `job.*.concurrency`; re-set on fresh install): `thumbnailGeneration=2`, `metadataExtraction=2`, `library=2` — these jobs read ORIGINALS off the HDD library. Left uncapped (were 8/4/4) a library-wide job (e.g. Duplicate Detection on 2026-06-01) fans the ML/thumbnail backfill out into a read storm that saturates sdc and starves etcd → apiserver down. `sidecar`/`smartSearch`/`faceDetection` stay at Immich defaults (small `.xmp` / SSD previews). Apply via Job Settings UI or the `system-config` API; **direct DB edits need an `immich-server` pod recreate to reload** (config cached at boot). See `docs/post-mortems/2026-05-25-immich-anca-elements-io-storm.md`. |
| CrowdSec | Pin version, disable Metabase when not needed (CPU hog), LAPI scaled to 3, **DB on PostgreSQL** (migrated from MySQL), flush config: max_items=10000/max_age=7d/agents_autodelete=30d, DECISION_DURATION=168h in blocklist CronJob. **Enforcement is out-of-band, NOT a Traefik plugin** (the Yaegi `crowdsec-bouncer-traefik-plugin` was dead on Traefik 3.7.5 and removed): `cs-firewall-bouncer` DaemonSet drops in-kernel via nftables on direct hosts (bouncer key `firewall`, v0.0.34 binary fetched at runtime, hostNetwork+NET_ADMIN, `stacks/crowdsec/modules/crowdsec/firewall_bouncer.tf`); `crowdsec-cf-sync` CronJob blocks at the CF edge for proxied hosts (bouncer key `kvsync`, `stacks/rybbit/crowdsec_edge.tf`). Both fail open. See `docs/architecture/security.md` |
| CrowdSec | Pin version, disable Metabase when not needed (CPU hog), LAPI scaled to 3, **DB on PostgreSQL** (migrated from MySQL), flush config: max_items=10000/max_age=7d/agents_autodelete=30d, DECISION_DURATION=168h in blocklist CronJob |
| Frigate | GPU stall detection in liveness probe (inference speed check), high CPU |
| Authentik | 3 server replicas + 2-replica embedded outpost (PG-backed sessions), PgBouncer in front of PostgreSQL, strip auth headers before forwarding. **`authentik.*` Helm values are INERT** (existingSecret skips chart env rendering) — tune via `server.env`/`worker.env` in `modules/authentik/values.yaml`. Single-screen login (password embedded in identification stage); all first-party OIDC apps use implicit consent (2026-06-10). `/static` ingress carve-out serves assets with immutable Cache-Control. |
| Kyverno | failurePolicy=Ignore to prevent blocking cluster, pin chart version |

View file

@ -4,7 +4,7 @@ Last updated: 2026-04-19 (WS E — Kea DHCP pushes dual DNS per subnet; Kea DDNS
## Overview
The homelab network is built on a dual-VLAN architecture with pfSense providing gateway services, Technitium for internal DNS, and Cloudflare for external DNS. Traefik serves as the Kubernetes ingress controller with a middleware chain of anti-AI bot-blocking, Authentik forward-auth, rate limiting, and retry. CrowdSec IP-reputation enforcement is **out-of-band** (not a Traefik hop): banned IPs are dropped in-kernel via nftables on direct hosts and blocked at the Cloudflare edge on proxied hosts (see `docs/architecture/security.md`). All HTTP traffic flows through Cloudflared tunnels, avoiding the need for port forwarding or exposing public IPs.
The homelab network is built on a dual-VLAN architecture with pfSense providing gateway services, Technitium for internal DNS, and Cloudflare for external DNS. Traefik serves as the Kubernetes ingress controller with a comprehensive middleware chain including CrowdSec bot protection, Authentik forward-auth, and rate limiting. All HTTP traffic flows through Cloudflared tunnels, avoiding the need for port forwarding or exposing public IPs.
## Architecture Diagram
@ -16,14 +16,12 @@ graph TB
Traefik[Traefik Ingress<br/>3 replicas + PDB]
subgraph "Middleware Chain"
AntiAI[Anti-AI bot-block<br/>fail-open]
CS[CrowdSec Bouncer<br/>fail-open]
Auth[Authentik Forward-Auth<br/>3 replicas + PDB]
RL[Rate Limiter<br/>429 response]
Retry[Retry<br/>2 attempts, 100ms]
end
CSdrop[CrowdSec drop<br/>nftables / CF edge<br/>out-of-band, pre-Traefik]
subgraph "Proxmox Host (eno1)"
vmbr0[vmbr0 Bridge<br/>192.168.1.127/24]
vmbr1[vmbr1 Internal<br/>VLAN-aware]
@ -55,9 +53,8 @@ graph TB
Internet -->|DNS query| CF
CF -->|CNAME to tunnel| CFD
CFD --> Traefik
CSdrop -.->|banned IPs dropped before Traefik| Traefik
Traefik --> AntiAI
AntiAI --> Auth
Traefik --> CS
CS --> Auth
Auth --> RL
RL --> Retry
Retry --> Service
@ -85,7 +82,7 @@ graph TB
| Cloudflare DNS | SaaS | External | ~50 public domains under viktorbarzin.me |
| Cloudflared | Container | K8s (3 replicas) | Tunnel ingress, replaces port forwarding |
| Traefik | Helm chart | K8s (3 replicas + PDB) | Ingress controller, HTTP/3 enabled |
| CrowdSec | Helm chart | K8s (LAPI: 3 replicas) | IP reputation. Out-of-band enforcement: `cs-firewall-bouncer` DaemonSet (in-kernel nftables drop, direct hosts) + Cloudflare edge WAF rule (proxied hosts). Fail-open |
| CrowdSec | Helm chart | K8s (LAPI: 3 replicas) | Bot protection, fail-open bouncer |
| Authentik | Helm chart | K8s (3 replicas + PDB) | SSO, forward-auth middleware |
| MetalLB | v0.15.3 Helm chart | K8s | LoadBalancer IPs (10.0.20.200-10.0.20.220), all services on 10.0.20.200 |
| Registry Cache | Container | 10.0.20.10 | Pull-through for docker.io:5000, ghcr.io:5010 |
@ -211,31 +208,24 @@ VMs tag traffic on vmbr1 to isolate workloads. pfSense bridges VLAN 20 to the up
### Ingress Flow
CrowdSec is **not** a step in this chain — banned IPs are dropped before the
request ever reaches Traefik (Cloudflare edge WAF rule on proxied hosts; host
nftables on direct hosts). The flow below is for a request that survives that
out-of-band gate.
```mermaid
sequenceDiagram
participant Client
participant CFedge as Cloudflare (edge WAF: crowdsec_ban block)
participant Cloudflare
participant Cloudflared
participant Traefik
participant AntiAI
participant CrowdSec
participant Authentik
participant RateLimit
participant Retry
participant Service
participant Pod
Client->>CFedge: HTTPS request to blog.viktorbarzin.me
Note over CFedge: banned IP → blocked here (proxied hosts)
CFedge->>Cloudflared: Forward via tunnel (QUIC)
Client->>Cloudflare: HTTPS request to blog.viktorbarzin.me
Cloudflare->>Cloudflared: Forward via tunnel (QUIC)
Cloudflared->>Traefik: HTTP to LoadBalancer IP
Note over Traefik: on direct hosts, banned IPs already dropped in-kernel (nftables forward hook)
Traefik->>AntiAI: anti-AI bot-block (fail-open)
AntiAI->>Authentik: If allowed, check auth (protected=true)
Traefik->>CrowdSec: Apply bouncer middleware
CrowdSec->>Authentik: If allowed, check auth (protected=true)
Authentik->>RateLimit: If authenticated, check rate limit
RateLimit->>Retry: If within limit, continue
Retry->>Service: Forward to Service
@ -244,27 +234,24 @@ sequenceDiagram
Service-->>Retry: Response
Retry-->>RateLimit: Response
RateLimit-->>Authentik: Response (strip auth headers)
Authentik-->>AntiAI: Response
AntiAI-->>Traefik: Response
Authentik-->>CrowdSec: Response
CrowdSec-->>Traefik: Response
Traefik-->>Cloudflared: Response
Cloudflared-->>CFedge: Response via tunnel
CFedge-->>Client: HTTPS response
Cloudflared-->>Cloudflare: Response via tunnel
Cloudflare-->>Client: HTTPS response
```
### Middleware Chain
CrowdSec IP-reputation enforcement is **not** in this chain — it is out-of-band
(host nftables on direct hosts; the Cloudflare edge WAF `crowdsec_ban` rule on
proxied hosts), so banned IPs never reach the chain and there is no per-request
CrowdSec hop. Every ingress created by the `ingress_factory` module follows this
Traefik chain:
Every ingress created by the `ingress_factory` module follows this chain:
1. **Anti-AI bot-block** (`ai-bot-block` ForwardAuth, on by default via `ingress_factory`): blocks/tarpits known AI crawlers. **Fail-open** (currently a no-op `return 200` — poison-fountain scaled to 0; see `docs/architecture/security.md`).
1. **CrowdSec Bouncer**: Checks IP against threat database. **Fail-open** mode — if LAPI is unreachable, traffic passes through to prevent outages.
2. **Authentik Forward-Auth** (if `protected = true`): SSO authentication via OIDC. Non-authenticated users are redirected to login. Auth headers are stripped before forwarding to backend.
3. **Rate Limiting**: Per-IP throttling. Returns **429 Too Many Requests** (not 503) when limit exceeded. Default is `rate-limit` (average 10 req/s, burst 50). Services whose clients legitimately burst harder get a dedicated middleware via `skip_default_rate_limit = true` + `extra_middlewares`: Immich (`immich-rate-limit`, 1000/20000, photo uploads) and ActualBudget (`actualbudget-rate-limit`, 50/300 — the Actual web app boots with ~70 parallel asset/migration revalidations; the default burst 429'd the tail and stalled every page load).
4. **Retry**: 2 attempts with 100ms delay on transient failures (5xx errors, connection errors).
Additional middleware:
- **Anti-AI**: On by default via `ingress_factory`. Blocks common AI crawler user-agents.
- **HTTP/3 (QUIC)**: Enabled globally on Traefik.
### Entrypoint Transport Timeouts
@ -361,7 +348,7 @@ Containerd on all K8s nodes uses `hosts.toml` to redirect pulls to the local cac
| pfSense | `stacks/pfsense/` | VM + cloud-init config |
| Technitium | `stacks/technitium/` | Deployment, Service, PVC |
| Traefik | `stacks/platform/` (sub-module) | Helm release, IngressRoute CRDs |
| CrowdSec | `stacks/crowdsec/` (+ edge in `stacks/rybbit/`) | Helm release, LAPI + agent; `cs-firewall-bouncer` DaemonSet (nftables, direct hosts) + Cloudflare edge sync (proxied hosts) |
| CrowdSec | `stacks/platform/` (sub-module) | Helm release, LAPI + bouncer |
| Authentik | `stacks/authentik/` | Helm release, ingress, OIDC configs |
| MetalLB | `stacks/platform/` (sub-module) | Helm release, IPAddressPool |
| Cloudflared | `stacks/cloudflared/` | Deployment (3 replicas), tunnel config; runs `--no-autoupdate` (in-place self-updates exited the pods and severed all tunnel WebSockets, 2026-06-09/10) |
@ -449,30 +436,13 @@ Containerd on all K8s nodes uses `hosts.toml` to redirect pulls to the local cac
**Decision**: Technitium handles internal `.lan` domains with near-zero latency. Cloudflare handles public domains with global DNS. K8s nodes use Technitium as primary, which forwards non-.lan queries to Cloudflare.
### Why CrowdSec Enforcement Is Out-of-Band (and Fails Open)
### Why Fail-Open on CrowdSec Bouncer?
CrowdSec used to enforce inline as a Traefik middleware (the
`crowdsec-bouncer-traefik-plugin`). On Traefik 3.7.5 the Yaegi plugin handler was
never invoked, so it enforced nothing; the plugin was removed and enforcement
moved off the request path entirely (full history in
`docs/architecture/security.md`). It now runs on two surfaces:
**Alternatives considered**:
1. **Fail-closed**: Maximum security, but LAPI downtime blocks all traffic.
2. **Redundant LAPI**: Already scaled to 3 replicas, but resource pressure can still cause outages.
- **Direct hosts**`cs-firewall-bouncer` DaemonSet drops banned IPs in the host
nftables, in **both the `input` and `forward` hooks**. The `forward` hook is
the load-bearing one: with Traefik on a dedicated LB IP at
`externalTrafficPolicy=Local`, client packets are DNAT'd to the Traefik **pod**
and transit the node's `forward` chain (not `input`) — which is exactly why the
ingress must preserve the **real client IP** end-to-end (ETP=Local + PROXY-v2
for IPv6; see the Traefik LB IP and IPv6 ingress notes above). Without the real
client IP the firewall-bouncer (and the CF edge rule) would have nothing to
match on.
- **Proxied hosts** → a Cloudflare edge WAF rule (`ip.src in $crowdsec_ban`) fed
by the `crowdsec-cf-sync` CronJob.
Both **fail open**: if LAPI is unreachable, the firewall-bouncer simply stops
receiving new decisions (existing drops persist) and the CF sync skips a run —
neither ever blocks legitimate traffic. Availability > strict bot blocking, and
out-of-band enforcement adds **zero per-request latency** (no Traefik hop).
**Decision**: Availability > strict bot blocking. CrowdSec LAPI is scaled to 3 replicas for resilience, but during cluster-wide resource exhaustion (e.g., memory pressure), bouncer falls back to allowing traffic. This prevents a complete service outage due to a security add-on.
### Why HTTP/3 (QUIC)?
@ -503,10 +473,9 @@ out-of-band enforcement adds **zero per-request latency** (no Traefik hop).
**Symptoms**: All ingress routes return 503, Traefik dashboard shows no backends available.
**Diagnosis**: Middleware chain is blocking traffic. (CrowdSec is **not** in the
chain — a CrowdSec/LAPI outage cannot cause 503s; it only stops new bans.) Check:
1. Authentik status: `kubectl get pod -n authentik` (ForwardAuth fails closed if the auth server is unreachable)
2. `bot-block-proxy` status: `kubectl get pod -n traefik -l app=bot-block-proxy` (anti-AI ForwardAuth target — also fails closed if down)
**Diagnosis**: Middleware chain is blocking traffic. Check:
1. Authentik status: `kubectl get pod -n authentik`
2. CrowdSec LAPI status: `kubectl get pod -n crowdsec`
3. Traefik logs: `kubectl logs -n kube-system deploy/traefik`
**Fix**: If Authentik is down and ingress uses forward-auth, pods won't pass health checks. Scale Authentik to 3 replicas or temporarily disable forward-auth middleware.

View file

@ -2,50 +2,40 @@
## Overview
The homelab implements defense-in-depth security using CrowdSec for threat intelligence and IP reputation, Kyverno for policy enforcement and resource governance, and a 3-layer anti-AI scraping defense (reduced from 5 in April 2026 after removing the rewrite-body plugin). CrowdSec enforcement is **out-of-band** (not a per-request Traefik hop — see the CrowdSec section): banned IPs are dropped in-kernel via nftables on direct hosts, and blocked at the Cloudflare edge on proxied hosts, so enforcement adds **zero per-request latency**. All security components fail open (a CrowdSec outage stops new bans but never blocks legitimate traffic). Security policies are deployed in audit mode first, then selectively enforced after validation.
The homelab implements defense-in-depth security at the application layer (L7) using CrowdSec for threat intelligence and IP reputation, Kyverno for policy enforcement and resource governance, and a 3-layer anti-AI scraping defense (reduced from 5 in April 2026 after removing the rewrite-body plugin). All security components operate in graceful degradation mode (fail-open) to prevent cascading failures. Security policies are deployed in audit mode first, then selectively enforced after validation.
## Architecture Diagram
CrowdSec enforcement is out-of-band (NOT an inline Traefik middleware hop). The
Traefik request chain is anti-AI → Authentik ForwardAuth → rate-limit → retry;
CrowdSec drops banned IPs *before* (direct hosts) or *off* (proxied hosts) that
chain entirely.
```mermaid
graph TB
graph LR
Internet[Internet]
subgraph "Proxied hosts (orange-cloud)"
CFedge[Cloudflare edge<br/>WAF rule: ip.src in $crowdsec_ban → block]
end
subgraph "Direct hosts (grey-cloud / internal)"
NFT[Host nftables<br/>table crowdsec/crowdsec6<br/>drop in input + forward]
end
CF[Cloudflare WAF]
Tunnel[Cloudflared Tunnel]
Traefik[Traefik<br/>anti-AI → Authentik → rate-limit → retry]
CrowdSec[CrowdSec Bouncer<br/>Traefik Plugin]
AntiAI[Anti-AI Check<br/>poison-fountain]
ForwardAuth[Authentik ForwardAuth]
RateLimit[Rate Limit Middleware]
Retry[Retry Middleware<br/>2 attempts, 100ms]
Backend[Backend Service]
LAPI[CrowdSec LAPI<br/>3 replicas]
Agent[CrowdSec Agent<br/>parses Traefik logs]
FWB[cs-firewall-bouncer<br/>DaemonSet, every node]
CFsync[crowdsec-cf-sync<br/>CronJob, every 2 min]
Agent[CrowdSec Agent]
Internet -->|proxied| CFedge
Internet -->|direct| NFT
CFedge -->|allowed| Tunnel
Tunnel --> Traefik
NFT -->|allowed| Traefik
Traefik --> Backend
Internet -->|1| CF
CF -->|2| Tunnel
Tunnel -->|3| CrowdSec
CrowdSec -.->|Query| LAPI
Agent -.->|Report| LAPI
CrowdSec -->|4. Pass/Block| AntiAI
AntiAI -->|5. Human/Bot| ForwardAuth
ForwardAuth -->|6. Authenticated| RateLimit
RateLimit -->|7. Under Limit| Retry
Retry -->|8. Success/Retry| Backend
Agent -.->|report| LAPI
LAPI -.->|all decisions incl. CAPI| FWB
FWB -.->|program drop rules| NFT
LAPI -.->|ban/captcha decisions, CAPI excluded| CFsync
CFsync -.->|push IP list| CFedge
style CFedge fill:#f9f,stroke:#333
style NFT fill:#f9f,stroke:#333
style CrowdSec fill:#f9f,stroke:#333
style AntiAI fill:#ff9,stroke:#333
style ForwardAuth fill:#9f9,stroke:#333
style RateLimit fill:#99f,stroke:#333
```
## Components
@ -54,8 +44,7 @@ graph TB
|-----------|---------|----------|---------|
| CrowdSec LAPI | Pinned | `stacks/crowdsec/` | Local API, threat intelligence aggregation (3 replicas) |
| CrowdSec Agent | Pinned | `stacks/crowdsec/` | Log parser, scenario detection |
| cs-firewall-bouncer | v0.0.34 | `stacks/crowdsec/modules/crowdsec/firewall_bouncer.tf` | In-kernel nftables drop on every node (DIRECT hosts). Bouncer key `firewall` |
| crowdsec-cf-sync | — | `stacks/rybbit/crowdsec_edge.tf` | LAPI→Cloudflare-IP-List sync CronJob (PROXIED hosts). Bouncer key `kvsync` |
| CrowdSec Traefik Bouncer | Plugin | Traefik config | Plugin-based IP reputation check |
| Kyverno | Pinned chart | `stacks/kyverno/` | Policy engine for K8s admission control |
| poison-fountain | Latest | `stacks/poison-fountain/` | Anti-AI bot detection and tarpit service |
| cert-manager/certbot | - | `stacks/cert-manager/` | TLS certificate management |
@ -65,15 +54,11 @@ graph TB
### Request Security Layers
CrowdSec IP-reputation enforcement happens **before** a request reaches the
Traefik chain (banned IPs are dropped in-kernel on direct hosts, or blocked at
the Cloudflare edge on proxied hosts — see CrowdSec Threat Intelligence below).
A request that survives that out-of-band gate then passes through the Traefik
middleware chain:
Every incoming request passes through 6 security layers:
1. **Cloudflare WAF / edge** - DDoS protection, bot detection, firewall rules incl. the CrowdSec `crowdsec_ban` block rule (proxied hosts only)
2. **Cloudflared Tunnel** - Zero Trust tunnel, hides origin IP (proxied hosts)
3. **CrowdSec out-of-band drop** - nftables on direct hosts; *not* a Traefik hop (zero per-request latency)
1. **Cloudflare WAF** - DDoS protection, bot detection, firewall rules (external)
2. **Cloudflared Tunnel** - Zero Trust tunnel, hides origin IP
3. **CrowdSec Bouncer** - IP reputation check against LAPI (fail-open on error)
4. **Anti-AI Scraping** - 3-layer bot defense (optional per service, updated 2026-04-17)
5. **Authentik ForwardAuth** - Authentication check (if `protected = true`)
6. **Rate Limiting** - Per-source IP rate limits (returns 429 on breach)
@ -95,71 +80,58 @@ CrowdSec operates in a hub-and-agent model:
- Reports malicious IPs to LAPI
- Shares threat intel with CrowdSec community (anonymized)
Enforcement is split across **two out-of-band surfaces**, neither of which adds
any per-request latency. (See "Why the Traefik bouncer plugin was removed" below
for the supersession history — there is no longer an inline Traefik bouncer.)
**Traefik Bouncer Plugin** (`crowdsec-bouncer-traefik-plugin`, `stacks/traefik/modules/traefik/middleware.tf`):
- Integrated as Traefik middleware (in the default ingress chain)
- Queries LAPI for IP reputation on each request
- **Registered with LAPI** via `BOUNCER_KEY_traefik` env on the LAPI container
(`stacks/crowdsec/modules/crowdsec/values.yaml`), seeded from the same Vault key
the middleware presents (`ingress_crowdsec_api_key`). **Before 2026-06-19 the
bouncer was never registered → LAPI returned 403 → the plugin failed open and
enforced nothing (no bans, no captcha).** The seed re-registers automatically on
every LAPI start, so a DB wipe (e.g. the MySQL→PostgreSQL migration that lost the
original registration) can't silently disable enforcement again.
- **Fail-open mode**: If LAPI unreachable, allows traffic (graceful degradation)
- **Only sees non-proxied (direct) apps' real client IPs** (ETP=Local). Proxied
apps arrive from cloudflared's pod IP (in `clientTrustedIPs`) and are bypassed —
extending enforcement to proxied apps needs `forwardedHeadersTrustedIPs` (future).
- Honours two LAPI remediation types (profiles in `stacks/crowdsec/modules/crowdsec/values.yaml`):
- **`ban`** → HTTP 403 (serious attacks: CVE exploits, scanners, brute force)
- **`captcha`** → **Cloudflare Turnstile challenge** so the flagged user can
self-unblock (lower-severity abuse: `http-429-abuse`, `http-403-abuse`,
`http-crawl-non_statics`, `http-sensitive-files`). The plugin is configured
with `captchaProvider=turnstile` + the widget keys; the `captcha.html`
template is mounted into the Traefik pod at `/captcha`. The widget is
Terraform-managed in `stacks/traefik/main.tf`
(`cloudflare_turnstile_widget.crowdsec_captcha`, scoped to `viktorbarzin.me`
so it covers every subdomain). **Before 2026-06-19 no captcha provider was
configured, so `captcha` decisions silently degraded to a 403 ban** — users
had no way to self-unblock; wiring Turnstile fixed that.
**Surface 1 — DIRECT (non-Cloudflare-proxied) hosts → in-kernel nftables drop**
(`cs-firewall-bouncer` DaemonSet, `stacks/crowdsec/modules/crowdsec/firewall_bouncer.tf`):
- Runs on **every node** (no nodeSelector). Programs the HOST nftables — `table ip
crowdsec` / `table ip6 crowdsec6` — with drop rules in **both the `input` AND
the `forward` hooks**. The `forward` hook is required because Traefik is a
LoadBalancer with `externalTrafficPolicy=Local`: client traffic is DNAT'd to the
Traefik **pod** and transits the node's `forward` hook (not `input`) with the
real client IP preserved. Chains use `policy accept` (only set members drop —
it can never blackhole normal traffic).
- Pulls **all** decisions from LAPI, **including the CAPI community blocklist
(~31k IPs)**. Packets from banned IPs are dropped **in-kernel before reaching
Traefik** → zero per-request hops, no Traefik involvement at all.
- **Packaging**: cs-firewall-bouncer publishes no container image, so the
**v0.0.34** static binary is fetched at runtime by an initContainer onto a
`debian:bookworm-slim` runtime container. Needs `hostNetwork` +
`NET_ADMIN`/`NET_RAW` to talk netlink directly. Registered bouncer key:
**`firewall`**.
- **Fail-open**: if LAPI is unreachable it just stops receiving new decisions
(existing drop rules persist); it never blocks legitimate traffic.
**Surface 2 — PROXIED (Cloudflare orange-cloud) hosts → Cloudflare edge block**
(`stacks/rybbit/crowdsec_edge.tf` + `lapi_kv_sync.py`):
- Proxied hosts terminate at the Cloudflare edge, so a host-level nftables drop
would never see them. Enforcement is instead a single Cloudflare Rules List
**`crowdsec_ban`** + a zone-scoped WAF custom rule `(ip.src in $crowdsec_ban)`
**block** action, which covers every proxied host in the zone.
- Fed by the **`crowdsec-cf-sync` CronJob** (namespace `rybbit`, every 2 min,
pure-stdlib Python in a ConfigMap). It pulls local **ban/captcha ip-scoped**
decisions and pushes them into the CF list, but **EXCLUDES the ~31k CAPI
community blocklist** — that set is far too large for a CF Rules List (the CF
account hard-limits to **one** list), and CAPI is already covered in-kernel on
direct hosts and by Cloudflare's own managed protections on proxied hosts.
Registered bouncer key: **`kvsync`**.
- **Block-only**: the single-list limit precludes a separate
captcha/managed-challenge list, so both ban and captcha decisions are enforced
as a plain block at the edge.
- **Auth carve-out:** the WAF rule excludes `authentik.viktorbarzin.me` +
`public-auth.viktorbarzin.me` (`… and not (http.host in {…})`). A CrowdSec hit
must never wall a user out of the login / WebAuthn flow they authenticate
through; auth keeps `traefik-rate-limit` for brute-force protection.
**Whitelist** (`stacks/crowdsec/whitelist.yaml`): a CrowdSec whitelist covers
RFC1918 + the tailnet + internal CIDRs (plus one specific external IP), so
internal users are never enforced. Internal access uses split-horizon DNS
straight to Traefik, and direct internal clients are RFC1918 — both whitelisted.
#### Why the Traefik bouncer plugin was removed
Enforcement used to run as an inline Traefik middleware — the
`crowdsec-bouncer-traefik-plugin` (Yaegi/Lua), which queried LAPI on every
request and could serve a Cloudflare Turnstile captcha for soft remediations.
On **Traefik 3.7.5 the Yaegi handler was never invoked**, so the bouncer was
registered but enforced **nothing** despite appearing healthy. Rather than chase
the Yaegi runtime, the whole plugin path was **removed** (2026-06): the plugin
static config + initContainer download, the `crowdsec` Middleware CRD, the
`captcha.html` template + its ConfigMap and volume mount, and the Cloudflare
Turnstile widget (`cloudflare_turnstile_widget.crowdsec_captcha`). It was
replaced by the two out-of-band surfaces above, which add zero per-request
latency and fail open. (The earlier `crowdsec-cf-sync` cursor-pagination /
IP-List-capacity issues are also moot now that CAPI is excluded from the edge
list and dropped in-kernel instead.)
**Cloudflare Edge Enforcement for proxied hosts** (`stacks/rybbit/crowdsec_edge.tf` + `lapi_kv_sync.py`):
- Proxied (orange-cloud) hosts terminate at the Cloudflare edge, so the in-cluster
bouncer above never decides on them. Edge enforcement instead syncs LAPI
decisions into **one Cloudflare account IP List (`crowdsec_ban`)** + a single
**zone-scoped WAF custom rule** blocking `(ip.src in $crowdsec_ban)` across every
proxied host. CronJob `crowdsec-cf-sync` (rybbit ns, every 2 min) reconciles it.
- **BAN-ONLY (2026-06-20):** only `type=ban` decisions sync to the edge. `captcha`
decisions are deliberately NOT pushed — the CF account allows only ONE Rules List
with a single block action, so folding captcha in would hard-block a soft
challenge on every proxied host. (Before 2026-06-20 captcha was downgraded to a
hard block at the edge.)
- **Auth carve-out (2026-06-20):** the WAF rule excludes `authentik.viktorbarzin.me`
+ `public-auth.viktorbarzin.me` (`… and not (http.host in {…})`), and the
Authentik UI ingress sets `exclude_crowdsec = true` for the in-cluster bouncer. A
CrowdSec hit must never wall a user out of the login / WebAuthn flow they
authenticate through; auth keeps `traefik-rate-limit` for brute-force protection.
- **⚠️ Currently NON-FUNCTIONAL (known issue, pre-existing since the 2026-06-20
rollout):** `crowdsec-cf-sync` fails every run — `cf_list_items()` pagination
gets CF `HTTP 400 code 10027 "invalid or expired cursor"`, so the list never
populates (`num_items=0`) and the edge rule blocks nothing. LAPI also returns
~31k ban IPs, likely exceeding CF IP-List capacity even once pagination is fixed.
**Edge enforcement for proxied hosts is therefore inert pending a fix** (the
in-cluster bouncer still protects direct apps; the auth carve-out is correct
regardless). Fix needs: (1) correct CF cursor pagination, (2) a capacity strategy
for the ban set.
**Metabase** (disabled by default):
- Dashboard for CrowdSec analytics
@ -405,12 +377,10 @@ Beads: `code-8ywc` W1.6 + W1.7. **Status: planned.**
| Path | Purpose |
|------|---------|
| `stacks/crowdsec/` | CrowdSec LAPI, agent config + `whitelist.yaml` |
| `stacks/crowdsec/modules/crowdsec/firewall_bouncer.tf` | cs-firewall-bouncer DaemonSet (in-kernel nftables drop, direct hosts) |
| `stacks/rybbit/crowdsec_edge.tf` + `lapi_kv_sync.py` | Cloudflare IP-List + WAF block rule + LAPI→CF sync CronJob (proxied hosts) |
| `stacks/crowdsec/` | CrowdSec LAPI, agent, bouncer config |
| `stacks/kyverno/` | Kyverno deployment + policies |
| `stacks/poison-fountain/` | Anti-AI service + CronJob |
| `stacks/traefik/modules/traefik/middleware.tf` | Security middleware definitions (no longer includes a CrowdSec bouncer) |
| `stacks/platform/modules/traefik/middleware.tf` | Security middleware definitions |
| `stacks/platform/modules/ingress_factory/` | Per-service security toggles |
### Vault Paths
@ -520,11 +490,7 @@ spec:
**Fix**:
1. Check LAPI decisions: `kubectl exec -it crowdsec-lapi-0 -- cscli decisions list`
2. Remove ban: `kubectl exec -it crowdsec-lapi-0 -- cscli decisions delete --ip <IP>`
— the in-kernel drop clears as soon as `cs-firewall-bouncer` reconciles (direct
hosts); for proxied hosts the `crowdsec-cf-sync` CronJob removes it from the
`crowdsec_ban` CF list within ~2 min.
3. Whitelist if needed: Add to `stacks/crowdsec/whitelist.yaml` (RFC1918 + tailnet
+ internal CIDRs are already whitelisted, so internal clients are never banned).
3. Whitelist if needed: Add to `stacks/crowdsec/whitelist.yaml`
### Kyverno Policy Blocking Deployment