# Security & L7 Protection
## Overview
The homelab implements defense-in-depth security at the application layer (L7) using CrowdSec for threat intelligence and IP reputation, Kyverno for policy enforcement and resource governance, and a 5-layer anti-AI scraping defense. All security components operate in graceful degradation mode (fail-open) to prevent cascading failures. Security policies are deployed in audit mode first, then selectively enforced after validation.
## Architecture Diagram
```mermaid
graph LR
Internet[Internet]
CF[Cloudflare WAF]
Tunnel[Cloudflared Tunnel]
CrowdSec[CrowdSec Bouncer
Traefik Plugin]
AntiAI[Anti-AI Check
poison-fountain]
ForwardAuth[Authentik ForwardAuth]
RateLimit[Rate Limit Middleware]
Retry[Retry Middleware
2 attempts, 100ms]
Backend[Backend Service]
LAPI[CrowdSec LAPI
3 replicas]
Agent[CrowdSec Agent]
Internet -->|1| CF
CF -->|2| Tunnel
Tunnel -->|3| CrowdSec
CrowdSec -.->|Query| LAPI
Agent -.->|Report| LAPI
CrowdSec -->|4. Pass/Block| AntiAI
AntiAI -->|5. Human/Bot| ForwardAuth
ForwardAuth -->|6. Authenticated| RateLimit
RateLimit -->|7. Under Limit| Retry
Retry -->|8. Success/Retry| Backend
style CrowdSec fill:#f9f,stroke:#333
style AntiAI fill:#ff9,stroke:#333
style ForwardAuth fill:#9f9,stroke:#333
style RateLimit fill:#99f,stroke:#333
```
## Components
| Component | Version | Location | Purpose |
|-----------|---------|----------|---------|
| CrowdSec LAPI | Pinned | `stacks/crowdsec/` | Local API, threat intelligence aggregation (3 replicas) |
| CrowdSec Agent | Pinned | `stacks/crowdsec/` | Log parser, scenario detection |
| CrowdSec Traefik Bouncer | Plugin | Traefik config | Plugin-based IP reputation check |
| Kyverno | Pinned chart | `stacks/kyverno/` | Policy engine for K8s admission control |
| poison-fountain | Latest | `stacks/poison-fountain/` | Anti-AI bot detection and tarpit service |
| cert-manager/certbot | - | `stacks/cert-manager/` | TLS certificate management |
| Traefik | Latest | `stacks/platform/` | Ingress controller with HTTP/3 (QUIC) |
## How It Works
### Request Security Layers
Every incoming request passes through 6 security layers:
1. **Cloudflare WAF** - DDoS protection, bot detection, firewall rules (external)
2. **Cloudflared Tunnel** - Zero Trust tunnel, hides origin IP
3. **CrowdSec Bouncer** - IP reputation check against LAPI (fail-open on error)
4. **Anti-AI Scraping** - 5-layer bot defense (optional per service)
5. **Authentik ForwardAuth** - Authentication check (if `protected = true`)
6. **Rate Limiting** - Per-source IP rate limits (returns 429 on breach)
7. **Retry Middleware** - Auto-retry on transient errors (2 attempts, 100ms delay)
### CrowdSec Threat Intelligence
CrowdSec operates in a hub-and-agent model:
**LAPI (Local API)**:
- 3 replicas for high availability
- Aggregates threat intelligence from agent + community
- Maintains ban list (IP reputation database)
- Version pinned to prevent breaking changes
**Agent**:
- Parses Traefik access logs
- Detects attack scenarios (SQL injection, directory traversal, brute force)
- Reports malicious IPs to LAPI
- Shares threat intel with CrowdSec community (anonymized)
**Traefik Bouncer Plugin**:
- Integrated as Traefik middleware
- Queries LAPI for IP reputation on each request
- **Fail-open mode**: If LAPI unreachable, allows traffic (graceful degradation)
- Blocks IPs on ban list, allows others
**Metabase** (disabled by default):
- Dashboard for CrowdSec analytics
- CPU-intensive, only enable when investigating incidents
### Kyverno Policy Engine
Kyverno enforces cluster-wide policies via admission webhooks. All policies use `failurePolicy=Ignore` to prevent blocking cluster operations.
#### 5-Tier Resource Governance
Namespaces are labeled with a tier (`tier: 0` through `tier: 4`). Kyverno auto-generates:
- **LimitRange** - Per-container CPU/memory limits
- **ResourceQuota** - Namespace-wide resource caps
| Tier | CPU Limit/Container | Memory Limit/Container | Namespace CPU Quota | Namespace Memory Quota |
|------|---------------------|------------------------|---------------------|------------------------|
| 0 | 100m | 128Mi | 500m | 512Mi |
| 1 | 250m | 256Mi | 1000m | 1Gi |
| 2 | 500m | 512Mi | 2000m | 2Gi |
| 3 | 1000m | 1Gi | 4000m | 4Gi |
| 4 | 2000m | 2Gi | 8000m | 8Gi |
This prevents resource exhaustion and enforces governance without manual quota management.
#### Security Policies (ALL in Audit Mode)
**Why audit mode?** Gradual rollout without breaking existing workloads. Policies collect violations, then selectively enforced after cleanup.
| Policy | Purpose | Enforcement |
|--------|---------|-------------|
| `deny-privileged-containers` | Block privileged pods | Audit |
| `deny-host-namespaces` | Block hostNetwork/hostPID/hostIPC | Audit |
| `restrict-sys-admin` | Block CAP_SYS_ADMIN | Audit |
| `require-trusted-registries` | Only allow approved image registries | Audit |
#### Operational Policies
| Policy | Purpose | Mode |
|--------|---------|------|
| `inject-priority-class-from-tier` | Set pod priorityClass based on namespace tier | Enforce (CREATE only) |
| `inject-ndots` | Set DNS `ndots:2` for faster lookups | Enforce |
| `sync-tier-label` | Propagate tier label to child resources | Enforce |
| `goldilocks-vpa-auto-mode` | Disable VPA globally (VPA off) | Enforce |
### Anti-AI Scraping (5-Layer Defense)
Enabled by default via `ingress_factory` module. Disable per-service with `anti_ai_scraping = false`.
#### Layer 1: Bot Blocking (ForwardAuth)
- Middleware calls `poison-fountain` service before backend
- Analyzes User-Agent, request patterns, timing
- Blocks known AI scrapers (GPTBot, CCBot, etc.)
- **Fail-open**: If poison-fountain down, allows traffic
#### Layer 2: X-Robots-Tag Header
- HTTP response header: `X-Robots-Tag: noai, noindex, nofollow`
- Instructs compliant bots to skip content
- Lightweight, no performance impact
#### Layer 3: Trap Links
- JavaScript injects invisible links before `