infra/docs/architecture
Viktor Barzin b3cf75dc61 docs(security): wave 1 plan — Kyverno enforce, NetworkPolicy egress, audit logging, source-IP anomaly
Locked design for wave 1 of cluster security hardening. Plan only — implementation lives in beads
code-8ywc and follow-up commits. Captures:

- security.md: Kyverno policy table updated (Audit → Enforce planned for the four security policies
  with the 31-namespace exclude list). New section "Audit Logging & Anomaly Detection" detailing the
  K8s API audit policy, Vault audit device + X-Forwarded-For trust, source-IP anomaly rules (K9, V7,
  S1), and the rejected-canary-tokens / rejected-K1 rationales. New section "NetworkPolicy
  Default-Deny Egress" describing the observe-then-enforce (γ) approach for tier 3+4.
- monitoring.md: new "Security Alerts (Wave 1)" section listing the 16 rules (K2-K9, V1-V7, S1)
  and the Loki ruler → Alertmanager → #security routing path.
- runbooks/security-incident.md (new): per-alert response playbook with LogQL queries, action
  steps, false-positive triage, and SEV1 escalation.
- .claude/CLAUDE.md: new "Security Posture" section summarising the locked decisions: identity
  allowlist is me@viktorbarzin.me ONLY, source-IP allowlist CIDRs, no public-IP access policy,
  rationale for not adopting canary tokens.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-22 14:16:57 +00:00
..
agent-task-tracking.md Add agent task tracking documentation 2026-04-15 17:11:26 +00:00
authentication.md docs/auth: sync to current auth enum (required/app/public/none) 2026-05-22 14:16:44 +00:00
automated-upgrades.md kured: drop Mon-Fri restriction, reboot any day 2026-05-22 14:16:48 +00:00
backup-dr.md backup: fix daily-backup silent failures, postiz pg_dump CronJob, doc reconcile 2026-05-10 11:12:39 +00:00
chrome-service.md chrome-service: open NP for Traefik → noVNC sidecar (port 6080) 2026-05-07 23:29:34 +00:00
ci-cd.md [forgejo] Phases 3+4+5: cutover, decommission, docs sweep 2026-05-07 23:29:34 +00:00
compute.md infra/compute: bump k8s-node1 RAM 32 -> 48 GiB 2026-05-22 14:16:41 +00:00
databases.md [redis] stabilise against node-crash flap cascade — RC1-RC5 fixes 2026-04-22 15:59:00 +00:00
dns.md phpipam-pfsense-import: every 5min → hourly 2026-04-26 22:48:43 +00:00
homepage.md add homepage auto-discovery documentation [ci skip] 2026-03-25 13:06:43 +02:00
incident-response.md [claude-agent-service] Migrate all pipelines from DevVM SSH to K8s HTTP 2026-04-18 10:12:02 +00:00
llama-cpp.md infra/llama-cpp: benchmark report + -fa flag fix 2026-05-22 14:16:41 +00:00
mailserver.md monitoring: bring EmailRoundtripStale threshold docs in sync with for:20m 2026-04-21 22:39:46 +00:00
monitoring.md docs(security): wave 1 plan — Kyverno enforce, NetworkPolicy egress, audit logging, source-IP anomaly 2026-05-22 14:16:57 +00:00
multi-tenancy.md add architecture documentation for all infrastructure subsystems [ci skip] 2026-03-24 00:55:25 +02:00
networking.md kms: deploy slack-notifier sidecar with Prometheus metrics + document public exposure 2026-05-10 11:12:39 +00:00
overview.md gpu: schedule off NFD label, not k8s-node1 hostname 2026-04-22 13:43:07 +00:00
secrets.md docs: comprehensive audit and update of all architecture docs and runbooks [ci skip] 2026-04-06 13:21:05 +03:00
security.md docs(security): wave 1 plan — Kyverno enforce, NetworkPolicy egress, audit logging, source-IP anomaly 2026-05-22 14:16:57 +00:00
storage.md mysql: bump to 4Gi limit / 3Gi request; grow /srv/nfs LV to 3 TiB 2026-05-10 11:12:38 +00:00
vpn.md [docs] TrueNAS decommission cleanup — remove references from active docs 2026-04-19 16:55:43 +00:00