security(wave1): W1.1 audit-log shipping LIVE + W1.5 trusted-registries Enforce LIVE

## W1.1 — K8s API audit log shipping (LIVE)
- alloy.yaml: added control-plane toleration so Alloy DaemonSet runs on
  k8s-master node. Verified alloy-7zg7t scheduled on master, tailing
  /var/log/kubernetes/audit.log
- loki.tf "Security Wave 1" rule group: added K2-K9 alert rules
  (skipped K1 per Q7 decision):
  - K2 K8sSATokenFromUnexpectedIP
  - K3 K8sSensitiveSecretReadByUnexpectedActor
  - K4 K8sExecIntoSensitiveNamespace
  - K5 K8sMassDelete (>5 Pod/Secret/CM in 60s by single user)
  - K6 K8sAuditPolicyModified (kubeadm-config CM change)
  - K7 K8sClusterRoleWildcardCreated (verbs=* + resources=*)
  - K8 K8sAnonymousBindingGranted
  - K9 K8sViktorFromUnexpectedIP
- All rules use source-IP regex matching the wave-1 allowlist
  (10.0.20.0/22, 192.168.1.0/24, 10.10.0.0/16 pod, 10.96.0.0/12 svc,
  100.64-127 tailnet) and `lane = "security"` → #security Slack route.
- Verified: kubectl-audit logs flowing in Loki query
  {job="kubernetes-audit"} returns events with node=k8s-master.
- Verified: /loki/api/v1/rules lists all K2-K9 + V1-V7 + S1.

## W1.5 — require-trusted-registries Enforce (LIVE)
- security-policies.tf: flipped Audit→Enforce with explicit allowlist
  built by `kubectl get pods -A -o jsonpath='{..image}'` enumeration.
- Removed `*/*` catch-all (which made Audit→Enforce a no-op).
- Pattern includes 15 explicit registries, 6 DockerHub library bare
  names, 56 DockerHub user repos.
- Verified by admission dry-run:
  - evilcorp.example/malware:v1 → BLOCKED with custom message
  - alpine:3.20 → ALLOWED (matches `alpine*`)
  - docker.io/library/alpine:3.20 → ALLOWED (matches `docker.io/*`)

## W1.6 — Calico flow logs (BLOCKED — Calico OSS limitation)
- Tried adding FelixConfiguration with flowLogsFileEnabled=true via
  kubectl_manifest in stacks/calico/main.tf
- Calico OSS rejected with "strict decoding error: unknown field
  spec.flowLogsFileEnabled" — these fields are Calico Enterprise/Tigera-only
- Removed the failed resource. Documented alternative paths in main.tf
  comment block: GNP with action=Log (iptables NFLOG → journal), Cilium
  migration, eBPF tooling, or Tigera Operator adoption.

## Docs updates
- security.md status table refreshed: W1.1/W1.2/W1.3/W1.4/W1.5 LIVE,
  W1.6/W1.7 blocked
- monitoring.md: Loki marked DEPLOYED (was incorrectly NOT-DEPLOYED in
  prior session before today's apply)

## Cleanup
- Removed stacks/kyverno/imports.tf (TF 1.5+ import blocks completed
  their job in the 2026-05-18 apply; should not stay in tree per TF docs)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
Viktor Barzin 2026-05-19 06:37:54 +00:00
parent 8ef4f06ac0
commit 669ba97078
7 changed files with 166 additions and 88 deletions

View file

@ -290,12 +290,13 @@ resource "kubectl_manifest" "policy_require_trusted_registries" {
}
}
spec = {
# NOTE: Stays in Audit mode pending allowlist tightening. The current
# pattern includes `*/*` which matches any image with a registry flipping
# to Enforce would not actually restrict supply chain. Tightening the
# allowlist to a precise enumeration of in-use registries is tracked
# separately under beads code-8ywc (W1.5 follow-up).
validationFailureAction = "Audit"
# Wave 1 W1.5: flipped Audit Enforce 2026-05-19 with explicit allowlist.
# Allowlist enumerated from `kubectl get pods -A -o jsonpath='{..image}'`
# on 2026-05-18; covers all in-cluster image sources. Update on adding new
# workloads from a registry NOT in this list (and ask if the new registry
# is trusted before opening it). The `*/*` catch-all was deliberately
# removed so unknown registries fail closed at admission.
validationFailureAction = "Enforce"
background = true
rules = [{
name = "validate-registries"
@ -314,11 +315,39 @@ resource "kubectl_manifest" "policy_require_trusted_registries" {
}]
}
validate = {
message = "Images must be from trusted registries (docker.io, ghcr.io, quay.io, registry.k8s.io, or local cache)."
message = "Images must be from trusted registries. Allowlist defined in stacks/kyverno/modules/kyverno/security-policies.tf — add the new registry there if intentional, otherwise switch the workload to a trusted source."
pattern = {
spec = {
containers = [{
image = "docker.io/* | ghcr.io/* | quay.io/* | registry.k8s.io/* | 10.0.20.10* | */*"
image = join(" | ", [
# Explicit registries
"docker.io/*", "ghcr.io/*", "quay.io/*", "registry.k8s.io/*",
"gcr.io/*", "us-docker.pkg.dev/*", "lscr.io/*",
"codeberg.org/*", "mcr.microsoft.com/*", "nvcr.io/*",
"oci.external-secrets.io/*", "reg.kyverno.io/*",
"docker.n8n.io/*", "registry.gitlab.com/*",
# Private
"forgejo.viktorbarzin.me/*", "10.0.20.10*",
# DockerHub library (bare image names without slash)
"alpine*", "busybox*", "kong*", "mysql*", "nginx*", "python*",
# DockerHub user repos (no registry prefix, has slash)
# enumerated from current cluster state.
"actualbudget/*", "afadil/*", "binwiederhier/*", "bitnami/*",
"clickhouse/*", "cloudflare/*", "coturn/*", "crowdsecurity/*",
"curlimages/*", "deluan/*", "dgtlmoon/*", "dolthub/*",
"dpage/*", "dperson/*", "edoburu/*", "esanchezm/*",
"freikin/*", "freshrss/*", "hackmdio/*", "hashicorp/*",
"headscale/*", "jhonderson/*", "kebe/*", "library/*",
"lissy93/*", "louislam/*", "matrixdotorg/*", "mendhak/*",
"mghee/*", "mindflavor/*", "mpepping/*", "netsampler/*",
"nvidia/*", "onlyoffice/*", "openresty/*", "owntracks/*",
"phpipam/*", "phpmyadmin/*", "privatebin/*", "prom/*",
"prompve/*", "rancher/*", "roundcube/*", "sclevine/*",
"shadowsocks/*", "shlinkio/*", "stirlingtools/*",
"technitium/*", "teddysun/*", "temporalio/*",
"typhonragewind/*", "tzahi12345/*", "vabene1111/*",
"vaultwarden/*", "viktorbarzin/*", "viren070/*", "zelest/*",
])
}]
}
}