infra/scripts/k8s-apiserver-audit-policy.yaml
Viktor Barzin fd0f4a0365 fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip]
6d224861 came from a --no-checkout worktree whose empty index made the
commit drop every file except two. This restores 05b50d2b's full tree and
correctly adds stacks/stem95su/gdrive-sync.tf + the service-catalog stem95su
entry. Forward-only (parent=6d224861, no force-push); [ci skip] since the
live infra was never applied from the broken commit.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-09 08:45:33 +00:00

57 lines
2.3 KiB
YAML

# kube-apiserver audit policy -- k8s-master (10.0.20.100), single control-plane.
#
# Goal: a durable "who/when/what" trail for MUTATIONS (create/update/patch/
# delete) so resource deletions can be attributed even though direct
# kubectl-to-apiserver calls otherwise leave no trace (see the 2026-06-06
# novelapp incident: a dashboard delete was attributable, a direct-kubectl
# recreate was not). Deployed OUTSIDE Terraform (the k8s VMs are not TF-managed,
# see memory id=1575); this file is the source of truth, scp'd to
# /etc/kubernetes/audit-policy.yaml and wired into the apiserver static-pod
# manifest + the kubeadm-config ConfigMap (so "kubeadm upgrade" preserves it).
#
# Tuned for LOW WRITE VOLUME (the cluster's sdc HDD is write-sensitive, see
# memory id=559): reads are dropped entirely, high-churn resources and probe
# endpoints are dropped, and the verbose RequestReceived stage is omitted, so
# only one Metadata-level line is written per mutating request.
apiVersion: audit.k8s.io/v1
kind: Policy
# Only emit the post-execution stage -- halves volume vs logging both stages.
omitStages:
- RequestReceived
rules:
# 1. Never log read-only verbs -- the overwhelming majority of traffic and
# irrelevant to "who changed/deleted X".
- level: None
verbs: ["get", "list", "watch"]
# 2. Drop high-churn / low-value resources even on writes.
- level: None
resources:
- group: ""
resources: ["events", "endpoints", "nodes/status", "pods/status"]
- group: "coordination.k8s.io"
resources: ["leases"]
- group: "discovery.k8s.io"
resources: ["endpointslices"]
- group: "metrics.k8s.io"
- group: "authentication.k8s.io"
resources: ["tokenreviews"]
- group: "authorization.k8s.io"
resources: ["subjectaccessreviews", "selfsubjectaccessreviews"]
# 3. Drop noisy non-resource probe / discovery URLs.
- level: None
nonResourceURLs:
- "/healthz*"
- "/readyz*"
- "/livez*"
- "/version"
- "/metrics"
- "/openapi/*"
- "/swagger*"
# 4. Everything else (every create/update/patch/delete on real resources):
# record WHO (user + sourceIP + userAgent), WHAT (resource/namespace/name),
# WHEN, and the verb -- at Metadata level (no request/response bodies, so
# each entry stays small).
- level: Metadata