update claude knowledge: vault-native secrets migration decisions [ci skip]

This commit is contained in:
Viktor Barzin 2026-03-15 21:00:07 +00:00
parent 1ffce19611
commit 23dfaa1ac8

View file

@ -23,16 +23,16 @@
- **Vault is the sole source of truth** for secrets. SOPS pipeline has been removed entirely.
- **Auth**: `vault login -method=oidc` (Authentik SSO) → `~/.vault-token` → read by Vault TF provider.
- **Vault stack self-reads**: `data "vault_kv_secret_v2" "vault"` reads its own OIDC creds from `secret/vault`.
- **Consuming stacks** read from `data "vault_kv_secret_v2" "secrets"` at `secret/<stack-name>`.
- **External Secrets Operator (ESO)**: `stacks/external-secrets/` syncs Vault KV → K8s Secrets via `ClusterSecretStore`.
- **Database rotation**: Vault database secrets engine rotates app DB passwords every 24h (MySQL + PostgreSQL static roles).
- **K8s credentials**: Vault K8s secrets engine issues short-lived tokens for dashboard, CI, OpenClaw, local admin.
- **CI/CD (Woodpecker)**: Authenticates via K8s service account JWT → Vault K8s auth method.
- **Platform cannot depend on vault** (circular). Apply order: vault first, then platform.
- **ESO (External Secrets Operator)**: `stacks/external-secrets/` — 26 ExternalSecrets sync Vault KV → K8s Secrets. API version `v1beta1`. Two ClusterSecretStores: `vault-kv` and `vault-database`.
- **Plan-time vs runtime**: Stacks using secrets in TF expressions (jsondecode, locals, module inputs, Helm templatefile) keep `data "vault_kv_secret_v2"`. Only direct `env { value = ... }` can migrate to `value_from { secret_key_ref }`. 15 fully migrated, 11 partial, 17 unchanged.
- **Database rotation**: Vault DB engine rotates passwords every 24h. MySQL: speedtest, wrongmove, codimd, nextcloud, shlink, grafana. PostgreSQL: trading, health, linkwarden, affine, woodpecker, claude_memory. Excluded: authentik (PgBouncer), technitium/crowdsec (Helm-baked), root users.
- **K8s credentials**: Vault K8s secrets engine. Roles: `dashboard-admin`, `ci-deployer`, `openclaw`, `local-admin`. Use `vault write kubernetes/creds/ROLE kubernetes_namespace=NS`. Helper: `scripts/vault-kubeconfig`.
- **CI/CD (Woodpecker)**: Authenticates via K8s SA JWT → Vault K8s auth. Sync CronJob pushes `secret/ci/global` → Woodpecker API every 6h. Shell scripts in HCL heredocs: escape `$``$$`, `%{}``%%{}`.
- **Platform cannot depend on vault** (circular). Apply order: vault first, then platform. Platform has 48 vault refs, all in module inputs — no ESO migration possible.
- **Complex types** (maps/lists like `homepage_credentials`, `k8s_users`) stored as JSON strings in KV, decoded with `jsondecode()` in consuming stack `locals` blocks.
- **New stacks**: Add secret in Vault UI/CLI at `secret/<stack-name>`, then use `data "vault_kv_secret_v2" "secrets"` in the stack.
- **New stacks**: Add secret in Vault UI/CLI at `secret/<stack-name>`, add ExternalSecret for runtime delivery, use `data "vault_kv_secret_v2"` only if needed at plan time.
- **Backup CronJob**: `vault-raft-backup` uses manually-created `vault-root-token` K8s Secret (independent of automation).
- **Bootstrap (fresh cluster)**: See vault/main.tf comments — comment out data source + OIDC, deploy Helm, init+unseal, populate `secret/vault`, uncomment, re-apply.
- **Bootstrap (fresh cluster)**: Comment out data source + OIDC → apply Helm → init+unseal → populate `secret/vault` → uncomment → re-apply.
## Resource Management Patterns
- **CPU**: All CPU limits removed cluster-wide (CFS throttling). Only set CPU requests based on actual usage.