docs: comprehensive audit and update of all architecture docs and runbooks [ci skip]

Audited 14 documentation files against live cluster state and Terraform code.

Architecture docs:
- databases.md: MySQL 8.4.4, proxmox-lvm storage (not iSCSI), anti-affinity
  excludes k8s-node1 (GPU), 2Gi/3Gi resources, 7-day rotation (not 24h),
  CNPG 2 instances, PostGIS 16, postgresql.dbaas has endpoints
- overview.md: 1x CPU, ~160GB RAM, all nodes 32GB, proxmox-lvm storage,
  correct Vault paths (secret/ not kv/)
- compute.md: 272GB physical host RAM, ~160GB allocated to VMs
- secrets.md: 7-day rotation, 7 MySQL + 5 PG roles, correct ESO config
- networking.md: MetalLB pool 10.0.20.200-220
- ci-cd.md: 9 GHA projects, travel_blog 5.7GB

Runbooks:
- restore-mysql/postgresql: backup files are .sql.gz (not .sql)
- restore-vault: weekly backup (not daily), auto-unseal sidecar note
- restore-vaultwarden: PVC is proxmox (not iscsi)
- restore-full-cluster: updated node roles, removed trading

Reference docs:
- CLAUDE.md: 7-day rotation, removed trading from PG list
- AGENTS.md: 100+ stacks, proxmox-lvm, platform empty shell
- service-catalog.md: 6 new stacks, 14 stack column updates
This commit is contained in:
Viktor Barzin 2026-04-06 13:21:05 +03:00
parent 06359aa3fa
commit fc233bd27f
14 changed files with 152 additions and 142 deletions

View file

@ -10,7 +10,7 @@ Secrets management is centralized in HashiCorp Vault as the single source of tru
graph TB
subgraph "Secret Sources"
VAULT_KV[Vault KV<br/>secret/viktor<br/>135+ keys]
VAULT_DB[Vault DB Engine<br/>24h rotation]
VAULT_DB[Vault DB Engine<br/>7-day rotation]
VAULT_K8S[Vault K8s Engine<br/>Dynamic SA tokens]
USER[User-managed<br/>sealed-*.yaml]
end
@ -46,7 +46,7 @@ graph TB
graph LR
subgraph "Database Credential Rotation"
VAULT_ROOT[Vault Root Creds] --> VAULT_DB_ENGINE[Vault DB Engine]
VAULT_DB_ENGINE -->|Create role| DB_ROLE[DB Role: 24h TTL]
VAULT_DB_ENGINE -->|Create role| DB_ROLE[DB Role: 7-day TTL]
DB_ROLE -->|ESO syncs| K8S_SECRET[K8s Secret]
K8S_SECRET -->|App reads| APP[Application Pod]
APP -->|Uses rotated creds| DATABASE[(MySQL/PostgreSQL)]
@ -63,7 +63,7 @@ graph LR
| Sealed Secrets | Latest | `stacks/platform/` | User-managed encrypted secrets |
| SOPS | Latest | `scripts/state-sync`, `scripts/tg` | Terraform state encryption (Vault Transit + age) |
| Vault K8s Auth | Enabled | `stacks/vault/` | CI/CD authentication via service account tokens |
| Vault DB Engine | Enabled | `stacks/vault/` | Dynamic DB credentials for 6 MySQL + 8 PostgreSQL databases |
| Vault DB Engine | Enabled | `stacks/vault/` | Dynamic DB credentials for 7 MySQL + 5 PostgreSQL databases |
## How It Works
@ -108,37 +108,34 @@ ESO creates/updates K8s Secrets automatically when Vault values change. Applicat
### Database Credential Rotation
Vault DB engine provides automatic 24h credential rotation for:
Vault DB engine provides automatic 7 days credential rotation for:
**MySQL databases** (6):
**MySQL databases** (7):
- speedtest
- wrongmove
- codimd
- nextcloud
- shlink
- grafana
- technitium
**PostgreSQL databases** (8):
- trading
**PostgreSQL databases** (5):
- health
- linkwarden
- affine
- woodpecker
- claude_memory
- (2 others)
**Excluded from rotation**:
- authentik (uses PgBouncer, incompatible with rotation)
- technitium, crowdsec (Helm charts bake credentials at install time)
- crowdsec (Helm chart bakes credentials at install time)
- Root user accounts (used for Vault itself to create rotated users)
Workflow:
1. ESO requests credentials from Vault DB engine
2. Vault creates new DB user with 24h TTL
3. ESO writes credentials to K8s Secret
4. Application reads credentials from secret
5. Vault automatically revokes user after 24h
6. ESO requests new credentials, cycle repeats
1. Vault rotates the database user's password (static role, 7-day period)
2. ExternalSecrets Operator syncs new password to K8s Secret (15-min refresh)
3. Apps read from K8s Secret via `secret_key_ref` env vars
4. Special case: Technitium uses a CronJob to push password to its app config via API
### Kubernetes Credential Management
@ -225,13 +222,13 @@ metadata:
spec:
provider:
vault:
server: "https://vault.viktorbarzin.me"
server: "http://vault-active.vault.svc.cluster.local:8200"
path: secret
version: v2
auth:
kubernetes:
mountPath: kubernetes
role: external-secrets
role: eso
```
**ExternalSecret example**:
@ -318,7 +315,7 @@ provider "vault" {
### Why Vault DB engine over static credentials?
**Automatic rotation**: 24h TTL reduces credential exposure window.
**Automatic rotation**: 7-day TTL reduces credential exposure window.
**Audit trail**: Every credential generation logged in Vault.