docs: comprehensive audit and update of all architecture docs and runbooks [ci skip]
Audited 14 documentation files against live cluster state and Terraform code. Architecture docs: - databases.md: MySQL 8.4.4, proxmox-lvm storage (not iSCSI), anti-affinity excludes k8s-node1 (GPU), 2Gi/3Gi resources, 7-day rotation (not 24h), CNPG 2 instances, PostGIS 16, postgresql.dbaas has endpoints - overview.md: 1x CPU, ~160GB RAM, all nodes 32GB, proxmox-lvm storage, correct Vault paths (secret/ not kv/) - compute.md: 272GB physical host RAM, ~160GB allocated to VMs - secrets.md: 7-day rotation, 7 MySQL + 5 PG roles, correct ESO config - networking.md: MetalLB pool 10.0.20.200-220 - ci-cd.md: 9 GHA projects, travel_blog 5.7GB Runbooks: - restore-mysql/postgresql: backup files are .sql.gz (not .sql) - restore-vault: weekly backup (not daily), auto-unseal sidecar note - restore-vaultwarden: PVC is proxmox (not iscsi) - restore-full-cluster: updated node roles, removed trading Reference docs: - CLAUDE.md: 7-day rotation, removed trading from PG list - AGENTS.md: 100+ stacks, proxmox-lvm, platform empty shell - service-catalog.md: 6 new stacks, 14 stack column updates
This commit is contained in:
parent
06359aa3fa
commit
fc233bd27f
14 changed files with 152 additions and 142 deletions
|
|
@ -54,7 +54,7 @@ Violations cause state drift, which causes future applies to break or silently r
|
|||
- **ESO (External Secrets Operator)**: `stacks/external-secrets/` — 43 ExternalSecrets + 9 DB-creds ExternalSecrets. API version `v1beta1`. Two ClusterSecretStores: `vault-kv` and `vault-database`.
|
||||
- **Plan-time pattern**: Former plan-time stacks use `data "kubernetes_secret"` to read ESO-created K8s Secrets at plan time (no Vault dependency). First-apply gotcha: must `terragrunt apply -target=kubernetes_manifest.external_secret` first, then full apply. `count` on resources using secret values fails — remove conditional counts.
|
||||
- **14 hybrid stacks** still keep `data "vault_kv_secret_v2"` for plan-time needs (job commands, Helm templatefile, module inputs). Platform has 48 plan-time refs — no migration possible without restructuring modules.
|
||||
- **Database rotation**: Vault DB engine rotates passwords every 24h. MySQL: speedtest, wrongmove, codimd, nextcloud, shlink, grafana, technitium. PostgreSQL: trading, health, linkwarden, affine, woodpecker, claude_memory. Excluded: authentik (PgBouncer), crowdsec (Helm-baked), root users. Technitium uses a password-sync CronJob (every 6h) to push rotated password to the Technitium app config via API.
|
||||
- **Database rotation**: Vault DB engine rotates passwords every 7 days (604800s). MySQL: speedtest, wrongmove, codimd, nextcloud, shlink, grafana, technitium. PostgreSQL: health, linkwarden, affine, woodpecker, claude_memory. Excluded: authentik (PgBouncer), crowdsec (Helm-baked), root users. Technitium uses a password-sync CronJob (every 6h) to push rotated password to the Technitium app config via API.
|
||||
- **K8s credentials**: Vault K8s secrets engine. Roles: `dashboard-admin`, `ci-deployer`, `openclaw`, `local-admin`. Use `vault write kubernetes/creds/ROLE kubernetes_namespace=NS`. Helper: `scripts/vault-kubeconfig`.
|
||||
- **CI/CD (GHA + Woodpecker)**: Docker builds run on **GitHub Actions** (free on public repos). Woodpecker is **deploy-only** — receives image tag via API POST, runs `kubectl set image`. Woodpecker authenticates via K8s SA JWT → Vault K8s auth. Sync CronJob pushes `secret/ci/global` → Woodpecker API every 6h. Shell scripts in HCL heredocs: escape `$` → `$$`, `%{}` → `%%{}`.
|
||||
- **Platform cannot depend on vault** (circular). Apply order: vault first, then platform. Platform has 48 vault refs, all in module inputs — no ESO migration possible.
|
||||
|
|
@ -79,7 +79,7 @@ Violations cause state drift, which causes future applies to break or silently r
|
|||
|
||||
**Flow**: `git push → GHA build+push DockerHub (8-char SHA) → POST Woodpecker API → kubectl set image`
|
||||
|
||||
**Migrated to GHA** (7): Website, k8s-portal, f1-stream, claude-memory-mcp, apple-health-data, audiblez-web, plotting-book
|
||||
**Migrated to GHA** (9): Website, k8s-portal, f1-stream, claude-memory-mcp, apple-health-data, audiblez-web, plotting-book, insta2spotify, audiobook-search
|
||||
**Woodpecker-only**: travel_blog (1.4GB content too large for GHA), infra pipelines (terragrunt apply, certbot, build-cli — need cluster access)
|
||||
|
||||
**Per-project files**:
|
||||
|
|
@ -97,7 +97,7 @@ Repo IDs: infra=1, Website=2, finance=3, health=4, travel_blog=5, webhook-handle
|
|||
|
||||
**GitHub repo secrets** (set on all repos): `DOCKERHUB_USERNAME`, `DOCKERHUB_TOKEN`, `WOODPECKER_TOKEN`
|
||||
|
||||
**Infra pipelines unchanged**: `default.yml` (terragrunt apply), `renew-tls.yml` (certbot cron), `build-cli.yml` (dual registry push), `k8s-portal.yml` (path-filtered build) — all stay on Woodpecker.
|
||||
**Infra pipelines unchanged**: `default.yml` (terragrunt apply), `renew-tls.yml` (certbot cron), `build-cli.yml` (dual registry push), `k8s-portal.yml` (path-filtered build), `provision-user.yml` — all stay on Woodpecker.
|
||||
|
||||
## Database Host
|
||||
|
||||
|
|
@ -121,7 +121,7 @@ Repo IDs: infra=1, Website=2, finance=3, health=4, travel_blog=5, webhook-handle
|
|||
| Frigate | GPU stall detection in liveness probe (inference speed check), high CPU |
|
||||
| Authentik | 3 replicas, PgBouncer in front of PostgreSQL, strip auth headers before forwarding |
|
||||
| Kyverno | failurePolicy=Ignore to prevent blocking cluster, pin chart version |
|
||||
| MySQL InnoDB | Enable auto-recovery, anti-affinity excludes node2 (SIGBUS), 4.4Gi req but ~1Gi used |
|
||||
| MySQL InnoDB | Enable auto-recovery, anti-affinity excludes k8s-node1 (GPU), 2Gi req / 3Gi limit |
|
||||
|
||||
## Monitoring & Alerting
|
||||
- Alert cascade inhibitions: if node is down, suppress pod alerts on that node.
|
||||
|
|
|
|||
|
|
@ -5,33 +5,33 @@
|
|||
## Critical - Network & Auth (Tier: core)
|
||||
| Service | Description | Stack |
|
||||
|---------|-------------|-------|
|
||||
| wireguard | VPN server | platform |
|
||||
| technitium | DNS server (10.0.20.101) | platform |
|
||||
| headscale | Tailscale control server | platform |
|
||||
| traefik | Ingress controller (Helm) | platform |
|
||||
| wireguard | VPN server | wireguard |
|
||||
| technitium | DNS server (10.0.20.101) | technitium |
|
||||
| headscale | Tailscale control server | headscale |
|
||||
| traefik | Ingress controller (Helm) | traefik |
|
||||
| xray | Proxy/tunnel | platform |
|
||||
| authentik | Identity provider (SSO) | platform |
|
||||
| cloudflared | Cloudflare tunnel | platform |
|
||||
| authelia | Auth middleware | platform |
|
||||
| monitoring | Prometheus/Grafana/Loki stack | platform |
|
||||
| authentik | Identity provider (SSO) | authentik |
|
||||
| cloudflared | Cloudflare tunnel | cloudflared |
|
||||
| authelia | Auth middleware (may be merged into ebooks or removed) | platform |
|
||||
| monitoring | Prometheus/Grafana/Loki stack | monitoring |
|
||||
|
||||
## Storage & Security (Tier: cluster)
|
||||
| Service | Description | Stack |
|
||||
|---------|-------------|-------|
|
||||
| vaultwarden | Bitwarden-compatible password manager | platform |
|
||||
| redis | Shared Redis at `redis.redis.svc.cluster.local` | platform |
|
||||
| redis | Shared Redis at `redis.redis.svc.cluster.local` | redis |
|
||||
| immich | Photo management (GPU) | immich |
|
||||
| nvidia | GPU device plugin | platform |
|
||||
| metrics-server | K8s metrics | platform |
|
||||
| uptime-kuma | Status monitoring | platform |
|
||||
| crowdsec | Security/WAF | platform |
|
||||
| kyverno | Policy engine | platform |
|
||||
| nvidia | GPU device plugin | nvidia |
|
||||
| metrics-server | K8s metrics | metrics-server |
|
||||
| uptime-kuma | Status monitoring | uptime-kuma |
|
||||
| crowdsec | Security/WAF | crowdsec |
|
||||
| kyverno | Policy engine | kyverno |
|
||||
|
||||
## Admin
|
||||
| Service | Description | Stack |
|
||||
|---------|-------------|-------|
|
||||
| k8s-dashboard | Kubernetes dashboard | platform |
|
||||
| reverse-proxy | Generic reverse proxy | platform |
|
||||
| k8s-dashboard | Kubernetes dashboard | k8s-dashboard |
|
||||
| reverse-proxy | Generic reverse proxy | reverse-proxy |
|
||||
|
||||
## Active Use
|
||||
| Service | Description | Stack |
|
||||
|
|
@ -43,12 +43,15 @@
|
|||
| dawarich | Location history | dawarich |
|
||||
| owntracks | Location tracking | owntracks |
|
||||
| nextcloud | File sync/share | nextcloud |
|
||||
| calibre | E-book management | calibre |
|
||||
| calibre | E-book management (may be merged into ebooks stack) | calibre |
|
||||
| onlyoffice | Document editing | onlyoffice |
|
||||
| f1-stream | F1 streaming | f1-stream |
|
||||
| rybbit | Analytics | rybbit |
|
||||
| isponsorblocktv | SponsorBlock for TV | isponsorblocktv |
|
||||
| actualbudget | Budgeting (factory pattern) | actualbudget |
|
||||
| insta2spotify | Instagram reel song ID to Spotify playlist | insta2spotify |
|
||||
| trading-bot | Event-driven trading with sentiment analysis | trading-bot |
|
||||
| claude-memory | Persistent memory MCP server | claude-memory |
|
||||
|
||||
## Optional
|
||||
| Service | Description | Stack |
|
||||
|
|
@ -69,7 +72,7 @@
|
|||
| send | Firefox Send | send |
|
||||
| ytdlp | YouTube downloader | ytdlp |
|
||||
| wealthfolio | Finance tracking | wealthfolio |
|
||||
| audiobookshelf | Audiobook server | audiobookshelf |
|
||||
| audiobookshelf | Audiobook server (may be merged into ebooks stack) | audiobookshelf |
|
||||
| paperless-ngx | Document management | paperless-ngx |
|
||||
| jsoncrack | JSON visualizer | jsoncrack |
|
||||
| servarr | Media automation (Sonarr/Radarr/etc) | servarr |
|
||||
|
|
@ -103,6 +106,9 @@
|
|||
| grampsweb | Genealogy web app (Gramps Web) | grampsweb |
|
||||
| openclaw | AI agent gateway (OpenClaw) | openclaw |
|
||||
| poison-fountain | Anti-AI scraping (tarpit + poison) | poison-fountain |
|
||||
| priority-pass | Boarding pass color transformer | priority-pass |
|
||||
| status-page | Status page | status-page |
|
||||
| plotting-book | Book plotting/world-building app | plotting-book |
|
||||
|
||||
## Cloudflare Domains
|
||||
|
||||
|
|
|
|||
12
AGENTS.md
12
AGENTS.md
|
|
@ -42,11 +42,11 @@ For secrets that users manage themselves (no SOPS/git-crypt access needed):
|
|||
|
||||
## Architecture
|
||||
Terragrunt-based homelab managing a Kubernetes cluster (5 nodes, v1.34.2) on Proxmox VMs.
|
||||
- **70+ services**, each in `stacks/<service>/` with its own Terraform state
|
||||
- **Core platform**: `stacks/platform/modules/` (~22 modules: Traefik, Kyverno, monitoring, dbaas, sealed-secrets, etc.)
|
||||
- **100+ stacks**, each in `stacks/<service>/` with its own Terraform state
|
||||
- **Core platform**: `stacks/platform/` is now an empty shell — all modules have been extracted to independent stacks under `stacks/`
|
||||
- **Public domain**: `viktorbarzin.me` (Cloudflare) | **Internal**: `viktorbarzin.lan` (Technitium DNS)
|
||||
- **Onboarding portal**: `https://k8s-portal.viktorbarzin.me` — self-service kubectl setup + docs
|
||||
- **CI/CD**: Woodpecker CI — PRs run plan, merges to master auto-apply platform stack
|
||||
- **CI/CD**: Woodpecker CI — PRs run plan, merges to master auto-apply all stacks
|
||||
|
||||
## Key Paths
|
||||
- `stacks/<service>/main.tf` — service definition
|
||||
|
|
@ -60,9 +60,9 @@ Terragrunt-based homelab managing a Kubernetes cluster (5 nodes, v1.34.2) on Pro
|
|||
|
||||
## Storage
|
||||
- **NFS** (`nfs-truenas` StorageClass): For app data. Use the `nfs_volume` module, never inline `nfs {}` blocks.
|
||||
- **iSCSI** (`iscsi-truenas` StorageClass): For databases (PostgreSQL, MySQL). democratic-csi driver.
|
||||
- **proxmox-lvm** (`proxmox-lvm` StorageClass): For databases (PostgreSQL, MySQL). TopoLVM driver.
|
||||
- **TrueNAS**: 10.0.10.15. NFS exports managed via `secrets/nfs_exports.sh`.
|
||||
- **SQLite on NFS is unreliable** (fsync issues) — always use iSCSI or local disk for databases.
|
||||
- **SQLite on NFS is unreliable** (fsync issues) — always use proxmox-lvm or local disk for databases.
|
||||
- **NFS mount options**: Always `soft,timeo=30,retrans=3` to prevent uninterruptible sleep (D state).
|
||||
- **NFS export directory must exist** on TrueNAS before Terraform can create the PV.
|
||||
|
||||
|
|
@ -81,7 +81,7 @@ Terragrunt-based homelab managing a Kubernetes cluster (5 nodes, v1.34.2) on Pro
|
|||
- **GPU**: `node_selector = { "gpu": "true" }` + toleration `nvidia.com/gpu`
|
||||
- **Pull-through cache**: 10.0.20.10 — docker.io (:5000), ghcr.io (:5010) only. Caches stale manifests for :latest tags — use versioned tags or pre-pull with `ctr --hosts-dir ''` to bypass.
|
||||
- **pfSense**: 10.0.20.1 (gateway, firewall, DNS forwarding)
|
||||
- **MySQL InnoDB Cluster**: 3 instances on iSCSI, anti-affinity excludes node2 (SIGBUS bug)
|
||||
- **MySQL InnoDB Cluster**: 3 instances on proxmox-lvm, anti-affinity excludes k8s-node1 (GPU node)
|
||||
- **SMTP**: `var.mail_host` port 587 STARTTLS (not internal svc address — cert mismatch)
|
||||
|
||||
## Contributor Onboarding
|
||||
|
|
|
|||
|
|
@ -58,7 +58,7 @@ graph LR
|
|||
|
||||
### Project Migration Status
|
||||
|
||||
**Migrated to GHA (7 projects)**:
|
||||
**Migrated to GHA (9 projects)**:
|
||||
- Website
|
||||
- k8s-portal
|
||||
- f1-stream
|
||||
|
|
@ -66,9 +66,11 @@ graph LR
|
|||
- apple-health-data
|
||||
- audiblez-web
|
||||
- plotting-book
|
||||
- insta2spotify
|
||||
- book-search (audiobook-search)
|
||||
|
||||
**Woodpecker-only (infra + large apps)**:
|
||||
- `travel_blog`: 1.4GB content directory exceeds GHA limits
|
||||
- `travel_blog`: 5.7GB content directory exceeds GHA limits
|
||||
- Infra pipelines: require cluster access (terragrunt apply, certbot, build-cli)
|
||||
|
||||
### Woodpecker Pipeline Files
|
||||
|
|
@ -265,7 +267,7 @@ commands:
|
|||
|
||||
### travel_blog Build Times Out on GHA
|
||||
|
||||
**Cause**: 1.4GB content directory exceeds GHA disk/time limits
|
||||
**Cause**: 5.7GB content directory exceeds GHA disk/time limits
|
||||
|
||||
**Fix**: Keep on Woodpecker (no migration). Build uses cluster storage and resources.
|
||||
|
||||
|
|
|
|||
|
|
@ -17,11 +17,11 @@ graph TB
|
|||
|
||||
subgraph Proxmox["Proxmox VE"]
|
||||
direction TB
|
||||
MASTER["VM 200: k8s-master<br/>8c / 16GB<br/>10.0.20.100"]
|
||||
MASTER["VM 200: k8s-master<br/>8c / 32GB<br/>10.0.20.100"]
|
||||
NODE1["VM 201: k8s-node1<br/>16c / 32GB<br/>GPU Passthrough<br/>nvidia.com/gpu=true:NoSchedule"]
|
||||
NODE2["VM 202: k8s-node2<br/>8c / 24GB"]
|
||||
NODE3["VM 203: k8s-node3<br/>8c / 24GB"]
|
||||
NODE4["VM 204: k8s-node4<br/>8c / 24GB"]
|
||||
NODE2["VM 202: k8s-node2<br/>8c / 32GB"]
|
||||
NODE3["VM 203: k8s-node3<br/>8c / 32GB"]
|
||||
NODE4["VM 204: k8s-node4<br/>8c / 32GB"]
|
||||
end
|
||||
|
||||
subgraph K8s["Kubernetes Cluster v1.34.2"]
|
||||
|
|
@ -62,7 +62,7 @@ graph TB
|
|||
| Model | Dell PowerEdge R730 |
|
||||
| CPU | 1x Intel Xeon E5-2699 v4 (22 cores / 44 threads, CPU2 unpopulated) |
|
||||
| Total Cores/Threads | 22 cores / 44 threads |
|
||||
| RAM | 272GB DDR4-2400 ECC RDIMM (10 DIMMs: 8x32G Samsung + 2x8G Hynix) |
|
||||
| RAM | 272GB DDR4-2400 ECC RDIMM physical (10 DIMMs: 8x32G Samsung + 2x8G Hynix). VMs use ~160GB total (5 K8s VMs x 32GB) |
|
||||
| GPU | NVIDIA Tesla T4 (16GB GDDR6, PCIe 0000:06:00.0) |
|
||||
| Storage | 1.1TB SSD + 931GB SSD + 10.7TB HDD |
|
||||
| Hypervisor | Proxmox VE |
|
||||
|
|
@ -71,13 +71,13 @@ graph TB
|
|||
|
||||
| VM | VMID | vCPUs | RAM | Network | Role | Taints |
|
||||
|----|------|-------|-----|---------|------|--------|
|
||||
| k8s-master | 200 | 8 | 16GB | vmbr1:vlan20 (10.0.20.100) | Control Plane | `node-role.kubernetes.io/control-plane:NoSchedule` |
|
||||
| k8s-master | 200 | 8 | 32GB | vmbr1:vlan20 (10.0.20.100) | Control Plane | `node-role.kubernetes.io/control-plane:NoSchedule` |
|
||||
| k8s-node1 | 201 | 16 | 32GB | vmbr1:vlan20 | GPU Worker | `nvidia.com/gpu=true:NoSchedule` |
|
||||
| k8s-node2 | 202 | 8 | 24GB | vmbr1:vlan20 | Worker | None |
|
||||
| k8s-node3 | 203 | 8 | 24GB | vmbr1:vlan20 | Worker | None |
|
||||
| k8s-node4 | 204 | 8 | 24GB | vmbr1:vlan20 | Worker | None |
|
||||
| k8s-node2 | 202 | 8 | 32GB | vmbr1:vlan20 | Worker | None |
|
||||
| k8s-node3 | 203 | 8 | 32GB | vmbr1:vlan20 | Worker | None |
|
||||
| k8s-node4 | 204 | 8 | 32GB | vmbr1:vlan20 | Worker | None |
|
||||
|
||||
**Total Cluster Resources**: 48 vCPUs, 120GB RAM (excluding control plane)
|
||||
**Total Cluster Resources**: 48 vCPUs, ~160GB RAM (5 nodes x 32GB)
|
||||
|
||||
### GPU Passthrough
|
||||
|
||||
|
|
@ -443,7 +443,7 @@ spec:
|
|||
**Rationale**:
|
||||
- **CFS Throttling**: Linux Completely Fair Scheduler throttles containers to their exact CPU limit, even when CPU is idle. This causes artificial performance degradation.
|
||||
- **Burstability**: Services can burst to unused CPU during low-load periods, improving response times.
|
||||
- **Memory-bound**: With 272GB host RAM (180GB allocated to VMs), memory is no longer the primary constraint. 92GB headroom available for new VMs.
|
||||
- **Memory-bound**: With 272GB physical host RAM (~160GB allocated to K8s VMs), memory is no longer the primary constraint. ~112GB headroom available for new VMs.
|
||||
|
||||
**Tradeoff**: A runaway process could monopolize CPU. Mitigated by CPU requests reserving capacity and PriorityClass preemption.
|
||||
|
||||
|
|
|
|||
|
|
@ -21,13 +21,12 @@ graph TB
|
|||
A4 --> PGB
|
||||
PGB --> CNPG_RW[CNPG Primary<br/>pg-cluster-rw.dbaas]
|
||||
CNPG_RW --> CNPG_R1[CNPG Replica 1]
|
||||
CNPG_RW --> CNPG_R2[CNPG Replica 2]
|
||||
end
|
||||
|
||||
subgraph MySQL
|
||||
A3 --> MYC[MySQL InnoDB Cluster<br/>3 instances]
|
||||
MYC --> ISCSI1[iSCSI Storage]
|
||||
MYC -.anti-affinity.-> NODE2[Exclude node2<br/>SIGBUS bug]
|
||||
MYC --> LVM1[Proxmox-LVM Storage]
|
||||
MYC -.anti-affinity.-> NODE1[Exclude k8s-node1<br/>GPU node]
|
||||
end
|
||||
|
||||
subgraph Redis
|
||||
|
|
@ -36,8 +35,8 @@ graph TB
|
|||
|
||||
subgraph Vault
|
||||
V[Vault DB Engine]
|
||||
V -.24h rotation.-> PGB
|
||||
V -.24h rotation.-> MYC
|
||||
V -.7-day rotation.-> PGB
|
||||
V -.7-day rotation.-> MYC
|
||||
end
|
||||
|
||||
style CNPG_RW fill:#2088ff
|
||||
|
|
@ -50,9 +49,9 @@ graph TB
|
|||
|
||||
| Component | Version | Location | Purpose |
|
||||
|-----------|---------|----------|---------|
|
||||
| PostgreSQL (CNPG) | CloudNativePG | `dbaas` namespace | Primary/replica cluster, auto-failover |
|
||||
| PostgreSQL (CNPG) | CloudNativePG (PostGIS 16: `postgis:16`) | `dbaas` namespace | Primary/replica cluster, auto-failover |
|
||||
| PgBouncer | 3 replicas | `dbaas` namespace | Connection pooling for PostgreSQL |
|
||||
| MySQL InnoDB Cluster | 8.x | `dbaas` namespace | Multi-master MySQL cluster |
|
||||
| MySQL InnoDB Cluster | 8.4.4 | `dbaas` namespace | Multi-master MySQL cluster |
|
||||
| Redis | Latest | `redis` namespace | Shared cache layer |
|
||||
| Vault DB Engine | - | `vault` namespace | Automated credential rotation |
|
||||
|
||||
|
|
@ -64,7 +63,7 @@ graph TB
|
|||
| PgBouncer | `pgbouncer.dbaas.svc.cluster.local` | Connection pool (3 replicas) |
|
||||
| MySQL | `mysql.dbaas.svc.cluster.local` | InnoDB Cluster VIP |
|
||||
| Redis | `redis.redis.svc.cluster.local` | Shared instance |
|
||||
| **NEVER USE** | `postgresql.dbaas.svc.cluster.local` | Legacy service, no endpoints |
|
||||
| PostgreSQL (compat) | `postgresql.dbaas.svc.cluster.local` | Compatibility service, selects CNPG primary |
|
||||
|
||||
## How It Works
|
||||
|
||||
|
|
@ -80,7 +79,7 @@ graph TB
|
|||
- Reduces connection overhead
|
||||
- Load balances across PgBouncer instances
|
||||
|
||||
3. **Credential Rotation**: Vault DB engine rotates credentials every 24h
|
||||
3. **Credential Rotation**: Vault DB engine rotates credentials every 7 days
|
||||
- Apps fetch credentials from Vault on startup
|
||||
- Vault manages rotation lifecycle
|
||||
|
||||
|
|
@ -91,7 +90,7 @@ graph TB
|
|||
- affine
|
||||
- woodpecker
|
||||
- claude-memory-mcp
|
||||
- ~12 stacks total
|
||||
- 5 active PG roles
|
||||
|
||||
### MySQL InnoDB Cluster
|
||||
|
||||
|
|
@ -99,16 +98,16 @@ graph TB
|
|||
- Multi-master replication
|
||||
- Automatic split-brain resolution
|
||||
|
||||
2. **Storage**: iSCSI-backed persistent volumes
|
||||
- Low-latency block storage
|
||||
- Better performance than NFS
|
||||
2. **Storage**: Proxmox-LVM persistent volumes
|
||||
- Thin-provisioned LVM on Proxmox hosts
|
||||
- Block-level storage with proper write guarantees
|
||||
|
||||
3. **Anti-Affinity**: Excludes node2 due to SIGBUS bug
|
||||
- Pods scheduled to node1, node3, node4, etc.
|
||||
- Prevents kernel panic crashes
|
||||
3. **Anti-Affinity**: Excludes k8s-node1 (GPU node)
|
||||
- Pods scheduled to node2, node3, node4, etc.
|
||||
- Keeps database workloads off the GPU-dedicated node
|
||||
|
||||
4. **Resource Allocation**: 4.4Gi memory request, ~1Gi actual usage
|
||||
- Over-provisioned for safety
|
||||
4. **Resource Allocation**: 2Gi request / 3Gi limit
|
||||
- Right-sized based on VPA recommendations
|
||||
|
||||
**Used by**:
|
||||
- wrongmove (realestate-crawler)
|
||||
|
|
@ -137,14 +136,13 @@ graph TB
|
|||
**Critical**: SQLite on NFS is unreliable
|
||||
- NFS lacks proper `fsync()` support
|
||||
- Causes database corruption under load
|
||||
- **Solution**: Use iSCSI-backed volumes for SQLite apps
|
||||
- **Solution**: Use Proxmox-LVM volumes for SQLite apps
|
||||
|
||||
### Vault Database Engine
|
||||
|
||||
**Rotation Schedule**: 24 hours
|
||||
**Rotation Schedule**: 7 days (604800s)
|
||||
|
||||
**PostgreSQL Rotation**:
|
||||
- trading
|
||||
- health (apple-health-data)
|
||||
- linkwarden
|
||||
- affine
|
||||
|
|
@ -229,8 +227,8 @@ resource "vault_database_secret_backend_role" "app" {
|
|||
"CREATE USER \"{{name}}\" WITH PASSWORD '{{password}}' VALID UNTIL '{{expiration}}';",
|
||||
"GRANT ALL PRIVILEGES ON DATABASE myapp TO \"{{name}}\";"
|
||||
]
|
||||
default_ttl = 86400 # 24 hours
|
||||
max_ttl = 86400
|
||||
default_ttl = 604800 # 7 days
|
||||
max_ttl = 604800
|
||||
}
|
||||
|
||||
resource "kubernetes_secret" "db_creds" {
|
||||
|
|
@ -281,17 +279,17 @@ resource "kubernetes_secret" "db_creds" {
|
|||
- Automatic split-brain resolution
|
||||
- Simpler than Galera
|
||||
|
||||
### Why iSCSI Storage for Databases?
|
||||
### Why Block Storage for Databases?
|
||||
|
||||
- NFS lacks proper `fsync()` support (causes SQLite corruption)
|
||||
- iSCSI provides block-level storage with proper write guarantees
|
||||
- Proxmox-LVM provides block-level storage with proper write guarantees
|
||||
- Lower latency than NFS for database workloads
|
||||
|
||||
### Why 24h Credential Rotation?
|
||||
### Why 7-Day Credential Rotation?
|
||||
|
||||
- Balance between security (shorter is better) and operational overhead
|
||||
- 24h allows time to debug issues before next rotation
|
||||
- Aligns with daily ops cycle
|
||||
- 7 days allows ample time to debug issues before next rotation
|
||||
- Reduces rotation-related disruptions while maintaining security hygiene
|
||||
|
||||
### Why Shared Redis (Not Per-App)?
|
||||
|
||||
|
|
@ -331,9 +329,9 @@ kubectl get cluster -n dbaas
|
|||
kubectl cnpg promote pg-cluster-2 -n dbaas
|
||||
```
|
||||
|
||||
### MySQL: Pod Stuck on node2
|
||||
### MySQL: Pod Stuck on Excluded Node
|
||||
|
||||
**Cause**: Anti-affinity rule not applied
|
||||
**Cause**: Anti-affinity rule not applied (should exclude k8s-node1)
|
||||
|
||||
**Fix**:
|
||||
```bash
|
||||
|
|
@ -344,17 +342,17 @@ kubectl get pod <mysql-pod> -n dbaas -o yaml | grep -A 10 affinity
|
|||
kubectl delete pod <mysql-pod> -n dbaas
|
||||
```
|
||||
|
||||
### MySQL: SIGBUS Crash on node2
|
||||
### MySQL: Pod Scheduled on GPU Node
|
||||
|
||||
**Cause**: Known kernel bug on node2 with iSCSI storage
|
||||
**Cause**: Anti-affinity rule not preventing scheduling on k8s-node1
|
||||
|
||||
**Fix**:
|
||||
```bash
|
||||
# Cordon node2 to prevent scheduling
|
||||
kubectl cordon node2
|
||||
# Check pod affinity rules
|
||||
kubectl get pod <mysql-pod> -n dbaas -o yaml | grep -A 10 affinity
|
||||
|
||||
# Delete MySQL pods on node2
|
||||
kubectl delete pod -n dbaas -l app=mysql --field-selector spec.nodeName=node2
|
||||
# Delete pod to reschedule away from node1
|
||||
kubectl delete pod <mysql-pod> -n dbaas
|
||||
```
|
||||
|
||||
### SQLite: Database Corruption
|
||||
|
|
@ -366,10 +364,10 @@ kubectl delete pod -n dbaas -l app=mysql --field-selector spec.nodeName=node2
|
|||
# Check volume type
|
||||
kubectl get pv | grep <app>
|
||||
|
||||
# If NFS, migrate to iSCSI:
|
||||
# 1. Create iSCSI PVC
|
||||
# If NFS, migrate to proxmox-lvm:
|
||||
# 1. Create proxmox-lvm PVC
|
||||
# 2. Backup SQLite database
|
||||
# 3. Restore to iSCSI volume
|
||||
# 3. Restore to proxmox-lvm volume
|
||||
# 4. Update app to use new volume
|
||||
```
|
||||
|
||||
|
|
@ -410,15 +408,15 @@ CONFIG REWRITE
|
|||
|
||||
### App Can't Connect: "Connection refused"
|
||||
|
||||
**Cause**: Using legacy `postgresql.dbaas` service (no endpoints)
|
||||
**Cause**: Service endpoint not reachable or PgBouncer not running
|
||||
|
||||
**Fix**:
|
||||
```bash
|
||||
# Check service endpoints
|
||||
kubectl get endpoints pgbouncer -n dbaas
|
||||
kubectl get endpoints postgresql -n dbaas
|
||||
# Output: No endpoints (this is the problem)
|
||||
|
||||
# Update app to use pg-cluster-rw or pgbouncer
|
||||
# Update app to use pgbouncer
|
||||
kubectl set env deployment/<app> DB_HOST=pgbouncer.dbaas.svc.cluster.local
|
||||
```
|
||||
|
||||
|
|
|
|||
|
|
@ -36,7 +36,7 @@ graph TB
|
|||
subgraph "VLAN 20 - Kubernetes<br/>10.0.20.0/24"
|
||||
pfSense[pfSense<br/>10.0.20.1<br/>Gateway/NAT/DHCP]
|
||||
Tech[Technitium DNS<br/>10.0.20.101<br/>viktorbarzin.lan]
|
||||
MLB[MetalLB Pool<br/>10.0.20.102-200]
|
||||
MLB[MetalLB Pool<br/>10.0.20.200-10.0.20.220]
|
||||
|
||||
subgraph "K8s Nodes"
|
||||
Master[k8s-master]
|
||||
|
|
@ -85,7 +85,7 @@ graph TB
|
|||
| Traefik | Helm chart | K8s (3 replicas + PDB) | Ingress controller, HTTP/3 enabled |
|
||||
| CrowdSec | Helm chart | K8s (LAPI: 3 replicas) | Bot protection, fail-open bouncer |
|
||||
| Authentik | Helm chart | K8s (3 replicas + PDB) | SSO, forward-auth middleware |
|
||||
| MetalLB | v0.15.3 Helm chart | K8s | LoadBalancer IPs (10.0.20.102-200), all services on 10.0.20.200 |
|
||||
| MetalLB | v0.15.3 Helm chart | K8s | LoadBalancer IPs (10.0.20.200-10.0.20.220), all services on 10.0.20.200 |
|
||||
| Registry Cache | Container | 10.0.20.10 | Pull-through for docker.io:5000, ghcr.io:5010 |
|
||||
|
||||
## How It Works
|
||||
|
|
@ -165,7 +165,7 @@ Additional middleware:
|
|||
|
||||
### MetalLB & Load Balancing
|
||||
|
||||
MetalLB v0.15.3 allocates IPs from the range 10.0.20.102-200 in **Layer 2 mode**. All 11 LoadBalancer services share a single IP (**10.0.20.200**) using the `metallb.io/allow-shared-ip: shared` annotation. Services sharing an IP must use the same `externalTrafficPolicy` (standardized to `Cluster`).
|
||||
MetalLB v0.15.3 allocates IPs from the range 10.0.20.200-10.0.20.220 in **Layer 2 mode**. All 11 LoadBalancer services share a single IP (**10.0.20.200**) using the `metallb.io/allow-shared-ip: shared` annotation. Services sharing an IP must use the same `externalTrafficPolicy` (standardized to `Cluster`).
|
||||
|
||||
| Service | Namespace | Ports |
|
||||
|---------|-----------|-------|
|
||||
|
|
@ -236,7 +236,7 @@ Containerd on all K8s nodes uses `hosts.toml` to redirect pulls to the local cac
|
|||
|
||||
**MetalLB**:
|
||||
- Helm values: `stacks/platform/metallb-values.yaml`
|
||||
- IPAddressPool CRD: `10.0.20.102-10.0.20.200`
|
||||
- IPAddressPool CRD: `10.0.20.200-10.0.20.220`
|
||||
- All 11 LB services consolidated on `10.0.20.200` with `metallb.io/allow-shared-ip: shared`
|
||||
- Requires matching `externalTrafficPolicy` (all use `Cluster`) for IP sharing
|
||||
|
||||
|
|
@ -327,7 +327,7 @@ Containerd on all K8s nodes uses `hosts.toml` to redirect pulls to the local cac
|
|||
**Diagnosis**: Check MetalLB logs: `kubectl logs -n metallb-system deploy/controller`
|
||||
|
||||
**Common causes**:
|
||||
1. **IP pool exhausted**: 98 IPs available (10.0.20.102-200), check `kubectl get svc -A | grep LoadBalancer`
|
||||
1. **IP pool exhausted**: 21 IPs available (10.0.20.200-10.0.20.220), check `kubectl get svc -A | grep LoadBalancer`
|
||||
2. **Missing allow-shared-ip annotation**: Services must have `metallb.io/allow-shared-ip: shared` and `metallb.io/loadBalancerIPs: 10.0.20.200`
|
||||
3. **Mismatched externalTrafficPolicy**: All services sharing an IP must use the same ETP (currently `Cluster`). Error: "can't change sharing key"
|
||||
4. **MetalLB controller crash-looping**: Resource limits too low
|
||||
|
|
|
|||
|
|
@ -9,7 +9,7 @@ This homelab infrastructure runs a production-grade Kubernetes cluster on Proxmo
|
|||
```mermaid
|
||||
graph TB
|
||||
subgraph Physical["Physical Hardware"]
|
||||
R730["Dell R730<br/>22c/44t Xeon E5-2699 v4<br/>142GB RAM<br/>NVIDIA Tesla T4<br/>1.1TB + 931GB + 10.7TB"]
|
||||
R730["Dell R730<br/>22c/44t Xeon E5-2699 v4<br/>~160GB RAM<br/>NVIDIA Tesla T4<br/>1.1TB + 931GB + 10.7TB"]
|
||||
end
|
||||
|
||||
subgraph Proxmox["Proxmox VE"]
|
||||
|
|
@ -67,8 +67,8 @@ graph TB
|
|||
| Component | Specification |
|
||||
|-----------|---------------|
|
||||
| Server | Dell PowerEdge R730 |
|
||||
| CPU | 2x Intel Xeon E5-2699 v4 (22 cores / 44 threads each, 44c/88t total) |
|
||||
| RAM | 142GB DDR4 ECC |
|
||||
| CPU | 1x Intel Xeon E5-2699 v4 (22 cores / 44 threads, CPU2 unpopulated) |
|
||||
| RAM | ~160GB DDR4 ECC |
|
||||
| GPU | NVIDIA Tesla T4 (16GB, PCIe 0000:06:00.0) |
|
||||
| Storage | 1.1TB SSD + 931GB SSD + 10.7TB HDD |
|
||||
| Network | eno1 (physical), vmbr0 (physical bridge), vmbr1 (VLAN-aware internal) |
|
||||
|
|
@ -88,11 +88,11 @@ graph TB
|
|||
| 101 | pfsense | 8 | 16GB | vmbr0, vmbr1:vlan10, vmbr1:vlan20 | - | Gateway/firewall routing between VLANs |
|
||||
| 102 | devvm | 16 | 8GB | vmbr1:vlan10 | - | Development VM |
|
||||
| 103 | home-assistant | 8 | 8GB | vmbr0 | - | Home Assistant Sofia instance |
|
||||
| 200 | k8s-master | 8 | 8GB | vmbr1:vlan20 | 10.0.20.100 | Kubernetes control plane |
|
||||
| 201 | k8s-node1 | 16 | 16GB | vmbr1:vlan20 | - | GPU worker node (Tesla T4 passthrough) |
|
||||
| 202 | k8s-node2 | 8 | 24GB | vmbr1:vlan20 | - | Worker node |
|
||||
| 203 | k8s-node3 | 8 | 24GB | vmbr1:vlan20 | - | Worker node |
|
||||
| 204 | k8s-node4 | 8 | 24GB | vmbr1:vlan20 | - | Worker node |
|
||||
| 200 | k8s-master | 8 | 32GB | vmbr1:vlan20 | 10.0.20.100 | Kubernetes control plane |
|
||||
| 201 | k8s-node1 | 16 | 32GB | vmbr1:vlan20 | - | GPU worker node (Tesla T4 passthrough) |
|
||||
| 202 | k8s-node2 | 8 | 32GB | vmbr1:vlan20 | - | Worker node |
|
||||
| 203 | k8s-node3 | 8 | 32GB | vmbr1:vlan20 | - | Worker node |
|
||||
| 204 | k8s-node4 | 8 | 32GB | vmbr1:vlan20 | - | Worker node |
|
||||
| 220 | docker-registry | 4 | 4GB | vmbr1:vlan20 | 10.0.20.10 | Private Docker registry |
|
||||
| 9000 | truenas | 16 | 16GB | vmbr1:vlan10 | 10.0.10.15 | NFS storage server |
|
||||
|
||||
|
|
@ -103,7 +103,7 @@ graph TB
|
|||
| Version | v1.34.2 |
|
||||
| Nodes | 5 (1 control plane, 4 workers) |
|
||||
| CNI | Calico |
|
||||
| Storage | democratic-csi (NFS) + local-path-provisioner |
|
||||
| Storage | NFS (democratic-csi) + Proxmox-LVM (Proxmox CSI) |
|
||||
| Ingress | Traefik v3 |
|
||||
| Total Services | 70+ services across 5 tiers |
|
||||
|
||||
|
|
@ -123,7 +123,7 @@ The cluster uses a five-tier namespace system managed by Kyverno, which automati
|
|||
|
||||
### Physical Layer
|
||||
|
||||
The infrastructure runs on a single Dell R730 server with dual Xeon CPUs and 142GB RAM. Proxmox VE provides hypervisor capabilities with hardware passthrough support for the Tesla T4 GPU. The physical network interface (eno1) bridges to vmbr0 for physical network access, while vmbr1 provides VLAN-aware internal networking.
|
||||
The infrastructure runs on a single Dell R730 server with a Xeon E5-2699 v4 CPU and ~160GB RAM. Proxmox VE provides hypervisor capabilities with hardware passthrough support for the Tesla T4 GPU. The physical network interface (eno1) bridges to vmbr0 for physical network access, while vmbr1 provides VLAN-aware internal networking.
|
||||
|
||||
### Network Layer
|
||||
|
||||
|
|
@ -137,9 +137,9 @@ This three-tier network design isolates Kubernetes workloads from management inf
|
|||
### Compute Layer
|
||||
|
||||
The Kubernetes cluster consists of 5 nodes:
|
||||
- **k8s-master (200)**: 8c/8GB control plane running kube-apiserver, etcd, controller-manager
|
||||
- **k8s-node1 (201)**: 16c/16GB GPU node with Tesla T4 passthrough, tainted for GPU workloads only
|
||||
- **k8s-node2-4 (202-204)**: 8c/24GB workers running general-purpose workloads
|
||||
- **k8s-master (200)**: 8c/32GB control plane running kube-apiserver, etcd, controller-manager
|
||||
- **k8s-node1 (201)**: 16c/32GB GPU node with Tesla T4 passthrough, tainted for GPU workloads only
|
||||
- **k8s-node2-4 (202-204)**: 8c/32GB workers running general-purpose workloads
|
||||
|
||||
GPU passthrough on node1 uses PCIe device 0000:06:00.0, with Kubernetes taint `nvidia.com/gpu=true:NoSchedule` and label `gpu=true` to ensure only GPU-requesting pods schedule there.
|
||||
|
||||
|
|
@ -201,11 +201,11 @@ Each service lives in `stacks/<service>/` with its own Terragrunt configuration.
|
|||
|
||||
### Vault Paths
|
||||
|
||||
Secrets are stored in HashiCorp Vault under `kv/`:
|
||||
- `kv/<service>/*` - Service-specific secrets
|
||||
- `kv/cloudflare` - Cloudflare API tokens
|
||||
- `kv/authentik` - OIDC client credentials
|
||||
- `kv/backup` - Backup encryption keys
|
||||
Secrets are stored in HashiCorp Vault under `secret/`:
|
||||
- `secret/<service>/*` - Service-specific secrets
|
||||
- `secret/cloudflare` - Cloudflare API tokens
|
||||
- `secret/authentik` - OIDC client credentials
|
||||
- `secret/backup` - Backup encryption keys
|
||||
|
||||
## Decisions & Rationale
|
||||
|
||||
|
|
@ -240,7 +240,7 @@ Secrets are stored in HashiCorp Vault under `kv/`:
|
|||
**Rationale**:
|
||||
- **CFS throttling**: Linux CFS throttles containers to exact CPU limit even when CPU is idle, causing artificial slowdowns
|
||||
- **Burstability**: Services can burst to unused CPU during idle periods
|
||||
- **Memory is the constraint**: With 142GB RAM, memory exhaustion occurs before CPU saturation
|
||||
- **Memory is the constraint**: With ~160GB RAM across VMs, memory exhaustion occurs before CPU saturation
|
||||
|
||||
**Tradeoff**: A runaway process could monopolize CPU (mitigated by CPU requests reserving capacity).
|
||||
|
||||
|
|
|
|||
|
|
@ -10,7 +10,7 @@ Secrets management is centralized in HashiCorp Vault as the single source of tru
|
|||
graph TB
|
||||
subgraph "Secret Sources"
|
||||
VAULT_KV[Vault KV<br/>secret/viktor<br/>135+ keys]
|
||||
VAULT_DB[Vault DB Engine<br/>24h rotation]
|
||||
VAULT_DB[Vault DB Engine<br/>7-day rotation]
|
||||
VAULT_K8S[Vault K8s Engine<br/>Dynamic SA tokens]
|
||||
USER[User-managed<br/>sealed-*.yaml]
|
||||
end
|
||||
|
|
@ -46,7 +46,7 @@ graph TB
|
|||
graph LR
|
||||
subgraph "Database Credential Rotation"
|
||||
VAULT_ROOT[Vault Root Creds] --> VAULT_DB_ENGINE[Vault DB Engine]
|
||||
VAULT_DB_ENGINE -->|Create role| DB_ROLE[DB Role: 24h TTL]
|
||||
VAULT_DB_ENGINE -->|Create role| DB_ROLE[DB Role: 7-day TTL]
|
||||
DB_ROLE -->|ESO syncs| K8S_SECRET[K8s Secret]
|
||||
K8S_SECRET -->|App reads| APP[Application Pod]
|
||||
APP -->|Uses rotated creds| DATABASE[(MySQL/PostgreSQL)]
|
||||
|
|
@ -63,7 +63,7 @@ graph LR
|
|||
| Sealed Secrets | Latest | `stacks/platform/` | User-managed encrypted secrets |
|
||||
| SOPS | Latest | `scripts/state-sync`, `scripts/tg` | Terraform state encryption (Vault Transit + age) |
|
||||
| Vault K8s Auth | Enabled | `stacks/vault/` | CI/CD authentication via service account tokens |
|
||||
| Vault DB Engine | Enabled | `stacks/vault/` | Dynamic DB credentials for 6 MySQL + 8 PostgreSQL databases |
|
||||
| Vault DB Engine | Enabled | `stacks/vault/` | Dynamic DB credentials for 7 MySQL + 5 PostgreSQL databases |
|
||||
|
||||
## How It Works
|
||||
|
||||
|
|
@ -108,37 +108,34 @@ ESO creates/updates K8s Secrets automatically when Vault values change. Applicat
|
|||
|
||||
### Database Credential Rotation
|
||||
|
||||
Vault DB engine provides automatic 24h credential rotation for:
|
||||
Vault DB engine provides automatic 7 days credential rotation for:
|
||||
|
||||
**MySQL databases** (6):
|
||||
**MySQL databases** (7):
|
||||
- speedtest
|
||||
- wrongmove
|
||||
- codimd
|
||||
- nextcloud
|
||||
- shlink
|
||||
- grafana
|
||||
- technitium
|
||||
|
||||
**PostgreSQL databases** (8):
|
||||
- trading
|
||||
**PostgreSQL databases** (5):
|
||||
- health
|
||||
- linkwarden
|
||||
- affine
|
||||
- woodpecker
|
||||
- claude_memory
|
||||
- (2 others)
|
||||
|
||||
**Excluded from rotation**:
|
||||
- authentik (uses PgBouncer, incompatible with rotation)
|
||||
- technitium, crowdsec (Helm charts bake credentials at install time)
|
||||
- crowdsec (Helm chart bakes credentials at install time)
|
||||
- Root user accounts (used for Vault itself to create rotated users)
|
||||
|
||||
Workflow:
|
||||
1. ESO requests credentials from Vault DB engine
|
||||
2. Vault creates new DB user with 24h TTL
|
||||
3. ESO writes credentials to K8s Secret
|
||||
4. Application reads credentials from secret
|
||||
5. Vault automatically revokes user after 24h
|
||||
6. ESO requests new credentials, cycle repeats
|
||||
1. Vault rotates the database user's password (static role, 7-day period)
|
||||
2. ExternalSecrets Operator syncs new password to K8s Secret (15-min refresh)
|
||||
3. Apps read from K8s Secret via `secret_key_ref` env vars
|
||||
4. Special case: Technitium uses a CronJob to push password to its app config via API
|
||||
|
||||
### Kubernetes Credential Management
|
||||
|
||||
|
|
@ -225,13 +222,13 @@ metadata:
|
|||
spec:
|
||||
provider:
|
||||
vault:
|
||||
server: "https://vault.viktorbarzin.me"
|
||||
server: "http://vault-active.vault.svc.cluster.local:8200"
|
||||
path: secret
|
||||
version: v2
|
||||
auth:
|
||||
kubernetes:
|
||||
mountPath: kubernetes
|
||||
role: external-secrets
|
||||
role: eso
|
||||
```
|
||||
|
||||
**ExternalSecret example**:
|
||||
|
|
@ -318,7 +315,7 @@ provider "vault" {
|
|||
|
||||
### Why Vault DB engine over static credentials?
|
||||
|
||||
**Automatic rotation**: 24h TTL reduces credential exposure window.
|
||||
**Automatic rotation**: 7-day TTL reduces credential exposure window.
|
||||
|
||||
**Audit trail**: Every credential generation logged in Vault.
|
||||
|
||||
|
|
|
|||
|
|
@ -23,7 +23,8 @@ cd infra
|
|||
scripts/tg apply stacks/infra
|
||||
|
||||
# 2. Wait for VMs to boot and be reachable
|
||||
# k8s-master, k8s-node3, k8s-node4, k8s-node5 (node1/2 excluded)
|
||||
# k8s-master, k8s-node3, k8s-node4, k8s-node5
|
||||
# (node1 has GPU workloads, node2 excluded from MySQL anti-affinity only — both are active cluster members)
|
||||
```
|
||||
|
||||
### Phase 2: Kubernetes Control Plane
|
||||
|
|
@ -84,7 +85,7 @@ kubectl wait --for=condition=Ready cluster/pg-cluster -n dbaas --timeout=600s
|
|||
### Phase 7: Application Services
|
||||
```bash
|
||||
# 17. Deploy remaining stacks in any order
|
||||
for stack in vaultwarden immich nextcloud linkwarden trading health; do
|
||||
for stack in vaultwarden immich nextcloud linkwarden health; do
|
||||
scripts/tg apply stacks/$stack
|
||||
done
|
||||
|
||||
|
|
|
|||
|
|
@ -6,7 +6,7 @@
|
|||
- Backup dump available on NFS at `/mnt/main/mysql-backup/`
|
||||
|
||||
## Backup Location
|
||||
- NFS: `/mnt/main/mysql-backup/dump_YYYY_MM_DD_HH_MM.sql`
|
||||
- NFS: `/mnt/main/mysql-backup/dump_YYYY_MM_DD_HH_MM.sql.gz`
|
||||
- Replicated to Synology NAS (192.168.1.13) via TrueNAS ZFS replication
|
||||
- Retention: 14 days
|
||||
- Size: ~11MB per dump
|
||||
|
|
@ -34,8 +34,8 @@ kubectl port-forward svc/mysql -n dbaas 3307:3306 &
|
|||
# Get root password
|
||||
ROOT_PWD=$(kubectl get secret cluster-secret -n dbaas -o jsonpath='{.data.ROOT_PASSWORD}' | base64 -d)
|
||||
|
||||
# Restore (use --host to avoid unix socket, specify non-default port)
|
||||
mysql -u root -p"$ROOT_PWD" --host 127.0.0.1 --port 3307 < /path/to/dump_YYYY_MM_DD_HH_MM.sql
|
||||
# Restore (decompress and pipe to mysql, use --host to avoid unix socket, specify non-default port)
|
||||
zcat /path/to/dump_YYYY_MM_DD_HH_MM.sql.gz | mysql -u root -p"$ROOT_PWD" --host 127.0.0.1 --port 3307
|
||||
```
|
||||
|
||||
### 3. Option B: Restore via in-cluster pod
|
||||
|
|
@ -43,7 +43,7 @@ mysql -u root -p"$ROOT_PWD" --host 127.0.0.1 --port 3307 < /path/to/dump_YYYY_MM
|
|||
ROOT_PWD=$(kubectl get secret cluster-secret -n dbaas -o jsonpath='{.data.ROOT_PASSWORD}' | base64 -d)
|
||||
|
||||
kubectl run mysql-restore --rm -it --image=mysql \
|
||||
--overrides='{"spec":{"volumes":[{"name":"backup","persistentVolumeClaim":{"claimName":"dbaas-mysql-backup"}}],"containers":[{"name":"mysql-restore","image":"mysql","env":[{"name":"MYSQL_PWD","value":"'$ROOT_PWD'"}],"volumeMounts":[{"name":"backup","mountPath":"/backup"}],"command":["mysql","-u","root","--host","mysql.dbaas.svc.cluster.local","<","/backup/dump_YYYY_MM_DD_HH_MM.sql"]}]}}' \
|
||||
--overrides='{"spec":{"volumes":[{"name":"backup","persistentVolumeClaim":{"claimName":"dbaas-mysql-backup"}}],"containers":[{"name":"mysql-restore","image":"mysql","env":[{"name":"MYSQL_PWD","value":"'$ROOT_PWD'"}],"volumeMounts":[{"name":"backup","mountPath":"/backup"}],"command":["/bin/sh","-c","zcat /backup/dump_YYYY_MM_DD_HH_MM.sql.gz | mysql -u root --host mysql.dbaas.svc.cluster.local"]}]}}' \
|
||||
-n dbaas
|
||||
```
|
||||
|
||||
|
|
@ -56,7 +56,7 @@ mysql -u root -p"$ROOT_PWD" --host 127.0.0.1 --port 3307 -e "SHOW DATABASES;"
|
|||
mysql -u root -p"$ROOT_PWD" --host 127.0.0.1 --port 3307 -e "SELECT * FROM performance_schema.replication_group_members;"
|
||||
|
||||
# Check table counts for key databases
|
||||
for db in speedtest wrongmove codimd nextcloud shlink grafana; do
|
||||
for db in speedtest wrongmove codimd nextcloud shlink grafana technitium; do
|
||||
echo "=== $db ==="
|
||||
mysql -u root -p"$ROOT_PWD" --host 127.0.0.1 --port 3307 -e "SELECT TABLE_NAME, TABLE_ROWS FROM information_schema.TABLES WHERE TABLE_SCHEMA='$db' ORDER BY TABLE_ROWS DESC LIMIT 5;"
|
||||
done
|
||||
|
|
@ -79,6 +79,8 @@ kubectl exec -n dbaas mysql-cluster-0 -c mysql -- mysql -u root -p"$ROOT_PWD" \
|
|||
#
|
||||
# For technitium specifically, also run the password sync CronJob:
|
||||
# kubectl create job --from=cronjob/technitium-password-sync technitium-pw-resync -n technitium
|
||||
#
|
||||
# Note: forgejo and uptimekuma may be legacy users not managed by Vault rotation.
|
||||
```
|
||||
|
||||
### 6. InnoDB Cluster Recovery
|
||||
|
|
|
|||
|
|
@ -7,7 +7,7 @@
|
|||
- PostgreSQL superuser password (from `pg-cluster-superuser` secret in `dbaas` namespace)
|
||||
|
||||
## Backup Location
|
||||
- NFS: `/mnt/main/postgresql-backup/dump_YYYY_MM_DD_HH_MM.sql`
|
||||
- NFS: `/mnt/main/postgresql-backup/dump_YYYY_MM_DD_HH_MM.sql.gz`
|
||||
- Replicated to Synology NAS (192.168.1.13) via TrueNAS ZFS replication
|
||||
- Retention: 14 days
|
||||
|
||||
|
|
@ -34,9 +34,9 @@ kubectl get secret pg-cluster-superuser -n dbaas -o jsonpath='{.data.password}'
|
|||
# Port-forward to the CNPG primary
|
||||
kubectl port-forward svc/pg-cluster-rw -n dbaas 5433:5432 &
|
||||
|
||||
# Restore (this will overwrite existing data)
|
||||
# Restore (decompress and pipe to psql — this will overwrite existing data)
|
||||
PGPASSWORD=$(kubectl get secret pg-cluster-superuser -n dbaas -o jsonpath='{.data.password}' | base64 -d) \
|
||||
psql -h 127.0.0.1 -p 5433 -U postgres -f /path/to/dump_YYYY_MM_DD_HH_MM.sql
|
||||
zcat /path/to/dump_YYYY_MM_DD_HH_MM.sql.gz | psql -h 127.0.0.1 -p 5433 -U postgres
|
||||
```
|
||||
|
||||
### 3. Option B: Rebuild CNPG cluster from scratch
|
||||
|
|
@ -56,7 +56,7 @@ kubectl wait --for=condition=Ready cluster/pg-cluster -n dbaas --timeout=300s
|
|||
# 5. Restore the dump
|
||||
PGPASSWORD=$(kubectl get secret pg-cluster-superuser -n dbaas -o jsonpath='{.data.password}' | base64 -d) \
|
||||
kubectl run pg-restore --rm -it --image=postgres:16.4-bullseye \
|
||||
--overrides='{"spec":{"volumes":[{"name":"backup","persistentVolumeClaim":{"claimName":"dbaas-postgresql-backup"}}],"containers":[{"name":"pg-restore","image":"postgres:16.4-bullseye","env":[{"name":"PGPASSWORD","value":"'$PGPASSWORD'"}],"volumeMounts":[{"name":"backup","mountPath":"/backup"}],"command":["psql","-h","pg-cluster-rw.dbaas","-U","postgres","-f","/backup/dump_YYYY_MM_DD_HH_MM.sql"]}]}}' \
|
||||
--overrides='{"spec":{"volumes":[{"name":"backup","persistentVolumeClaim":{"claimName":"dbaas-postgresql-backup"}}],"containers":[{"name":"pg-restore","image":"postgres:16.4-bullseye","env":[{"name":"PGPASSWORD","value":"'$PGPASSWORD'"}],"volumeMounts":[{"name":"backup","mountPath":"/backup"}],"command":["/bin/sh","-c","zcat /backup/dump_YYYY_MM_DD_HH_MM.sql.gz | psql -h pg-cluster-rw.dbaas -U postgres"]}]}}' \
|
||||
-n dbaas
|
||||
```
|
||||
|
||||
|
|
@ -66,7 +66,7 @@ PGPASSWORD=$(kubectl get secret pg-cluster-superuser -n dbaas -o jsonpath='{.dat
|
|||
PGPASSWORD=$PGPASSWORD psql -h 127.0.0.1 -p 5433 -U postgres -c "\l"
|
||||
|
||||
# Check table counts for critical databases
|
||||
for db in trading health linkwarden affine woodpecker claude_memory; do
|
||||
for db in health linkwarden affine woodpecker claude_memory; do
|
||||
echo "=== $db ==="
|
||||
PGPASSWORD=$PGPASSWORD psql -h 127.0.0.1 -p 5433 -U postgres -d $db -c \
|
||||
"SELECT schemaname, tablename, n_live_tup FROM pg_stat_user_tables ORDER BY n_live_tup DESC LIMIT 5;"
|
||||
|
|
@ -76,10 +76,9 @@ done
|
|||
### 5. Restart dependent services
|
||||
After restore, restart services that connect to PostgreSQL to pick up fresh connections:
|
||||
```bash
|
||||
kubectl rollout restart deployment -n trading
|
||||
kubectl rollout restart deployment -n health
|
||||
kubectl rollout restart deployment -n linkwarden
|
||||
# ... repeat for all 12 PG-dependent services
|
||||
# ... repeat for all PG-dependent services (excluding trading — disabled)
|
||||
```
|
||||
|
||||
## Restore from Synology (if TrueNAS is down)
|
||||
|
|
|
|||
|
|
@ -10,7 +10,7 @@
|
|||
- NFS: `/mnt/main/vault-backup/vault-raft-YYYYMMDD-HHMMSS.db`
|
||||
- Replicated to Synology NAS (192.168.1.13) via TrueNAS ZFS replication
|
||||
- Retention: 30 days
|
||||
- Schedule: Daily at 02:00
|
||||
- Schedule: Weekly on Sundays at 02:00 (`0 2 * * 0`)
|
||||
|
||||
## CRITICAL: Vault is a dependency for many services
|
||||
Vault provides secrets to the entire cluster via ESO (External Secrets Operator). A Vault outage affects:
|
||||
|
|
@ -44,6 +44,11 @@ vault operator raft snapshot restore -force /path/to/vault-raft-YYYYMMDD-HHMMSS.
|
|||
```
|
||||
|
||||
### 3. Unseal Vault (if sealed after restore)
|
||||
|
||||
> **Note:** Vault now has an auto-unseal sidecar that automatically unseals pods
|
||||
> using the `vault-unseal-key` K8s Secret. The manual procedure below is a
|
||||
> fallback if auto-unseal fails.
|
||||
|
||||
```bash
|
||||
# Check seal status
|
||||
vault status
|
||||
|
|
|
|||
|
|
@ -41,7 +41,7 @@ kubectl scale deployment vaultwarden -n vaultwarden --replicas=0
|
|||
BACKUP_DIR="YYYY_MM_DD_HH_MM" # Set to desired backup
|
||||
|
||||
kubectl run vw-restore --rm -it --image=alpine \
|
||||
--overrides='{"spec":{"volumes":[{"name":"backup","persistentVolumeClaim":{"claimName":"vaultwarden-backup"}},{"name":"data","persistentVolumeClaim":{"claimName":"vaultwarden-data-iscsi"}}],"containers":[{"name":"vw-restore","image":"alpine","volumeMounts":[{"name":"backup","mountPath":"/backup"},{"name":"data","mountPath":"/data"}],"command":["/bin/sh","-c","cp /backup/'$BACKUP_DIR'/db.sqlite3 /data/db.sqlite3 && cp /backup/'$BACKUP_DIR'/rsa_key.pem /data/ && cp /backup/'$BACKUP_DIR'/rsa_key.pub.pem /data/ && cp -a /backup/'$BACKUP_DIR'/attachments /data/ 2>/dev/null; echo Restore complete"]}]}}' \
|
||||
--overrides='{"spec":{"volumes":[{"name":"backup","persistentVolumeClaim":{"claimName":"vaultwarden-backup"}},{"name":"data","persistentVolumeClaim":{"claimName":"vaultwarden-data-proxmox"}}],"containers":[{"name":"vw-restore","image":"alpine","volumeMounts":[{"name":"backup","mountPath":"/backup"},{"name":"data","mountPath":"/data"}],"command":["/bin/sh","-c","cp /backup/'$BACKUP_DIR'/db.sqlite3 /data/db.sqlite3 && cp /backup/'$BACKUP_DIR'/rsa_key.pem /data/ && cp /backup/'$BACKUP_DIR'/rsa_key.pub.pem /data/ && cp -a /backup/'$BACKUP_DIR'/attachments /data/ 2>/dev/null; echo Restore complete"]}]}}' \
|
||||
-n vaultwarden
|
||||
```
|
||||
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue