docs: add phpIPAM, Kea DDNS, and DNS sync documentation

- networking.md: Add phpIPAM IPAM section, Kea DDNS config, reverse DNS zones,
  Technitium dynamic update policy
- CLAUDE.md: Add phpipam to DB rotation list, service notes, networking section
- service-catalog.md: Add phpipam, mark netbox as disabled/replaced

[ci skip]

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
Viktor Barzin 2026-04-10 16:01:32 +00:00
parent a86394f12b
commit 8cd8743140
3 changed files with 21 additions and 6 deletions

View file

@ -55,7 +55,7 @@ Violations cause state drift, which causes future applies to break or silently r
- **ESO (External Secrets Operator)**: `stacks/external-secrets/` — 43 ExternalSecrets + 9 DB-creds ExternalSecrets. API version `v1beta1`. Two ClusterSecretStores: `vault-kv` and `vault-database`.
- **Plan-time pattern**: Former plan-time stacks use `data "kubernetes_secret"` to read ESO-created K8s Secrets at plan time (no Vault dependency). First-apply gotcha: must `terragrunt apply -target=kubernetes_manifest.external_secret` first, then full apply. `count` on resources using secret values fails — remove conditional counts.
- **14 hybrid stacks** still keep `data "vault_kv_secret_v2"` for plan-time needs (job commands, Helm templatefile, module inputs). Platform has 48 plan-time refs — no migration possible without restructuring modules.
- **Database rotation**: Vault DB engine rotates passwords every 7 days (604800s). MySQL: speedtest, wrongmove, codimd, nextcloud, shlink, grafana, technitium. PostgreSQL: health, linkwarden, affine, woodpecker, claude_memory. Excluded: authentik (PgBouncer), crowdsec (Helm-baked), root users. Technitium uses a password-sync CronJob (every 6h) to push rotated password to the Technitium app config via API.
- **Database rotation**: Vault DB engine rotates passwords every 7 days (604800s). MySQL: speedtest, wrongmove, codimd, nextcloud, shlink, grafana, technitium, phpipam. PostgreSQL: health, linkwarden, affine, woodpecker, claude_memory. Excluded: authentik (PgBouncer), crowdsec (Helm-baked), root users. Technitium uses a password-sync CronJob (every 6h) to push rotated password to the Technitium app config via API.
- **K8s credentials**: Vault K8s secrets engine. Roles: `dashboard-admin`, `ci-deployer`, `openclaw`, `local-admin`. Use `vault write kubernetes/creds/ROLE kubernetes_namespace=NS`. Helper: `scripts/vault-kubeconfig`.
- **CI/CD (GHA + Woodpecker)**: Docker builds run on **GitHub Actions** (free on public repos). Woodpecker is **deploy-only** — receives image tag via API POST, runs `kubectl set image`. Woodpecker authenticates via K8s SA JWT → Vault K8s auth. Sync CronJob pushes `secret/ci/global` → Woodpecker API every 6h. Shell scripts in HCL heredocs: escape `$``$$`, `%{}``%%{}`.
- **Platform cannot depend on vault** (circular). Apply order: vault first, then platform. Platform has 48 vault refs, all in module inputs — no ESO migration possible.
@ -112,6 +112,7 @@ Repo IDs: infra=1, Website=2, finance=3, health=4, travel_blog=5, webhook-handle
- **Rate limiting**: Return 429 (not 503). Per-service tuning: Immich/Nextcloud need higher limits.
- **Retry middleware**: 2 attempts, 100ms — in default ingress chain.
- **HTTP/3 (QUIC)**: Enabled cluster-wide via Traefik.
- **IPAM & DNS auto-registration**: phpIPAM discovers hosts (fping every 15min). Kea DDNS on pfSense auto-registers VLAN 10/20 hosts in Technitium (RFC 2136). CronJob `phpipam-dns-sync` syncs remaining named hosts (192.168.1.x, VPN) → Technitium A+PTR records every 15min. Technitium zones accept dynamic updates from pfSense IPs.
## Service-Specific Notes
| Service | Key Operational Knowledge |
@ -123,6 +124,7 @@ Repo IDs: infra=1, Website=2, finance=3, health=4, travel_blog=5, webhook-handle
| Authentik | 3 replicas, PgBouncer in front of PostgreSQL, strip auth headers before forwarding |
| Kyverno | failurePolicy=Ignore to prevent blocking cluster, pin chart version |
| MySQL InnoDB | Enable auto-recovery, anti-affinity excludes k8s-node1 (GPU), 2Gi req / 3Gi limit |
| phpIPAM | IPAM with auto-discovery (fping every 15min). DNS sync CronJob pushes named hosts → Technitium. Kea DDNS handles VLAN 10/20; CronJob handles 192.168.1.x. API app `claude` (ssl_token). Cron container needs NET_RAW + 512Mi. |
## Monitoring & Alerting
- Alert cascade inhibitions: if node is down, suppress pod alerts on that node.

View file

@ -95,7 +95,8 @@
| stirling-pdf | PDF tools | stirling-pdf |
| speedtest | Speed testing | speedtest |
| freedify | Music streaming (factory pattern) | freedify |
| netbox | Network documentation | netbox |
| phpipam | IP Address Management (IPAM) + auto-discovery | phpipam |
| ~~netbox~~ | ~~Network documentation~~ (disabled, replaced by phpipam) | netbox |
| infra-maintenance | Maintenance jobs | infra-maintenance |
| ollama | LLM server (GPU) | ollama |
| frigate | NVR/camera (GPU) | frigate |
@ -117,7 +118,7 @@
blog, hackmd, privatebin, url, echo, f1tv, excalidraw, send,
audiobookshelf, jsoncrack, ntfy, cyberchef, homepage, linkwarden,
changedetection, tandoor, n8n, stirling-pdf, dashy, city-guesser,
travel, netbox
travel, netbox, phpipam
```
### Non-Proxied (Direct DNS)

View file

@ -1,6 +1,6 @@
# Networking Architecture
Last updated: 2026-04-08
Last updated: 2026-04-10
## Overview
@ -248,15 +248,27 @@ Containerd on all K8s nodes uses `hosts.toml` to redirect pulls to the local cac
### Key Configuration Files
**pfSense**:
- Terraform: `stacks/pfsense/main.tf`
- DHCP scope: 10.0.20.50-250 (VLAN 20)
- Config: Not Terraform-managed (pfSense web UI / config.xml)
- DHCP: Kea DHCP4 on VLAN 10 (10.0.10.0/24) and VLAN 20 (10.0.20.0/24)
- DHCP DDNS: Kea DHCP-DDNS sends RFC 2136 updates to Technitium on lease grant
- Firewall rules: Allow K8s egress, block inter-VLAN by default
**Technitium**:
- Config: Stored in PVC `technitium-data`
- Zone file: `viktorbarzin.lan` (A records for all internal hosts)
- Reverse zones: `10.0.10.in-addr.arpa`, `20.0.10.in-addr.arpa`, `1.168.192.in-addr.arpa`, `2.3.10.in-addr.arpa`, `0.168.192.in-addr.arpa`
- Dynamic updates: Enabled (UseSpecifiedNetworkACL) from pfSense IPs (10.0.20.1, 10.0.10.1)
- Forwarders: Cloudflare 1.1.1.1, Google 8.8.8.8
**phpIPAM (IP Address Management)**:
- Stack: `stacks/phpipam/`
- Web UI: `phpipam.viktorbarzin.me` (Authentik-protected)
- Database: MySQL InnoDB cluster (`mysql.dbaas.svc.cluster.local`)
- Auto-discovery: fping scan every 15min via `phpipam-cron` container
- Subnets tracked: 10.0.10.0/24, 10.0.20.0/24, 192.168.1.0/24, 10.3.2.0/24, 192.168.8.0/24, 192.168.0.0/24
- DNS sync: CronJob `phpipam-dns-sync` pushes named hosts → Technitium A+PTR records every 15min
- API: REST API enabled (app `claude`, ssl_token auth), MCP server available for agent access
**Traefik Middleware**:
- Helm values: `stacks/platform/traefik-values.yaml`
- Middleware CRDs: Generated by `ingress_factory` module