2026-04-10 14:19:25 +00:00
|
|
|
variable "tls_secret_name" {
|
|
|
|
|
type = string
|
|
|
|
|
sensitive = true
|
|
|
|
|
}
|
|
|
|
|
variable "mysql_host" { type = string }
|
|
|
|
|
|
2026-04-10 14:47:24 +00:00
|
|
|
data "vault_kv_secret_v2" "secrets" {
|
|
|
|
|
mount = "secret"
|
|
|
|
|
name = "platform"
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
locals {
|
|
|
|
|
technitium_password = data.vault_kv_secret_v2.secrets.data["technitium_password"]
|
|
|
|
|
}
|
|
|
|
|
|
2026-04-10 14:19:25 +00:00
|
|
|
resource "kubernetes_namespace" "phpipam" {
|
|
|
|
|
metadata {
|
|
|
|
|
name = "phpipam"
|
|
|
|
|
labels = {
|
|
|
|
|
tier = local.tiers.aux
|
|
|
|
|
}
|
|
|
|
|
}
|
[infra] Suppress Goldilocks vpa-update-mode label drift on all namespaces [ci skip]
## Context
Wave 3B-continued: the Goldilocks VPA dashboard (stacks/vpa) runs a Kyverno
ClusterPolicy `goldilocks-vpa-auto-mode` that mutates every namespace with
`metadata.labels["goldilocks.fairwinds.com/vpa-update-mode"] = "off"`. This
is intentional — Terraform owns container resource limits, and Goldilocks
should only provide recommendations, never auto-update. The label is how
Goldilocks decides per-namespace whether to run its VPA in `off` mode.
Effect on Terraform: every `kubernetes_namespace` resource shows the label
as pending-removal (`-> null`) on every `scripts/tg plan`. Dawarich survey
2026-04-18 confirmed the drift. Cluster-side count: 88 namespaces carry the
label (`kubectl get ns -o json | jq ... | wc -l`). Every TF-managed namespace
is affected.
This commit brings the intentional admission drift under the same
`# KYVERNO_LIFECYCLE_V1` discoverability marker introduced in c9d221d5 for
the ndots dns_config pattern. The marker now stands generically for any
Kyverno admission-webhook drift suppression; the inline comment records
which specific policy stamps which specific field so future grep audits
show why each suppression exists.
## This change
107 `.tf` files touched — every stack's `resource "kubernetes_namespace"`
resource gets:
```hcl
lifecycle {
# KYVERNO_LIFECYCLE_V1: goldilocks-vpa-auto-mode ClusterPolicy stamps this label on every namespace
ignore_changes = [metadata[0].labels["goldilocks.fairwinds.com/vpa-update-mode"]]
}
```
Injection was done with a brace-depth-tracking Python pass (`/tmp/add_goldilocks_ignore.py`):
match `^resource "kubernetes_namespace" ` → track `{` / `}` until the
outermost closing brace → insert the lifecycle block before the closing
brace. The script is idempotent (skips any file that already mentions
`goldilocks.fairwinds.com/vpa-update-mode`) so re-running is safe.
Vault stack picked up 2 namespaces in the same file (k8s-users produces
one, plus a second explicit ns) — confirmed via file diff (+8 lines).
## What is NOT in this change
- `stacks/trading-bot/main.tf` — entire file is `/* … */` commented out
(paused 2026-04-06 per user decision). Reverted after the script ran.
- `stacks/_template/main.tf.example` — per-stack skeleton, intentionally
minimal. User keeps it that way. Not touched by the script (file
has no real `resource "kubernetes_namespace"` — only a placeholder
comment).
- `.terraform/` copies (e.g. `stacks/metallb/.terraform/modules/...`) —
gitignored, won't commit; the live path was edited.
- `terraform fmt` cleanup of adjacent pre-existing alignment issues in
authentik, freedify, hermes-agent, nvidia, vault, meshcentral. Reverted
to keep the commit scoped to the Goldilocks sweep. Those files will
need a separate fmt-only commit or will be cleaned up on next real
apply to that stack.
## Verification
Dawarich (one of the hundred-plus touched stacks) showed the pattern
before and after:
```
$ cd stacks/dawarich && ../../scripts/tg plan
Before:
Plan: 0 to add, 2 to change, 0 to destroy.
# kubernetes_namespace.dawarich will be updated in-place
(goldilocks.fairwinds.com/vpa-update-mode -> null)
# module.tls_secret.kubernetes_secret.tls_secret will be updated in-place
(Kyverno generate.* labels — fixed in 8d94688d)
After:
No changes. Your infrastructure matches the configuration.
```
Injection count check:
```
$ rg -c 'KYVERNO_LIFECYCLE_V1: goldilocks-vpa-auto-mode' stacks/ | awk -F: '{s+=$2} END {print s}'
108
```
## Reproduce locally
1. `git pull`
2. Pick any stack: `cd stacks/<name> && ../../scripts/tg plan`
3. Expect: no drift on the namespace's goldilocks.fairwinds.com/vpa-update-mode label.
Closes: code-dwx
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 21:15:27 +00:00
|
|
|
lifecycle {
|
|
|
|
|
# KYVERNO_LIFECYCLE_V1: goldilocks-vpa-auto-mode ClusterPolicy stamps this label on every namespace
|
|
|
|
|
ignore_changes = [metadata[0].labels["goldilocks.fairwinds.com/vpa-update-mode"]]
|
|
|
|
|
}
|
2026-04-10 14:19:25 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
|
|
resource "kubernetes_manifest" "external_secret" {
|
|
|
|
|
manifest = {
|
|
|
|
|
apiVersion = "external-secrets.io/v1beta1"
|
|
|
|
|
kind = "ExternalSecret"
|
|
|
|
|
metadata = {
|
|
|
|
|
name = "phpipam-secrets"
|
|
|
|
|
namespace = "phpipam"
|
|
|
|
|
}
|
|
|
|
|
spec = {
|
|
|
|
|
refreshInterval = "15m"
|
|
|
|
|
secretStoreRef = {
|
|
|
|
|
name = "vault-database"
|
|
|
|
|
kind = "ClusterSecretStore"
|
|
|
|
|
}
|
|
|
|
|
target = {
|
|
|
|
|
name = "phpipam-secrets"
|
|
|
|
|
}
|
|
|
|
|
data = [{
|
|
|
|
|
secretKey = "db_password"
|
|
|
|
|
remoteRef = {
|
|
|
|
|
key = "static-creds/mysql-phpipam"
|
|
|
|
|
property = "password"
|
|
|
|
|
}
|
|
|
|
|
}]
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
depends_on = [kubernetes_namespace.phpipam]
|
|
|
|
|
}
|
|
|
|
|
|
2026-04-10 20:37:28 +00:00
|
|
|
resource "kubernetes_manifest" "external_secret_pfsense_ssh" {
|
|
|
|
|
manifest = {
|
|
|
|
|
apiVersion = "external-secrets.io/v1beta1"
|
|
|
|
|
kind = "ExternalSecret"
|
|
|
|
|
metadata = {
|
|
|
|
|
name = "phpipam-pfsense-ssh"
|
|
|
|
|
namespace = "phpipam"
|
|
|
|
|
}
|
|
|
|
|
spec = {
|
|
|
|
|
refreshInterval = "1h"
|
|
|
|
|
secretStoreRef = {
|
|
|
|
|
name = "vault-kv"
|
|
|
|
|
kind = "ClusterSecretStore"
|
|
|
|
|
}
|
|
|
|
|
target = {
|
|
|
|
|
name = "phpipam-pfsense-ssh"
|
|
|
|
|
}
|
|
|
|
|
data = [{
|
|
|
|
|
secretKey = "ssh_key"
|
|
|
|
|
remoteRef = {
|
|
|
|
|
key = "viktor"
|
|
|
|
|
property = "phpipam_pfsense_ssh_key"
|
|
|
|
|
}
|
|
|
|
|
}]
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
depends_on = [kubernetes_namespace.phpipam]
|
|
|
|
|
}
|
|
|
|
|
|
2026-04-10 14:47:24 +00:00
|
|
|
resource "kubernetes_manifest" "external_secret_admin" {
|
|
|
|
|
manifest = {
|
|
|
|
|
apiVersion = "external-secrets.io/v1beta1"
|
|
|
|
|
kind = "ExternalSecret"
|
|
|
|
|
metadata = {
|
|
|
|
|
name = "phpipam-admin-password"
|
|
|
|
|
namespace = "phpipam"
|
|
|
|
|
}
|
|
|
|
|
spec = {
|
|
|
|
|
refreshInterval = "1h"
|
|
|
|
|
secretStoreRef = {
|
|
|
|
|
name = "vault-kv"
|
|
|
|
|
kind = "ClusterSecretStore"
|
|
|
|
|
}
|
|
|
|
|
target = {
|
|
|
|
|
name = "phpipam-admin-password"
|
|
|
|
|
}
|
|
|
|
|
data = [{
|
|
|
|
|
secretKey = "password"
|
|
|
|
|
remoteRef = {
|
|
|
|
|
key = "viktor"
|
|
|
|
|
property = "phpipam_admin_password"
|
|
|
|
|
}
|
|
|
|
|
}]
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
depends_on = [kubernetes_namespace.phpipam]
|
|
|
|
|
}
|
|
|
|
|
|
2026-04-10 14:19:25 +00:00
|
|
|
module "tls_secret" {
|
|
|
|
|
source = "../../modules/kubernetes/setup_tls_secret"
|
|
|
|
|
namespace = kubernetes_namespace.phpipam.metadata[0].name
|
|
|
|
|
tls_secret_name = var.tls_secret_name
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
resource "kubernetes_deployment" "phpipam_web" {
|
|
|
|
|
metadata {
|
|
|
|
|
name = "phpipam-web"
|
|
|
|
|
namespace = kubernetes_namespace.phpipam.metadata[0].name
|
|
|
|
|
labels = {
|
|
|
|
|
app = "phpipam"
|
|
|
|
|
tier = local.tiers.aux
|
|
|
|
|
}
|
|
|
|
|
annotations = {
|
|
|
|
|
"reloader.stakater.com/auto" = "true"
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
spec {
|
|
|
|
|
replicas = 1
|
|
|
|
|
strategy {
|
|
|
|
|
type = "Recreate"
|
|
|
|
|
}
|
|
|
|
|
selector {
|
|
|
|
|
match_labels = {
|
|
|
|
|
app = "phpipam"
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
template {
|
|
|
|
|
metadata {
|
|
|
|
|
labels = {
|
|
|
|
|
app = "phpipam"
|
|
|
|
|
}
|
|
|
|
|
annotations = {
|
|
|
|
|
"diun.enable" = "true"
|
|
|
|
|
"dependency.kyverno.io/wait-for" = "mysql.dbaas:3306"
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
spec {
|
|
|
|
|
container {
|
2026-04-16 16:53:21 +00:00
|
|
|
image = "phpipam/phpipam-www:v1.7.4"
|
2026-04-10 14:19:25 +00:00
|
|
|
name = "phpipam-web"
|
|
|
|
|
port {
|
|
|
|
|
container_port = 80
|
|
|
|
|
}
|
|
|
|
|
env {
|
|
|
|
|
name = "TZ"
|
|
|
|
|
value = "Europe/Sofia"
|
|
|
|
|
}
|
|
|
|
|
env {
|
|
|
|
|
name = "IPAM_DATABASE_HOST"
|
|
|
|
|
value = var.mysql_host
|
|
|
|
|
}
|
|
|
|
|
env {
|
|
|
|
|
name = "IPAM_DATABASE_USER"
|
|
|
|
|
value = "phpipam"
|
|
|
|
|
}
|
|
|
|
|
env {
|
|
|
|
|
name = "IPAM_DATABASE_PASS"
|
|
|
|
|
value_from {
|
|
|
|
|
secret_key_ref {
|
|
|
|
|
name = "phpipam-secrets"
|
|
|
|
|
key = "db_password"
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
env {
|
|
|
|
|
name = "IPAM_DATABASE_NAME"
|
|
|
|
|
value = "phpipam"
|
|
|
|
|
}
|
|
|
|
|
env {
|
|
|
|
|
name = "IPAM_TRUST_X_FORWARDED"
|
|
|
|
|
value = "true"
|
|
|
|
|
}
|
|
|
|
|
resources {
|
|
|
|
|
requests = {
|
|
|
|
|
cpu = "10m"
|
|
|
|
|
memory = "64Mi"
|
|
|
|
|
}
|
|
|
|
|
limits = {
|
|
|
|
|
memory = "256Mi"
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
lifecycle {
|
[infra] Establish KYVERNO_LIFECYCLE_V1 drift-suppression convention [ci skip]
## Context
Phase 1 of the state-drift consolidation audit (plan Wave 3) identified that
the entire repo leans on a repeated `lifecycle { ignore_changes = [...dns_config] }`
snippet to suppress Kyverno's admission-webhook dns_config mutation (the ndots=2
override that prevents NxDomain search-domain flooding). 27 occurrences across
19 stacks. Without this suppression, every pod-owning resource shows perpetual
TF plan drift.
The original plan proposed a shared `modules/kubernetes/kyverno_lifecycle/`
module emitting the ignore-paths list as an output that stacks would consume in
their `ignore_changes` blocks. That approach is architecturally impossible:
Terraform's `ignore_changes` meta-argument accepts only static attribute paths
— it rejects module outputs, locals, variables, and any expression (the HCL
spec evaluates `lifecycle` before the regular expression graph). So a DRY
module cannot exist. The canonical pattern IS the repeated snippet.
What the snippet was missing was a *discoverability tag* so that (a) new
resources can be validated for compliance, (b) the existing 27 sites can be
grep'd in a single command, and (c) future maintainers understand the
convention rather than each reinventing it.
## This change
- Introduces `# KYVERNO_LIFECYCLE_V1` as the canonical marker comment.
Attached inline on every `spec[0].template[0].spec[0].dns_config` line
(or `spec[0].job_template[0].spec[0]...` for CronJobs) across all 27
existing suppression sites.
- Documents the convention with rationale and copy-paste snippets in
`AGENTS.md` → new "Kyverno Drift Suppression" section.
- Expands the existing `.claude/CLAUDE.md` Kyverno ndots note to reference
the marker and explain why the module approach is blocked.
- Updates `_template/main.tf.example` so every new stack starts compliant.
## What is NOT in this change
- The `kubernetes_manifest` Kyverno annotation drift (beads `code-seq`)
— that is Phase B with a sibling `# KYVERNO_MANIFEST_V1` marker.
- Behavioral changes — every `ignore_changes` list is byte-identical
save for the inline comment.
- The fallback module the original plan anticipated — skipped because
Terraform rejects expressions in `ignore_changes`.
- `terraform fmt` cleanup on adjacent unrelated blocks in three files
(claude-agent-service, freedify/factory, hermes-agent). Reverted to
keep this commit scoped to the convention rollout.
## Before / after
Before (cannot distinguish accidental-forgotten from intentional-convention):
```hcl
lifecycle {
ignore_changes = [spec[0].template[0].spec[0].dns_config]
}
```
After (greppable, self-documenting, discoverable by tooling):
```hcl
lifecycle {
ignore_changes = [spec[0].template[0].spec[0].dns_config] # KYVERNO_LIFECYCLE_V1
}
```
## Test Plan
### Automated
```
$ rg -c 'KYVERNO_LIFECYCLE_V1' stacks/ --include='*.tf' --include='*.tf.example' \
| awk -F: '{s+=$2} END {print s}'
27
$ git diff --stat | grep -E '\.(tf|tf\.example|md)$' | wc -l
21
# All code-file diffs are 1 insertion + 1 deletion per marker site,
# except beads-server (3), ebooks (4), immich (3), uptime-kuma (2).
$ git diff --stat stacks/ | tail -1
20 files changed, 45 insertions(+), 28 deletions(-)
```
### Manual Verification
No apply required — HCL comments only. Zero effect on any stack's plan output.
Future audits: `rg 'KYVERNO_LIFECYCLE_V1' stacks/ | wc -l` must grow as new
pod-owning resources are added.
## Reproduce locally
1. `cd infra && git pull`
2. `rg 'KYVERNO_LIFECYCLE_V1' stacks/` → expect 27 hits in 19 files
3. Grep any new `kubernetes_deployment` for the marker; absence = missing
suppression.
Closes: code-28m
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 14:15:51 +00:00
|
|
|
ignore_changes = [spec[0].template[0].spec[0].dns_config] # KYVERNO_LIFECYCLE_V1
|
2026-04-10 14:19:25 +00:00
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
2026-04-10 20:38:59 +00:00
|
|
|
# phpipam-cron container removed — discovery now handled by phpipam-pfsense-import CronJob
|
|
|
|
|
# which queries Kea DHCP leases + pfSense ARP table directly (no fping needed)
|
2026-04-10 14:19:25 +00:00
|
|
|
|
|
|
|
|
resource "kubernetes_service" "phpipam" {
|
|
|
|
|
metadata {
|
|
|
|
|
name = "phpipam"
|
|
|
|
|
namespace = kubernetes_namespace.phpipam.metadata[0].name
|
|
|
|
|
labels = {
|
|
|
|
|
app = "phpipam"
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
spec {
|
|
|
|
|
selector = {
|
|
|
|
|
app = "phpipam"
|
|
|
|
|
}
|
|
|
|
|
port {
|
|
|
|
|
name = "http"
|
|
|
|
|
port = 80
|
|
|
|
|
target_port = 80
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
module "ingress" {
|
|
|
|
|
source = "../../modules/kubernetes/ingress_factory"
|
2026-04-16 13:45:04 +00:00
|
|
|
dns_type = "proxied"
|
2026-04-10 14:19:25 +00:00
|
|
|
namespace = kubernetes_namespace.phpipam.metadata[0].name
|
|
|
|
|
name = "phpipam"
|
|
|
|
|
tls_secret_name = var.tls_secret_name
|
|
|
|
|
protected = true
|
|
|
|
|
extra_annotations = {
|
|
|
|
|
"gethomepage.dev/enabled" = "true"
|
|
|
|
|
"gethomepage.dev/name" = "phpIPAM"
|
|
|
|
|
"gethomepage.dev/description" = "IP Address Management"
|
|
|
|
|
"gethomepage.dev/icon" = "phpipam.png"
|
|
|
|
|
"gethomepage.dev/group" = "Infrastructure"
|
|
|
|
|
"gethomepage.dev/pod-selector" = ""
|
|
|
|
|
}
|
|
|
|
|
}
|
2026-04-10 14:47:24 +00:00
|
|
|
|
2026-04-10 20:29:25 +00:00
|
|
|
# CronJob: Bidirectional sync between phpIPAM and Technitium DNS
|
|
|
|
|
# 1. Push: named phpIPAM hosts → Technitium A + PTR records
|
|
|
|
|
# 2. Pull: Technitium reverse DNS → phpIPAM hostnames for unnamed entries
|
2026-04-10 14:47:24 +00:00
|
|
|
resource "kubernetes_cron_job_v1" "phpipam_dns_sync" {
|
|
|
|
|
metadata {
|
|
|
|
|
name = "phpipam-dns-sync"
|
|
|
|
|
namespace = kubernetes_namespace.phpipam.metadata[0].name
|
|
|
|
|
}
|
|
|
|
|
spec {
|
|
|
|
|
schedule = "*/15 * * * *"
|
|
|
|
|
successful_jobs_history_limit = 1
|
|
|
|
|
failed_jobs_history_limit = 3
|
|
|
|
|
concurrency_policy = "Forbid"
|
|
|
|
|
job_template {
|
|
|
|
|
metadata {}
|
|
|
|
|
spec {
|
|
|
|
|
backoff_limit = 1
|
|
|
|
|
template {
|
|
|
|
|
metadata {}
|
|
|
|
|
spec {
|
|
|
|
|
restart_policy = "Never"
|
|
|
|
|
container {
|
|
|
|
|
name = "sync"
|
|
|
|
|
image = "mysql:8.0"
|
|
|
|
|
command = ["/bin/bash", "-c", <<-EOT
|
|
|
|
|
set -e
|
|
|
|
|
TECH_URL="http://technitium-web.technitium.svc.cluster.local:5380"
|
|
|
|
|
|
|
|
|
|
# Login to Technitium
|
|
|
|
|
TECH_TOKEN=$$(curl -sf "$$TECH_URL/api/user/login?user=admin&pass=$$TECH_PASS" | sed 's/.*"token":"\([^"]*\)".*/\1/')
|
|
|
|
|
if [ -z "$$TECH_TOKEN" ]; then echo "Technitium login failed"; exit 1; fi
|
|
|
|
|
echo "Technitium auth OK"
|
|
|
|
|
|
|
|
|
|
# Query phpIPAM MySQL directly for hosts with hostnames
|
|
|
|
|
HOSTS=$$(mysql -h $$DB_HOST -u $$DB_USER -p$$DB_PASS $$DB_NAME -N -B -e \
|
|
|
|
|
"SELECT INET_NTOA(ip_addr), hostname FROM ipaddresses WHERE hostname != '' AND hostname IS NOT NULL AND subnetId >= 7")
|
|
|
|
|
|
|
|
|
|
SYNCED=0
|
|
|
|
|
echo "$$HOSTS" | while IFS=$$'\t' read -r IP HOSTNAME; do
|
|
|
|
|
[ -z "$$IP" ] || [ -z "$$HOSTNAME" ] && continue
|
|
|
|
|
SHORT=$$(echo "$$HOSTNAME" | cut -d. -f1)
|
|
|
|
|
FQDN="$$SHORT.viktorbarzin.lan"
|
|
|
|
|
|
|
|
|
|
# A record
|
|
|
|
|
curl -sf -o /dev/null -X POST "$$TECH_URL/api/zones/records/add?token=$$TECH_TOKEN" \
|
|
|
|
|
-d "domain=$$FQDN&zone=viktorbarzin.lan&type=A&ipAddress=$$IP&overwrite=true&ttl=300"
|
|
|
|
|
|
|
|
|
|
# PTR record
|
|
|
|
|
O1=$$(echo $$IP | cut -d. -f1); O2=$$(echo $$IP | cut -d. -f2)
|
|
|
|
|
O3=$$(echo $$IP | cut -d. -f3); O4=$$(echo $$IP | cut -d. -f4)
|
|
|
|
|
curl -sf -o /dev/null -X POST "$$TECH_URL/api/zones/records/add?token=$$TECH_TOKEN" \
|
|
|
|
|
-d "domain=$$O4.$$O3.$$O2.$$O1.in-addr.arpa&zone=$$O3.$$O2.$$O1.in-addr.arpa&type=PTR&ptrName=$$FQDN&overwrite=true&ttl=300" 2>/dev/null || true
|
|
|
|
|
|
|
|
|
|
SYNCED=$$((SYNCED + 1))
|
|
|
|
|
echo " $$IP -> $$FQDN"
|
|
|
|
|
done
|
2026-04-10 20:29:25 +00:00
|
|
|
echo "Push sync complete"
|
|
|
|
|
|
|
|
|
|
# Reverse sync: pull hostnames from DNS into phpIPAM for unnamed entries
|
|
|
|
|
echo ""
|
|
|
|
|
echo "=== Reverse sync: DNS -> phpIPAM ==="
|
|
|
|
|
UNNAMED=$$(mysql -h $$DB_HOST -u $$DB_USER -p$$DB_PASS $$DB_NAME -N -B -e \
|
|
|
|
|
"SELECT id, INET_NTOA(ip_addr) FROM ipaddresses WHERE (hostname IS NULL OR hostname = '') AND subnetId >= 7")
|
|
|
|
|
|
|
|
|
|
echo "$$UNNAMED" | while IFS=$$'\t' read -r ID IP; do
|
|
|
|
|
[ -z "$$ID" ] || [ -z "$$IP" ] && continue
|
|
|
|
|
# Query Technitium for PTR record
|
|
|
|
|
O1=$$(echo $$IP | cut -d. -f1); O2=$$(echo $$IP | cut -d. -f2)
|
|
|
|
|
O3=$$(echo $$IP | cut -d. -f3); O4=$$(echo $$IP | cut -d. -f4)
|
|
|
|
|
PTR_NAME="$$O4.$$O3.$$O2.$$O1.in-addr.arpa"
|
|
|
|
|
REV_ZONE="$$O3.$$O2.$$O1.in-addr.arpa"
|
|
|
|
|
RESULT=$$(curl -sf "$$TECH_URL/api/zones/records/get?token=$$TECH_TOKEN&domain=$$PTR_NAME&zone=$$REV_ZONE&type=PTR" 2>/dev/null)
|
|
|
|
|
HOSTNAME=$$(echo "$$RESULT" | sed -n 's/.*"ptrName":"\([^"]*\)".*/\1/p' | head -1)
|
|
|
|
|
[ -z "$$HOSTNAME" ] && continue
|
|
|
|
|
|
|
|
|
|
# Extract short name
|
|
|
|
|
SHORT=$$(echo "$$HOSTNAME" | cut -d. -f1)
|
|
|
|
|
[ -z "$$SHORT" ] && continue
|
|
|
|
|
|
|
|
|
|
# Update phpIPAM
|
|
|
|
|
mysql -h $$DB_HOST -u $$DB_USER -p$$DB_PASS $$DB_NAME -e \
|
|
|
|
|
"UPDATE ipaddresses SET hostname='$$SHORT' WHERE id=$$ID AND (hostname IS NULL OR hostname = '')"
|
|
|
|
|
echo " $$IP -> $$SHORT (from DNS)"
|
|
|
|
|
done
|
|
|
|
|
echo "Bidirectional sync complete"
|
2026-04-10 14:47:24 +00:00
|
|
|
EOT
|
|
|
|
|
]
|
|
|
|
|
env {
|
|
|
|
|
name = "TECH_PASS"
|
|
|
|
|
value = local.technitium_password
|
|
|
|
|
}
|
|
|
|
|
env {
|
|
|
|
|
name = "DB_HOST"
|
|
|
|
|
value = var.mysql_host
|
|
|
|
|
}
|
|
|
|
|
env {
|
|
|
|
|
name = "DB_USER"
|
|
|
|
|
value = "phpipam"
|
|
|
|
|
}
|
|
|
|
|
env {
|
|
|
|
|
name = "DB_PASS"
|
|
|
|
|
value_from {
|
|
|
|
|
secret_key_ref {
|
|
|
|
|
name = "phpipam-secrets"
|
|
|
|
|
key = "db_password"
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
env {
|
|
|
|
|
name = "DB_NAME"
|
|
|
|
|
value = "phpipam"
|
|
|
|
|
}
|
|
|
|
|
resources {
|
|
|
|
|
requests = {
|
|
|
|
|
cpu = "10m"
|
|
|
|
|
memory = "32Mi"
|
|
|
|
|
}
|
|
|
|
|
limits = {
|
|
|
|
|
memory = "128Mi"
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
[infra] Sweep dns_config ignore_changes across all pod-owning resources [ci skip]
## Context
Wave 3A (commit c9d221d5) added the `# KYVERNO_LIFECYCLE_V1` marker to the
27 pre-existing `ignore_changes = [...dns_config]` sites so they could be
grepped and audited. It did NOT address pod-owning resources that were
simply missing the suppression entirely. Post-Wave-3A sampling (2026-04-18)
found that navidrome, f1-stream, frigate, servarr, monitoring, crowdsec,
and many other stacks showed perpetual `dns_config` drift every plan
because their `kubernetes_deployment` / `kubernetes_stateful_set` /
`kubernetes_cron_job_v1` resources had no `lifecycle {}` block at all.
Root cause (same as Wave 3A): Kyverno's admission webhook stamps
`dns_config { option { name = "ndots"; value = "2" } }` on every pod's
`spec.template.spec.dns_config` to prevent NxDomain search-domain flooding
(see `k8s-ndots-search-domain-nxdomain-flood` skill). Without `ignore_changes`
on every Terraform-managed pod-owner, Terraform repeatedly tries to strip
the injected field.
## This change
Extends the Wave 3A convention by sweeping EVERY `kubernetes_deployment`,
`kubernetes_stateful_set`, `kubernetes_daemon_set`, `kubernetes_cron_job_v1`,
`kubernetes_job_v1` (+ their `_v1` variants) in the repo and ensuring each
carries the right `ignore_changes` path:
- **kubernetes_deployment / stateful_set / daemon_set / job_v1**:
`spec[0].template[0].spec[0].dns_config`
- **kubernetes_cron_job_v1**:
`spec[0].job_template[0].spec[0].template[0].spec[0].dns_config`
(extra `job_template[0]` nesting — the CronJob's PodTemplateSpec is
one level deeper)
Each injection / extension is tagged `# KYVERNO_LIFECYCLE_V1: Kyverno
admission webhook mutates dns_config with ndots=2` inline so the
suppression is discoverable via `rg 'KYVERNO_LIFECYCLE_V1' stacks/`.
Two insertion paths are handled by a Python pass (`/tmp/add_dns_config_ignore.py`):
1. **No existing `lifecycle {}`**: inject a brand-new block just before the
resource's closing `}`. 108 new blocks on 93 files.
2. **Existing `lifecycle {}` (usually for `DRIFT_WORKAROUND: CI owns image tag`
from Wave 4, commit a62b43d1)**: extend its `ignore_changes` list with the
dns_config path. Handles both inline (`= [x]`) and multiline
(`= [\n x,\n]`) forms; ensures the last pre-existing list item carries
a trailing comma so the extended list is valid HCL. 34 extensions.
The script skips anything already mentioning `dns_config` inside an
`ignore_changes`, so re-running is a no-op.
## Scale
- 142 total lifecycle injections/extensions
- 93 `.tf` files touched
- 108 brand-new `lifecycle {}` blocks + 34 extensions of existing ones
- Every Tier 0 and Tier 1 stack with a pod-owning resource is covered
- Together with Wave 3A's 27 pre-existing markers → **169 greppable
`KYVERNO_LIFECYCLE_V1` dns_config sites across the repo**
## What is NOT in this change
- `stacks/trading-bot/main.tf` — entirely commented-out block (`/* … */`).
Python script touched the file, reverted manually.
- `_template/main.tf.example` skeleton — kept minimal on purpose; any
future stack created from it should either inherit the Wave 3A one-line
form or add its own on first `kubernetes_deployment`.
- `terraform fmt` fixes to pre-existing alignment issues in meshcentral,
nvidia/modules/nvidia, vault — unrelated to this commit. Left for a
separate fmt-only pass.
- Non-pod resources (`kubernetes_service`, `kubernetes_secret`,
`kubernetes_manifest`, etc.) — they don't own pods so they don't get
Kyverno dns_config mutation.
## Verification
Random sample post-commit:
```
$ cd stacks/navidrome && ../../scripts/tg plan → No changes.
$ cd stacks/f1-stream && ../../scripts/tg plan → No changes.
$ cd stacks/frigate && ../../scripts/tg plan → No changes.
$ rg -c 'KYVERNO_LIFECYCLE_V1' stacks/ --include='*.tf' --include='*.tf.example' \
| awk -F: '{s+=$2} END {print s}'
169
```
## Reproduce locally
1. `git pull`
2. `rg 'KYVERNO_LIFECYCLE_V1' stacks/ | wc -l` → 169+
3. `cd stacks/navidrome && ../../scripts/tg plan` → expect 0 drift on
the deployment's dns_config field.
Refs: code-seq (Wave 3B dns_config class closed; kubernetes_manifest
annotation class handled separately in 8d94688d for tls_secret)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 21:19:48 +00:00
|
|
|
lifecycle {
|
|
|
|
|
# KYVERNO_LIFECYCLE_V1: Kyverno admission webhook mutates dns_config with ndots=2
|
|
|
|
|
ignore_changes = [spec[0].job_template[0].spec[0].template[0].spec[0].dns_config]
|
|
|
|
|
}
|
2026-04-10 14:47:24 +00:00
|
|
|
}
|
2026-04-10 20:37:28 +00:00
|
|
|
|
|
|
|
|
# CronJob: Import devices from pfSense (Kea DHCP leases + ARP table) into phpIPAM
|
|
|
|
|
# Replaces active fping scanning with passive data from pfSense
|
|
|
|
|
resource "kubernetes_cron_job_v1" "phpipam_pfsense_import" {
|
|
|
|
|
metadata {
|
|
|
|
|
name = "phpipam-pfsense-import"
|
|
|
|
|
namespace = kubernetes_namespace.phpipam.metadata[0].name
|
|
|
|
|
}
|
|
|
|
|
spec {
|
|
|
|
|
schedule = "*/5 * * * *"
|
|
|
|
|
successful_jobs_history_limit = 1
|
|
|
|
|
failed_jobs_history_limit = 3
|
|
|
|
|
concurrency_policy = "Forbid"
|
|
|
|
|
job_template {
|
|
|
|
|
metadata {}
|
|
|
|
|
spec {
|
|
|
|
|
backoff_limit = 1
|
|
|
|
|
template {
|
|
|
|
|
metadata {}
|
|
|
|
|
spec {
|
|
|
|
|
restart_policy = "Never"
|
|
|
|
|
container {
|
|
|
|
|
name = "import"
|
|
|
|
|
image = "alpine:3.21"
|
|
|
|
|
command = ["/bin/sh", "-c", <<-EOT
|
|
|
|
|
set -e
|
|
|
|
|
apk add --no-cache -q openssh-client mysql-client python3 > /dev/null 2>&1
|
|
|
|
|
|
|
|
|
|
# Setup SSH key
|
|
|
|
|
mkdir -p /root/.ssh
|
2026-04-10 21:24:40 +00:00
|
|
|
cp /ssh/ssh_key /root/.ssh/id_rsa
|
|
|
|
|
chmod 600 /root/.ssh/id_rsa
|
2026-04-10 20:37:28 +00:00
|
|
|
echo "StrictHostKeyChecking no" > /root/.ssh/config
|
|
|
|
|
|
|
|
|
|
# 1. Get Kea DHCP leases via control socket
|
|
|
|
|
echo "=== Fetching Kea leases ==="
|
|
|
|
|
LEASES=$$(ssh admin@10.0.20.1 'echo "{\"command\": \"lease4-get-all\"}" | /usr/bin/nc -U /tmp/kea4-ctrl-socket 2>/dev/null')
|
|
|
|
|
|
|
|
|
|
# 2. Get ARP table
|
|
|
|
|
echo "=== Fetching ARP table ==="
|
|
|
|
|
ARP=$$(ssh admin@10.0.20.1 'arp -an' 2>/dev/null)
|
|
|
|
|
|
2026-04-10 21:24:40 +00:00
|
|
|
# Remote sites handled by phpipam-remote-import CronJob (hourly)
|
2026-04-10 20:58:49 +00:00
|
|
|
|
2026-04-10 20:37:28 +00:00
|
|
|
# 3. Parse and import into phpIPAM MySQL
|
|
|
|
|
echo "=== Importing into phpIPAM ==="
|
|
|
|
|
export LEASES_DATA="$$LEASES"
|
|
|
|
|
export ARP_DATA="$$ARP"
|
|
|
|
|
python3 << 'PYEOF'
|
|
|
|
|
import json, subprocess, sys, re, os
|
|
|
|
|
|
|
|
|
|
db_host = os.environ["DB_HOST"]
|
|
|
|
|
db_user = os.environ["DB_USER"]
|
|
|
|
|
db_pass = os.environ["DB_PASS"]
|
|
|
|
|
db_name = os.environ["DB_NAME"]
|
|
|
|
|
|
|
|
|
|
def mysql_exec(sql):
|
|
|
|
|
r = subprocess.run(
|
|
|
|
|
["mysql", "-h", db_host, "-u", db_user, f"-p{db_pass}", db_name, "-N", "-B", "-e", sql],
|
|
|
|
|
capture_output=True, text=True
|
|
|
|
|
)
|
|
|
|
|
return r.stdout.strip()
|
|
|
|
|
|
|
|
|
|
# Get existing phpIPAM entries (subnetId >= 7 = our subnets)
|
|
|
|
|
existing = {}
|
|
|
|
|
rows = mysql_exec("SELECT INET_NTOA(ip_addr), hostname, mac, subnetId FROM ipaddresses WHERE subnetId >= 7")
|
|
|
|
|
for line in rows.split("\n"):
|
|
|
|
|
if not line: continue
|
|
|
|
|
parts = line.split("\t")
|
|
|
|
|
existing[parts[0]] = {"hostname": parts[1] if parts[1] != "NULL" else "", "mac": parts[2] if parts[2] != "NULL" else "", "subnetId": parts[3]}
|
|
|
|
|
|
|
|
|
|
# Subnet mapping
|
|
|
|
|
def get_subnet_id(ip):
|
|
|
|
|
if ip.startswith("10.0.10."): return 7
|
|
|
|
|
if ip.startswith("10.0.20."): return 8
|
|
|
|
|
if ip.startswith("192.168.1."): return 9
|
|
|
|
|
if ip.startswith("10.3.2."): return 10
|
|
|
|
|
if ip.startswith("192.168.8."): return 11
|
|
|
|
|
if ip.startswith("192.168.0."): return 12
|
|
|
|
|
return None
|
|
|
|
|
|
|
|
|
|
# Parse Kea leases
|
|
|
|
|
leases_raw = os.environ.get("LEASES_DATA", "{}")
|
|
|
|
|
try:
|
|
|
|
|
leases_json = json.loads(leases_raw)
|
|
|
|
|
leases = leases_json.get("arguments", {}).get("leases", []) if isinstance(leases_json, dict) else leases_json[0].get("arguments", {}).get("leases", [])
|
|
|
|
|
except:
|
|
|
|
|
leases = []
|
|
|
|
|
|
|
|
|
|
imported = 0
|
|
|
|
|
updated_mac = 0
|
|
|
|
|
updated_hostname = 0
|
|
|
|
|
|
|
|
|
|
for lease in leases:
|
|
|
|
|
ip = lease["ip-address"]
|
|
|
|
|
mac = lease.get("hw-address", "")
|
|
|
|
|
hostname = lease.get("hostname", "").split(".")[0] # strip .viktorbarzin.lan
|
|
|
|
|
subnet_id = get_subnet_id(ip)
|
|
|
|
|
if not subnet_id: continue
|
|
|
|
|
|
|
|
|
|
if ip not in existing:
|
|
|
|
|
# New host — insert
|
|
|
|
|
mac_sql = f"'{mac}'" if mac else "NULL"
|
|
|
|
|
host_sql = f"'{hostname}'" if hostname else "''"
|
|
|
|
|
mysql_exec(f"INSERT INTO ipaddresses (ip_addr, subnetId, hostname, mac, description, lastSeen) VALUES (INET_ATON('{ip}'), {subnet_id}, {host_sql}, {mac_sql}, '-- kea lease --', NOW())")
|
|
|
|
|
imported += 1
|
|
|
|
|
print(f" NEW {ip} -> {hostname} mac={mac}")
|
|
|
|
|
else:
|
|
|
|
|
# Existing — update MAC if missing, hostname if missing, lastSeen always
|
|
|
|
|
updates = ["lastSeen=NOW()"]
|
|
|
|
|
if mac and not existing[ip]["mac"]:
|
|
|
|
|
updates.append(f"mac='{mac}'")
|
|
|
|
|
updated_mac += 1
|
|
|
|
|
if hostname and not existing[ip]["hostname"]:
|
|
|
|
|
updates.append(f"hostname='{hostname}'")
|
|
|
|
|
updated_hostname += 1
|
|
|
|
|
mysql_exec(f"UPDATE ipaddresses SET {','.join(updates)} WHERE ip_addr=INET_ATON('{ip}')")
|
|
|
|
|
|
|
|
|
|
# Parse ARP table for devices not in Kea (static IPs)
|
|
|
|
|
arp_raw = os.environ.get("ARP_DATA", "")
|
|
|
|
|
lease_ips = {l["ip-address"] for l in leases}
|
|
|
|
|
|
|
|
|
|
for line in arp_raw.split("\n"):
|
|
|
|
|
m = re.match(r'\? \((\d+\.\d+\.\d+\.\d+)\) at ([0-9a-f:]+) on', line)
|
|
|
|
|
if not m: continue
|
|
|
|
|
ip, mac = m.group(1), m.group(2)
|
|
|
|
|
if mac == "(incomplete)": continue
|
|
|
|
|
subnet_id = get_subnet_id(ip)
|
|
|
|
|
if not subnet_id: continue
|
|
|
|
|
if ip in lease_ips: continue # already handled by Kea
|
|
|
|
|
|
|
|
|
|
if ip not in existing:
|
|
|
|
|
mysql_exec(f"INSERT INTO ipaddresses (ip_addr, subnetId, mac, description, lastSeen) VALUES (INET_ATON('{ip}'), {subnet_id}, '{mac}', '-- arp discovered --', NOW())")
|
|
|
|
|
imported += 1
|
|
|
|
|
print(f" NEW (arp) {ip} mac={mac}")
|
|
|
|
|
else:
|
|
|
|
|
updates = ["lastSeen=NOW()"]
|
|
|
|
|
if mac and not existing[ip]["mac"]:
|
|
|
|
|
updates.append(f"mac='{mac}'")
|
|
|
|
|
updated_mac += 1
|
|
|
|
|
mysql_exec(f"UPDATE ipaddresses SET {','.join(updates)} WHERE ip_addr=INET_ATON('{ip}')")
|
|
|
|
|
|
2026-04-10 21:24:40 +00:00
|
|
|
print(f"\nImported: {imported} new, Updated: {updated_mac} MACs, {updated_hostname} hostnames")
|
|
|
|
|
PYEOF
|
|
|
|
|
echo "Import complete"
|
|
|
|
|
EOT
|
|
|
|
|
]
|
|
|
|
|
env {
|
|
|
|
|
name = "DB_HOST"
|
|
|
|
|
value = var.mysql_host
|
|
|
|
|
}
|
|
|
|
|
env {
|
|
|
|
|
name = "DB_USER"
|
|
|
|
|
value = "phpipam"
|
|
|
|
|
}
|
|
|
|
|
env {
|
|
|
|
|
name = "DB_PASS"
|
|
|
|
|
value_from {
|
|
|
|
|
secret_key_ref {
|
|
|
|
|
name = "phpipam-secrets"
|
|
|
|
|
key = "db_password"
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
env {
|
|
|
|
|
name = "DB_NAME"
|
|
|
|
|
value = "phpipam"
|
|
|
|
|
}
|
|
|
|
|
volume_mount {
|
|
|
|
|
name = "ssh-key"
|
|
|
|
|
mount_path = "/ssh"
|
|
|
|
|
read_only = true
|
|
|
|
|
}
|
|
|
|
|
resources {
|
|
|
|
|
requests = {
|
|
|
|
|
cpu = "10m"
|
|
|
|
|
memory = "64Mi"
|
|
|
|
|
}
|
|
|
|
|
limits = {
|
|
|
|
|
memory = "128Mi"
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
volume {
|
|
|
|
|
name = "ssh-key"
|
|
|
|
|
secret {
|
|
|
|
|
secret_name = "phpipam-pfsense-ssh"
|
|
|
|
|
default_mode = "0400"
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
[infra] Sweep dns_config ignore_changes across all pod-owning resources [ci skip]
## Context
Wave 3A (commit c9d221d5) added the `# KYVERNO_LIFECYCLE_V1` marker to the
27 pre-existing `ignore_changes = [...dns_config]` sites so they could be
grepped and audited. It did NOT address pod-owning resources that were
simply missing the suppression entirely. Post-Wave-3A sampling (2026-04-18)
found that navidrome, f1-stream, frigate, servarr, monitoring, crowdsec,
and many other stacks showed perpetual `dns_config` drift every plan
because their `kubernetes_deployment` / `kubernetes_stateful_set` /
`kubernetes_cron_job_v1` resources had no `lifecycle {}` block at all.
Root cause (same as Wave 3A): Kyverno's admission webhook stamps
`dns_config { option { name = "ndots"; value = "2" } }` on every pod's
`spec.template.spec.dns_config` to prevent NxDomain search-domain flooding
(see `k8s-ndots-search-domain-nxdomain-flood` skill). Without `ignore_changes`
on every Terraform-managed pod-owner, Terraform repeatedly tries to strip
the injected field.
## This change
Extends the Wave 3A convention by sweeping EVERY `kubernetes_deployment`,
`kubernetes_stateful_set`, `kubernetes_daemon_set`, `kubernetes_cron_job_v1`,
`kubernetes_job_v1` (+ their `_v1` variants) in the repo and ensuring each
carries the right `ignore_changes` path:
- **kubernetes_deployment / stateful_set / daemon_set / job_v1**:
`spec[0].template[0].spec[0].dns_config`
- **kubernetes_cron_job_v1**:
`spec[0].job_template[0].spec[0].template[0].spec[0].dns_config`
(extra `job_template[0]` nesting — the CronJob's PodTemplateSpec is
one level deeper)
Each injection / extension is tagged `# KYVERNO_LIFECYCLE_V1: Kyverno
admission webhook mutates dns_config with ndots=2` inline so the
suppression is discoverable via `rg 'KYVERNO_LIFECYCLE_V1' stacks/`.
Two insertion paths are handled by a Python pass (`/tmp/add_dns_config_ignore.py`):
1. **No existing `lifecycle {}`**: inject a brand-new block just before the
resource's closing `}`. 108 new blocks on 93 files.
2. **Existing `lifecycle {}` (usually for `DRIFT_WORKAROUND: CI owns image tag`
from Wave 4, commit a62b43d1)**: extend its `ignore_changes` list with the
dns_config path. Handles both inline (`= [x]`) and multiline
(`= [\n x,\n]`) forms; ensures the last pre-existing list item carries
a trailing comma so the extended list is valid HCL. 34 extensions.
The script skips anything already mentioning `dns_config` inside an
`ignore_changes`, so re-running is a no-op.
## Scale
- 142 total lifecycle injections/extensions
- 93 `.tf` files touched
- 108 brand-new `lifecycle {}` blocks + 34 extensions of existing ones
- Every Tier 0 and Tier 1 stack with a pod-owning resource is covered
- Together with Wave 3A's 27 pre-existing markers → **169 greppable
`KYVERNO_LIFECYCLE_V1` dns_config sites across the repo**
## What is NOT in this change
- `stacks/trading-bot/main.tf` — entirely commented-out block (`/* … */`).
Python script touched the file, reverted manually.
- `_template/main.tf.example` skeleton — kept minimal on purpose; any
future stack created from it should either inherit the Wave 3A one-line
form or add its own on first `kubernetes_deployment`.
- `terraform fmt` fixes to pre-existing alignment issues in meshcentral,
nvidia/modules/nvidia, vault — unrelated to this commit. Left for a
separate fmt-only pass.
- Non-pod resources (`kubernetes_service`, `kubernetes_secret`,
`kubernetes_manifest`, etc.) — they don't own pods so they don't get
Kyverno dns_config mutation.
## Verification
Random sample post-commit:
```
$ cd stacks/navidrome && ../../scripts/tg plan → No changes.
$ cd stacks/f1-stream && ../../scripts/tg plan → No changes.
$ cd stacks/frigate && ../../scripts/tg plan → No changes.
$ rg -c 'KYVERNO_LIFECYCLE_V1' stacks/ --include='*.tf' --include='*.tf.example' \
| awk -F: '{s+=$2} END {print s}'
169
```
## Reproduce locally
1. `git pull`
2. `rg 'KYVERNO_LIFECYCLE_V1' stacks/ | wc -l` → 169+
3. `cd stacks/navidrome && ../../scripts/tg plan` → expect 0 drift on
the deployment's dns_config field.
Refs: code-seq (Wave 3B dns_config class closed; kubernetes_manifest
annotation class handled separately in 8d94688d for tls_secret)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 21:19:48 +00:00
|
|
|
lifecycle {
|
|
|
|
|
# KYVERNO_LIFECYCLE_V1: Kyverno admission webhook mutates dns_config with ndots=2
|
|
|
|
|
ignore_changes = [spec[0].job_template[0].spec[0].template[0].spec[0].dns_config]
|
|
|
|
|
}
|
2026-04-10 21:24:40 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
|
|
# CronJob: Import devices from remote sites (London + Valchedrym) via SSH
|
|
|
|
|
# Runs hourly — these networks are mostly static
|
|
|
|
|
resource "kubernetes_cron_job_v1" "phpipam_remote_import" {
|
|
|
|
|
metadata {
|
|
|
|
|
name = "phpipam-remote-import"
|
|
|
|
|
namespace = kubernetes_namespace.phpipam.metadata[0].name
|
|
|
|
|
}
|
|
|
|
|
spec {
|
|
|
|
|
schedule = "0 * * * *"
|
|
|
|
|
successful_jobs_history_limit = 1
|
|
|
|
|
failed_jobs_history_limit = 3
|
|
|
|
|
concurrency_policy = "Forbid"
|
|
|
|
|
job_template {
|
|
|
|
|
metadata {}
|
|
|
|
|
spec {
|
|
|
|
|
backoff_limit = 1
|
|
|
|
|
template {
|
|
|
|
|
metadata {}
|
|
|
|
|
spec {
|
|
|
|
|
restart_policy = "Never"
|
|
|
|
|
container {
|
|
|
|
|
name = "import"
|
|
|
|
|
image = "alpine:3.21"
|
|
|
|
|
command = ["/bin/sh", "-c", <<-EOT
|
|
|
|
|
set -e
|
|
|
|
|
apk add --no-cache -q openssh-client mysql-client python3 > /dev/null 2>&1
|
|
|
|
|
|
|
|
|
|
mkdir -p /root/.ssh
|
|
|
|
|
cp /ssh/ssh_key /root/.ssh/id_rsa
|
|
|
|
|
chmod 600 /root/.ssh/id_rsa
|
|
|
|
|
echo "StrictHostKeyChecking no" > /root/.ssh/config
|
|
|
|
|
|
|
|
|
|
# Pull DHCP leases + ARP from Valchedrym via pfSense SSH hop
|
|
|
|
|
echo "=== Valchedrym (192.168.0.1 via pfSense) ==="
|
|
|
|
|
VALCHEDRYM=$$(ssh -o ConnectTimeout=10 admin@10.0.20.1 'timeout 15 ssh -o StrictHostKeyChecking=no -o ConnectTimeout=5 root@192.168.0.1 "cat /tmp/dhcp.leases 2>/dev/null; echo ---ARP---; cat /proc/net/arp 2>/dev/null" 2>/dev/null' 2>/dev/null || echo "")
|
|
|
|
|
|
2026-04-11 08:18:30 +00:00
|
|
|
echo "=== London (192.168.8.1 via pfSense) ==="
|
|
|
|
|
LONDON=$$(ssh -o ConnectTimeout=10 admin@10.0.20.1 'timeout 15 ssh -o StrictHostKeyChecking=no -o ConnectTimeout=5 root@192.168.8.1 "cat /tmp/dhcp.leases 2>/dev/null; echo ---ARP---; cat /proc/net/arp 2>/dev/null" 2>/dev/null' 2>/dev/null || echo "")
|
2026-04-10 21:24:40 +00:00
|
|
|
|
|
|
|
|
echo "=== Importing ==="
|
|
|
|
|
export LONDON_DATA="$$LONDON"
|
|
|
|
|
export VALCHEDRYM_DATA="$$VALCHEDRYM"
|
|
|
|
|
python3 << 'PYEOF'
|
|
|
|
|
import os, re, subprocess
|
|
|
|
|
|
|
|
|
|
db_host = os.environ["DB_HOST"]
|
|
|
|
|
db_user = os.environ["DB_USER"]
|
|
|
|
|
db_pass = os.environ["DB_PASS"]
|
|
|
|
|
db_name = os.environ["DB_NAME"]
|
|
|
|
|
|
|
|
|
|
def mysql_exec(sql):
|
|
|
|
|
subprocess.run(["mysql", "-h", db_host, "-u", db_user, f"-p{db_pass}", db_name, "-N", "-B", "-e", sql], capture_output=True, text=True)
|
|
|
|
|
|
|
|
|
|
def get_existing():
|
|
|
|
|
r = subprocess.run(["mysql", "-h", db_host, "-u", db_user, f"-p{db_pass}", db_name, "-N", "-B", "-e",
|
|
|
|
|
"SELECT INET_NTOA(ip_addr), hostname, mac, subnetId FROM ipaddresses WHERE subnetId IN (11, 12)"],
|
|
|
|
|
capture_output=True, text=True)
|
|
|
|
|
existing = {}
|
|
|
|
|
for line in r.stdout.strip().split("\n"):
|
|
|
|
|
if not line: continue
|
|
|
|
|
parts = line.split("\t")
|
|
|
|
|
existing[parts[0]] = {"hostname": parts[1] if parts[1] != "NULL" else "", "mac": parts[2] if parts[2] != "NULL" else ""}
|
|
|
|
|
return existing
|
|
|
|
|
|
|
|
|
|
def import_site(data, subnet_prefix, subnet_id, site_name):
|
|
|
|
|
if not data or "---ARP---" not in data:
|
|
|
|
|
print(f" {site_name}: no data")
|
|
|
|
|
return 0
|
|
|
|
|
existing = get_existing()
|
|
|
|
|
dhcp_part, arp_part = data.split("---ARP---", 1)
|
|
|
|
|
imported = 0
|
|
|
|
|
|
|
|
|
|
# DHCP leases: timestamp mac ip hostname client_id
|
2026-04-10 21:06:39 +00:00
|
|
|
for line in dhcp_part.strip().split("\n"):
|
|
|
|
|
parts = line.split()
|
2026-04-10 21:24:40 +00:00
|
|
|
if len(parts) < 4: continue
|
|
|
|
|
mac, ip, hostname = parts[1], parts[2], parts[3]
|
|
|
|
|
if not ip.startswith(subnet_prefix): continue
|
|
|
|
|
short = hostname.split(".")[0] if hostname != "*" else ""
|
|
|
|
|
if ip not in existing:
|
|
|
|
|
mac_sql = f"'{mac}'" if mac else "NULL"
|
|
|
|
|
host_sql = f"'{short}'" if short else "''"
|
|
|
|
|
mysql_exec(f"INSERT INTO ipaddresses (ip_addr, subnetId, hostname, mac, description, lastSeen) VALUES (INET_ATON('{ip}'), {subnet_id}, {host_sql}, {mac_sql}, '-- {site_name} dhcp --', NOW())")
|
|
|
|
|
imported += 1
|
|
|
|
|
print(f" NEW {ip} -> {short} mac={mac}")
|
|
|
|
|
else:
|
|
|
|
|
updates = ["lastSeen=NOW()"]
|
|
|
|
|
if mac and not existing[ip]["mac"]: updates.append(f"mac='{mac}'")
|
|
|
|
|
if short and not existing[ip]["hostname"]: updates.append(f"hostname='{short}'")
|
|
|
|
|
mysql_exec(f"UPDATE ipaddresses SET {','.join(updates)} WHERE ip_addr=INET_ATON('{ip}')")
|
|
|
|
|
|
|
|
|
|
# ARP table
|
2026-04-10 21:06:39 +00:00
|
|
|
for line in arp_part.strip().split("\n"):
|
|
|
|
|
m = re.match(r'(\d+\.\d+\.\d+\.\d+)\s+\S+\s+\S+\s+([0-9a-f:]+)\s+', line)
|
|
|
|
|
if not m: continue
|
|
|
|
|
ip, mac = m.group(1), m.group(2)
|
2026-04-10 21:24:40 +00:00
|
|
|
if not ip.startswith(subnet_prefix) or mac == "00:00:00:00:00:00": continue
|
2026-04-10 21:06:39 +00:00
|
|
|
if ip in existing: continue
|
2026-04-10 21:24:40 +00:00
|
|
|
mysql_exec(f"INSERT INTO ipaddresses (ip_addr, subnetId, mac, description, lastSeen) VALUES (INET_ATON('{ip}'), {subnet_id}, '{mac}', '-- {site_name} arp --', NOW())")
|
2026-04-10 21:06:39 +00:00
|
|
|
imported += 1
|
2026-04-10 21:24:40 +00:00
|
|
|
print(f" NEW (arp) {ip} mac={mac}")
|
|
|
|
|
return imported
|
2026-04-10 20:58:49 +00:00
|
|
|
|
2026-04-10 21:24:40 +00:00
|
|
|
london = import_site(os.environ.get("LONDON_DATA", ""), "192.168.8.", 11, "london")
|
|
|
|
|
valchedrym = import_site(os.environ.get("VALCHEDRYM_DATA", ""), "192.168.0.", 12, "valchedrym")
|
|
|
|
|
print(f"\nLondon: {london} new, Valchedrym: {valchedrym} new")
|
2026-04-10 20:37:28 +00:00
|
|
|
PYEOF
|
2026-04-10 21:24:40 +00:00
|
|
|
echo "Remote import complete"
|
2026-04-10 20:37:28 +00:00
|
|
|
EOT
|
|
|
|
|
]
|
|
|
|
|
env {
|
|
|
|
|
name = "DB_HOST"
|
|
|
|
|
value = var.mysql_host
|
|
|
|
|
}
|
|
|
|
|
env {
|
|
|
|
|
name = "DB_USER"
|
|
|
|
|
value = "phpipam"
|
|
|
|
|
}
|
|
|
|
|
env {
|
|
|
|
|
name = "DB_PASS"
|
|
|
|
|
value_from {
|
|
|
|
|
secret_key_ref {
|
|
|
|
|
name = "phpipam-secrets"
|
|
|
|
|
key = "db_password"
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
env {
|
|
|
|
|
name = "DB_NAME"
|
|
|
|
|
value = "phpipam"
|
|
|
|
|
}
|
|
|
|
|
volume_mount {
|
|
|
|
|
name = "ssh-key"
|
|
|
|
|
mount_path = "/ssh"
|
|
|
|
|
read_only = true
|
|
|
|
|
}
|
|
|
|
|
resources {
|
|
|
|
|
requests = {
|
|
|
|
|
cpu = "10m"
|
|
|
|
|
memory = "64Mi"
|
|
|
|
|
}
|
|
|
|
|
limits = {
|
|
|
|
|
memory = "128Mi"
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
volume {
|
|
|
|
|
name = "ssh-key"
|
|
|
|
|
secret {
|
|
|
|
|
secret_name = "phpipam-pfsense-ssh"
|
|
|
|
|
default_mode = "0400"
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
[infra] Sweep dns_config ignore_changes across all pod-owning resources [ci skip]
## Context
Wave 3A (commit c9d221d5) added the `# KYVERNO_LIFECYCLE_V1` marker to the
27 pre-existing `ignore_changes = [...dns_config]` sites so they could be
grepped and audited. It did NOT address pod-owning resources that were
simply missing the suppression entirely. Post-Wave-3A sampling (2026-04-18)
found that navidrome, f1-stream, frigate, servarr, monitoring, crowdsec,
and many other stacks showed perpetual `dns_config` drift every plan
because their `kubernetes_deployment` / `kubernetes_stateful_set` /
`kubernetes_cron_job_v1` resources had no `lifecycle {}` block at all.
Root cause (same as Wave 3A): Kyverno's admission webhook stamps
`dns_config { option { name = "ndots"; value = "2" } }` on every pod's
`spec.template.spec.dns_config` to prevent NxDomain search-domain flooding
(see `k8s-ndots-search-domain-nxdomain-flood` skill). Without `ignore_changes`
on every Terraform-managed pod-owner, Terraform repeatedly tries to strip
the injected field.
## This change
Extends the Wave 3A convention by sweeping EVERY `kubernetes_deployment`,
`kubernetes_stateful_set`, `kubernetes_daemon_set`, `kubernetes_cron_job_v1`,
`kubernetes_job_v1` (+ their `_v1` variants) in the repo and ensuring each
carries the right `ignore_changes` path:
- **kubernetes_deployment / stateful_set / daemon_set / job_v1**:
`spec[0].template[0].spec[0].dns_config`
- **kubernetes_cron_job_v1**:
`spec[0].job_template[0].spec[0].template[0].spec[0].dns_config`
(extra `job_template[0]` nesting — the CronJob's PodTemplateSpec is
one level deeper)
Each injection / extension is tagged `# KYVERNO_LIFECYCLE_V1: Kyverno
admission webhook mutates dns_config with ndots=2` inline so the
suppression is discoverable via `rg 'KYVERNO_LIFECYCLE_V1' stacks/`.
Two insertion paths are handled by a Python pass (`/tmp/add_dns_config_ignore.py`):
1. **No existing `lifecycle {}`**: inject a brand-new block just before the
resource's closing `}`. 108 new blocks on 93 files.
2. **Existing `lifecycle {}` (usually for `DRIFT_WORKAROUND: CI owns image tag`
from Wave 4, commit a62b43d1)**: extend its `ignore_changes` list with the
dns_config path. Handles both inline (`= [x]`) and multiline
(`= [\n x,\n]`) forms; ensures the last pre-existing list item carries
a trailing comma so the extended list is valid HCL. 34 extensions.
The script skips anything already mentioning `dns_config` inside an
`ignore_changes`, so re-running is a no-op.
## Scale
- 142 total lifecycle injections/extensions
- 93 `.tf` files touched
- 108 brand-new `lifecycle {}` blocks + 34 extensions of existing ones
- Every Tier 0 and Tier 1 stack with a pod-owning resource is covered
- Together with Wave 3A's 27 pre-existing markers → **169 greppable
`KYVERNO_LIFECYCLE_V1` dns_config sites across the repo**
## What is NOT in this change
- `stacks/trading-bot/main.tf` — entirely commented-out block (`/* … */`).
Python script touched the file, reverted manually.
- `_template/main.tf.example` skeleton — kept minimal on purpose; any
future stack created from it should either inherit the Wave 3A one-line
form or add its own on first `kubernetes_deployment`.
- `terraform fmt` fixes to pre-existing alignment issues in meshcentral,
nvidia/modules/nvidia, vault — unrelated to this commit. Left for a
separate fmt-only pass.
- Non-pod resources (`kubernetes_service`, `kubernetes_secret`,
`kubernetes_manifest`, etc.) — they don't own pods so they don't get
Kyverno dns_config mutation.
## Verification
Random sample post-commit:
```
$ cd stacks/navidrome && ../../scripts/tg plan → No changes.
$ cd stacks/f1-stream && ../../scripts/tg plan → No changes.
$ cd stacks/frigate && ../../scripts/tg plan → No changes.
$ rg -c 'KYVERNO_LIFECYCLE_V1' stacks/ --include='*.tf' --include='*.tf.example' \
| awk -F: '{s+=$2} END {print s}'
169
```
## Reproduce locally
1. `git pull`
2. `rg 'KYVERNO_LIFECYCLE_V1' stacks/ | wc -l` → 169+
3. `cd stacks/navidrome && ../../scripts/tg plan` → expect 0 drift on
the deployment's dns_config field.
Refs: code-seq (Wave 3B dns_config class closed; kubernetes_manifest
annotation class handled separately in 8d94688d for tls_secret)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 21:19:48 +00:00
|
|
|
lifecycle {
|
|
|
|
|
# KYVERNO_LIFECYCLE_V1: Kyverno admission webhook mutates dns_config with ndots=2
|
|
|
|
|
ignore_changes = [spec[0].job_template[0].spec[0].template[0].spec[0].dns_config]
|
|
|
|
|
}
|
2026-04-10 20:37:28 +00:00
|
|
|
}
|