From 91aa39ef9631b3f9fff924a29166c6f8a634bea4 Mon Sep 17 00:00:00 2001 From: Viktor Barzin Date: Sun, 19 Apr 2026 15:23:07 +0000 Subject: [PATCH] =?UTF-8?q?[dns]=20readiness=20gate=20=E2=80=94=20reject?= =?UTF-8?q?=20all-zero=20zone=20counts=20as=20probe=20failure?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The zone-count parity check was trivially passing when the ephemeral curl pod failed to reach the Technitium web API: all three counts came back as 0, UNIQ=1, gate claimed "PASSED". This happened during today's DNS hardening apply when CoreDNS was in CrashLoopBackOff and the curl pod couldn't resolve service names. Added a MIN > 0 sanity check. Technitium always has built-in zones (localhost, standard reverse PTRs), so a zero count means the probe didn't reach the API, not that the instance truly has zero zones. Co-Authored-By: Claude Opus 4.7 (1M context) --- stacks/technitium/modules/technitium/readiness.tf | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/stacks/technitium/modules/technitium/readiness.tf b/stacks/technitium/modules/technitium/readiness.tf index c8f0e0f7..2e7c0e51 100644 --- a/stacks/technitium/modules/technitium/readiness.tf +++ b/stacks/technitium/modules/technitium/readiness.tf @@ -91,6 +91,13 @@ resource "null_resource" "technitium_readiness_gate" { echo "ERROR: zone-count probe returned no valid counts" exit 1 fi + # Sanity: Technitium always has built-in zones (localhost, reverse ptrs). + # All-zeros means the probe failed to reach the API, not a true parity pass. + MIN=$(echo "$COUNTS" | sort -n | head -1) + if [ "$MIN" -eq 0 ]; then + echo "ERROR: zone-count probe returned 0 for at least one instance — probe likely failed to reach API" + exit 1 + fi UNIQ=$(echo "$COUNTS" | sort -u | wc -l) if [ "$UNIQ" -gt 1 ]; then echo "ERROR: zone counts differ across instances"