infra/scripts/tg: enforce ingress_factory auth-comment convention

Every `tg plan/apply/destroy/refresh` now runs
`scripts/check-ingress-auth-comments.py` against the current stack
before invoking terragrunt. The check fails closed if any
`auth = "app"` or `auth = "none"` line in the stack's .tf files lacks
an immediately-preceding `# auth = "<tier>": ...` comment documenting
what gates the app (for "app") or why the endpoint is intentionally
public (for "none").

Why tg-level (not git pre-commit): tg is the universal entry point
for all infra changes. CI runs it, headless agents run it, humans
run it. A pre-commit hook only catches the human path. Wiring the
check into tg means the anti-exposure guard fires regardless of who
or what is invoking terragrunt.

Stack-scoped: each stack documents itself the next time it's edited.
The 30+ existing `auth = "none"` stacks that predate this guard are
not blocked from operating today; they'll need the comment added the
next time someone runs `tg plan` on them — at which point the gate
forces a conscious "yes, this is intentional" moment before any
state change can land.

Skipped on: init, fmt, validate, output, etc. — anything that doesn't
read or write infra state.
This commit is contained in:
Viktor Barzin 2026-05-11 19:18:27 +00:00
parent b91268fef4
commit 0712a1b659
3 changed files with 149 additions and 1 deletions

View file

@ -0,0 +1,124 @@
#!/usr/bin/env python3
"""Enforce the inline-comment convention for ingress_factory auth tiers.
Every `auth = "app"` or `auth = "none"` line under a stack must have an
immediately-preceding comment block containing `# auth = "<tier>":`
that documents what gates the app (for "app") or why the endpoint is
intentionally public (for "none").
This is the static guard for the anti-exposure rule documented in
`infra/.claude/CLAUDE.md` "Auth" section. It's invoked by `scripts/tg`
before every plan/apply/destroy/refresh, so it fires regardless of who
or what is running terragrunt local laptop, CI, headless agent.
Stack-scoped by design: only checks the .tf files under the stack
being acted on. Other stacks' historical violations don't block work
on the current stack; each stack documents itself the next time it's
edited.
Usage:
check-ingress-auth-comments.py <stack-path> # scan one stack
check-ingress-auth-comments.py --all # scan every stack
"""
import argparse
import os
import re
import sys
AUTH_LINE = re.compile(r'^\s*auth\s*=\s*"(app|none)"\s*$')
COMMENT_LINE = re.compile(r'^\s*#')
COMMENT_TIER = re.compile(r'auth\s*=\s*"(app|none)"')
def scan_dir(path):
violations = []
for root, _, files in os.walk(path):
for f in files:
if not f.endswith('.tf'):
continue
full = os.path.join(root, f)
try:
with open(full) as fh:
lines = fh.readlines()
except OSError:
continue
for i, line in enumerate(lines):
m = AUTH_LINE.match(line)
if not m:
continue
tier = m.group(1)
# Walk backwards through contiguous comment lines.
# Pass if ANY of them documents the matching tier.
ok = False
j = i - 1
while j >= 0 and COMMENT_LINE.match(lines[j]):
cm = COMMENT_TIER.search(lines[j])
if cm and cm.group(1) == tier:
ok = True
break
j -= 1
if not ok:
violations.append((full, i + 1, tier))
return violations
def main():
ap = argparse.ArgumentParser(description=__doc__.splitlines()[0])
g = ap.add_mutually_exclusive_group(required=True)
g.add_argument('path', nargs='?', help='Stack directory to scan')
g.add_argument('--all', action='store_true', help='Scan every stack under stacks/')
args = ap.parse_args()
if args.all:
scan_paths = ['stacks']
else:
if not os.path.isdir(args.path):
print(f"ERROR: {args.path} is not a directory", file=sys.stderr)
sys.exit(2)
scan_paths = [args.path]
violations = []
for p in scan_paths:
violations.extend(scan_dir(p))
if not violations:
return
print(
"\n"
"==============================================================\n"
"ingress_factory auth-comment convention violated\n"
"==============================================================\n"
"\n"
"Every `auth = \"app\"` or `auth = \"none\"` line must have a\n"
"preceding comment line documenting what gates the app (for\n"
"\"app\") or why the endpoint is intentionally public (for\n"
"\"none\"). This guard prevents accidentally exposing private\n"
"services. See infra/.claude/CLAUDE.md Auth section.\n"
"\n"
"Add a comment line directly above the auth line:\n"
"\n"
" # auth = \"app\": <what gates the app, e.g. NextAuth + OAuth>\n"
" auth = \"app\"\n"
"\n"
"or:\n"
"\n"
" # auth = \"none\": <why public, e.g. webhook receiver, CalDAV>\n"
" auth = \"none\"\n"
"\n"
"Violations:",
file=sys.stderr,
)
for path, line_no, tier in violations:
print(
f" {path}:{line_no}: auth = \"{tier}\" missing preceding "
f"`# auth = \"{tier}\":` comment",
file=sys.stderr,
)
print(file=sys.stderr)
sys.exit(1)
if __name__ == '__main__':
main()

View file

@ -102,6 +102,30 @@ for arg in "$@"; do
esac
done
# Detect if this is a plan/apply/destroy/refresh — anything that reads or
# writes infra state. Cheap pre-flight check below scans only the current
# stack's .tf files for the ingress_factory auth-comment convention. Other
# tg verbs (init, fmt, validate) skip the check.
is_tf_op=false
for arg in "$@"; do
case "$arg" in
plan|apply|destroy|refresh) is_tf_op=true ;;
esac
done
# Anti-exposure guard: every `auth = "app"` or `auth = "none"` in this stack
# must have a preceding `# auth = "<tier>":` comment documenting what gates
# the app or why the endpoint is intentionally public. See:
# - infra/modules/kubernetes/ingress_factory/main.tf (variable description)
# - infra/.claude/CLAUDE.md "Auth" section
# Stack-scoped: untouched stacks aren't blocked from future applies until
# they're actually edited, at which point the convention applies.
if $is_tf_op && [ -n "$STACK_NAME" ]; then
if ! "$REPO_ROOT/scripts/check-ingress-auth-comments.py" "$REPO_ROOT/stacks/$STACK_NAME"; then
exit 1
fi
fi
# Acquire lock for mutating operations (Tier 0 only — Tier 1 uses pg_advisory_lock)
if $is_mutating && [ -n "$STACK_NAME" ] && is_tier0 "$STACK_NAME"; then
if command -v vault &>/dev/null && [ -n "${VAULT_TOKEN:-}" ]; then