From bb0f9f59ef4f44c2c9c3cf9a6460a120c25347a2 Mon Sep 17 00:00:00 2001 From: Viktor Barzin Date: Fri, 12 Jun 2026 20:39:27 +0000 Subject: [PATCH 1/2] =?UTF-8?q?docs:=20CI-compute=20doctrine=20=E2=80=94?= =?UTF-8?q?=20leverage=20external=20infra=20for=20builds=20AND=20tests=20[?= =?UTF-8?q?ci=20skip]?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Viktor's standing instruction (2026-06-12): lean on external infra as much as possible for CI — builds, running tests, lint, releases all on GitHub Actions hosted runners, never on cluster nodes; in-cluster pipelines only for cluster-touching steps (deploys, terragrunt, certbot). Also: watch any triggered pipeline chain to completion and fix failures immediately. Added to AGENTS.md + .claude/CLAUDE.md CI sections (ADR-0002 companions). Co-Authored-By: Claude Fable 5 --- .claude/CLAUDE.md | 13 ++++++++++++- AGENTS.md | 1 + 2 files changed, 13 insertions(+), 1 deletion(-) diff --git a/.claude/CLAUDE.md b/.claude/CLAUDE.md index dd865aa9..d0bc9444 100755 --- a/.claude/CLAUDE.md +++ b/.claude/CLAUDE.md @@ -26,7 +26,7 @@ Violations cause state drift, which causes future applies to break or silently r ## Instructions - **"remember X"**: Use `memory-tool store "content" --category facts --tags "tag1,tag2"` (via exec) for persistent cross-session memory. Also update this file + `AGENTS.md` (if shared knowledge), commit with `[ci skip]`. To recall: `memory-tool recall "query"`. To list: `memory-tool list`. To delete: `memory-tool delete `. The native `memory_search` and `memory_get` tools are also available for searching indexed memory files. For **storing** new memories, always use the `memory-tool` CLI via exec. - **Apply**: Authenticate via `vault login -method=oidc`, then use `scripts/tg` (preferred — handles state decrypt/encrypt) or `terragrunt` directly. `scripts/tg` adds `-auto-approve` for `--non-interactive` applies. -- **New services need CI/CD** and **monitoring** (Prometheus/Uptime Kuma) +- **New services need CI/CD** and **monitoring** (Prometheus/Uptime Kuma). CI = a GHA workflow on the repo's GitHub mirror (build + tests off-infra, ADR-0002); Woodpecker gets a deploy-only pipeline — never an in-cluster build. - **New service**: Use `setup-project` skill for full workflow - **Ingress**: `ingress_factory` module. **Auth** (`auth` string enum, default `"required"` — fail-closed). Pick by asking "what gates the app?": - `auth = "required"` — Authentik forward-auth gates every request. Use when the backend has **no built-in user auth** and Authentik is the only thing standing between strangers and the app (prowlarr, qbittorrent, netbox, phpipam, k8s-dashboard, any admin UI shipped without its own login). @@ -89,6 +89,17 @@ Violations cause state drift, which causes future applies to break or silently r ## CI/CD Architecture — GHA Builds + Woodpecker Deploy +**Doctrine (ADR-0002): leverage external infra for ALL CI compute.** Builds, +tests, lint, and release jobs run on GitHub Actions hosted runners (public +repos: unlimited free; private: 2000 free min/mo) — never on cluster nodes. +In-cluster pipelines are reserved for cluster-touching steps only: Woodpecker +deploys (`kubectl set image`), terragrunt applies, certbot. Do not +(re)introduce in-cluster image builds or CI test runs — the fallback-build +pattern was deliberately removed (clean cut). **Watch what you trigger**: +after any push that fires a build chain, monitor it to completion (GHA run → +Woodpecker deploy → `rollout status`) and fix failures immediately; verify +via live state, not the checkmark. Fleet migration: PRD infra#10 (ADR-0002). + **Owned-app deploy model (build triggers the rollout — 2026-06-02):** For self-hosted apps **we build** (Forgejo `viktor/` + Dockerfile + `.woodpecker.yml`), the build pipeline ALSO drives the rollout — atomic + diff --git a/AGENTS.md b/AGENTS.md index 43d4cf3e..797ed5df 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -90,6 +90,7 @@ Terragrunt-based homelab managing a Kubernetes cluster (5 nodes, v1.34.2) on Pro - **Public domain**: `viktorbarzin.me` (Cloudflare) | **Internal**: `viktorbarzin.lan` (Technitium DNS) - **Onboarding portal**: `https://k8s-portal.viktorbarzin.me` — self-service kubectl setup + docs - **CI/CD**: Woodpecker CI — PRs run plan, merges to master auto-apply all stacks +- **CI compute is external (ADR-0002, 2026-06-12)**: builds, tests, lint, and release jobs run on GitHub Actions hosted runners via each repo's GitHub mirror — never on cluster nodes. In-cluster pipelines exist only for steps that need cluster access (Woodpecker `kubectl set image` deploys, terragrunt applies, certbot). Never add an in-cluster build or test pipeline to any repo; the fallback-build pattern was deliberately removed. After pushing anything that fires a build chain, watch it end-to-end (GHA run → Woodpecker deploy → rollout) before calling the change done — verify live state, not the checkmark. ## Key Paths - `stacks//main.tf` — service definition From 98f1f7fc24368ab3ab87e7a35088a59509b77cd9 Mon Sep 17 00:00:00 2001 From: Viktor Barzin Date: Fri, 12 Jun 2026 20:41:50 +0000 Subject: [PATCH 2/2] tts: seed extension-less voice copies so tripit's bare stems resolve MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit First live drain failed all 27 queued narrations with 404 'Voice file 'Emily' not found': tripit's catalog sends bare stems (Emily) but the devnen server resolves the voice as a literal filename (Emily.wav) in predefined_voices_path then reference_audio — no stem fallback exists upstream (HEAD == our pinned sha), and symlinks can't bridge it because safe_resolve_within() resolves them out of the containment check. New initContainer on the chatterbox deployment copies the 28 bundled voices to /data/reference_audio/ on the PVC (second lookup path). Same image as the main container so no extra pull; idempotent; ~15 MB. Verified live before committing: an extension-less copy synthesizes 200 audio/mp3 (5.3s warm) where voice=Emily 404'd. Co-Authored-By: Claude Fable 5 --- stacks/tts/main.tf | 30 ++++++++++++++++++++++++++++++ 1 file changed, 30 insertions(+) diff --git a/stacks/tts/main.tf b/stacks/tts/main.tf index b9df17c3..5056fced 100644 --- a/stacks/tts/main.tf +++ b/stacks/tts/main.tf @@ -374,6 +374,36 @@ resource "kubernetes_deployment" "chatterbox" { name = kubernetes_secret.ghcr_credentials.metadata[0].name } + # tripit's voice catalog sends bare stems ("Emily"); the server resolves + # the voice as a LITERAL filename in predefined_voices_path THEN in + # reference_audio (404 otherwise, observed 2026-06-12: all 27 queued + # narrations failed with "Voice file 'Emily' not found"). Upstream HEAD + # (= our pinned sha) has no stem fallback, and symlinks can't bridge it + # because safe_resolve_within() .resolve()s them out of the containment + # check. So seed REAL extension-less copies of the bundled voices into + # reference_audio on the PVC (the second lookup path). Same image as the + # main container = no extra pull; idempotent; ~15 MB once. The engine + # sniffs audio content, not extensions. + init_container { + name = "seed-stem-voices" + image = local.image + command = ["sh", "-c", <<-EOC + set -eu + mkdir -p /data/reference_audio + for f in /app/voices/*.wav /app/voices/*.mp3; do + [ -e "$f" ] || continue + stem="$(basename "$f")"; stem="$${stem%.*}" + [ -e "/data/reference_audio/$stem" ] || cp "$f" "/data/reference_audio/$stem" + done + echo "reference_audio seeded:"; ls /data/reference_audio + EOC + ] + volume_mount { + name = "models" + mount_path = "/data" + } + } + container { name = "chatterbox-tts" image = local.image