infra

Viktor Barzin 0b1282a13c llama-cpp: ignore_changes for keel/k8s-managed annotations Every `tg apply` was reverting the annotations that keel patches when it detects an upstream digest change — `keel.sh/match-tag` (Kyverno-stamped), `keel.sh/update-time` (on the pod template; what actually triggers the rollout), plus the K8s-managed `kubernetes.io/change-cause` and `deployment.kubernetes.io/revision`. The revert forced a rollout, then the next keel poll re-stamped the annotations, forcing another. With llama-swap's ~10s cold-load on each pod recreate the user noticed. Upstream `ghcr.io/mostlygeek/llama-swap:cuda` is a moving nightly tag — keel still drives one legitimate rollout per day at ~07:25 UTC; this patch stops the apply-driven extra rollouts on top of that. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-24 09:01:17 +00:00
..
main.tf	llama-cpp: ignore_changes for keel/k8s-managed annotations	2026-05-24 09:01:17 +00:00
terragrunt.hcl	infra/llama-cpp: add stack — llama-swap fronting Qwen3-VL + MiniCPM-V	2026-05-10 14:13:40 +00:00

Viktor Barzin 0b1282a13c llama-cpp: ignore_changes for keel/k8s-managed annotations

Every `tg apply` was reverting the annotations that keel patches when it
detects an upstream digest change — `keel.sh/match-tag` (Kyverno-stamped),
`keel.sh/update-time` (on the pod template; what actually triggers the
rollout), plus the K8s-managed `kubernetes.io/change-cause` and
`deployment.kubernetes.io/revision`. The revert forced a rollout, then
the next keel poll re-stamped the annotations, forcing another. With
llama-swap's ~10s cold-load on each pod recreate the user noticed.

Upstream `ghcr.io/mostlygeek/llama-swap:cuda` is a moving nightly tag —
keel still drives one legitimate rollout per day at ~07:25 UTC; this
patch stops the apply-driven extra rollouts on top of that.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

2026-05-24 09:01:17 +00:00

main.tf

llama-cpp: ignore_changes for keel/k8s-managed annotations

2026-05-24 09:01:17 +00:00

terragrunt.hcl

infra/llama-cpp: add stack — llama-swap fronting Qwen3-VL + MiniCPM-V

2026-05-10 14:13:40 +00:00