infra/stacks/llama-cpp
Viktor Barzin 0b1282a13c llama-cpp: ignore_changes for keel/k8s-managed annotations
Every `tg apply` was reverting the annotations that keel patches when it
detects an upstream digest change — `keel.sh/match-tag` (Kyverno-stamped),
`keel.sh/update-time` (on the pod template; what actually triggers the
rollout), plus the K8s-managed `kubernetes.io/change-cause` and
`deployment.kubernetes.io/revision`. The revert forced a rollout, then
the next keel poll re-stamped the annotations, forcing another. With
llama-swap's ~10s cold-load on each pod recreate the user noticed.

Upstream `ghcr.io/mostlygeek/llama-swap:cuda` is a moving nightly tag —
keel still drives one legitimate rollout per day at ~07:25 UTC; this
patch stops the apply-driven extra rollouts on top of that.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-24 09:01:17 +00:00
..
main.tf llama-cpp: ignore_changes for keel/k8s-managed annotations 2026-05-24 09:01:17 +00:00
terragrunt.hcl infra/llama-cpp: add stack — llama-swap fronting Qwen3-VL + MiniCPM-V 2026-05-10 14:13:40 +00:00