infra

Viktor Barzin b10233975b llama-cpp: restore replicas to 1; fire-planner: fix llama-swap URL llama-cpp was scaled to 0 during 2026-05-25 IO-storm recovery (TEMP-SCALEDOWN). Cluster is now stable; only frigate competes for the GPU on k8s-node1. Restoring to 1 to unblock fire-planner's Reddit examples ingest, which needs qwen3-8b for structured extraction. fire-planner's llama_cpp_base_url default pointed at a non-existent service:port (llama-cpp:8000) — the real service is `llama-swap` on port 8080. First 2026-05-28 bulk Job exited 0 with 0 rows because of this. Correcting.	2026-05-29 06:20:03 +00:00
..
main.tf	llama-cpp: restore replicas to 1; fire-planner: fix llama-swap URL	2026-05-29 06:20:03 +00:00
terragrunt.hcl	infra/llama-cpp: add stack — llama-swap fronting Qwen3-VL + MiniCPM-V	2026-05-10 14:13:40 +00:00

Viktor Barzin b10233975b llama-cpp: restore replicas to 1; fire-planner: fix llama-swap URL

llama-cpp was scaled to 0 during 2026-05-25 IO-storm recovery
(TEMP-SCALEDOWN). Cluster is now stable; only frigate competes for the
GPU on k8s-node1. Restoring to 1 to unblock fire-planner's Reddit
examples ingest, which needs qwen3-8b for structured extraction.

fire-planner's llama_cpp_base_url default pointed at a non-existent
service:port (llama-cpp:8000) — the real service is `llama-swap` on
port 8080. First 2026-05-28 bulk Job exited 0 with 0 rows because of
this. Correcting.

2026-05-29 06:20:03 +00:00

main.tf

llama-cpp: restore replicas to 1; fire-planner: fix llama-swap URL

2026-05-29 06:20:03 +00:00

terragrunt.hcl

infra/llama-cpp: add stack — llama-swap fronting Qwen3-VL + MiniCPM-V

2026-05-10 14:13:40 +00:00