infra/stacks/immich
Viktor Barzin 052c776eba immich: set MACHINE_LEARNING_MODEL_TTL 0->600 to stop GPU VRAM hog
immich-ml at TTL=0 never unloaded models; a heavy OCR library job
inflated onnxruntime's CUDA arena to ~10.7GB and held it on the shared
time-sliced T4, starving llama-swap (qwen3-8b) so recruiter-responder
triage 502'd silently for hours (emails preserved unseen, no loss).
TTL=600 lets idle ad-hoc models (OCR, face) free VRAM while preloaded
CLIP/smart-search stays warm.

Docs: correct stale llama-cpp GPU notes (T4 is time-sliced, no VRAM
isolation; add qwen3-8b to model table), immich MODEL_TTL gotcha in
.claude/CLAUDE.md, and a post-mortem.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-02 20:16:11 +00:00
..
.terraform.lock.hcl kms: revert files accidentally bundled into the docs commit 2026-06-01 10:36:49 +00:00
backend.tf Woodpecker CI deploy [CI SKIP] 2026-05-29 18:07:00 +00:00
chart_values.tpl [redis] Migrate live RW consumers off bare redis.redis hostname 2026-04-19 12:42:36 +00:00
frame.tf immich: GPU-accelerate video transcoding (NVENC + NVDEC) 2026-05-29 18:05:34 +00:00
main.tf immich: set MACHINE_LEARNING_MODEL_TTL 0->600 to stop GPU VRAM hog 2026-06-02 20:16:11 +00:00
providers.tf cluster-health: emergency-stop Keel + roll back image downgrades + quota raises 2026-05-26 18:48:50 +00:00
secrets [ci skip] Move Terraform modules into stack directories 2026-02-22 14:38:14 +00:00
terragrunt.hcl migrate all secrets from SOPS to Vault KV 2026-03-14 17:15:48 +00:00