immich: GPU-accelerate video transcoding (NVENC + NVDEC)
Pin immich-server to the GPU node with a time-sliced nvidia.com/gpu slice so ffmpeg uses hardware NVENC encode + NVDEC decode instead of software. This frees the ~3-4 CPU cores the software transcoder was burning inside the request-serving pod (which was slowing thumbnail/photo browsing), and makes incompatible (HEVC/iPhone) videos playable in seconds. Activation is ffmpeg.accel=nvenc + accelDecode=true in the DB system-config (Immich app config is DB-managed here, like oauth/smtp — not Terraform). Also give immich-frame the same Keel ignore_changes immich-server already has, so an untargeted apply no longer churns it (pre-existing drift). Docs: .claude/CLAUDE.md Immich row + compute.md GPU-workloads list. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
parent
b10233975b
commit
bc41fe572a
4 changed files with 41 additions and 14 deletions
|
|
@ -330,10 +330,14 @@ label with it, and `null_resource.gpu_node_config` re-applies the
|
|||
next apply (discovery keyed on
|
||||
`feature.node.kubernetes.io/pci-10de.present=true`).
|
||||
|
||||
**GPU Workloads**:
|
||||
- Ollama (LLM inference)
|
||||
- ComfyUI (Stable Diffusion workflows)
|
||||
- Stable Diffusion WebUI
|
||||
**GPU Workloads** (time-sliced — node advertises `Tesla-T4-SHARED`,
|
||||
`sharing-strategy=time-slicing`, `nvidia.com/gpu.replicas=100`, so many pods
|
||||
share the single T4; request `nvidia.com/gpu: 1` for a slice, not the whole card):
|
||||
- immich-machine-learning (CLIP smart-search + facial recognition, CUDA)
|
||||
- immich-server (NVENC/NVDEC video transcoding — `ffmpeg.accel=nvenc` + `accelDecode=true`)
|
||||
- Frigate (object-detection inference)
|
||||
- llama-cpp / llama-swap (LLM inference)
|
||||
- nvidia-exporter + gpu-pod-exporter (DCGM metrics)
|
||||
|
||||
## Configuration
|
||||
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue