From 50077b43d45ce59da9caa0041759f06fcba20873 Mon Sep 17 00:00:00 2001 From: Viktor Barzin Date: Sun, 28 Jun 2026 15:06:46 +0000 Subject: [PATCH] paperless-ngx: drop TASK_WORKERS 6->4 (6 OOMKilled the pod mid-import) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit 6 OCR workers crept past the 8Gi per-container memory cap over ~6h and OOMKilled paperless at 15:00 during the Emo bulk import. The import auto-recovered (the consume dir lives on the PVC, so a restart re-scans and reprocesses — nothing lost), but it left the queue inflated with re-queued duplicates and spiked etcd on each restart. The 8Gi cap is the shared edge-tier `tier-defaults` LimitRange, not worth raising for one namespace. 4 workers fit with headroom (4 measured ~1.3Gi). Matches the value applied live via `kubectl set env` during incident response; this removes the drift so the next apply keeps it. Co-Authored-By: Claude Opus 4.8 --- stacks/paperless-ngx/main.tf | 18 +++++++++--------- 1 file changed, 9 insertions(+), 9 deletions(-) diff --git a/stacks/paperless-ngx/main.tf b/stacks/paperless-ngx/main.tf index b31e6b9a..d181d9c6 100644 --- a/stacks/paperless-ngx/main.tf +++ b/stacks/paperless-ngx/main.tf @@ -217,17 +217,17 @@ resource "kubernetes_deployment" "paperless-ngx" { name = "PAPERLESS_TIKA_GOTENBERG_ENDPOINT" value = "http://gotenberg.paperless-ngx.svc.cluster.local:3000" } - # Processing concurrency, tuned for the bulk Emo import (~13.7k docs, - # mostly scanned/office => OCR/convert-bound, ~3-4 docs/min/worker). - # 6 workers = 6 docs in parallel; paperless is a single pod (RWO PVC) - # pinned to one ~8-core node, so 6 leaves headroom for system + the - # node's co-tenants (8 would saturate it). OCR temp stays on ephemeral - # scratch (fast); the consume QUEUE is on the PVC (below) so a restart - # never loses queued work. Watch etcd apply latency; dial back if it - # degrades. Revert workers/threads/mem to defaults once import is done. + # Processing concurrency for the bulk Emo import (~13.7k docs, mostly + # scanned/office => OCR/convert-bound). 4 workers: 6 OOMKilled the pod + # (crept past the 8Gi tier-defaults LimitRange cap over ~6h; that cap + # is shared across the edge tier, not worth raising for one ns). 4 + # fits with headroom (4 workers measured ~1.3Gi). OCR temp stays on + # ephemeral scratch (fast); the consume QUEUE is on the PVC so a + # restart never loses queued work. Watch etcd apply latency. Revert + # workers/threads/mem to defaults once import is done. env { name = "PAPERLESS_TASK_WORKERS" - value = "6" + value = "4" } env { name = "PAPERLESS_THREADS_PER_WORKER"