paperless-ngx: 2 task workers + 2 threads/worker + 4Gi limit for the Emo bulk import
Some checks failed
ci/woodpecker/push/default Pipeline was canceled
Some checks failed
ci/woodpecker/push/default Pipeline was canceled
Emo's ~13.7k-doc import is OCR-bound on a single celery worker (~10s/doc = multi-day). Bump PAPERLESS_TASK_WORKERS=2 + THREADS_PER_WORKER=2 for ~2x throughput, and the memory limit 2Gi->4Gi to fit two concurrent OCR jobs. Kept deliberately modest: archive writes hit the shared sdc HDD that etcd also lives on (IO-storm risk, code-oflt) — watch etcd apply latency and revert workers to 1 if it degrades. Revert to defaults once the import done. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
parent
d4f564e8d5
commit
9599beadc9
1 changed files with 15 additions and 1 deletions
|
|
@ -217,6 +217,20 @@ resource "kubernetes_deployment" "paperless-ngx" {
|
|||
name = "PAPERLESS_TIKA_GOTENBERG_ENDPOINT"
|
||||
value = "http://gotenberg.paperless-ngx.svc.cluster.local:3000"
|
||||
}
|
||||
# Processing concurrency, tuned for the bulk Emo import (~13.7k docs).
|
||||
# 2 workers = 2 docs in parallel (≈2x throughput); kept modest because
|
||||
# archive writes land on the shared sdc HDD that etcd also uses (IO
|
||||
# storm risk, code-oflt). 2 threads/worker speeds per-doc OCR using the
|
||||
# node's spare CPU. Watch etcd apply latency; dial workers back to 1 if
|
||||
# it degrades. Revert both to defaults once the import is done.
|
||||
env {
|
||||
name = "PAPERLESS_TASK_WORKERS"
|
||||
value = "2"
|
||||
}
|
||||
env {
|
||||
name = "PAPERLESS_THREADS_PER_WORKER"
|
||||
value = "2"
|
||||
}
|
||||
volume_mount {
|
||||
name = "data"
|
||||
mount_path = "/usr/src/paperless/data"
|
||||
|
|
@ -228,7 +242,7 @@ resource "kubernetes_deployment" "paperless-ngx" {
|
|||
memory = "2Gi"
|
||||
}
|
||||
limits = {
|
||||
memory = "2Gi"
|
||||
memory = "4Gi"
|
||||
}
|
||||
}
|
||||
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue