tripit: switch mail-ingest LLM_MODEL qwen3-8b -> qwen3vl-8b (qwen3-8b segfaults)
All checks were successful
ci/woodpecker/push/default Pipeline was successful

The qwen3-8b GGUF segfaults on load on the current llama-swap :cuda image
("common_init_from_params: failed to create context"; llama-swap returns 502),
which broke ALL tripit mail ingest text extraction — booking emails AND forwarded
reels (status=failed, "no place could be read"). The GGUF isn't corrupt (valid
header, full size, worked for weeks) — it's a llama.cpp/image regression. Rather
than pin the SHARED llama-swap image (cross-user blast radius), repoint the
ingest-plans CronJob at qwen3vl-8b, an already-provisioned 8B model that loads
fine and extracts flight numbers + places reliably. Restores the auto-path
(reels resolve via the Nominatim geocoder; bookings parse again). The broken
qwen3-8b GGUF is a separate, non-urgent llama-cpp cleanup.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
Viktor Barzin 2026-06-22 15:52:09 +00:00
parent 7dbbb74163
commit 59f2070e56

View file

@ -571,11 +571,15 @@ locals {
extra_env = {
LLM_MODE = "llamacpp"
LLM_ENDPOINT = "http://llama-swap.llama-cpp.svc.cluster.local:8080"
# Text body extraction uses the 8B text model (reliably emits flight_number);
# Text body extraction uses an 8B model (reliably emits flight_number);
# boarding-pass image attachments use the 4B vision model. llama-swap loads
# each on demand. Was qwen3vl-4b for both, which dropped flight numbers and
# duplicated schedule-change emails (2026-06-16).
LLM_MODEL = "qwen3-8b"
# duplicated schedule-change emails (2026-06-16). Switched qwen3-8b ->
# qwen3vl-8b (2026-06-22): the qwen3-8b GGUF SEGFAULTS on the current
# llama-swap :cuda image ("failed to create context"), which broke ALL mail
# ingest; qwen3vl-8b loads and extracts flight numbers + places reliably.
# (ADR-0033 adds a claude-agent-service fallback for the next llama outage.)
LLM_MODEL = "qwen3vl-8b"
LLM_VISION_MODEL = "qwen3vl-4b"
MAIL_INGEST_ENABLED = "true"
# ReelWishlist ingest (tripit ADR-0031): geocode forwarded-reel venues at