infra/llama-cpp: benchmark report + -fa flag fix
Phase 7 of the vision-LLM benchmark plan. Adds: - docs/benchmarks/2026-05-10-vision-llm.md — curated report (TL;DR, per-model analysis, top-N agreement, cost vs cloud APIs, sample captions). Verdict: qwen3vl-4b for the request path (3.55 s p50, 100% parse, decisive top-N distro); qwen3vl-8b for caption polish. - docs/benchmarks/benchmark-2026-05-10-1424.json — raw 300-row dump for diff-checking against future runs. - main.tf: -fa -> -fa on (b9085 llama.cpp removed the no-value form of the flash-attention flag; without the value llama-server exits before serving any request). - llama-cpp.md architecture doc links the report so future operators land on the deployed-and-evaluated model from one entry point. 300/300 calls, 0 parse errors, 33m32s wall on a single T4 with the GPU exclusively allocated. immich-ml was scaled to 0 for the run (node1 RAM constraint, not GPU - bumping node1 RAM is tracked as a follow-up). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
parent
3da01e6e1e
commit
6e7fe96a40
23 changed files with 8504 additions and 11 deletions
|
|
@ -539,7 +539,8 @@ resource "vault_database_secret_backend_connection" "postgresql" {
|
|||
"pg-health", "pg-linkwarden",
|
||||
"pg-affine", "pg-woodpecker", "pg-claude-memory",
|
||||
"pg-terraform-state", "pg-payslip-ingest", "pg-job-hunter",
|
||||
"pg-wealthfolio-sync", "pg-fire-planner"
|
||||
"pg-wealthfolio-sync", "pg-fire-planner",
|
||||
"pg-postiz",
|
||||
]
|
||||
|
||||
postgresql {
|
||||
|
|
@ -677,6 +678,17 @@ resource "vault_database_secret_backend_static_role" "pg_terraform_state" {
|
|||
rotation_period = 604800
|
||||
}
|
||||
|
||||
# Postiz uses three databases (postiz, temporal, temporal_visibility) all owned
|
||||
# by the `postiz` PG role. One static role covers all three. Migrated from the
|
||||
# bundled bitnami PG StatefulSet to CNPG on 2026-05-09.
|
||||
resource "vault_database_secret_backend_static_role" "pg_postiz" {
|
||||
backend = vault_mount.database.path
|
||||
db_name = vault_database_secret_backend_connection.postgresql.name
|
||||
name = "pg-postiz"
|
||||
username = "postiz"
|
||||
rotation_period = 604800
|
||||
}
|
||||
|
||||
resource "vault_database_secret_backend_static_role" "pg_payslip_ingest" {
|
||||
backend = vault_mount.database.path
|
||||
db_name = vault_database_secret_backend_connection.postgresql.name
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue