payslip-ingest

4 commits 1 branch 0 tags 260 KiB

Author	SHA1	Message	Date
Viktor Barzin	693ec4a5d4	extractor: wait up to 15min for claude-agent-service to free lock Real UK payslip extractions routinely take 5-10min end-to-end (Haiku processing 100-300KB base64'd PDFs). With 10 retries × 5s = 50s we'd abort while another extraction was still in-flight. Bump to 90 retries × 10s = 900s wait — enough to cover the server-side timeout_seconds=600 plus some slack. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-18 22:36:05 +00:00
Viktor Barzin	7a32885d26	extractor: bump claude poll timeout to 600s Real UK payslip PDFs are 100-200KB base64'd, which means ~300-500KB of prompt tokens. Claude (even Haiku) takes 1-5 minutes to process and emit structured JSON. The original 120s ceiling timed out before extraction could finish. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-18 22:32:34 +00:00
Viktor Barzin	11a8256e6a	alembic: create schema before initializing version table The payslip_ingest schema must exist before Alembic creates its alembic_version tracking table inside it. Add CREATE SCHEMA IF NOT EXISTS at the top of do_run_migrations. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-18 22:23:30 +00:00
Viktor Barzin	57484619c1	Initial commit: event-driven UK payslip ingest service Extracted from /home/wizard/code monorepo into its own repo so Woodpecker CI can watch it. Identical content to /home/wizard/code commit e426028. See README.md for overview, env vars, and Paperless workflow config. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-18 22:10:23 +00:00