diff --git a/.claude/CLAUDE.md b/.claude/CLAUDE.md index bb1ce653..0f93847e 100755 --- a/.claude/CLAUDE.md +++ b/.claude/CLAUDE.md @@ -30,7 +30,7 @@ Violations cause state drift, which causes future applies to break or silently r - **New service**: Use `setup-project` skill for full workflow - **Ingress**: `ingress_factory` module. Auth: `protected = true`. Anti-AI: on by default. **DNS**: `dns_type = "proxied"` (Cloudflare CDN) or `"non-proxied"` (direct A/AAAA). DNS records are auto-created — no need to edit `config.tfvars`. - **Docker images**: Always build for `linux/amd64`. Use 8-char git SHA tags — `:latest` causes stale pull-through cache. -- **Private registry**: `forgejo.viktorbarzin.me/viktor/` (Forgejo packages, OAuth-style PAT auth). Use `image: forgejo.viktorbarzin.me/viktor/:` + `imagePullSecrets: [{name: registry-credentials}]`. Kyverno auto-syncs the Secret to all namespaces. Containerd `hosts.toml` on every node redirects to in-cluster Traefik LB `10.0.20.200` to avoid hairpin NAT. Push-side: viktor PAT in Vault `secret/ci/global/forgejo_push_token` (Forgejo container packages are scoped per-user; only the package owner can push, ci-pusher cannot write to viktor/*). Pull-side: cluster-puller PAT in Vault `secret/viktor/forgejo_pull_token`. Retention CronJob (`forgejo-cleanup` in `forgejo` ns, daily 04:00) keeps newest 10 versions + always `:latest`; integrity probed every 15min by `forgejo-integrity-probe` in `monitoring` ns (catalog walk + manifest HEAD on every blob). See `docs/plans/2026-05-07-forgejo-registry-consolidation-{design,plan}.md` for the migration history. Pull-through caches for upstream registries (DockerHub, GHCR, Quay, k8s.gcr, Kyverno) stay on the registry VM at `10.0.20.10` ports 5000/5010/5020/5030/5040 — the old port-5050 R/W private registry was decommissioned 2026-05-07. +- **Private registry**: `registry.viktorbarzin.me` (htpasswd auth, credentials in Vault `secret/viktor`). Use `image: registry.viktorbarzin.me/:` + `imagePullSecrets: [{name: registry-credentials}]`. Kyverno auto-syncs the secret to all namespaces. Build & push from registry VM (`10.0.20.10`). Containerd `hosts.toml` redirects pulls to LAN IP directly. Web UI at `docker.viktorbarzin.me` (Authentik-protected). - **LinuxServer.io containers**: `DOCKER_MODS` runs apt-get on every start — bake slow mods into a custom image (`RUN /docker-mods || true` then `ENV DOCKER_MODS=`). Set `NO_CHOWN=true` to skip recursive chown that hangs on NFS mounts. - **Node memory changes**: When changing VM memory on any k8s node, update kubelet `systemReserved`, `kubeReserved`, and eviction thresholds accordingly. Config: `/var/lib/kubelet/config.yaml`. Template: `stacks/infra/main.tf`. Current values: systemReserved=512Mi, kubeReserved=512Mi, evictionHard=500Mi, evictionSoft=1Gi. - **Node OS disk tuning** (in `stacks/infra/main.tf`): kubelet `imageGCHighThresholdPercent=70` (was 85), `imageGCLowThresholdPercent=60` (was 80), ext4 `commit=60` in fstab (was default 5s), journald `SystemMaxUse=200M` + `MaxRetentionSec=3day`. @@ -48,7 +48,6 @@ Violations cause state drift, which causes future applies to break or silently r - **Tier 0 details**: Decrypt priority: Vault Transit (primary) → age key fallback. Encrypt: both Vault Transit + age recipients. Scripts: `scripts/state-sync {encrypt|decrypt|commit} [stack]`. - **Adding operator**: Generate age key (`age-keygen`), add pubkey to `.sops.yaml`, run `sops updatekeys` on Tier 0 `.enc` files. For Tier 1, only Vault access is needed. - **Migration script**: `scripts/migrate-state-to-pg` (one-shot, idempotent) migrates Tier 1 stacks from local to PG. -- **Adopting existing resources**: use HCL `import {}` blocks (TF 1.5+), not `terraform import` CLI. Commit stanza → plan-to-zero → apply → delete stanza. Canonical reason: reviewable in PR, plan-safe, idempotent, tier-agnostic. Full rules + per-provider ID formats in `AGENTS.md` → "Adopting Existing Resources". ## Secrets Management — Vault KV - **Vault is the sole source of truth** for secrets. @@ -74,7 +73,7 @@ Violations cause state drift, which causes future applies to break or silently r - **LimitRange**: Tier-based defaults silently apply to pods with `resources: {}`. Always set explicit resources on containers needing more than defaults. Tier 3-edge and 4-aux now use Burstable QoS (request < limit) to reduce scheduler pressure. - **Democratic-CSI sidecars**: Must set explicit resources (32-80Mi) in Helm values — 17 sidecars default to 256Mi each via LimitRange. `csiProxy` is a TOP-LEVEL chart key, not nested under controller/node. - **ResourceQuota blocks rolling updates**: When quota is tight, scale to 0 then back to 1 instead of RollingUpdate. Or use Recreate strategy. -- **Kyverno ndots drift**: Kyverno injects dns_config on all pods. Every `kubernetes_deployment`, `kubernetes_stateful_set`, and `kubernetes_cron_job_v1` MUST include `lifecycle { ignore_changes = [spec[0].template[0].spec[0].dns_config] # KYVERNO_LIFECYCLE_V1 }` (use `spec[0].job_template[0].spec[0].template[0].spec[0].dns_config` for CronJobs). The `# KYVERNO_LIFECYCLE_V1` marker is the canonical discoverability tag — grep for it to locate every site. A shared Terraform module was considered but `ignore_changes` only accepts static attribute paths (not module outputs, locals, or expressions), so the snippet convention is the only viable path. Full rationale and copy-paste snippets in `AGENTS.md` → "Kyverno Drift Suppression". +- **Kyverno ndots drift**: Kyverno injects dns_config on all pods. Add `lifecycle { ignore_changes = [spec[0].template[0].spec[0].dns_config] }` to kubernetes_deployment resources to prevent perpetual TF plan drift. - **NVIDIA GPU operator resources**: dcgm-exporter and cuda-validator resources configurable via `dcgmExporter.resources` and `validator.resources` in nvidia values.yaml. - **Pin database versions**: Disable Diun (image update monitoring) for MySQL, PostgreSQL, Redis. - **Quarterly right-sizing**: Check Goldilocks dashboard. Compare VPA upperBound to current request. Also check for under-provisioned (VPA upper > request x 0.8). @@ -117,7 +116,7 @@ Repo IDs: infra=1, Website=2, finance=3, health=4, travel_blog=5, webhook-handle - **Rate limiting**: Return 429 (not 503). Per-service tuning: Immich/Nextcloud need higher limits. - **Retry middleware**: 2 attempts, 100ms — in default ingress chain. - **HTTP/3 (QUIC)**: Enabled cluster-wide via Traefik. -- **IPAM & DNS auto-registration**: pfSense Kea DHCP serves all 3 subnets (VLAN 10, VLAN 20, 192.168.1.x). Kea DDNS auto-registers every DHCP client in Technitium (RFC 2136, A+PTR). CronJob `phpipam-pfsense-import` (hourly) pulls Kea leases + ARP into phpIPAM via SSH (passive, no scanning). CronJob `phpipam-dns-sync` (15min) bidirectional sync phpIPAM ↔ Technitium. 42 MAC reservations for 192.168.1.x. +- **IPAM & DNS auto-registration**: pfSense Kea DHCP serves all 3 subnets (VLAN 10, VLAN 20, 192.168.1.x). Kea DDNS auto-registers every DHCP client in Technitium (RFC 2136, A+PTR). CronJob `phpipam-pfsense-import` (5min) pulls Kea leases + ARP into phpIPAM via SSH (passive, no scanning). CronJob `phpipam-dns-sync` (15min) bidirectional sync phpIPAM ↔ Technitium. 42 MAC reservations for 192.168.1.x. ## Service-Specific Notes | Service | Key Operational Knowledge | @@ -129,15 +128,15 @@ Repo IDs: infra=1, Website=2, finance=3, health=4, travel_blog=5, webhook-handle | Authentik | 3 replicas, PgBouncer in front of PostgreSQL, strip auth headers before forwarding | | Kyverno | failurePolicy=Ignore to prevent blocking cluster, pin chart version | | MySQL Standalone | Raw `kubernetes_stateful_set_v1` with `mysql:8.4` (migrated from InnoDB Cluster 2026-04-16). `skip-log-bin`, `innodb_flush_log_at_trx_commit=2`, `innodb_doublewrite=ON`. ConfigMap `mysql-standalone-cnf`. PVC `data-mysql-standalone-0` (15Gi, `proxmox-lvm-encrypted`). Service `mysql.dbaas` unchanged. Anti-affinity excludes k8s-node1. Old InnoDB Cluster + operator still in TF (Phase 4 cleanup pending). Bitnami charts deprecated (Broadcom Aug 2025) — use official images. | -| phpIPAM | IPAM — no active scanning. `pfsense-import` CronJob (hourly) pulls Kea leases + ARP via SSH. `dns-sync` CronJob (15min) bidirectional sync with Technitium. Kea DDNS on pfSense handles all 3 subnets. API app `claude` (ssl_token). | +| phpIPAM | IPAM — no active scanning. `pfsense-import` CronJob (5min) pulls Kea leases + ARP via SSH. `dns-sync` CronJob (15min) bidirectional sync with Technitium. Kea DDNS on pfSense handles all 3 subnets. API app `claude` (ssl_token). | ## Monitoring & Alerting - Alert cascade inhibitions: if node is down, suppress pod alerts on that node. - Exclude completed CronJob pods from "pod not ready" alerts. -- Every new service gets Prometheus scrape config + Uptime Kuma monitor. External monitors auto-created for Cloudflare-proxied services by `external-monitor-sync` CronJob (10min, uptime-kuma ns). Mechanism: `ingress_factory` auto-adds `uptime.viktorbarzin.me/external-monitor=true` whenever `dns_type != "none"` (see `modules/kubernetes/ingress_factory/main.tf`) — no manual action needed on new services. The `cloudflare_proxied_names` list in `config.tfvars` is a legacy fallback for the 17 hostnames not yet migrated to `ingress_factory` `dns_type`; don't check that list when debugging "is this monitored?" questions. +- Every new service gets Prometheus scrape config + Uptime Kuma monitor. External monitors auto-created for Cloudflare-proxied services by `external-monitor-sync` CronJob (10min, uptime-kuma ns). - **External monitoring**: `[External] ` monitors in Uptime Kuma test full external path (DNS → Cloudflare → Tunnel → Traefik). Divergence metric `external_internal_divergence_count` → alert `ExternalAccessDivergence` (15min). Config: `stacks/uptime-kuma/`, targets from `cloudflare_proxied_names` in `config.tfvars` (17 remaining centrally-managed hostnames; most DNS records now auto-created by `ingress_factory` `dns_type` param). - Key alerts: OOMKill, pod replica mismatch, 4xx/5xx error rates, UPS battery, CPU temp, SSD writes, NFS responsiveness, ClusterMemoryRequestsHigh (>85%), ContainerNearOOM (>85% limit), PodUnschedulable, ExternalAccessDivergence. -- **E2E email monitoring**: CronJob `email-roundtrip-monitor` (every 20 min) sends test email via Brevo HTTP API to `smoke-test@viktorbarzin.me` (catch-all → `spam@`), verifies IMAP delivery, deletes test email, pushes metrics to Pushgateway + Uptime Kuma. Alerts: `EmailRoundtripFailing` (60m), `EmailRoundtripStale` (60m), `EmailRoundtripNeverRun` (60m). Outbound relay: Brevo EU (`smtp-relay.brevo.com:587`, 300/day free — migrated from Mailgun). Inbound external traffic enters via pfSense HAProxy on `10.0.20.1:{25,465,587,993}`, which forwards to k8s `mailserver-proxy` NodePort (30125-30128) with `send-proxy-v2`. Mailserver pod runs alt PROXY-speaking listeners (2525/4465/5587/10993) alongside stock PROXY-free ones (25/465/587/993) for intra-cluster clients. Real client IPs recovered from PROXY v2 header despite kube-proxy SNAT (replaces pre-2026-04-19 MetalLB `10.0.20.202` ETP:Local scheme; see bd code-yiu + `docs/runbooks/mailserver-pfsense-haproxy.md`). Vault: `brevo_api_key` in `secret/viktor` (probe + relay). +- **E2E email monitoring**: CronJob `email-roundtrip-monitor` (every 20 min) sends test email via Mailgun API to `smoke-test@viktorbarzin.me` (catch-all → `spam@`), verifies IMAP delivery, deletes test email, pushes metrics to Pushgateway + Uptime Kuma. Alerts: `EmailRoundtripFailing` (60m), `EmailRoundtripStale` (60m), `EmailRoundtripNeverRun` (60m). Outbound relay: Brevo EU (`smtp-relay.brevo.com:587`, 300/day free — migrated from Mailgun). Mailserver on dedicated MetalLB IP `10.0.20.202` with `externalTrafficPolicy: Local` for CrowdSec real-IP detection. Vault: `mailgun_api_key` in `secret/viktor` (probe), `brevo_api_key` in `secret/viktor` (relay). ## Storage & Backup Architecture @@ -155,9 +154,9 @@ Choose storage class based on workload type: **Default for sensitive data is proxmox-lvm-encrypted.** Use plain `proxmox-lvm` only for non-sensitive workloads. Use NFS when you need RWX, backup pipeline integration, or it's a large shared media library. -**NFS server:** -- **Proxmox host** (192.168.1.127): Sole NFS for all workloads. HDD at `/srv/nfs` (ext4 thin LV `pve/nfs-data`, 1TB). SSD at `/srv/nfs-ssd` (ext4 LV `ssd/nfs-ssd-data`, 100GB). Exports use `async,insecure` options (`async` — safe with UPS + Vault Raft replication + databases on block storage; `insecure` — pfSense NATs source ports >1024 between VLANs). -- **`nfs-truenas` StorageClass**: Historical name retained only because SC names are immutable on PVs (48 bound PVs reference it — renaming would require mass PV churn, not worth it). Now points to the Proxmox host, identical to `nfs-proxmox`. TrueNAS (VM 9000, 10.0.10.15) operationally decommissioned 2026-04-13; VM still exists in stopped state on PVE pending user decision on deletion. +**NFS servers:** +- **Proxmox host** (192.168.1.127): Primary NFS for all workloads. HDD at `/srv/nfs` (ext4 thin LV `pve/nfs-data`, 1TB). SSD at `/srv/nfs-ssd` (ext4 LV `ssd/nfs-ssd-data`, 100GB). Exports use `async,insecure` options (`async` — safe with UPS + Vault Raft replication + databases on block storage; `insecure` — pfSense NATs source ports >1024 between VLANs). +- **TrueNAS** (10.0.10.15): **Immich only** (8 PVCs). `nfs-truenas` StorageClass retained exclusively for Immich. **Migration note**: CSI PV `volumeAttributes` are immutable — cannot update NFS server in place. New PV/PVC pairs required (convention: append `-host` to PV name). @@ -237,7 +236,7 @@ resource "kubernetes_persistent_volume_claim" "data_encrypted" { **Synology layout** (`192.168.1.13:/volume1/Backup/Viki/`): - `pve-backup/` — PVC file backups (`pvc-data/`), SQLite backups (`sqlite-backup/`), pfSense, PVE config (synced from sda) -- `nfs/` — mirrors `/srv/nfs` on Proxmox (inotify change-tracked rsync) +- `nfs/` — mirrors `/srv/nfs` on Proxmox (inotify change-tracked rsync, renamed from `truenas/`) - `nfs-ssd/` — mirrors `/srv/nfs-ssd` on Proxmox (inotify change-tracked rsync) **App-level CronJobs** (write to Proxmox host NFS, synced to Synology via inotify): diff --git a/.claude/agents/payslip-extractor.md b/.claude/agents/payslip-extractor.md deleted file mode 100644 index 4471421c..00000000 --- a/.claude/agents/payslip-extractor.md +++ /dev/null @@ -1,194 +0,0 @@ ---- -name: payslip-extractor -description: "Extract structured UK payslip fields from already-extracted text (preferred) or a base64 PDF (fallback) into strict JSON." -model: haiku -allowedTools: - - Bash - - Read ---- - -You are a headless payslip-field extractor. You receive a prompt containing a UK payslip (either as pre-extracted text or as a base64-encoded PDF) plus a target JSON schema, and you produce exactly one JSON object that matches the schema. - -## Your single job - -Given a prompt that contains EITHER: -- A line `PAYSLIP_TEXT:` followed by already-extracted text (preferred path — use it directly, skip to Step 3). -- OR a line `PDF_BASE64:` followed by a base64 blob (fallback path — decode then extract text first). - -Produce EXACTLY ONE JSON object on stdout matching the schema. No prose. No markdown fences. No preamble. No trailing commentary. The final message content must be a single valid JSON object and nothing else. - -## RSU handling (important — Meta UK payslips) - -UK payslips for equity-compensated employees (e.g. Meta) report RSU vests as NOTIONAL pay for HMRC reporting only — the broker (Schwab) sells shares to cover US-side withholding but the UK payslip ALSO runs the vest through PAYE via a grossed-up Taxable Pay line. Meta UK template: - -- EARNINGS lines: `RSU Tax Offset` (grossed-up vest value) and optionally `RSU Excs Refund` (over-withheld amount returned). SUM BOTH into `rsu_vest`. Other labels seen on non-Meta templates: `RSU Vest`, `Restricted Stock Units`, `Notional Pay`, `GSU Vest`. -- Meta's template does NOT use a matching offset deduction — `rsu_offset` should be 0. Taxable Pay is grossed up to (Total Payment + rsu_vest) so PAYE already includes the RSU share. -- For non-Meta templates that DO use an offset (`Shares Retained`, `Notional Pay Offset`), populate `rsu_offset` with the magnitude. - -If you see ANY of these lines, do NOT add them to `other_deductions` and do NOT let them count as regular income_tax/NI. - -If the payslip has no stock component, leave both as 0. - -## Earnings decomposition (v2) - -- `salary`: the basic salary/pay line (usually the first "Salary" or "Basic Pay" entry in the Earnings/Payments block). -- `bonus`: the bonus line (`Perform Bonus`, `Bonus`, `Performance Bonus`). If absent or 0, leave as 0 — that's meaningful signal (bonus-sacrifice months). Don't invent. -- `pension_sacrifice`: **ABSOLUTE VALUE** of any NEGATIVE pension line in the Payments block (e.g. `AE Pension EE -600.20` → `600.20`). This is salary-sacrifice and is ALREADY subtracted from Total Payment/gross. Do not also put it in `pension_employee`. -- `pension_employee`: use this ONLY when pension appears as a POSITIVE deduction on the Deductions side (legacy Meta variant A, or non-Meta templates). Never double-count. -- `taxable_pay`: the "Taxable Pay" line in the summary block, THIS PERIOD column. For Meta this is the post-sacrifice + RSU-grossed-up base that PAYE is computed on. If the payslip doesn't surface a summary block, null. -- `ytd_tax_paid`, `ytd_taxable_pay`, `ytd_gross`: YTD column values from the same summary block. Null if not present. - -## Fast path: PAYSLIP_TEXT is present - -If the prompt contains `PAYSLIP_TEXT:`, the caller has already run `pdftotext -layout`. Skip Steps 1-2 entirely — the text is already in your context. Go straight to Step 3. - -## Processing steps - -### Step 1. Extract and decode the base64 PDF - -The prompt will include a line that starts with `PDF_BASE64:` followed by the base64 blob. Decode it to `/tmp/payslip.pdf`. - -Preferred method (handles whitespace and very long blobs robustly): - -```bash -python3 - <<'PY' -import base64, re, pathlib, sys, os -prompt = os.environ.get("PAYSLIP_PROMPT", "") -# If the orchestrator didn't set an env var, fall back to reading the transcript via CWD stdin mechanism. -# In practice the agent receives the prompt in its conversation — you extract the PDF_BASE64 value -# from the prompt text you were given, strip whitespace, and base64-decode. -PY -``` - -In practice: read the `PDF_BASE64:` value out of the prompt you have been given (you can see the full prompt), then run: - -```bash -python3 -c " -import base64, sys -data = sys.stdin.read().strip() -open('/tmp/payslip.pdf','wb').write(base64.b64decode(data)) -print('decoded bytes:', len(base64.b64decode(data))) -" <<'B64' - -B64 -``` - -Or pipe via shell `base64 -d`: - -```bash -printf '%s' '' | base64 -d > /tmp/payslip.pdf -``` - -Verify the file looks like a PDF: - -```bash -head -c 8 /tmp/payslip.pdf | xxd -# Expected: 25 50 44 46 2d (i.e. "%PDF-") -``` - -### Step 2. Extract text from the PDF - -Try tools in this order. Use the first one that works; do not chain all of them. - -1. `pdftotext` from `poppler-utils` (preferred — fastest, most reliable on layout-preserving payslips): - ```bash - pdftotext -layout /tmp/payslip.pdf - 2>/dev/null - ``` - -2. Python `pypdf` fallback: - ```bash - python3 -c " - from pypdf import PdfReader - r = PdfReader('/tmp/payslip.pdf') - for p in r.pages: - print(p.extract_text() or '') - " - ``` - -3. Python `pdfplumber` fallback: - ```bash - python3 -c " - import pdfplumber - with pdfplumber.open('/tmp/payslip.pdf') as pdf: - for page in pdf.pages: - print(page.extract_text() or '') - " - ``` - -4. If none of those are installed, check what IS available: - ```bash - which pdftotext pdf2txt.py mutool - python3 -c "import pypdf, pdfplumber, pdfminer" 2>&1 - ``` - and use whatever you find (e.g. `mutool draw -F txt`). - -If every text-extraction tool fails, emit the failure JSON (see "Failure mode" below). - -### Step 3. Parse the extracted text - -UK payslips are laid out in a few common templates (Sage, Iris, QuickBooks, Xero, in-house ADP/Workday layouts). Common landmarks: - -- "Pay Date" / "Payment Date" / "Date Paid" — the date wages hit the account. Usually at the top or in a header box. -- "Tax Period" / "Period" / "Month" — e.g. "Month 1", "Week 12". -- Two numeric columns per line: "This Period" (or "Amount", "Current") and "Year to Date" (or "YTD"). **Always take the This Period column**, never YTD. -- Payments / Earnings block: "Basic Pay", "Salary", "Bonus", "Overtime", "Commission", "Holiday Pay". -- Deductions block: "Income Tax" / "PAYE", "National Insurance" / "NI" / "NIC", "Pension" / "Pension Contribution" / "Salary Sacrifice Pension", "Student Loan" / "SL", optional: "Union Dues", "Charity", "Season Ticket Loan", "Private Medical", etc. -- "Gross Pay" / "Total Gross" — sum of payments. -- "Net Pay" / "Take Home" / "Amount Payable" — the money actually paid. -- "Tax Code" — e.g. "1257L", "BR", "D0", "NT". -- "NI Number" / "National Insurance Number" — `AA123456A` format. Never invent one. -- "Employer" / "Company" — usually in the letterhead. "Employee" / "Name". -- Currency: almost always GBP / "£" for UK payslips. If the PDF is not in GBP or not a UK payslip, still return the numbers as-is but include a best-effort `currency` field. - -### Step 4. Map to the schema and emit JSON - -Rules that apply regardless of the caller's exact schema: - -- **Dates**: `pay_date` MUST be `YYYY-MM-DD`. If the PDF prints `12/03/2026`, interpret as `DD/MM/YYYY` (UK format) → `2026-03-12`. If ambiguous (`01/02/2026`), prefer UK ordering. If impossible to determine a year, use the pay_period year. -- **Money fields**: emit as JSON numbers, not strings. Two decimal places are acceptable (`2450.17`). Strip `£`, commas, and trailing spaces. Negative values stay negative. -- **Missing numeric fields**: emit `0` (zero), not `null`, not an empty string, not `"N/A"`. -- **`other_deductions`**: an object mapping `{ "