Compare commits

...

3 commits

Author SHA1 Message Date
cc56ba2939 [payslip-ingest] Move Payslips datasource 'database' into jsonData
Grafana 11.2+ Postgres plugin reads the DB name from jsonData.database
(see grafana/grafana#112418). The top-level 'database' field is silently
ignored by the frontend — datasource health checks and POST /api/ds/query
still work because the backend honors it, but every dashboard panel fails
with 'you do not have default database'.

Rolling back to the supported shape fixes rendering for all 4 uk-payslip
panels.
2026-04-18 23:23:07 +00:00
Viktor Barzin
f6cff262f0 broker-sync: chown fidelity_storage_state to broker uid in init container
## Context

First end-to-end test of the broker-sync-fidelity CronJob failed with
`PermissionError: [Errno 13] Permission denied:
'/data/fidelity_storage_state.json'`. Init container runs as root (uid
0) but the broker-sync container runs as uid 10001; chmod 600 without
chown made the file unreadable from the main container.

## This change

Added `chown 10001:10001` before the existing `chmod 600` in the
`stage-storage-state` init container command. Init container has
CAP_CHOWN by default as root, so this succeeds.

## Verification

$ kubectl apply -f test-pod.yaml   # same init + main pattern
$ kubectl logs fidelity-debug -c broker-sync
...
broker_sync.providers.fidelity_planviewer.FidelitySessionError:
    PlanViewer session stale — run `broker-sync fidelity-seed`

Init container succeeded + main container read the file + Playwright
launched Chromium + navigated to PlanViewer + hit the 15-min idle page
→ exactly the intended behaviour for a stale session. Next step
(out-of-band): Viktor paste a fresh SMS OTP and re-seed via
fidelity-seed on Viktor's laptop or the existing chat-driven flow.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 23:22:43 +00:00
Viktor Barzin
43254ccd3f [infra] Add Woodpecker pipeline to deploy PVE /etc/exports (Wave 6b)
## Context

Wave 6b of the state-drift consolidation plan. `scripts/pve-nfs-exports` is
the git-managed source of truth for the Proxmox host's NFS export table
(file header documents this since the 2026-04-14 NFS outage post-mortem).
Deploying it was runbook-only — `scp` then `ssh ... exportfs -ra` — which
means a change could sit unpushed-to-PVE indefinitely, and nothing alerted
on divergence between git and host.

Wave 6b closes that loop: a Woodpecker pipeline watches
`scripts/pve-nfs-exports` on the `master` branch, diffs against the
current host file, and scp's the new content followed by `exportfs -ra`.
The same 2-shell-command runbook, now a CI step that runs on every push
and is manually triggerable.

## This change

- New pipeline `.woodpecker/pve-nfs-exports-sync.yml` — path-filtered push
  trigger + manual.
- SSH credentials provisioned 2026-04-18:
  - ed25519 keypair `woodpecker-pve-nfs-exports-sync`
  - Public key in `root@192.168.1.127:~/.ssh/authorized_keys`
  - Private key in Vault `secret/woodpecker/pve_ssh_key` (plus known_hosts
    entry for deterministic host-key pinning from Vault)
  - Woodpecker repo-level secret `pve_ssh_key` (id 139) bound to the infra
    repo's `push`/`manual`/`cron` events
- Pipeline steps: install openssh + curl (alpine image) → stage private
  key from secret → ssh-keyscan the PVE host into known_hosts → diff
  current vs. proposed exports (shown in pipeline log) → scp → exportfs
  -ra → Slack notify status.

## What is NOT in this change

- Drift detection (git-truth vs. host-truth) via cron: this pipeline only
  fires on *push*, so a host-side edit wouldn't be caught. Could add a
  daily cron that just runs the diff step and alerts if non-empty. Left
  as a refinement if drift becomes an issue.
- Pulling known_hosts from Vault rather than ssh-keyscan on each run: the
  keyscan is simpler and works against key rotation without needing a
  Vault round-trip. Pulling from Vault is the right answer the moment we
  add MITM risk, which the internal network doesn't have today.

## Reproduce locally
Edit `scripts/pve-nfs-exports`, push to master. Watch the pipeline in
Woodpecker. Verify on PVE: `ssh root@192.168.1.127 "md5sum /etc/exports"`
matches `md5sum scripts/pve-nfs-exports` in the repo.

Closes: code-dne

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 23:21:36 +00:00
3 changed files with 78 additions and 8 deletions

View file

@ -0,0 +1,63 @@
# Sync infra/scripts/pve-nfs-exports → PVE host /etc/exports on change.
#
# Wave 6b of the state-drift consolidation plan: move the "scp + exportfs -ra"
# deploy step out of runbook-human-hands and into CI so the Proxmox NFS export
# table tracks git.
#
# Trigger: push to master that touches `scripts/pve-nfs-exports`. The file
# header documents the deploy invocation; this pipeline codifies it.
#
# Credentials:
# - pve_ssh_key: Woodpecker repo-secret (ed25519 keypair provisioned
# 2026-04-18 as `woodpecker-pve-nfs-exports-sync`). Public key lives in
# /root/.ssh/authorized_keys on the PVE host. Private key mirrored in
# Vault `secret/woodpecker/pve_ssh_key` for recovery.
when:
- event: push
branch: master
path: scripts/pve-nfs-exports
- event: manual
clone:
git:
image: woodpeckerci/plugin-git
settings:
depth: 1
attempts: 3
steps:
- name: deploy
image: alpine:3.20
environment:
PVE_SSH_KEY:
from_secret: pve_ssh_key
SLACK_WEBHOOK:
from_secret: slack_webhook
commands:
- apk add --no-cache openssh-client curl
- mkdir -p ~/.ssh && chmod 700 ~/.ssh
- printf '%s\n' "$PVE_SSH_KEY" > ~/.ssh/id_ed25519
- chmod 600 ~/.ssh/id_ed25519
# Pin host key — CI's ~/.ssh/known_hosts is ephemeral, so accept-new on first pull.
- ssh-keyscan -t ed25519 192.168.1.127 >> ~/.ssh/known_hosts 2>/dev/null
# Diff what we'd ship, so pipeline logs show the intended change.
- echo '---diff---' && ssh -o BatchMode=yes root@192.168.1.127 "cat /etc/exports" > /tmp/remote.exports || true
- diff -u /tmp/remote.exports scripts/pve-nfs-exports || true
- echo '---applying---'
- scp -o BatchMode=yes scripts/pve-nfs-exports root@192.168.1.127:/etc/exports
- ssh -o BatchMode=yes root@192.168.1.127 "exportfs -ra && exportfs -s | head -5"
- echo '---done---'
- name: slack
image: curlimages/curl:8.11.0
environment:
SLACK_WEBHOOK:
from_secret: slack_webhook
commands:
- |
curl -s -X POST -H 'Content-type: application/json' \
--data "{\"channel\":\"general\",\"text\":\"PVE /etc/exports sync: ${CI_PIPELINE_STATUS}\"}" \
"$SLACK_WEBHOOK" || true
when:
status: [success, failure]

View file

@ -669,7 +669,9 @@ resource "kubernetes_cron_job_v1" "fidelity" {
spec {
restart_policy = "OnFailure"
# Materialise the JSON storage_state from the projected Secret
# onto the PVC where Playwright expects to read it.
# onto the PVC where Playwright expects to read it. Init container
# runs as root; the main broker-sync container runs as uid 10001,
# so we chown+chmod 600 to grant read access to the broker user.
init_container {
name = "stage-storage-state"
image = "busybox:1.36"
@ -677,6 +679,7 @@ resource "kubernetes_cron_job_v1" "fidelity" {
set -eu
mkdir -p /data
cp /secrets/fidelity_storage_state /data/fidelity_storage_state.json
chown 10001:10001 /data/fidelity_storage_state.json
chmod 600 /data/fidelity_storage_state.json
EOT
]

View file

@ -312,14 +312,18 @@ resource "kubernetes_config_map" "grafana_payslips_datasource" {
"payslips-datasource.yaml" = yamlencode({
apiVersion = 1
datasources = [{
name = "Payslips"
type = "postgres"
access = "proxy"
url = "${var.postgresql_host}:5432"
database = "payslip_ingest"
user = "payslip_ingest"
uid = "payslips-pg"
name = "Payslips"
type = "postgres"
access = "proxy"
url = "${var.postgresql_host}:5432"
user = "payslip_ingest"
uid = "payslips-pg"
# Grafana 11.2+ Postgres plugin reads the DB name from jsonData.database;
# the top-level `database` field is silently ignored by the frontend and
# triggers "you do not have default database" on every panel.
# See github.com/grafana/grafana#112418.
jsonData = {
database = "payslip_ingest"
sslmode = "disable"
postgresVersion = 1600
timescaledb = false