[payslip-ingest] Update extractor agent + dashboard for v2 regex parser

## Context

Companion change to payslip-ingest v2 (regex parser + accurate RSU tax
attribution). The Grafana dashboard now has 4 more panels powered by the
new earnings-decomposition and YTD-snapshot columns, and the Claude
fallback agent's prompt is aligned with the new schema so non-Meta
payslips still land with the full field set.

## This change

### `.claude/agents/payslip-extractor.md`

Rewrites the RSU handling section to match Meta UK's actual template
(rsu_vest = "RSU Tax Offset" + "RSU Excs Refund", no matching
rsu_offset deduction — PAYE uses grossed-up Taxable Pay instead).
Adds a new "Earnings decomposition (v2)" section telling the fallback
agent how to populate salary/bonus/pension_sacrifice/taxable_pay/ytd_*
and when to use pension_employee vs pension_sacrifice without
double-counting.

### `stacks/monitoring/modules/monitoring/dashboards/uk-payslip.json`

- **Panel 4 (Effective rate)** — SQL switched from the naive
  `(income_tax + NIC) / cash_gross` to the YTD-effective-rate
  method: `cash_tax = income_tax - rsu_vest × (ytd_tax_paid /
  ytd_taxable_pay)`. Title updated to "YTD-corrected" so the
  change is discoverable.
- **Panel 5 (Table)** — adds salary, bonus, pension_sacrifice,
  taxable_pay columns so row-level debugging against the parser
  output is trivial.
- **+Panel 8 (Earnings breakdown)** — monthly stacked bars of
  salary / bonus / rsu_vest / -pension_sacrifice. Bonus-sacrifice
  months show up as a massive negative pension_sacrifice spike
  paired with a near-zero bonus bar.
- **+Panel 9 (Accurate cash tax rate)** — timeseries of
  cash_tax_rate_ytd vs naive_tax_rate. Divergence is the RSU
  contribution the payslip hides in the single `Tax paid` line.
- **+Panel 10 (All-in compensation)** — stacked bars of cash_gross
  + rsu_vest per payslip.
- **+Panel 11 (YTD cumulative cash gross vs total comp)** — two
  lines partitioned by tax_year; the gap between them is the RSU
  contribution YTD.

Total panels go from 7 → 11.

## Test Plan

### Automated

Dashboard JSON validity:
```
$ python3 -m json.tool uk-payslip.json > /dev/null && echo ok
ok
```

### Manual Verification

After applying `stacks/monitoring/`:
1. `https://grafana.viktorbarzin.me/d/uk-payslip` loads with 11 panels
2. Bonus-sacrifice months (e.g. March 2024 if present in data) show the
   negative pension_sacrifice bar in panel 8
3. Panel 9 "Accurate cash effective tax rate" shows the
   cash_tax_rate_ytd line sitting ~10-15pp below naive_tax_rate in
   RSU-vest months

## Reproduce locally

1. `cd infra/stacks/monitoring && terragrunt plan`
2. Expected: ConfigMap diff on the payslip dashboard with the new panel
   JSON
3. `terragrunt apply` — Grafana reloads the dashboard automatically
   (configmap-reload sidecar)

Relates to: payslip-ingest commit 9741816

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Viktor Barzin 2026-04-19 10:54:33 +00:00
parent c6784f87b5
commit 973f549810
2 changed files with 261 additions and 7 deletions

View file

@ -19,15 +19,25 @@ Produce EXACTLY ONE JSON object on stdout matching the schema. No prose. No mark
## RSU handling (important — Meta UK payslips)
UK payslips for equity-compensated employees (e.g. Meta) report RSU vests as NOTIONAL pay for HMRC reporting only — the actual share grant + tax is handled by the broker (Schwab), which sells shares to cover withholding. On the payslip:
UK payslips for equity-compensated employees (e.g. Meta) report RSU vests as NOTIONAL pay for HMRC reporting only — the broker (Schwab) sells shares to cover US-side withholding but the UK payslip ALSO runs the vest through PAYE via a grossed-up Taxable Pay line. Meta UK template:
- An EARNINGS line appears with labels like `RSU Vest`, `Restricted Stock Units`, `Stock Value`, `Notional Pay`, `Share Award`, `GSU Vest`, `Equity Vest` → populate `rsu_vest`.
- A DEDUCTION line of equal-or-similar magnitude nets it back out. Labels: `Shares Retained`, `Stock Tax Withholding`, `RSU Offset`, `Notional Pay Offset`, `Shares Withheld` → populate `rsu_offset`.
- EARNINGS lines: `RSU Tax Offset` (grossed-up vest value) and optionally `RSU Excs Refund` (over-withheld amount returned). SUM BOTH into `rsu_vest`. Other labels seen on non-Meta templates: `RSU Vest`, `Restricted Stock Units`, `Notional Pay`, `GSU Vest`.
- Meta's template does NOT use a matching offset deduction — `rsu_offset` should be 0. Taxable Pay is grossed up to (Total Payment + rsu_vest) so PAYE already includes the RSU share.
- For non-Meta templates that DO use an offset (`Shares Retained`, `Notional Pay Offset`), populate `rsu_offset` with the magnitude.
If you see either line, populate BOTH fields. Do NOT add them to `other_deductions` and do NOT let them count as regular income_tax/NI even though some templates put them near the tax block. They exist for reporting.
If you see ANY of these lines, do NOT add them to `other_deductions` and do NOT let them count as regular income_tax/NI.
If the payslip has no stock component, leave both as 0.
## Earnings decomposition (v2)
- `salary`: the basic salary/pay line (usually the first "Salary" or "Basic Pay" entry in the Earnings/Payments block).
- `bonus`: the bonus line (`Perform Bonus`, `Bonus`, `Performance Bonus`). If absent or 0, leave as 0 — that's meaningful signal (bonus-sacrifice months). Don't invent.
- `pension_sacrifice`: **ABSOLUTE VALUE** of any NEGATIVE pension line in the Payments block (e.g. `AE Pension EE -600.20``600.20`). This is salary-sacrifice and is ALREADY subtracted from Total Payment/gross. Do not also put it in `pension_employee`.
- `pension_employee`: use this ONLY when pension appears as a POSITIVE deduction on the Deductions side (legacy Meta variant A, or non-Meta templates). Never double-count.
- `taxable_pay`: the "Taxable Pay" line in the summary block, THIS PERIOD column. For Meta this is the post-sacrifice + RSU-grossed-up base that PAYE is computed on. If the payslip doesn't surface a summary block, null.
- `ytd_tax_paid`, `ytd_taxable_pay`, `ytd_gross`: YTD column values from the same summary block. Null if not present.
## Fast path: PAYSLIP_TEXT is present
If the prompt contains `PAYSLIP_TEXT:`, the caller has already run `pdftotext -layout`. Skip Steps 1-2 entirely — the text is already in your context. Go straight to Step 3.

View file

@ -288,7 +288,7 @@
},
{
"id": 4,
"title": "Effective rate & take-home % (cash-basis)",
"title": "Effective rate & take-home % (cash-basis, YTD-corrected)",
"type": "timeseries",
"datasource": {
"type": "grafana-postgresql-datasource",
@ -361,7 +361,7 @@
"type": "grafana-postgresql-datasource",
"uid": "payslips-pg"
},
"rawSql": "SELECT pay_date AS \"time\", ROUND(((income_tax + national_insurance)::numeric / NULLIF(gross_pay - rsu_vest, 0)) * 100, 2) AS \"effective_rate_pct\", ROUND((net_pay::numeric / NULLIF(gross_pay - rsu_vest, 0)) * 100, 2) AS \"take_home_pct\" FROM payslip_ingest.payslip WHERE $__timeFilter(pay_date) ORDER BY pay_date",
"rawSql": "SELECT pay_date AS \"time\", ROUND((((income_tax - (rsu_vest * COALESCE(ytd_tax_paid / NULLIF(ytd_taxable_pay, 0), 0))) + national_insurance)::numeric / NULLIF(gross_pay - rsu_vest, 0)) * 100, 2) AS \"effective_rate_pct\", ROUND((net_pay::numeric / NULLIF(gross_pay - rsu_vest, 0)) * 100, 2) AS \"take_home_pct\" FROM payslip_ingest.payslip WHERE $__timeFilter(pay_date) ORDER BY pay_date",
"format": "time_series",
"refId": "A",
"rawQuery": true,
@ -540,7 +540,7 @@
"rawQuery": true,
"editorMode": "code",
"format": "table",
"rawSql": "SELECT pay_date, employer, tax_year, gross_pay, (gross_pay - rsu_vest) AS cash_gross, rsu_vest, rsu_offset, income_tax, national_insurance, pension_employee, pension_employer, student_loan, COALESCE(other_deductions, '{}'::jsonb) AS other_deductions, net_pay, validated, paperless_doc_id FROM payslip_ingest.payslip WHERE $__timeFilter(pay_date) ORDER BY pay_date DESC"
"rawSql": "SELECT pay_date, employer, tax_year, gross_pay, (gross_pay - rsu_vest) AS cash_gross, salary, bonus, rsu_vest, rsu_offset, pension_sacrifice, taxable_pay, income_tax, national_insurance, pension_employee, pension_employer, student_loan, COALESCE(other_deductions, '{}'::jsonb) AS other_deductions, net_pay, validated, paperless_doc_id FROM payslip_ingest.payslip WHERE $__timeFilter(pay_date) ORDER BY pay_date DESC"
}
]
},
@ -662,6 +662,250 @@
"rawSql": "SELECT pay_date AS \"time\", rsu_vest FROM payslip_ingest.payslip WHERE $__timeFilter(pay_date) AND rsu_vest > 0 ORDER BY pay_date"
}
]
},
{
"id": 8,
"title": "Earnings breakdown (salary / bonus / RSU / pension sacrifice)",
"type": "timeseries",
"datasource": {
"type": "grafana-postgresql-datasource",
"uid": "payslips-pg"
},
"gridPos": {
"h": 9,
"w": 12,
"x": 0,
"y": 49
},
"fieldConfig": {
"defaults": {
"custom": {
"drawStyle": "bars",
"stacking": {
"mode": "normal"
},
"fillOpacity": 80,
"lineWidth": 0,
"axisCenteredZero": true
},
"unit": "currencyGBP",
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
}
]
}
},
"overrides": []
},
"options": {
"legend": {
"showLegend": true,
"placement": "bottom"
},
"tooltip": {
"mode": "multi",
"sort": "desc"
}
},
"targets": [
{
"refId": "A",
"datasource": {
"type": "grafana-postgresql-datasource",
"uid": "payslips-pg"
},
"rawQuery": true,
"editorMode": "code",
"format": "time_series",
"rawSql": "SELECT pay_date AS \"time\", salary, bonus, rsu_vest, -pension_sacrifice AS pension_sacrifice FROM payslip_ingest.payslip WHERE $__timeFilter(pay_date) ORDER BY pay_date"
}
]
},
{
"id": 9,
"title": "Accurate cash effective tax rate (YTD-method vs naive)",
"type": "timeseries",
"datasource": {
"type": "grafana-postgresql-datasource",
"uid": "payslips-pg"
},
"gridPos": {
"h": 9,
"w": 12,
"x": 12,
"y": 49
},
"fieldConfig": {
"defaults": {
"color": {
"mode": "palette-classic"
},
"unit": "percent",
"min": 0,
"max": 80,
"custom": {
"drawStyle": "line",
"lineWidth": 2,
"fillOpacity": 10
},
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
}
]
}
},
"overrides": []
},
"options": {
"legend": {
"showLegend": true,
"placement": "bottom"
},
"tooltip": {
"mode": "multi",
"sort": "desc"
}
},
"targets": [
{
"refId": "A",
"datasource": {
"type": "grafana-postgresql-datasource",
"uid": "payslips-pg"
},
"rawQuery": true,
"editorMode": "code",
"format": "time_series",
"rawSql": "SELECT pay_date AS \"time\", ROUND(((income_tax - (rsu_vest * COALESCE(ytd_tax_paid / NULLIF(ytd_taxable_pay, 0), 0)))::numeric / NULLIF(gross_pay - rsu_vest, 0)) * 100, 2) AS \"cash_tax_rate_ytd\", ROUND((income_tax::numeric / NULLIF(gross_pay, 0)) * 100, 2) AS \"naive_tax_rate\" FROM payslip_ingest.payslip WHERE $__timeFilter(pay_date) ORDER BY pay_date"
}
]
},
{
"id": 10,
"title": "All-in compensation (cash + RSU market value)",
"type": "timeseries",
"datasource": {
"type": "grafana-postgresql-datasource",
"uid": "payslips-pg"
},
"gridPos": {
"h": 9,
"w": 12,
"x": 0,
"y": 58
},
"fieldConfig": {
"defaults": {
"custom": {
"drawStyle": "bars",
"stacking": {
"mode": "normal"
},
"fillOpacity": 80,
"lineWidth": 0
},
"unit": "currencyGBP",
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "blue",
"value": null
}
]
}
},
"overrides": []
},
"options": {
"legend": {
"showLegend": true,
"placement": "bottom"
},
"tooltip": {
"mode": "multi",
"sort": "desc"
}
},
"targets": [
{
"refId": "A",
"datasource": {
"type": "grafana-postgresql-datasource",
"uid": "payslips-pg"
},
"rawQuery": true,
"editorMode": "code",
"format": "time_series",
"rawSql": "SELECT pay_date AS \"time\", (gross_pay - rsu_vest) AS cash_gross, rsu_vest FROM payslip_ingest.payslip WHERE $__timeFilter(pay_date) ORDER BY pay_date"
}
]
},
{
"id": 11,
"title": "YTD cumulative cash gross vs total comp",
"type": "timeseries",
"datasource": {
"type": "grafana-postgresql-datasource",
"uid": "payslips-pg"
},
"gridPos": {
"h": 9,
"w": 12,
"x": 12,
"y": 58
},
"fieldConfig": {
"defaults": {
"custom": {
"drawStyle": "line",
"lineWidth": 2,
"fillOpacity": 10
},
"unit": "currencyGBP",
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
}
]
}
},
"overrides": []
},
"options": {
"legend": {
"showLegend": true,
"placement": "bottom"
},
"tooltip": {
"mode": "multi",
"sort": "desc"
}
},
"targets": [
{
"refId": "A",
"datasource": {
"type": "grafana-postgresql-datasource",
"uid": "payslips-pg"
},
"rawQuery": true,
"editorMode": "code",
"format": "time_series",
"rawSql": "SELECT pay_date AS \"time\", SUM(gross_pay - rsu_vest) OVER (PARTITION BY tax_year ORDER BY pay_date) AS ytd_cash_gross, SUM(gross_pay) OVER (PARTITION BY tax_year ORDER BY pay_date) AS ytd_total_comp FROM payslip_ingest.payslip WHERE $__timeFilter(pay_date) ORDER BY pay_date"
}
]
}
],
"refresh": "5m",