parser + P60 ingest: split income_tax cash/RSU, add P60 ground-truth

Meta variant-B payslips gross up Taxable Pay for RSU and compute PAYE on
the grossed-up figure, so `income_tax` on the slip is the total PAYE
(cash + RSU-attributed). Dashboards that stacked the raw figure made
vest-month tax look ~2x higher than "cash tax paid". Introduce
`cash_income_tax = income_tax * (gross_pay - pension_sacrifice) /
taxable_pay` as a derived column alongside the raw figure. Dashboards
can now stack cash vs RSU-attributed tax as separate segments.

Also capture YTD column values of `RSU Tax Offset` and `RSU Excs Refund`
from the Payments grid — needed for reconciliation against HMRC annual
figures.

P60 ingest: new parser under `parsers/p60.py` anchoring on statutory
HMRC line labels (`Tax year to 5 April YYYY`, `Employer PAYE reference`,
`In this employment` pay/tax row, NI letter bands). Processor routes
documents carrying the `p60` Paperless tag to `_handle_p60` which
writes to the new `payslip_ingest.p60_reference` table (one row per
tax_year+employer). App lifespan resolves the tag id at startup; missing
tag disables dispatch without breaking payslip ingest. Paperless tag
creation + webhook config are manual follow-ups.

Migrations:
- 0004 — cash_income_tax + ytd_rsu_tax_offset + ytd_rsu_excs_refund on
  payslip, all nullable.
- 0005 — p60_reference table with (tax_year, employer) unique +
  paperless_doc_id unique for idempotent re-uploads.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Viktor Barzin 2026-04-19 15:23:05 +00:00
parent d91f34ddb4
commit 26e43b1055
14 changed files with 644 additions and 15 deletions

View file

@ -38,6 +38,15 @@ def test_parses_variant_b_modern() -> None:
assert result.ytd_taxable_pay == Decimal("373601.64")
assert result.ytd_gross == Decimal("232630.34")
# Derived cash-only PAYE: income_tax * (gross - pension_sacrifice) / taxable_pay
# = 31311.90 * 39282.69 / 72096.92 = 17060.59 (vs 31311.90 total PAYE)
assert result.cash_income_tax is not None
assert abs(result.cash_income_tax - Decimal("17060.59")) <= Decimal("0.02")
# YTD column of RSU lines in the Payments grid
assert result.ytd_rsu_tax_offset == Decimal("124674.27")
assert result.ytd_rsu_excs_refund == Decimal("3221.32")
def test_parses_variant_b_with_bonus() -> None:
"""March 2025 — variant B, bonus + RSU + multiple other deductions."""
@ -145,6 +154,28 @@ def test_parses_variant_a_2021_08() -> None:
assert result.taxable_pay == Decimal("15323.16")
def test_cash_income_tax_falls_back_when_taxable_pay_missing() -> None:
"""When taxable_pay is None, cash_income_tax == income_tax (no RSU grossing)."""
from payslip_ingest.parsers.meta_uk import _cash_income_tax
assert _cash_income_tax(Decimal("1000"), Decimal("5000"), Decimal("100"),
None) == Decimal("1000")
assert _cash_income_tax(Decimal("1000"), Decimal("5000"), Decimal("100"),
Decimal("0")) == Decimal("1000")
def test_variant_a_cash_income_tax_pro_rata() -> None:
"""Variant A fixture with taxable_pay → cash_income_tax is pro-rata.
2021-06 has taxable_pay=5095.86 (= gross_pay), pension_sacrifice=152.90,
income_tax=1410.07 cash_income_tax = 1410.07 * 4942.96 / 5095.86 = 1367.76.
"""
result = parse_meta_uk(_load("meta_uk_2021_06_variant_a_bik.txt"))
assert result.taxable_pay == Decimal("5095.86")
assert result.cash_income_tax is not None
assert abs(result.cash_income_tax - Decimal("1367.76")) <= Decimal("0.02")
def test_raises_on_non_meta_payslip() -> None:
with pytest.raises(ParserError):
parse_meta_uk("This is not a Meta payslip\nRandom text\n")