infra/docs/post-mortems
Viktor Barzin 42961a5f58 [registry] fix-broken-blobs.sh — check revision-link, not blob data
The original index-child scan checked if the child's blob data file
existed under /blobs/sha256/<child>/data. That's wrong in a subtle
way: registry:2 serves a per-repo manifest via the link file at
<repo>/_manifests/revisions/sha256/<child-digest>/link, NOT by blob
presence. When cleanup-tags.sh rmtrees a tag, the per-repo revision
links for its index's children also disappear — but the blob data
survives (GC owns that, and runs weekly). Result: blob present,
link absent, API 404 on HEAD — the exact 2026-04-19 failure mode.

Live proof: the registry-integrity-probe CronJob just found 38 real
orphan children (including 98f718c8 from the original incident) while
the previous fix-broken-blobs.sh scan reported 0. After the fix, both
tools agree. The probe had been authoritative all along; the scan was
a false-negative because it was asking the wrong question.

Post-mortem updated to reflect the true mechanism (link-file absence,
not blob deletion).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 17:43:35 +00:00
..
2026-03-16-kured-containerd-cascade-outage.html docs: consolidate all post-mortems under docs/post-mortems/ 2026-04-14 08:24:36 +00:00
2026-03-16-nfs-csi-cascade-failure.md docs: move post-mortems to docs/post-mortems/ 2026-04-14 08:20:09 +00:00
2026-04-14-nfs-fsid0-dns-vault-outage.md docs: update post-mortem follow-up implementation [PM-2026-04-14] [ci skip] 2026-04-14 18:09:11 +00:00
2026-04-14-postmortem-pipeline-test.md fix: use full path to claude CLI for non-interactive SSH 2026-04-14 17:44:50 +00:00
2026-04-18-authentik-outpost-shm-full.md [docs] post-mortem: clarify the sizeLimit vs container memory limit gotcha 2026-04-18 13:23:14 +00:00
2026-04-19-registry-orphan-index.md [registry] fix-broken-blobs.sh — check revision-link, not blob data 2026-04-19 17:43:35 +00:00
index.html docs: consolidate all post-mortems under docs/post-mortems/ 2026-04-14 08:24:36 +00:00