infra/modules/docker-registry
Viktor Barzin 42961a5f58 [registry] fix-broken-blobs.sh — check revision-link, not blob data
The original index-child scan checked if the child's blob data file
existed under /blobs/sha256/<child>/data. That's wrong in a subtle
way: registry:2 serves a per-repo manifest via the link file at
<repo>/_manifests/revisions/sha256/<child-digest>/link, NOT by blob
presence. When cleanup-tags.sh rmtrees a tag, the per-repo revision
links for its index's children also disappear — but the blob data
survives (GC owns that, and runs weekly). Result: blob present,
link absent, API 404 on HEAD — the exact 2026-04-19 failure mode.

Live proof: the registry-integrity-probe CronJob just found 38 real
orphan children (including 98f718c8 from the original incident) while
the previous fix-broken-blobs.sh scan reported 0. After the fix, both
tools agree. The probe had been authoritative all along; the scan was
a false-negative because it was asking the wrong question.

Post-mortem updated to reflect the true mechanism (link-file absence,
not blob deletion).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 17:43:35 +00:00
..
cleanup-tags.sh [ci skip] Rebuild docker-registry with nginx serialization on all ports 2026-02-22 21:45:53 +00:00
config-private.yml add htpasswd auth to private docker registry + expose at registry.viktorbarzin.me 2026-03-22 22:10:10 +02:00
config-proxy.yaml.tpl registry: set proxy TTL to 0 to prevent stale :latest images 2026-03-30 00:02:48 +03:00
config.yaml registry: set proxy TTL to 0 to prevent stale :latest images 2026-03-30 00:02:48 +03:00
docker-compose.yml [registry] Stop recurring orphan OCI-index incidents — detection + prevention + recovery 2026-04-19 17:08:28 +00:00
fix-broken-blobs.sh [registry] fix-broken-blobs.sh — check revision-link, not blob data 2026-04-19 17:43:35 +00:00
nginx_registry.conf harden pull-through cache: intercept errors, reduce lock timeout, add healthz 2026-03-23 11:33:06 +02:00