infra/stacks/forgejo
Viktor Barzin ff0cb9a0d0 forgejo: survive CI-build registry-push storms (mem 3Gi + working retention)
Heavy in-cluster builds (e.g. tripit buildkit) were taking Forgejo down via
two vectors. Fixes both, without moving Forgejo off the sdc HDD (code-oflt
deferred):

- Memory 1Gi -> 3Gi (requests=limits). Forgejo was OOMKilled (exit 137) under
  registry-push load; VPA upperBound ~1.5Gi was suppressed by the 1Gi cap it
  kept OOMing against. Size for the push spike.

- Activate registry retention (DRY_RUN false). Verified the delete list
  against all running viktor/* images first: 0 running images affected.
  Pruned 478 -> 161 package versions; PVC was at its 50Gi autoresize ceiling.

- FIX broken retention auth: the cleanup PAT was ci-pusher's, but Forgejo
  scopes container packages per-user, so DELETE on viktor/* returned 403 (the
  dry-run only did GETs, hiding it). Repointed forgejo_cleanup_token to
  viktor's write:package PAT. Retention had never actually worked.

- Protect buildkit *cache* tags from retention (cleanup.sh keep-set) so the
  gentler-builds layer cache survives daily pruning.

[ci skip] — already applied via scripts/tg.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-09 14:23:33 +00:00
..
files forgejo: survive CI-build registry-push storms (mem 3Gi + working retention) 2026-06-09 14:23:33 +00:00
.terraform.lock.hcl Woodpecker CI deploy [CI SKIP] 2026-05-24 22:07:58 +00:00
cleanup.tf forgejo: survive CI-build registry-push storms (mem 3Gi + working retention) 2026-06-09 14:23:33 +00:00
main.tf forgejo: survive CI-build registry-push storms (mem 3Gi + working retention) 2026-06-09 14:23:33 +00:00
providers.tf nfs-mirror: append transferred files to offsite-sync manifest 2026-05-24 15:32:22 +00:00
secrets [ci skip] Move Terraform modules into stack directories 2026-02-22 14:38:14 +00:00
terragrunt.hcl migrate all secrets from SOPS to Vault KV 2026-03-14 17:15:48 +00:00