From f793a5f50b8f26fde82f38fb7822e2aae57baeaf Mon Sep 17 00:00:00 2001 From: Viktor Barzin Date: Thu, 7 May 2026 15:51:34 +0000 Subject: [PATCH] [forgejo] Phase 0 of registry consolidation: prepare Forgejo OCI registry MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Stage 1 of moving private images off the registry:2 container at registry.viktorbarzin.me:5050 (which has hit distribution#3324 corruption 3x in 3 weeks) onto Forgejo's built-in OCI registry. No cutover risk — pods still pull from the existing registry until Phase 3. What changes: * Forgejo deployment: memory 384Mi→1Gi, PVC 5Gi→15Gi (cap 50Gi). Explicit FORGEJO__packages__ENABLED + CHUNKED_UPLOAD_PATH (defensive, v11 default-on). * ingress_factory: max_body_size variable was declared but never wired in after the nginx→Traefik migration. Now creates a per-ingress Buffering middleware when set; default null = no limit (preserves existing behavior). Forgejo ingress sets max_body_size=5g to allow multi-GB layer pushes. * Cluster-wide registry-credentials Secret: 4th auths entry for forgejo.viktorbarzin.me, populated from Vault secret/viktor/ forgejo_pull_token (cluster-puller PAT, read:package). Existing Kyverno ClusterPolicy syncs cluster-wide — no policy edits. * Containerd hosts.toml redirect: forgejo.viktorbarzin.me → in-cluster Traefik LB 10.0.20.200 (avoids hairpin NAT for in-cluster pulls). Cloud-init for new VMs + scripts/setup-forgejo-containerd-mirror.sh for existing nodes. * Forgejo retention CronJob (0 4 * * *): keeps newest 10 versions per package + always :latest. First 7 days dry-run (DRY_RUN=true); flip the local in cleanup.tf after log review. * Forgejo integrity probe CronJob (*/15): same algorithm as the existing registry-integrity-probe. Existing Prometheus alerts (RegistryManifestIntegrityFailure et al) made instance-aware so they cover both registries during the bake. * Docs: design+plan in docs/plans/, setup runbook in docs/runbooks/. Operational note — the apply order is non-trivial because the new Vault keys (forgejo_pull_token, forgejo_cleanup_token, secret/ci/global/forgejo_*) must exist BEFORE terragrunt apply in the kyverno + monitoring + forgejo stacks. The setup runbook documents the bootstrap sequence. Phase 1 (per-project dual-push pipelines) follows in subsequent commits. Bake clock starts when the last project goes dual-push. Co-Authored-By: Claude Opus 4.7 --- ...7-forgejo-registry-consolidation-design.md | 195 +++++++++++++++++ ...-07-forgejo-registry-consolidation-plan.md | 152 +++++++++++++ docs/runbooks/forgejo-registry-setup.md | 163 ++++++++++++++ modules/kubernetes/ingress_factory/main.tf | 38 +++- scripts/setup-forgejo-containerd-mirror.sh | 59 +++++ stacks/forgejo/cleanup.tf | 120 +++++++++++ stacks/forgejo/files/cleanup.sh | 109 ++++++++++ stacks/forgejo/main.tf | 23 +- stacks/infra/main.tf | 6 + .../modules/kyverno/registry-credentials.tf | 6 + stacks/monitoring/main.tf | 1 + stacks/monitoring/modules/monitoring/main.tf | 202 ++++++++++++++++++ .../monitoring/prometheus_chart_values.tpl | 8 +- 13 files changed, 1072 insertions(+), 10 deletions(-) create mode 100644 docs/plans/2026-05-07-forgejo-registry-consolidation-design.md create mode 100644 docs/plans/2026-05-07-forgejo-registry-consolidation-plan.md create mode 100644 docs/runbooks/forgejo-registry-setup.md create mode 100755 scripts/setup-forgejo-containerd-mirror.sh create mode 100644 stacks/forgejo/cleanup.tf create mode 100644 stacks/forgejo/files/cleanup.sh diff --git a/docs/plans/2026-05-07-forgejo-registry-consolidation-design.md b/docs/plans/2026-05-07-forgejo-registry-consolidation-design.md new file mode 100644 index 00000000..5e88bd36 --- /dev/null +++ b/docs/plans/2026-05-07-forgejo-registry-consolidation-design.md @@ -0,0 +1,195 @@ +# Forgejo Registry Consolidation — Design + +**Date**: 2026-05-07 +**Status**: Approved + +## Problem + +`registry-private` (the `registry:2` container on the docker-registry +VM at `10.0.20.10`) has hit `distribution#3324` corruption three +times in three weeks (2026-04-13, 2026-04-19, 2026-05-04). Each +incident required manual blob recovery and another round of +hardening to `cleanup-tags.sh` and the GC procedure. The integrity +probe catches it within 15 minutes now, but every hit still costs +~1h of cleanup, and we keep tightening the same loose screw. + +Root cause is a known race in `distribution`: tag deletes that race +with concurrent garbage collection produce orphan OCI-index children. +Upstream has not patched it; our mitigations (probe, blob +fix-up script, idempotent cleanup) reduce blast radius but don't +remove the failure mode. + +Forgejo (deployed for OAuth and personal repos at +`forgejo.viktorbarzin.me`) ships a built-in OCI registry as part of +the Packages feature, default-on in v11. Using it removes +`distribution`-the-engine from the path entirely, replaces it with +Forgejo's own implementation backed by Forgejo's DB+blob store, and +gets us source hosting + image hosting in one resource. + +The PVE host RAM upgrade from 142GB to 272GB (memory id=569) means +the cluster can absorb the resource bump Forgejo needs for the +registry workload (1Gi → 1Gi). + +## Decision + +Move every image currently on `registry.viktorbarzin.me:5050` to +Forgejo's OCI registry at `forgejo.viktorbarzin.me`. Decommission +`registry-private` after a 14-day dual-push bake. + +Pull-through caches for upstream registries (DockerHub, GHCR, Quay, +k8s.gcr, Kyverno) stay on the registry VM permanently — Forgejo +won't serve as a pull-through, so the chicken-and-egg of "Forgejo +pulling its own image through itself" never arises. + +## Design + +### Registry hostname + +Image references become `forgejo.viktorbarzin.me/viktor/:`. +The `viktor/` prefix is the Forgejo owner namespace; all current +private images ship under that single owner. + +### Auth + +Two service-account users: + +| User | Scope | Vault key | Used by | +|---|---|---|---| +| `cluster-puller` | `read:package` | `secret/viktor/forgejo_pull_token` | cluster-wide `registry-credentials` Secret, monitoring probe | +| `ci-pusher` | `write:package` | `secret/ci/global/forgejo_push_token` | Woodpecker pipelines (synced via `vault-woodpecker-sync` CronJob) | + +A third PAT (`secret/viktor/forgejo_cleanup_token`, also belongs to +`ci-pusher`) drives the retention CronJob — kept separate from the +push PAT so a leaked CI token doesn't immediately enable mass deletes. + +PATs have no expiry. Rotation policy: regenerate via Forgejo Web UI +and `vault kv patch` if a leak is suspected; ESO/sync downstream is +automatic. + +### Cluster pull path + +`registry-credentials` is a single Secret in `kyverno` ns, cloned +into every namespace by the existing +`sync-registry-credentials` ClusterPolicy. We extend its +`dockerconfigjson` `auths` map with a fourth entry for +`forgejo.viktorbarzin.me`. **No new Secret, no new ClusterPolicy, +no `imagePullSecrets =` line edits across stacks.** + +Containerd `hosts.toml` redirects `forgejo.viktorbarzin.me` → in-cluster +Traefik LB at `10.0.20.200`, the same pattern used for +`registry.viktorbarzin.me` → `10.0.20.10:5050`. Avoids hairpin NAT +through the WAN gateway for in-cluster pulls. + +### Push path + +Woodpecker pipelines push to BOTH targets during the bake: + +```yaml +- name: build-and-push + image: woodpeckerci/plugin-docker-buildx + settings: + repo: + - registry.viktorbarzin.me/ + - forgejo.viktorbarzin.me/viktor/ + logins: + - registry: registry.viktorbarzin.me + username: + from_secret: registry_user + password: + from_secret: registry_password + - registry: forgejo.viktorbarzin.me + username: + from_secret: forgejo_user + password: + from_secret: forgejo_push_token +``` + +The `vault-woodpecker-sync` CronJob (every 6h) propagates +`secret/ci/global` keys to every Woodpecker repo as global secrets. + +### Retention + +Forgejo's per-package "Cleanup Rules" UI is per-user runtime DB +state, not Terraform-driven. Retention runs as a CronJob in the +`forgejo` namespace, schedule `0 4 * * *`, that: + +1. Lists all container packages under the `viktor` owner. +2. Groups by package name. +3. Keeps newest 10 versions + always keeps `latest`. +4. DELETEs the rest via `/api/v1/packages/{owner}/{type}/{name}/{version}`. + +First 7 days run with `DRY_RUN=true` — script logs what it would +delete but issues no DELETE calls. After log review, flip the +`forgejo_cleanup_dry_run` local in `cleanup.tf` to false. + +### Integrity monitoring + +Mirror the existing `registry-integrity-probe` CronJob: walk +`/v2/_catalog`, walk every tag, HEAD every manifest + index child, +push `registry_manifest_integrity_*` metrics. Existing +Prometheus alerts fire on the `instance` label, so they cover both +probes automatically once the alert annotations are made +instance-aware (done in this change). + +### Source migration + +Projects currently living as plain dirs in the local-only monorepo +become standalone Forgejo repos. Two GitHub-hosted private repos +(`beadboard`, `claude-memory-mcp`) move to Forgejo and are archived +on GitHub. + +CI standardises on Woodpecker for everything in scope. The two +projects that used GHA (build + Woodpecker-deploy via GHA-hosted +DockerHub push) keep DockerHub for legacy compatibility but their +canonical image source becomes Forgejo. + +### Break-glass for infra-ci + +`infra-ci` is the Docker image used by all infra Woodpecker +pipelines, including `default.yml` (terragrunt apply). If Forgejo is +unreachable at the moment we need to apply, `infra-ci` is +unreachable, and we can't apply our way out. + +Mitigation: dual-push step also `docker save | gzip` the built +infra-ci image to: + +- `/opt/registry/data/private/_breakglass/infra-ci-.tar.gz` on + the registry VM disk (Copy 1) +- `/srv/nfs/forgejo-breakglass/` on the NAS (Copy 2) + +A `latest` symlink in each location points at the most recent. +Recovery procedure (`docs/runbooks/forgejo-registry-breakglass.md`): +scp tarball → `docker load` → `ctr -n k8s.io images import` → fix +Forgejo via that node. + +### Cutover style + +**Dual-push bake**: pipelines push to both registries for ≥14 days. +Pods continue pulling from `registry.viktorbarzin.me`. After bake: + +1. Per-project PR: flip `image=` lines in Terraform stacks. Pod + re-pull naturally on next rollout. +2. Phase 4: stop `registry-private` container, remove its + `auths` entry from the cluster Secret, drop containerd hosts.toml + entry. + +## Why not alternatives + +| Option | Rejected because | +|---|---| +| Stay on `registry-private` | Three corruption incidents in three weeks; mitigation cost rising | +| Run a fresh registry container alongside (no Forgejo) | Same upstream, same `distribution#3324` failure mode | +| GHCR / DockerHub for all private images | Public-by-default model + push rate limits; loses owner-owned blob storage | +| Harbor | Heavier than Forgejo registry, would need its own DB + ingress, no source-hosting integration | + +## Risks + +See plan doc § "Risk register" for the full table. Top three: + +1. **Forgejo registry hits the same corruption pattern.** Mitigated + by 14-day bake + integrity probe within 15 min. +2. **Forgejo down → infra-ci unreachable → can't apply.** Mitigated + by tarball break-glass on VM + NAS. +3. **Pod re-pulls fail after `image=` flip due to containerd cache + poisoning.** Mitigated by hosts.toml deployment + per-project + `kubectl rollout restart` in Phase 3. diff --git a/docs/plans/2026-05-07-forgejo-registry-consolidation-plan.md b/docs/plans/2026-05-07-forgejo-registry-consolidation-plan.md new file mode 100644 index 00000000..1634d48e --- /dev/null +++ b/docs/plans/2026-05-07-forgejo-registry-consolidation-plan.md @@ -0,0 +1,152 @@ +# Forgejo Registry Consolidation — Plan + +**Date**: 2026-05-07 +**Status**: Approved — execution in progress (Phase 0) +**Design**: `2026-05-07-forgejo-registry-consolidation-design.md` + +This is the implementation roadmap for migrating off `registry-private` +onto Forgejo's OCI registry. See the design doc for problem +statement and rationale. Execution spans 5 phases over ≥3 weeks. + +## Phase 0 — Prepare Forgejo (1 PR, no cutover risk) + +| Task | File / artifact | +|---|---| +| Bump Forgejo memory request+limit 384Mi → 1Gi | `infra/stacks/forgejo/main.tf` | +| Add `FORGEJO__packages__ENABLED=true` and `FORGEJO__packages__CHUNKED_UPLOAD_PATH=/data/tmp/package-upload` env vars (defensive — already default in v11) | `infra/stacks/forgejo/main.tf` | +| Bump Forgejo PVC 5Gi → 15Gi, auto-resize cap 20Gi → 50Gi | `infra/stacks/forgejo/main.tf` | +| Bump ingress `max_body_size = "5g"` (wired into ingress_factory as a Buffering middleware) | `infra/stacks/forgejo/main.tf`, `infra/modules/kubernetes/ingress_factory/main.tf` | +| Create `cluster-puller` (read:package), `ci-pusher` (write:package), and a third `cleanup` PAT on `ci-pusher`; store PATs in Vault | runbook: `docs/runbooks/forgejo-registry-setup.md` | +| Extend `registry-credentials` Secret with 4th `auths` entry for `forgejo.viktorbarzin.me` | `infra/stacks/kyverno/modules/kyverno/registry-credentials.tf` | +| Add containerd `hosts.toml` entry redirecting `forgejo.viktorbarzin.me` → in-cluster Traefik LB `10.0.20.200` | `infra/stacks/infra/main.tf` cloud-init + new `infra/scripts/setup-forgejo-containerd-mirror.sh` for existing nodes | +| Forgejo retention CronJob (`0 4 * * *`, dry-run for first 7 days) | new `infra/stacks/forgejo/cleanup.tf` + `infra/stacks/forgejo/files/cleanup.sh` | +| Forgejo integrity probe CronJob (`*/15 * * * *`) | `infra/stacks/monitoring/modules/monitoring/main.tf` | +| Make existing alerts instance-aware so they cover both registries | `infra/stacks/monitoring/modules/monitoring/prometheus_chart_values.tpl` | + +**Smoke test (must pass before declaring Phase 0 done):** + +- `docker login forgejo.viktorbarzin.me` succeeds. +- Push a hello-world image to `forgejo.viktorbarzin.me/viktor/smoketest:1` succeeds. +- `crictl pull forgejo.viktorbarzin.me/viktor/smoketest:1` from a k8s + node succeeds, using the auto-synced `registry-credentials` Secret. +- A fresh namespace gets the cloned Secret with 4 `auths` entries. +- Delete the smoketest package via API. +- Forgejo integrity probe completes once and pushes metrics. + +## Phase 1 — Source migration (parallel-safe, no production impact) + +For each project the recipe is identical: + +1. `git init` + push to `forgejo.viktorbarzin.me/viktor/` — + register in Woodpecker via OAuth. +2. Add `.woodpecker.yml` based on `payslip-ingest/.woodpecker.yml`. + Push step uses `woodpeckerci/plugin-docker-buildx` with TWO + `repo:` entries (dual-push). +3. Confirm first build pushes to BOTH registries. + +Projects (bake clock starts at "all dual-push"): + +| Project | Action | +|---|---| +| `claude-agent-service` | Extract from monorepo to Forgejo. New `.woodpecker.yml`. | +| `fire-planner` | Extract from monorepo to Forgejo. New `.woodpecker.yml`. | +| `wealthfolio-sync` | Extract from monorepo to Forgejo. New `.woodpecker.yml`. | +| `hmrc-sync` | Extract from monorepo to Forgejo. New `.woodpecker.yml`. | +| `freedify` | Push from monorepo to Forgejo. New `.woodpecker.yml`. (Upstream is gone.) | +| `payslip-ingest` | Already on Forgejo. Add second `repo:` entry to `.woodpecker.yml`. | +| `job-hunter` | Already on Forgejo. Add second `repo:` entry. | +| `beadboard` | Push to Forgejo. New `.woodpecker.yml`. Disable GHA workflow. **Don't archive GitHub yet** (deferred to Phase 3). | +| `claude-memory-mcp` | Push to Forgejo. New `.woodpecker.yml`. | +| `infra-ci` | Edit `.woodpecker/build-ci-image.yml` to dual-push. ALSO `docker save | gzip` to `/opt/registry/data/private/_breakglass/` on VM AND `/srv/nfs/forgejo-breakglass/` on NAS. Pin a `latest` symlink. | + +Break-glass runbook (`docs/runbooks/forgejo-registry-breakglass.md`) +documents the recovery path. + +## Phase 2 — Bake (≥14 days) + +- No `image=` lines change. Pods still pull from + `registry.viktorbarzin.me`. +- **Daily smoke check**: pull a recent image from Forgejo as + `cluster-puller`, verify integrity (HEAD on manifest + each blob). +- **Bake exit criteria**: + - Zero `RegistryManifestIntegrityFailure` alerts on Forgejo. + - Zero `ContainerNearOOM` for the forgejo pod. + - Retention CronJob has run ≥14 times successfully. + - At least one full Sunday GC cycle has elapsed. + - Switch retention CronJob to `DRY_RUN=false` on day 7, observe + until day 14. + +## Phase 3 — Cutover (one PR per project, single session) + +Order = lowest blast radius first. Each step: +`image=` flip → `kubectl rollout restart` → verify pull from Forgejo. + +1. `payslip-ingest` (`infra/stacks/payslip-ingest/main.tf`) +2. `job-hunter` (`infra/stacks/job-hunter/main.tf`) +3. `claude-agent-service` (`infra/stacks/claude-agent-service/main.tf`) +4. `fire-planner` (`infra/stacks/fire-planner/main.tf`) +5. `wealthfolio-sync` (`infra/stacks/wealthfolio/main.tf`) +6. `freedify` (`infra/stacks/freedify/factory/main.tf`) +7. `chrome-service` (`infra/stacks/chrome-service/main.tf`) +8. `beads-server` / `beadboard` (`infra/stacks/beads-server/main.tf`). + Then `gh repo archive ViktorBarzin/beadboard`. +9. `infra-ci` — flip `image:` references in 4 `.woodpecker/*.yml` + files in the infra repo. Verify next push to master applies cleanly. +10. `claude-memory-mcp` — update `CLAUDE.md` install instruction from + `claude plugins install github:ViktorBarzin/claude-memory-mcp` to + `claude plugins install https://forgejo.viktorbarzin.me/viktor/claude-memory-mcp.git`. + `gh repo archive ViktorBarzin/claude-memory-mcp`. + +## Phase 4 — Decommission + +| Step | File / location | +|---|---| +| Stop `registry-private` container on VM (10.0.20.10): edit `/opt/registry/docker-compose.yml`, comment out service, `docker compose up -d --remove-orphans`. (Manual SSH — cloud-init won't redeploy on TF apply per memory id=1078.) | live VM | +| Update cloud-init template to match the new compose file | `infra/stacks/infra/main.tf:288` | +| Delete `auths` entries for `registry.viktorbarzin.me` / `:5050` / `10.0.20.10:5050` from the dockerconfigjson | `infra/stacks/kyverno/modules/kyverno/registry-credentials.tf` | +| Drop `registry.viktorbarzin.me` and `10.0.20.10:5050` `hosts.toml` entries on each node + cloud-init template | `infra/stacks/infra/main.tf` cloud-init + ad-hoc script | +| After 1 week of no incidents, delete `/opt/registry/data/private/` blob storage on the VM (~2.6GB freed) | manual SSH | + +## Phase 5 — Docs + +In the same commit as the Phase 4 closing: + +| Doc | Update | +|---|---| +| `docs/runbooks/registry-vm.md` | Note `registry-private` is gone; pull-through caches and break-glass tarballs only | +| `docs/runbooks/registry-rebuild-image.md` | Replaced by NEW `forgejo-registry-rebuild-image.md` | +| `docs/runbooks/forgejo-registry-rebuild-image.md` (NEW) | Forgejo PVC restore procedure | +| `docs/runbooks/forgejo-registry-breakglass.md` (NEW) | infra-ci tarball recovery | +| `docs/architecture/ci-cd.md` | Image registry section flips to Forgejo | +| `docs/architecture/monitoring.md` | Integrity probe target updated | +| `infra/.claude/CLAUDE.md` | Registry references updated | +| `CLAUDE.md` (monorepo root) | claude-memory-mcp install URL updated | +| `infra/.claude/reference/service-catalog.md` | Cross-reference checked | + +## Critical files modified + +| File | Phase | What | +|---|---|---| +| `infra/stacks/forgejo/main.tf` | 0 | Memory bump, packages env vars, PVC bump, ingress max_body_size | +| `infra/stacks/forgejo/cleanup.tf` (NEW) | 0 | Retention CronJob | +| `infra/stacks/forgejo/files/cleanup.sh` (NEW) | 0 | Retention script (mounted via ConfigMap) | +| `infra/modules/kubernetes/ingress_factory/main.tf` | 0 | Wire `max_body_size` into a Traefik Buffering middleware | +| `infra/stacks/kyverno/modules/kyverno/registry-credentials.tf` | 0 | Add 4th `auths` entry | +| `infra/stacks/infra/main.tf` | 0 + 4 | Containerd hosts.toml block (add Forgejo, later remove registry-private); compose template update | +| `infra/scripts/setup-forgejo-containerd-mirror.sh` (NEW) | 0 | One-shot rollout for existing nodes | +| `infra/stacks/monitoring/modules/monitoring/main.tf` | 0 | Forgejo integrity probe CronJob | +| `infra/stacks/monitoring/modules/monitoring/prometheus_chart_values.tpl` | 0 | Make alerts instance-aware | +| `infra/stacks/monitoring/main.tf` | 0 | Plumb `forgejo_pull_token` into module | +| `infra/.woodpecker/build-ci-image.yml` | 1 | Dual-push to add Forgejo target + tarball break-glass | +| `/.woodpecker.yml` | 1 | Dual-push (NEW for fire-planner, wealthfolio-sync, hmrc-sync, freedify, beadboard, claude-memory-mcp; EDIT for payslip-ingest, job-hunter, claude-agent-service) | +| `infra/.woodpecker/{default,drift-detection,build-cli}.yml` | 3 | Flip `image:` to Forgejo for infra-ci | +| `infra/stacks/{beads-server,chrome-service,claude-agent-service,fire-planner,freedify/factory,job-hunter,payslip-ingest,wealthfolio}/main.tf` | 3 | Flip `image =` to Forgejo | + +## Verification + +- **Push** (Phase 0/1): `docker push forgejo.viktorbarzin.me/viktor/` visible in Forgejo Web UI under viktor/. +- **Pull** (Phase 0): `crictl pull forgejo.viktorbarzin.me/viktor/smoketest:1` succeeds with auto-synced Secret. +- **Dual-push** (Phase 1): every Woodpecker pipeline run pushes to BOTH endpoints — confirmed via HEAD checks on `:` for both. +- **Bake** (Phase 2): existing daily Forgejo `/api/healthz` external monitor stays green; integrity probe stays green; no `ContainerNearOOM` for forgejo pod. +- **Cutover** (Phase 3): `kubectl rollout status deploy/ -n ` succeeds. `kubectl describe pod` shows the image was pulled from `forgejo.viktorbarzin.me`. +- **Decommission** (Phase 4): `docker ps` on registry VM no longer shows `registry-private`. Brand-new namespace gets the Secret with only the Forgejo `auths` entry. Pull still works. diff --git a/docs/runbooks/forgejo-registry-setup.md b/docs/runbooks/forgejo-registry-setup.md new file mode 100644 index 00000000..16637c6d --- /dev/null +++ b/docs/runbooks/forgejo-registry-setup.md @@ -0,0 +1,163 @@ +# Runbook: Forgejo OCI registry — initial setup + +Last updated: 2026-05-07 + +This runbook covers the **one-time** bootstrap of Forgejo's container +registry, executed during Phase 0 of the registry consolidation plan +(`docs/plans/2026-05-07-forgejo-registry-consolidation-plan.md`). + +After this runbook is complete, the Forgejo OCI registry at +`forgejo.viktorbarzin.me` accepts pushes from CI and pulls from the +cluster, with retention and integrity monitoring in place. + +## Order of operations + +The Terraform stacks reference Vault keys that don't exist on a fresh +cluster. Create the keys **before** running `scripts/tg apply`. + +1. Apply the resource bumps (memory, PVC, ingress body size, + packages env vars) — these don't depend on the new Vault keys. +2. Create the service-account users + PATs in Forgejo. +3. Push the PATs to Vault. +4. Apply the rest of Phase 0 (registry-credentials extension, + monitoring probe, retention CronJob). + +### Step 1 — apply Forgejo deployment bumps + +```bash +cd infra/stacks/forgejo +scripts/tg apply +``` + +Wait for the new pod to come up at the bumped 1Gi memory request and +the resized 15Gi PVC. Verify packages are enabled: + +```bash +kubectl exec -n forgejo deploy/forgejo -- forgejo manager flush-queues +kubectl exec -n forgejo deploy/forgejo -- env | grep PACKAGES +``` + +### Step 2 — create service-account users + +`forgejo admin user create` is idempotent only with +`--must-change-password=false`. Re-running it on an existing user +errors out — that's fine; skip on rerun. + +```bash +# cluster-puller — read:package PAT for in-cluster pulls. +kubectl exec -n forgejo deploy/forgejo -- \ + forgejo admin user create \ + --username cluster-puller \ + --email cluster-puller@viktorbarzin.me \ + --password "$(openssl rand -base64 24)" \ + --must-change-password=false + +# ci-pusher — write:package PAT for CI dual-push, also reused as the +# cleanup CronJob credential (write:package includes delete). +kubectl exec -n forgejo deploy/forgejo -- \ + forgejo admin user create \ + --username ci-pusher \ + --email ci-pusher@viktorbarzin.me \ + --password "$(openssl rand -base64 24)" \ + --must-change-password=false +``` + +The user passwords are throwaway — we only ever auth via PAT. Forgejo +admin can reset them at any time from the Web UI. + +### Step 3 — generate the PATs + +PATs **must** be generated through the Web UI logged in as the +respective user (the CLI doesn't expose token creation). To log in +without OAuth (registration is disabled for everyone except `viktor`, +the admin), use the per-user temporary password from step 2. + +For each of `cluster-puller` and `ci-pusher`: + +1. Sign out of `viktor`. +2. Go to `https://forgejo.viktorbarzin.me/user/login` and sign in + with the throwaway password. +3. Settings → Applications → Generate new token. +4. Name: `cluster-pull` / `ci-push`. **Expiration: never.** +5. Scopes: + - `cluster-puller`: `read:package` + - `ci-pusher`: `write:package` (covers read+write+delete) +6. Save the token shown on the next page — it is **not** displayed again. + +For the cleanup CronJob, generate a third PAT on `ci-pusher`: + +7. Repeat steps 4-6 with name `cleanup`, scope `write:package`. + +### Step 4 — push PATs to Vault + +```bash +vault login -method=oidc + +# Read-only, used by the cluster-wide registry-credentials Secret and +# by the Forgejo integrity probe. +vault kv patch secret/viktor \ + forgejo_pull_token= + +# Write+delete, used by the retention CronJob inside Forgejo's +# namespace. +vault kv patch secret/viktor \ + forgejo_cleanup_token= + +# Write, propagated by vault-woodpecker-sync to all Woodpecker repos. +vault kv patch secret/ci/global \ + forgejo_user=ci-pusher \ + forgejo_push_token= +``` + +### Step 5 — apply the rest of Phase 0 + +```bash +# Registry credential Secret (now reads forgejo_pull_token). +cd infra/stacks/kyverno && scripts/tg apply + +# Monitoring probe + retention CronJob. +cd infra/stacks/monitoring && scripts/tg apply +cd infra/stacks/forgejo && scripts/tg apply + +# Containerd hosts.toml on each existing k8s node — VM cloud-init +# only fires on first boot. +infra/scripts/setup-forgejo-containerd-mirror.sh +``` + +## Verification + +```bash +# Login from a workstation with docker. +echo "" | docker login forgejo.viktorbarzin.me -u ci-pusher --password-stdin + +# Push a smoketest image. +docker pull alpine:3.20 +docker tag alpine:3.20 forgejo.viktorbarzin.me/viktor/smoketest:1 +docker push forgejo.viktorbarzin.me/viktor/smoketest:1 + +# Pull from a k8s node. +ssh wizard@ sudo crictl pull forgejo.viktorbarzin.me/viktor/smoketest:1 + +# Confirm the cluster-wide Secret was synced into a fresh namespace. +kubectl create namespace forgejo-smoketest +kubectl get secret -n forgejo-smoketest registry-credentials \ + -o jsonpath='{.data.\.dockerconfigjson}' | base64 -d | jq '.auths | keys' +# Expect: ["10.0.20.10:5050", "forgejo.viktorbarzin.me", +# "registry.viktorbarzin.me", "registry.viktorbarzin.me:5050"] +kubectl delete namespace forgejo-smoketest + +# Delete the smoketest package via API. +curl -X DELETE -H "Authorization: token " \ + https://forgejo.viktorbarzin.me/api/v1/packages/viktor/container/smoketest/1 +``` + +## When to revisit + +- **PAT rotation**: PATs created here have no expiry by design. If a + PAT leaks, regenerate via the Web UI and `vault kv patch` the new + value into the same key — the next `terragrunt apply` will sync it + to all consumers within minutes (Kyverno ClusterPolicy clones the + Secret, vault-woodpecker-sync runs every 6h). +- **New service account**: if a future workload needs different + scopes, add a parallel user/PAT here rather than expanding existing + PAT scope. Principle of least privilege. diff --git a/modules/kubernetes/ingress_factory/main.tf b/modules/kubernetes/ingress_factory/main.tf index 8e893dca..1975658e 100644 --- a/modules/kubernetes/ingress_factory/main.tf +++ b/modules/kubernetes/ingress_factory/main.tf @@ -40,8 +40,9 @@ variable "ingress_path" { default = ["/"] } variable "max_body_size" { - type = string - default = "50m" + type = string + default = null + description = "Maximum request body size, e.g. '5g'. null = no limit (Traefik default). When set, a per-ingress Buffering middleware is created and attached." } variable "extra_annotations" { default = {} @@ -203,6 +204,17 @@ locals { "gethomepage.dev/href" = "https://${local.effective_host}" "gethomepage.dev/icon" = "${replace(var.name, "-", "")}.png" } : {} + + # Parse "5g"/"50m"/"1024k"/"42" into bytes. Traefik's Buffering middleware + # takes maxRequestBodyBytes as an integer. Empty unit = bytes. + body_size_match = var.max_body_size == null ? null : regex("^([0-9]+)([kmgKMG]?)$", var.max_body_size) + body_size_unit_multiplier = var.max_body_size == null ? 0 : ( + lower(local.body_size_match[1]) == "g" ? 1073741824 : + lower(local.body_size_match[1]) == "m" ? 1048576 : + lower(local.body_size_match[1]) == "k" ? 1024 : + 1 + ) + max_body_size_bytes = var.max_body_size == null ? 0 : tonumber(local.body_size_match[0]) * local.body_size_unit_multiplier } @@ -245,6 +257,7 @@ resource "kubernetes_ingress_v1" "proxied-ingress" { var.protected ? "traefik-authentik-forward-auth@kubernetescrd" : null, var.allow_local_access_only ? "traefik-local-only@kubernetescrd" : null, var.custom_content_security_policy != null ? "${var.namespace}-custom-csp-${var.name}@kubernetescrd" : null, + var.max_body_size != null ? "${var.namespace}-buffering-${var.name}@kubernetescrd" : null, ], var.extra_middlewares))) "traefik.ingress.kubernetes.io/router.entrypoints" = "websecure" }, local.homepage_defaults, var.extra_annotations, @@ -302,6 +315,27 @@ resource "kubernetes_manifest" "custom_csp" { } } +# Buffering middleware - created per service when max_body_size is set. +# Traefik default is unlimited; setting maxRequestBodyBytes enforces a limit +# (e.g. Forgejo container pushes can ship multi-GB layer blobs). +resource "kubernetes_manifest" "buffering" { + count = var.max_body_size != null ? 1 : 0 + + manifest = { + apiVersion = "traefik.io/v1alpha1" + kind = "Middleware" + metadata = { + name = "buffering-${var.name}" + namespace = var.namespace + } + spec = { + buffering = { + maxRequestBodyBytes = local.max_body_size_bytes + } + } + } +} + # Cloudflare DNS records — created automatically when dns_type is set. # Proxied: CNAME to Cloudflare tunnel. Non-proxied: A + AAAA to public IP. resource "cloudflare_record" "proxied" { diff --git a/scripts/setup-forgejo-containerd-mirror.sh b/scripts/setup-forgejo-containerd-mirror.sh new file mode 100755 index 00000000..1e4625fd --- /dev/null +++ b/scripts/setup-forgejo-containerd-mirror.sh @@ -0,0 +1,59 @@ +#!/usr/bin/env bash +# One-shot deployment of the forgejo.viktorbarzin.me containerd hosts.toml +# entry across every k8s node. Cloud-init only fires on VM provision, so +# existing nodes need this manual rollout. +# +# What it does, per node: +# 1. drain (ignore-daemonsets, delete-emptydir-data) +# 2. ssh in: mkdir + write /etc/containerd/certs.d/forgejo.viktorbarzin.me/hosts.toml +# 3. systemctl restart containerd +# 4. uncordon +# +# hosts.toml is documented as hot-reloaded but the post-2026-04-19 +# containerd corruption playbook calls for an explicit restart so the +# config is unambiguously in effect. Running drain/uncordon around it +# avoids pulling against an in-flight containerd restart. +# +# Re-run is safe: writes are idempotent. + +set -euo pipefail + +CERTS_DIR=/etc/containerd/certs.d/forgejo.viktorbarzin.me +HOSTS_TOML='server = "https://forgejo.viktorbarzin.me" + +[host."https://10.0.20.200"] + capabilities = ["pull", "resolve"] +' + +NODES=$(kubectl get nodes -o name | sed 's|^node/||') +if [[ -z "$NODES" ]]; then + echo "ERROR: no nodes returned from kubectl get nodes" >&2 + exit 1 +fi + +for n in $NODES; do + echo "=== $n ===" + kubectl drain "$n" --ignore-daemonsets --delete-emptydir-data --force --grace-period=60 + + ssh -o StrictHostKeyChecking=accept-new "wizard@$n" sudo bash < "$CERTS_DIR/hosts.toml" <<'TOML' +$HOSTS_TOML +TOML +systemctl restart containerd +EOF + + kubectl uncordon "$n" + + # Wait for the node to report Ready before moving to the next one. + for i in {1..30}; do + if kubectl get node "$n" -o jsonpath='{.status.conditions[?(@.type=="Ready")].status}' | grep -q True; then + echo " node Ready" + break + fi + sleep 2 + done +done + +echo "All nodes updated." diff --git a/stacks/forgejo/cleanup.tf b/stacks/forgejo/cleanup.tf new file mode 100644 index 00000000..6b180089 --- /dev/null +++ b/stacks/forgejo/cleanup.tf @@ -0,0 +1,120 @@ +# Forgejo container-package retention CronJob. +# +# Forgejo's per-package "Cleanup Rules" UI is not exposed via Terraform — +# it's per-user runtime state inside the Forgejo DB. Driving retention from +# a CronJob hitting the public API keeps the policy versioned in this repo. +# +# Auth: a write:package PAT belonging to ci-pusher (same user that pushes +# from CI). DELETE on packages requires write:package scope. PAT lives in +# Vault at secret/viktor/forgejo_cleanup_token. + +data "vault_kv_secret_v2" "forgejo_viktor" { + mount = "secret" + name = "viktor" +} + +locals { + # Flip to false after first 7 days of dry-run logs look correct. + forgejo_cleanup_dry_run = true +} + +resource "kubernetes_config_map" "forgejo_cleanup_script" { + metadata { + name = "forgejo-cleanup-script" + namespace = kubernetes_namespace.forgejo.metadata[0].name + } + data = { + "cleanup.sh" = file("${path.module}/files/cleanup.sh") + } +} + +resource "kubernetes_secret" "forgejo_cleanup_token" { + metadata { + name = "forgejo-cleanup-token" + namespace = kubernetes_namespace.forgejo.metadata[0].name + } + type = "Opaque" + data = { + FORGEJO_TOKEN = data.vault_kv_secret_v2.forgejo_viktor.data["forgejo_cleanup_token"] + } +} + +resource "kubernetes_cron_job_v1" "forgejo_cleanup" { + metadata { + name = "forgejo-cleanup" + namespace = kubernetes_namespace.forgejo.metadata[0].name + } + spec { + concurrency_policy = "Forbid" + schedule = "0 4 * * *" + failed_jobs_history_limit = 3 + successful_jobs_history_limit = 3 + job_template { + metadata {} + spec { + backoff_limit = 1 + ttl_seconds_after_finished = 3600 + template { + metadata {} + spec { + container { + name = "cleanup" + image = "docker.io/library/alpine:3.20" + command = ["/bin/sh", "/scripts/cleanup.sh"] + env { + name = "FORGEJO_TOKEN" + value_from { + secret_key_ref { + name = kubernetes_secret.forgejo_cleanup_token.metadata[0].name + key = "FORGEJO_TOKEN" + } + } + } + env { + name = "FORGEJO_HOST" + value = "http://forgejo.forgejo.svc.cluster.local" + } + env { + name = "FORGEJO_OWNER" + value = "viktor" + } + env { + name = "KEEP_LAST_N" + value = "10" + } + env { + name = "DRY_RUN" + value = local.forgejo_cleanup_dry_run ? "true" : "false" + } + volume_mount { + name = "scripts" + mount_path = "/scripts" + } + resources { + requests = { + cpu = "10m" + memory = "32Mi" + } + limits = { + memory = "96Mi" + } + } + } + volume { + name = "scripts" + config_map { + name = kubernetes_config_map.forgejo_cleanup_script.metadata[0].name + default_mode = "0755" + } + } + restart_policy = "OnFailure" + } + } + } + } + } + lifecycle { + # KYVERNO_LIFECYCLE_V1: Kyverno admission webhook mutates dns_config with ndots=2 + ignore_changes = [spec[0].job_template[0].spec[0].template[0].spec[0].dns_config] + } +} diff --git a/stacks/forgejo/files/cleanup.sh b/stacks/forgejo/files/cleanup.sh new file mode 100644 index 00000000..61bb7a96 --- /dev/null +++ b/stacks/forgejo/files/cleanup.sh @@ -0,0 +1,109 @@ +#!/bin/sh +# Forgejo container-package retention. +# +# For each container package owned by ${FORGEJO_OWNER}, keep newest +# ${KEEP_LAST_N} versions + always keep tag "latest". Deletes the rest via +# DELETE /api/v1/packages/{owner}/container/{name}/{version}. +# +# DRY_RUN=true logs what would be deleted but issues no DELETE calls. +# +# Required env: +# FORGEJO_HOST e.g. http://forgejo.forgejo.svc.cluster.local +# FORGEJO_OWNER e.g. viktor +# FORGEJO_USER PAT owner (write:package scope) +# FORGEJO_TOKEN PAT +# KEEP_LAST_N integer (default 10) +# DRY_RUN true|false (default true) + +set -eu + +apk add --no-cache curl jq >/dev/null + +OWNER="${FORGEJO_OWNER}" +KEEP="${KEEP_LAST_N:-10}" +DRY="${DRY_RUN:-true}" +BASE="${FORGEJO_HOST%/}/api/v1" + +AUTH_HEADER="Authorization: token $FORGEJO_TOKEN" + +echo "Forgejo cleanup: owner=$OWNER keep_last=$KEEP dry_run=$DRY" +echo "API base: $BASE" + +# Page through ALL container packages. +TMPDIR=$(mktemp -d) +trap 'rm -rf "$TMPDIR"' EXIT +ALL="$TMPDIR/all.json" +echo "[]" > "$ALL" + +PAGE=1 +while :; do + RESP=$(curl -sf -H "$AUTH_HEADER" \ + "$BASE/packages/$OWNER?type=container&limit=50&page=$PAGE") + COUNT=$(echo "$RESP" | jq 'length') + if [ "$COUNT" = "0" ]; then break; fi + jq -s '.[0] + .[1]' "$ALL" <(echo "$RESP") > "$TMPDIR/merged.json" + mv "$TMPDIR/merged.json" "$ALL" + PAGE=$((PAGE + 1)) + # Safety: never run away. + if [ "$PAGE" -gt 100 ]; then break; fi +done + +TOTAL=$(jq 'length' "$ALL") +echo "Found $TOTAL package version(s)." + +if [ "$TOTAL" = "0" ]; then + echo "Nothing to do." + exit 0 +fi + +# Group by name and process each group. +NAMES=$(jq -r '.[].name' "$ALL" | sort -u) + +DEL=0 +KEPT=0 + +for NAME in $NAMES; do + # All versions of this name, sorted by created_at descending. + jq --arg n "$NAME" ' + [.[] | select(.name == $n)] + | sort_by(.created_at) | reverse + ' "$ALL" > "$TMPDIR/$NAME.json" + + N_VERSIONS=$(jq 'length' "$TMPDIR/$NAME.json") + echo "[$NAME] $N_VERSIONS version(s)" + + # Build the keep set: top $KEEP + anything tagged 'latest'. + jq -r --argjson keep "$KEEP" ' + [.[0:$keep][].version] + [.[] | select(.version == "latest") | .version] + | unique + | .[] + ' "$TMPDIR/$NAME.json" > "$TMPDIR/$NAME.keep" + + # Build the delete set. + jq -r '.[].version' "$TMPDIR/$NAME.json" \ + | grep -vxFf "$TMPDIR/$NAME.keep" > "$TMPDIR/$NAME.delete" || true + + D_COUNT=$(wc -l < "$TMPDIR/$NAME.delete" | tr -d ' ') + K_COUNT=$(wc -l < "$TMPDIR/$NAME.keep" | tr -d ' ') + echo " keep=$K_COUNT delete=$D_COUNT" + KEPT=$((KEPT + K_COUNT)) + + while IFS= read -r VER; do + [ -z "$VER" ] && continue + URL="$BASE/packages/$OWNER/container/$NAME/$VER" + if [ "$DRY" = "true" ]; then + echo " DRY_RUN would DELETE $URL" + else + HTTP=$(curl -s -o /dev/null -w '%{http_code}' \ + -X DELETE -H "$AUTH_HEADER" "$URL" || echo "000") + if [ "$HTTP" = "204" ] || [ "$HTTP" = "200" ]; then + echo " deleted $NAME:$VER" + else + echo " FAIL $NAME:$VER HTTP $HTTP" + fi + fi + DEL=$((DEL + 1)) + done < "$TMPDIR/$NAME.delete" +done + +echo "Summary: kept=$KEPT to_delete=$DEL dry_run=$DRY" diff --git a/stacks/forgejo/main.tf b/stacks/forgejo/main.tf index bb774230..5b52c419 100644 --- a/stacks/forgejo/main.tf +++ b/stacks/forgejo/main.tf @@ -32,7 +32,7 @@ resource "kubernetes_persistent_volume_claim" "data_encrypted" { annotations = { "resize.topolvm.io/threshold" = "80%" "resize.topolvm.io/increase" = "50%" - "resize.topolvm.io/storage_limit" = "20Gi" + "resize.topolvm.io/storage_limit" = "50Gi" } } spec { @@ -40,7 +40,7 @@ resource "kubernetes_persistent_volume_claim" "data_encrypted" { storage_class_name = "proxmox-lvm-encrypted" resources { requests = { - storage = "5Gi" + storage = "15Gi" } } } @@ -106,6 +106,18 @@ resource "kubernetes_deployment" "forgejo" { name = "FORGEJO__webhook__ALLOWED_HOST_LIST" value = "*.svc.cluster.local" } + # OCI registry (container packages). Default-on in Forgejo v11 but + # explicit so it can't be silently disabled by an upstream config + # change. Chunked-upload path needs a directory inside /data so it + # survives pod restarts and shares the same PVC as the registry blobs. + env { + name = "FORGEJO__packages__ENABLED" + value = "true" + } + env { + name = "FORGEJO__packages__CHUNKED_UPLOAD_PATH" + value = "/data/tmp/package-upload" + } volume_mount { name = "data" mount_path = "/data" @@ -113,10 +125,10 @@ resource "kubernetes_deployment" "forgejo" { resources { requests = { cpu = "15m" - memory = "384Mi" + memory = "1Gi" } limits = { - memory = "384Mi" + memory = "1Gi" } } port { @@ -165,6 +177,9 @@ module "ingress" { namespace = kubernetes_namespace.forgejo.metadata[0].name name = "forgejo" tls_secret_name = var.tls_secret_name + # OCI registry pushes ship full image layer blobs in one request; default + # Traefik buffering chokes on anything past a few hundred MB. + max_body_size = "5g" extra_annotations = { "gethomepage.dev/enabled" = "true" "gethomepage.dev/name" = "Forgejo" diff --git a/stacks/infra/main.tf b/stacks/infra/main.tf index ef404685..c08af833 100644 --- a/stacks/infra/main.tf +++ b/stacks/infra/main.tf @@ -83,6 +83,12 @@ module "k8s-node-template" { mkdir -p /etc/containerd/certs.d/registry.viktorbarzin.me printf 'server = "https://registry.viktorbarzin.me"\n\n[host."https://10.0.20.10:5050"]\n capabilities = ["pull", "resolve", "push"]\n skip_verify = true\n' > /etc/containerd/certs.d/registry.viktorbarzin.me/hosts.toml + # Forgejo OCI registry: redirect to in-cluster Traefik LB (10.0.20.200) so + # pulls don't hairpin out through the WAN gateway. Traefik serves the + # *.viktorbarzin.me wildcard so SNI verification still passes. + mkdir -p /etc/containerd/certs.d/forgejo.viktorbarzin.me + printf 'server = "https://forgejo.viktorbarzin.me"\n\n[host."https://10.0.20.200"]\n capabilities = ["pull", "resolve"]\n' > /etc/containerd/certs.d/forgejo.viktorbarzin.me/hosts.toml + # Low-traffic registries (registry.k8s.io, quay.io, reg.kyverno.io) pull directly. # Pull-through cache removed: caused corrupted images (truncated downloads) # breaking VPA certgen and Kyverno image pulls. diff --git a/stacks/kyverno/modules/kyverno/registry-credentials.tf b/stacks/kyverno/modules/kyverno/registry-credentials.tf index 2a9fc7c5..6d55f005 100644 --- a/stacks/kyverno/modules/kyverno/registry-credentials.tf +++ b/stacks/kyverno/modules/kyverno/registry-credentials.tf @@ -29,6 +29,12 @@ resource "kubernetes_secret" "registry_credentials" { "10.0.20.10:5050" = { auth = base64encode("${data.vault_kv_secret_v2.viktor.data["registry_user"]}:${data.vault_kv_secret_v2.viktor.data["registry_password"]}") } + # Forgejo OCI registry — read-only PAT for the cluster-puller service + # account user. Pushes go through ci-pusher (separate PAT in Vault + # secret/ci/global, surfaced to Woodpecker). + "forgejo.viktorbarzin.me" = { + auth = base64encode("cluster-puller:${data.vault_kv_secret_v2.viktor.data["forgejo_pull_token"]}") + } } }) } diff --git a/stacks/monitoring/main.tf b/stacks/monitoring/main.tf index 0c207aa0..b222e226 100644 --- a/stacks/monitoring/main.tf +++ b/stacks/monitoring/main.tf @@ -33,5 +33,6 @@ module "monitoring" { kube_config_path = var.kube_config_path registry_user = data.vault_kv_secret_v2.viktor.data["registry_user"] registry_password = data.vault_kv_secret_v2.viktor.data["registry_password"] + forgejo_pull_token = data.vault_kv_secret_v2.viktor.data["forgejo_pull_token"] tier = local.tiers.cluster } diff --git a/stacks/monitoring/modules/monitoring/main.tf b/stacks/monitoring/modules/monitoring/main.tf index d55ac703..ceadb1eb 100644 --- a/stacks/monitoring/modules/monitoring/main.tf +++ b/stacks/monitoring/modules/monitoring/main.tf @@ -41,6 +41,11 @@ variable "registry_password" { type = string sensitive = true } +variable "forgejo_pull_token" { + type = string + sensitive = true + description = "PAT for the cluster-puller user, used by the Forgejo registry integrity probe." +} resource "kubernetes_namespace" "monitoring" { metadata { @@ -426,6 +431,203 @@ resource "kubernetes_cron_job_v1" "registry_integrity_probe" { } } +# ----------------------------------------------------------------------------- +# Forgejo registry integrity probe — same algorithm as registry-integrity-probe +# above, but targets the Forgejo OCI registry instead of registry-private. Runs +# in parallel with the existing probe during the dual-push bake; once Phase 4 +# decommissions registry-private, the registry-integrity-probe CronJob is +# deleted and only this one remains. +# +# Auth: HTTP Basic with cluster-puller PAT (read:package scope is enough to +# walk catalog + manifests). Reaches Forgejo via the in-cluster service so we +# don't hairpin out through Traefik for every probe run. +# ----------------------------------------------------------------------------- +resource "kubernetes_secret" "forgejo_probe_credentials" { + metadata { + name = "forgejo-probe-credentials" + namespace = kubernetes_namespace.monitoring.metadata[0].name + } + type = "Opaque" + data = { + REG_USER = "cluster-puller" + REG_PASS = var.forgejo_pull_token + } +} + +resource "kubernetes_cron_job_v1" "forgejo_integrity_probe" { + metadata { + name = "forgejo-integrity-probe" + namespace = kubernetes_namespace.monitoring.metadata[0].name + } + spec { + concurrency_policy = "Forbid" + failed_jobs_history_limit = 3 + successful_jobs_history_limit = 3 + schedule = "*/15 * * * *" + job_template { + metadata {} + spec { + backoff_limit = 1 + ttl_seconds_after_finished = 600 + template { + metadata {} + spec { + container { + name = "forgejo-integrity-probe" + image = "docker.io/library/alpine:3.20" + env { + name = "REG_USER" + value_from { + secret_key_ref { + name = kubernetes_secret.forgejo_probe_credentials.metadata[0].name + key = "REG_USER" + } + } + } + env { + name = "REG_PASS" + value_from { + secret_key_ref { + name = kubernetes_secret.forgejo_probe_credentials.metadata[0].name + key = "REG_PASS" + } + } + } + env { + name = "REGISTRY_HOST" + value = "forgejo.forgejo.svc.cluster.local" + } + env { + name = "REGISTRY_SCHEME" + value = "http" + } + env { + name = "REGISTRY_INSTANCE" + value = "forgejo.viktorbarzin.me" + } + env { + name = "PUSHGATEWAY" + value = "http://prometheus-prometheus-pushgateway.monitoring:9091/metrics/job/forgejo-integrity-probe" + } + env { + name = "TAGS_PER_REPO" + value = "5" + } + command = ["/bin/sh", "-c", <<-EOT + set -eu + apk add --no-cache curl jq >/dev/null + + REG="$REGISTRY_HOST" + SCHEME="$${REGISTRY_SCHEME:-https}" + INSTANCE="$REGISTRY_INSTANCE" + AUTH="$REG_USER:$REG_PASS" + ACCEPT='application/vnd.oci.image.index.v1+json,application/vnd.oci.image.manifest.v1+json,application/vnd.docker.distribution.manifest.list.v2+json,application/vnd.docker.distribution.manifest.v2+json' + + push() { + curl -sf --max-time 10 --data-binary @- "$PUSHGATEWAY" >/dev/null 2>&1 || true + } + + CATALOG=$(curl -sk -u "$AUTH" --max-time 30 "$SCHEME://$REG/v2/_catalog?n=1000" || echo "") + REPOS=$(echo "$CATALOG" | jq -r '.repositories[]?' 2>/dev/null || echo "") + + if [ -z "$REPOS" ]; then + echo "ERROR: empty catalog or auth failure — cannot probe" + NOW=$(date +%s) + push < /tmp/repos.txt + while IFS= read -r repo; do + [ -z "$repo" ] && continue + REPOS_N=$((REPOS_N + 1)) + + TAGS_JSON=$(curl -sk -u "$AUTH" --max-time 15 "$SCHEME://$REG/v2/$repo/tags/list" || echo "") + echo "$TAGS_JSON" | jq -r '.tags[]?' 2>/dev/null | tail -n "$TAGS_PER_REPO" > /tmp/tags.txt || true + + while IFS= read -r tag; do + [ -z "$tag" ] && continue + TAGS_N=$((TAGS_N + 1)) + + HTTP=$(curl -sk -u "$AUTH" -o /tmp/m.json -w '%%{http_code}' \ + -H "Accept: $ACCEPT" --max-time 15 \ + "$SCHEME://$REG/v2/$repo/manifests/$tag") + if [ "$HTTP" != "200" ]; then + echo "FAIL: $repo:$tag manifest HTTP $HTTP" + FAIL=$((FAIL + 1)) + continue + fi + + MT=$(jq -r '.mediaType // empty' /tmp/m.json 2>/dev/null || echo "") + if echo "$MT" | grep -Eq 'manifest\.list|image\.index'; then + INDEXES_N=$((INDEXES_N + 1)) + jq -r '.manifests[].digest' /tmp/m.json > /tmp/children.txt 2>/dev/null || true + while IFS= read -r d; do + [ -z "$d" ] && continue + CH=$(curl -sk -u "$AUTH" -o /dev/null -w '%%{http_code}' \ + -H "Accept: $ACCEPT" --max-time 10 -I \ + "$SCHEME://$REG/v2/$repo/manifests/$d") + if [ "$CH" != "200" ]; then + echo "FAIL: $repo:$tag index child $d HTTP $CH" + FAIL=$((FAIL + 1)) + fi + done < /tmp/children.txt + fi + done < /tmp/tags.txt + done < /tmp/repos.txt + + NOW=$(date +%s) + push < 3600 for: 15m labels: severity: warning annotations: - summary: "Registry integrity probe has not reported in >1h — CronJob may be broken" + summary: "{{ $labels.instance }} integrity probe has not reported in >1h — CronJob may be broken" - alert: RegistryCatalogInaccessible expr: registry_manifest_integrity_catalog_accessible == 0 for: 15m labels: severity: critical annotations: - summary: "Registry probe cannot fetch /v2/_catalog — auth failure or registry down" + summary: "{{ $labels.instance }} probe cannot fetch /v2/_catalog — auth failure or registry down" - alert: NodeHighCPUUsage expr: pve_cpu_usage_ratio * 100 > 60 for: 6h