infra/docs/runbooks/forgejo-registry-setup.md
Viktor Barzin fd0f4a0365 fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip]
6d224861 came from a --no-checkout worktree whose empty index made the
commit drop every file except two. This restores 05b50d2b's full tree and
correctly adds stacks/stem95su/gdrive-sync.tf + the service-catalog stem95su
entry. Forward-only (parent=6d224861, no force-push); [ci skip] since the
live infra was never applied from the broken commit.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-09 08:45:33 +00:00

5.6 KiB

Runbook: Forgejo OCI registry — initial setup

Last updated: 2026-05-07

This runbook covers the one-time bootstrap of Forgejo's container registry, executed during Phase 0 of the registry consolidation plan (docs/plans/2026-05-07-forgejo-registry-consolidation-plan.md).

After this runbook is complete, the Forgejo OCI registry at forgejo.viktorbarzin.me accepts pushes from CI and pulls from the cluster, with retention and integrity monitoring in place.

Order of operations

The Terraform stacks reference Vault keys that don't exist on a fresh cluster. Create the keys before running scripts/tg apply.

  1. Apply the resource bumps (memory, PVC, ingress body size, packages env vars) — these don't depend on the new Vault keys.
  2. Create the service-account users + PATs in Forgejo.
  3. Push the PATs to Vault.
  4. Apply the rest of Phase 0 (registry-credentials extension, monitoring probe, retention CronJob).

Step 1 — apply Forgejo deployment bumps

cd infra/stacks/forgejo
scripts/tg apply

Wait for the new pod to come up at the bumped 1Gi memory request and the resized 15Gi PVC. Verify packages are enabled:

kubectl exec -n forgejo deploy/forgejo -- forgejo manager flush-queues
kubectl exec -n forgejo deploy/forgejo -- env | grep PACKAGES

Step 2 — create service-account users

forgejo admin user create is idempotent only with --must-change-password=false. Re-running it on an existing user errors out — that's fine; skip on rerun.

# cluster-puller — read:package PAT for in-cluster pulls.
kubectl exec -n forgejo deploy/forgejo -- \
  forgejo admin user create \
  --username cluster-puller \
  --email cluster-puller@viktorbarzin.me \
  --password "$(openssl rand -base64 24)" \
  --must-change-password=false

# ci-pusher — write:package PAT for CI dual-push, also reused as the
# cleanup CronJob credential (write:package includes delete).
kubectl exec -n forgejo deploy/forgejo -- \
  forgejo admin user create \
  --username ci-pusher \
  --email ci-pusher@viktorbarzin.me \
  --password "$(openssl rand -base64 24)" \
  --must-change-password=false

The user passwords are throwaway — we only ever auth via PAT. Forgejo admin can reset them at any time from the Web UI.

Step 3 — generate the PATs

PATs must be generated through the Web UI logged in as the respective user (the CLI doesn't expose token creation). To log in without OAuth (registration is disabled for everyone except viktor, the admin), use the per-user temporary password from step 2.

For each of cluster-puller and ci-pusher:

  1. Sign out of viktor.
  2. Go to https://forgejo.viktorbarzin.me/user/login and sign in with the throwaway password.
  3. Settings → Applications → Generate new token.
  4. Name: cluster-pull / ci-push. Expiration: never.
  5. Scopes:
    • cluster-puller: read:package
    • ci-pusher: write:package (covers read+write+delete)
  6. Save the token shown on the next page — it is not displayed again.

For the cleanup CronJob, generate a third PAT on ci-pusher:

  1. Repeat steps 4-6 with name cleanup, scope write:package.

Step 4 — push PATs to Vault

vault login -method=oidc

# Read-only, used by the cluster-wide registry-credentials Secret and
# by the Forgejo integrity probe.
vault kv patch secret/viktor \
  forgejo_pull_token=<paste cluster-puller PAT>

# Write+delete, used by the retention CronJob inside Forgejo's
# namespace.
vault kv patch secret/viktor \
  forgejo_cleanup_token=<paste ci-pusher cleanup PAT>

# Write, propagated by vault-woodpecker-sync to all Woodpecker repos.
vault kv patch secret/ci/global \
  forgejo_user=ci-pusher \
  forgejo_push_token=<paste ci-pusher push PAT>

Step 5 — apply the rest of Phase 0

# Registry credential Secret (now reads forgejo_pull_token).
cd infra/stacks/kyverno && scripts/tg apply

# Monitoring probe + retention CronJob.
cd infra/stacks/monitoring && scripts/tg apply
cd infra/stacks/forgejo && scripts/tg apply

# Containerd hosts.toml on each existing k8s node — VM cloud-init
# only fires on first boot.
infra/scripts/setup-forgejo-containerd-mirror.sh

Verification

# Login from a workstation with docker.
echo "<ci-pusher PAT>" | docker login forgejo.viktorbarzin.me -u ci-pusher --password-stdin

# Push a smoketest image.
docker pull alpine:3.20
docker tag alpine:3.20 forgejo.viktorbarzin.me/viktor/smoketest:1
docker push forgejo.viktorbarzin.me/viktor/smoketest:1

# Pull from a k8s node.
ssh wizard@<node> sudo crictl pull forgejo.viktorbarzin.me/viktor/smoketest:1

# Confirm the cluster-wide Secret was synced into a fresh namespace.
kubectl create namespace forgejo-smoketest
kubectl get secret -n forgejo-smoketest registry-credentials \
  -o jsonpath='{.data.\.dockerconfigjson}' | base64 -d | jq '.auths | keys'
# Expect: ["10.0.20.10:5050", "forgejo.viktorbarzin.me",
#         "registry.viktorbarzin.me", "registry.viktorbarzin.me:5050"]
kubectl delete namespace forgejo-smoketest

# Delete the smoketest package via API.
curl -X DELETE -H "Authorization: token <ci-pusher cleanup PAT>" \
  https://forgejo.viktorbarzin.me/api/v1/packages/viktor/container/smoketest/1

When to revisit

  • PAT rotation: PATs created here have no expiry by design. If a PAT leaks, regenerate via the Web UI and vault kv patch the new value into the same key — the next terragrunt apply will sync it to all consumers within minutes (Kyverno ClusterPolicy clones the Secret, vault-woodpecker-sync runs every 6h).
  • New service account: if a future workload needs different scopes, add a parallel user/PAT here rather than expanding existing PAT scope. Principle of least privilege.