Commit graph

4 commits

Author SHA1 Message Date
Viktor Barzin
90c944a265 woodpecker: disable partial clone (partial: false) — fix intermittent git exit-128
All checks were successful
ci/woodpecker/push/default Pipeline was successful
Infra pipelines were failing intermittently across all authors (e.g. #241-244,
#247) with the git clone step exiting 128:

  git fetch --depth=1 --filter=tree:0 ...   (partial/treeless clone)
  git reset --hard <sha>
  fatal: could not fetch <tree-sha> from promisor remote
  remote: 404 page not found

The plugin-git clone defaulted to a partial (treeless) clone. The initial ref
fetch carries credentials, but the lazy *promisor* object fetch triggered by
`git reset --hard` hits the PRIVATE Forgejo repo without creds -> 404 -> exit
128. Whether it fired was luck-of-the-draw, hence the ~50% intermittent failures
fleet-wide (not specific to any commit).

Fix: set `partial: false` on every clone block so all objects for the (still
shallow) commit are fetched upfront with creds — no fragile lazy promisor fetch.

Diagnosed against the woodpecker Postgres DB (steps/log_entries) since the
Woodpecker HTTP API was itself flapping. Earlier "permission for ViktorBarzin"
log lines were an unrelated cross-forge red herring.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-19 09:06:44 +00:00
Viktor Barzin
fd0f4a0365 fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip]
6d224861 came from a --no-checkout worktree whose empty index made the
commit drop every file except two. This restores 05b50d2b's full tree and
correctly adds stacks/stem95su/gdrive-sync.tf + the service-catalog stem95su
entry. Forward-only (parent=6d224861, no force-push); [ci skip] since the
live infra was never applied from the broken commit.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-09 08:45:33 +00:00
Viktor Barzin
6d224861c4 stem95su: scheduled Drive->site sync CronJob (every 10m)
CronJob stem95su-gdrive-sync (*/10) mounts the content PVC RW and
rclone-syncs the read-only Drive folder "claude" (stem claude/files) onto
it (rclone/rclone:1.74.3, scope=drive.readonly, empty-source guard +
--max-delete 25). ESO ExternalSecret stem95su-rclone <- Vault
secret/stem95su. Requires the GCP OAuth app published to Production or the
refresh token expires ~weekly.

Lands the gdrive-sync stack on master (it had landed on a feature branch
by accident on the shared devvm checkout).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-09 08:42:26 +00:00
Viktor Barzin
34ee282d88 [ci] Auto-sync modules/docker-registry/* to registry VM + runbook docs
Replaces the manual scp+bounce sequence that landed registry:2.8.3 on
10.0.20.10 today (see commit 7cb44d72 + nginx-DNS-trap in runbook).
Addresses the "no repeat manual fixes" preference — future changes to
docker-compose.yml / fix-broken-blobs.sh / nginx_registry.conf /
config-private.yml / cleanup-tags.sh now deploy through CI.

Pipeline (.woodpecker/registry-config-sync.yml) mirrors
pve-nfs-exports-sync.yml: ssh-keyscan pin, scp the whole managed set,
bounce compose only when compose-visible files changed, always restart
nginx after a compose bounce (critical — nginx caches upstream DNS), end
with a dry-run fix-broken-blobs.sh to catch regressions.

Credentials:
 - Woodpecker repo-secret `registry_ssh_key` (events: push, manual)
 - Mirror at Vault `secret/woodpecker/registry_ssh_key`
   (private_key / public_key / known_hosts_entry)
 - Public key on /root/.ssh/authorized_keys on 10.0.20.10
 - Key label: woodpecker-registry-config-sync

Runbook updated with "Auto-sync pipeline" section pointing at the new
flow + manual override command.

Closes: code-3vl

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 17:32:12 +00:00