fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip]
6d224861 came from a --no-checkout worktree whose empty index made the
commit drop every file except two. This restores 05b50d2b's full tree and
correctly adds stacks/stem95su/gdrive-sync.tf + the service-catalog stem95su
entry. Forward-only (parent=6d224861, no force-push); [ci skip] since the
live infra was never applied from the broken commit.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
parent
6d224861c4
commit
fd0f4a0365
1166 changed files with 358546 additions and 0 deletions
507
docs/plans/2026-02-28-ci-build-caching-plan.md
Normal file
507
docs/plans/2026-02-28-ci-build-caching-plan.md
Normal file
|
|
@ -0,0 +1,507 @@
|
|||
# CI Build Caching Implementation Plan
|
||||
|
||||
> **For Claude:** REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task.
|
||||
|
||||
**Goal:** Speed up Woodpecker CI Docker image builds by adding BuildKit layer caching via a local private registry, with dual-push to Docker Hub and local.
|
||||
|
||||
**Architecture:** Extend the existing Docker Compose registry stack on `10.0.20.10` with a new R/W `registry-private` service (port 5050). Configure Woodpecker `plugin-docker-buildx` pipelines with `cache_from`/`cache_to` pointing to `registry.viktorbarzin.lan:5050`. Push final images to both Docker Hub and local registry. All changes persisted in Terraform via `stacks/infra/main.tf` cloud-init provisioning.
|
||||
|
||||
**Tech Stack:** Docker Registry v2, nginx, Docker Compose, Woodpecker CI, BuildKit, Technitium DNS, Terraform
|
||||
|
||||
**Design doc:** `docs/plans/2026-02-28-ci-build-caching-design.md`
|
||||
|
||||
**Key context:** The registry VM at `10.0.20.10` is fully managed via Terraform in `stacks/infra/main.tf`. Config files live in `modules/docker-registry/` and are read by Terraform via `file()` and `templatefile()`, then base64-encoded into cloud-init `provision_cmds`. Changes to config files require updating both the files AND the cloud-init provisioning in `stacks/infra/main.tf`. Since the VM is already running, we also SCP updated files to the live VM for immediate effect.
|
||||
|
||||
---
|
||||
|
||||
### Task 1: Create private registry config file
|
||||
|
||||
**Files:**
|
||||
- Create: `modules/docker-registry/config-private.yml`
|
||||
|
||||
**Step 1: Create the config file**
|
||||
|
||||
This is a standard `registry:2` config WITHOUT the `proxy` section (which is what makes it R/W instead of read-only pull-through). Based on the existing `config.yaml` but with 100GiB storage and no proxy/auth.
|
||||
|
||||
```yaml
|
||||
version: 0.1
|
||||
log:
|
||||
fields:
|
||||
service: registry-private
|
||||
storage:
|
||||
cache:
|
||||
blobdescriptor: inmemory
|
||||
filesystem:
|
||||
rootdirectory: /var/lib/registry
|
||||
maxsize: 100GiB
|
||||
delete:
|
||||
enabled: true
|
||||
maintenance:
|
||||
uploadpurging:
|
||||
enabled: true
|
||||
age: 168h
|
||||
interval: 4h
|
||||
dryrun: false
|
||||
http:
|
||||
addr: :5000
|
||||
headers:
|
||||
X-Content-Type-Options: [nosniff]
|
||||
health:
|
||||
storagedriver:
|
||||
enabled: true
|
||||
interval: 10s
|
||||
threshold: 3
|
||||
```
|
||||
|
||||
Key differences from the proxy configs:
|
||||
- No `proxy` section → allows pushes
|
||||
- `maxsize: 100GiB` (user requested 100GB)
|
||||
- `uploadpurging.age: 168h` (7 days, since build cache layers are re-pushed frequently)
|
||||
|
||||
**Step 2: Commit**
|
||||
|
||||
```bash
|
||||
git add modules/docker-registry/config-private.yml
|
||||
git commit -m "[ci skip] add private R/W registry config for CI build caching"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Task 2: Add registry-private service to Docker Compose
|
||||
|
||||
**Files:**
|
||||
- Modify: `modules/docker-registry/docker-compose.yml`
|
||||
|
||||
**Step 1: Add the registry-private service**
|
||||
|
||||
Add this service block after `registry-kyverno` (before `nginx`):
|
||||
|
||||
```yaml
|
||||
registry-private:
|
||||
image: registry:2
|
||||
container_name: registry-private
|
||||
restart: always
|
||||
volumes:
|
||||
- /opt/registry/data/private:/var/lib/registry
|
||||
- /opt/registry/config-private.yml:/etc/docker/registry/config.yml:ro
|
||||
networks:
|
||||
- registry
|
||||
healthcheck:
|
||||
test: ["CMD", "sh", "-c", "wget -qO- http://localhost:5000/v2/ >/dev/null 2>&1"]
|
||||
interval: 30s
|
||||
timeout: 10s
|
||||
retries: 3
|
||||
start_period: 10s
|
||||
```
|
||||
|
||||
**Step 2: Add port 5050 to the nginx service**
|
||||
|
||||
In the `nginx` service `ports` list, add:
|
||||
|
||||
```yaml
|
||||
- "5050:5050"
|
||||
```
|
||||
|
||||
**Step 3: Add registry-private to nginx depends_on**
|
||||
|
||||
```yaml
|
||||
registry-private:
|
||||
condition: service_healthy
|
||||
```
|
||||
|
||||
**Step 4: Commit**
|
||||
|
||||
```bash
|
||||
git add modules/docker-registry/docker-compose.yml
|
||||
git commit -m "[ci skip] add registry-private service to Docker Compose stack"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Task 3: Add nginx upstream and server block for private registry
|
||||
|
||||
**Files:**
|
||||
- Modify: `modules/docker-registry/nginx_registry.conf`
|
||||
|
||||
**Step 1: Add upstream block**
|
||||
|
||||
After the existing `upstream kyverno` block, add:
|
||||
|
||||
```nginx
|
||||
upstream private {
|
||||
server registry-private:5000;
|
||||
keepalive 32;
|
||||
}
|
||||
```
|
||||
|
||||
**Step 2: Add server block**
|
||||
|
||||
After the last server block (kyverno on port 5040), add:
|
||||
|
||||
```nginx
|
||||
# --- Private R/W Registry (port 5050) ---
|
||||
|
||||
server {
|
||||
listen 5050;
|
||||
server_name _;
|
||||
|
||||
client_max_body_size 0;
|
||||
proxy_request_buffering off;
|
||||
proxy_buffering off;
|
||||
chunked_transfer_encoding on;
|
||||
|
||||
location /v2/ {
|
||||
proxy_pass http://private;
|
||||
proxy_http_version 1.1;
|
||||
proxy_set_header Host $host;
|
||||
proxy_set_header Connection "";
|
||||
proxy_set_header X-Real-IP $remote_addr;
|
||||
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
|
||||
|
||||
proxy_read_timeout 900;
|
||||
proxy_send_timeout 900;
|
||||
}
|
||||
|
||||
location / {
|
||||
return 200 'ok';
|
||||
add_header Content-Type text/plain;
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Key differences from the read-only proxy server blocks:
|
||||
- **No `proxy_cache`** directives — caching uploads would corrupt pushes
|
||||
- **`proxy_buffering off`** — important for large layer uploads
|
||||
- **`chunked_transfer_encoding on`** — Docker push uses chunked uploads
|
||||
- **`X-Real-IP` / `X-Forwarded-For`** headers — useful for debugging
|
||||
|
||||
**Step 3: Commit**
|
||||
|
||||
```bash
|
||||
git add modules/docker-registry/nginx_registry.conf
|
||||
git commit -m "[ci skip] add nginx upstream and server block for private registry on port 5050"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Task 4: Update Terraform provisioning for the private registry
|
||||
|
||||
**Files:**
|
||||
- Modify: `stacks/infra/main.tf` (lines 119-274, the docker-registry-template and docker-registry-vm modules)
|
||||
|
||||
**Step 1: Add private registry data directory to `provision_cmds`**
|
||||
|
||||
In the `mkdir` command at line 152, append the private registry directory. Change:
|
||||
|
||||
```hcl
|
||||
"mkdir -p /opt/registry/data/dockerhub /opt/registry/data/ghcr /opt/registry/data/quay /opt/registry/data/k8s /opt/registry/data/kyverno",
|
||||
```
|
||||
|
||||
to:
|
||||
|
||||
```hcl
|
||||
"mkdir -p /opt/registry/data/dockerhub /opt/registry/data/ghcr /opt/registry/data/quay /opt/registry/data/k8s /opt/registry/data/kyverno /opt/registry/data/private",
|
||||
```
|
||||
|
||||
**Step 2: Add config-private.yml deployment command**
|
||||
|
||||
After the kyverno config block (line 203), add:
|
||||
|
||||
```hcl
|
||||
# Write private R/W registry config (no proxy = accepts pushes)
|
||||
format("echo %s | base64 -d > /opt/registry/config-private.yml",
|
||||
base64encode(file("${path.root}/../../modules/docker-registry/config-private.yml"))
|
||||
),
|
||||
```
|
||||
|
||||
**Step 3: Add garbage collection cron for private registry**
|
||||
|
||||
After the kyverno garbage collection cron (line 239), add:
|
||||
|
||||
```hcl
|
||||
"( crontab -l 2>/dev/null; echo '25 3 * * 0 /usr/bin/docker exec registry-private registry garbage-collect -m /etc/docker/registry/config.yml >> /var/log/registry-gc.log 2>&1' ) | crontab -",
|
||||
```
|
||||
|
||||
This follows the existing staggered pattern (each registry offset by 5 minutes).
|
||||
|
||||
**Step 4: Update the VM module comment block**
|
||||
|
||||
At lines 266-273, update the port documentation comment to include port 5050:
|
||||
|
||||
```hcl
|
||||
# All ports go through nginx for request serialization (proxy_cache_lock):
|
||||
# 5000 -> nginx -> registry-dockerhub (docker.io proxy)
|
||||
# 5001 -> registry-dockerhub direct (Prometheus metrics)
|
||||
# 5010 -> nginx -> registry-ghcr (ghcr.io proxy)
|
||||
# 5020 -> nginx -> registry-quay (quay.io proxy)
|
||||
# 5030 -> nginx -> registry-k8s (registry.k8s.io proxy)
|
||||
# 5040 -> nginx -> registry-kyverno (reg.kyverno.io proxy)
|
||||
# 5050 -> nginx -> registry-private (R/W registry for CI build cache)
|
||||
# 8080 -> registry-ui (joxit/docker-registry-ui)
|
||||
```
|
||||
|
||||
**Step 5: Commit**
|
||||
|
||||
```bash
|
||||
git add stacks/infra/main.tf
|
||||
git commit -m "[ci skip] add private registry to Terraform cloud-init provisioning"
|
||||
```
|
||||
|
||||
**Note:** This updates the cloud-init template. The running VM won't automatically pick up these changes — it only applies on fresh VM creation from the template. For the running VM, Task 5 deploys the files via SCP. This ensures both the live VM and Terraform state are in sync.
|
||||
|
||||
---
|
||||
|
||||
### Task 5: Deploy to the running registry VM
|
||||
|
||||
Since the registry VM is already running (cloud-init only runs on first boot), we deploy the updated files directly via SSH/SCP for immediate effect.
|
||||
|
||||
**Step 1: SSH to the registry VM and create the storage directory**
|
||||
|
||||
```bash
|
||||
ssh root@10.0.20.10 "mkdir -p /opt/registry/data/private"
|
||||
```
|
||||
|
||||
**Step 2: Copy updated files to the VM**
|
||||
|
||||
```bash
|
||||
scp modules/docker-registry/docker-compose.yml root@10.0.20.10:/opt/registry/docker-compose.yml
|
||||
scp modules/docker-registry/config-private.yml root@10.0.20.10:/opt/registry/config-private.yml
|
||||
scp modules/docker-registry/nginx_registry.conf root@10.0.20.10:/opt/registry/nginx.conf
|
||||
```
|
||||
|
||||
Note: The nginx config is stored as `/opt/registry/nginx.conf` on the VM (the docker-compose mounts it as `nginx.conf`).
|
||||
|
||||
**Step 3: Restart the Docker Compose stack**
|
||||
|
||||
```bash
|
||||
ssh root@10.0.20.10 "cd /opt/registry && docker compose up -d"
|
||||
```
|
||||
|
||||
This will create the new `registry-private` container and reload nginx with the new port.
|
||||
|
||||
**Step 4: Add garbage collection cron on the running VM**
|
||||
|
||||
```bash
|
||||
ssh root@10.0.20.10 '( crontab -l 2>/dev/null; echo "25 3 * * 0 /usr/bin/docker exec registry-private registry garbage-collect -m /etc/docker/registry/config.yml >> /var/log/registry-gc.log 2>&1" ) | crontab -'
|
||||
```
|
||||
|
||||
**Step 5: Verify the private registry is accessible**
|
||||
|
||||
```bash
|
||||
curl -s http://10.0.20.10:5050/v2/
|
||||
# Expected: {} (empty JSON object = registry is up)
|
||||
|
||||
curl -s http://10.0.20.10:5050/v2/_catalog
|
||||
# Expected: {"repositories":[]} (empty, no images pushed yet)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Task 6: Add DNS record for registry.viktorbarzin.lan
|
||||
|
||||
**Step 1: Add A record via Technitium API**
|
||||
|
||||
```bash
|
||||
# Technitium DNS API endpoint (web UI is on port 5380)
|
||||
# Get API token from tfvars (technitium_password)
|
||||
curl -s "http://10.0.20.204:5380/api/zones/records/add?token=<TECHNITIUM_TOKEN>&domain=registry.viktorbarzin.lan&zone=viktorbarzin.lan&type=A&ipAddress=10.0.20.10&overwrite=true"
|
||||
```
|
||||
|
||||
Alternatively, add via Technitium web UI at `https://technitium.viktorbarzin.me`:
|
||||
- Zone: `viktorbarzin.lan`
|
||||
- Record: `registry` → A → `10.0.20.10`
|
||||
|
||||
**Step 2: Verify DNS resolution from a K8s pod**
|
||||
|
||||
```bash
|
||||
kubectl run -it --rm dns-test --image=alpine --restart=Never -- nslookup registry.viktorbarzin.lan
|
||||
# Expected: Address: 10.0.20.10
|
||||
```
|
||||
|
||||
**Step 3: Verify registry is accessible via DNS name**
|
||||
|
||||
```bash
|
||||
curl -s http://registry.viktorbarzin.lan:5050/v2/
|
||||
# Expected: {}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Task 7: Update build-cli.yml pipeline with BuildKit caching
|
||||
|
||||
**Files:**
|
||||
- Modify: `.woodpecker/build-cli.yml`
|
||||
|
||||
**Step 1: Update the pipeline**
|
||||
|
||||
Replace the entire file content with:
|
||||
|
||||
```yaml
|
||||
when:
|
||||
event: push
|
||||
|
||||
clone:
|
||||
git:
|
||||
image: woodpeckerci/plugin-git
|
||||
settings:
|
||||
attempts: 5
|
||||
backoff: 10s
|
||||
|
||||
steps:
|
||||
- name: build-image
|
||||
image: woodpeckerci/plugin-docker-buildx
|
||||
settings:
|
||||
username: "viktorbarzin"
|
||||
password:
|
||||
from_secret: dockerhub-pat
|
||||
repo:
|
||||
- viktorbarzin/infra
|
||||
- registry.viktorbarzin.lan:5050/infra
|
||||
logins:
|
||||
- registry: https://index.docker.io/v1/
|
||||
username: viktorbarzin
|
||||
password:
|
||||
from_secret: dockerhub-pat
|
||||
dockerfile: cli/Dockerfile
|
||||
context: cli
|
||||
auto_tag: true
|
||||
cache_from: type=registry,ref=registry.viktorbarzin.lan:5050/infra:buildcache
|
||||
cache_to: type=registry,ref=registry.viktorbarzin.lan:5050/infra:buildcache,mode=max
|
||||
buildkit_config: |
|
||||
[registry."registry.viktorbarzin.lan:5050"]
|
||||
http = true
|
||||
insecure = true
|
||||
```
|
||||
|
||||
Key changes:
|
||||
- `repo` is now a list — pushes to both Docker Hub and local registry
|
||||
- `logins` provides Docker Hub credentials explicitly (needed when `repo` is a list)
|
||||
- `cache_from`/`cache_to` use registry-based BuildKit cache on the local registry
|
||||
- `buildkit_config` allows HTTP access to the insecure local registry
|
||||
- `mode=max` caches ALL layers (including intermediate build stages)
|
||||
|
||||
**Step 2: Commit**
|
||||
|
||||
```bash
|
||||
git add .woodpecker/build-cli.yml
|
||||
git commit -m "[ci skip] add BuildKit layer caching and dual-push to build-cli pipeline"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Task 8: Update f1-stream.yml pipeline with BuildKit caching
|
||||
|
||||
**Files:**
|
||||
- Modify: `.woodpecker/f1-stream.yml`
|
||||
|
||||
**Step 1: Update the pipeline**
|
||||
|
||||
Replace the entire file content with:
|
||||
|
||||
```yaml
|
||||
when:
|
||||
event: push
|
||||
path: "stacks/f1-stream/files/**"
|
||||
|
||||
clone:
|
||||
git:
|
||||
image: woodpeckerci/plugin-git
|
||||
settings:
|
||||
attempts: 5
|
||||
backoff: 10s
|
||||
|
||||
steps:
|
||||
- name: build-image
|
||||
image: woodpeckerci/plugin-docker-buildx
|
||||
settings:
|
||||
username: "viktorbarzin"
|
||||
password:
|
||||
from_secret: dockerhub-pat
|
||||
repo:
|
||||
- viktorbarzin/f1-stream
|
||||
- registry.viktorbarzin.lan:5050/f1-stream
|
||||
logins:
|
||||
- registry: https://index.docker.io/v1/
|
||||
username: viktorbarzin
|
||||
password:
|
||||
from_secret: dockerhub-pat
|
||||
dockerfile: stacks/f1-stream/files/Dockerfile
|
||||
context: stacks/f1-stream/files
|
||||
platforms: linux/amd64
|
||||
provenance: false
|
||||
tags: latest
|
||||
cache_from: type=registry,ref=registry.viktorbarzin.lan:5050/f1-stream:buildcache
|
||||
cache_to: type=registry,ref=registry.viktorbarzin.lan:5050/f1-stream:buildcache,mode=max
|
||||
buildkit_config: |
|
||||
[registry."registry.viktorbarzin.lan:5050"]
|
||||
http = true
|
||||
insecure = true
|
||||
|
||||
- name: deploy
|
||||
image: bitnami/kubectl
|
||||
commands:
|
||||
- kubectl -n f1-stream rollout restart deployment f1-stream
|
||||
- kubectl -n f1-stream rollout status deployment f1-stream --timeout=120s
|
||||
```
|
||||
|
||||
Same pattern as build-cli: dual-push + BuildKit cache. The `deploy` step is unchanged.
|
||||
|
||||
**Step 2: Commit**
|
||||
|
||||
```bash
|
||||
git add .woodpecker/f1-stream.yml
|
||||
git commit -m "[ci skip] add BuildKit layer caching and dual-push to f1-stream pipeline"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Task 9: Test end-to-end with a manual build trigger
|
||||
|
||||
**Step 1: Push changes to trigger the build-cli pipeline**
|
||||
|
||||
```bash
|
||||
git push origin master
|
||||
```
|
||||
|
||||
The `build-cli.yml` pipeline triggers on every push. Monitor it at `https://ci.viktorbarzin.me`.
|
||||
|
||||
**Step 2: Verify cache was populated**
|
||||
|
||||
After the first build completes, check the local registry has the cache:
|
||||
|
||||
```bash
|
||||
curl -s http://registry.viktorbarzin.lan:5050/v2/_catalog
|
||||
# Expected: {"repositories":["infra"]}
|
||||
|
||||
curl -s http://registry.viktorbarzin.lan:5050/v2/infra/tags/list
|
||||
# Expected: tags include "buildcache" and the auto-tagged version
|
||||
```
|
||||
|
||||
**Step 3: Trigger a second build to verify cache hit**
|
||||
|
||||
Make a trivial change (e.g., update a comment in `cli/`) and push again. The build logs should show "importing cache manifest from registry.viktorbarzin.lan:5050/infra:buildcache" and skip unchanged layers.
|
||||
|
||||
**Step 4: Verify Docker Hub also has the image**
|
||||
|
||||
```bash
|
||||
curl -s https://hub.docker.com/v2/repositories/viktorbarzin/infra/tags/ | python3 -m json.tool | head -20
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Task 10: Verify cleanup script covers private registry
|
||||
|
||||
**Files:**
|
||||
- Review: `modules/docker-registry/cleanup-tags.sh`
|
||||
|
||||
**Step 1: Verify the script already handles multiple registries**
|
||||
|
||||
The existing script walks ALL subdirectories under `BASE` (`/opt/registry/data`). Since the private registry stores data at `/opt/registry/data/private/docker/registry/v2/repositories/`, it will automatically be picked up by the existing script without changes.
|
||||
|
||||
Verify by reading the script logic — `os.listdir(BASE)` iterates `dockerhub`, `ghcr`, `quay`, `k8s`, `kyverno`, and now `private`.
|
||||
|
||||
**Step 2: Consider whether to adjust the keep count**
|
||||
|
||||
The default `KEEP=10` may be too aggressive for the private registry since buildcache tags are few (usually just one `buildcache` tag per repo). The script only deletes when there are MORE than `KEEP` tags, so with typically 2-3 tags per repo (e.g., `latest`, `buildcache`, maybe a version tag), no cleanup will happen. This is fine.
|
||||
|
||||
No code changes needed — the script already works with the new registry.
|
||||
Loading…
Add table
Add a link
Reference in a new issue