infra

Viktor Barzin 99180bec42 [n8n] Fix broken DIUN auto-upgrade pipeline — missing auth token to claude-agent-service ## Context DIUN has been detecting image updates and firing Slack + webhook notifications for weeks, but zero automated upgrades ran because the handoff from n8n to claude-agent-service was silently 401-ing. The pipeline (DIUN → n8n webhook → claude-agent-service /execute → service-upgrade agent) was migrated from DevVM SSH to K8s HTTP in `42f1c3cf`. The migration wired `claude-agent-service` (API_BEARER_TOKEN env set), updated the n8n workflow JSON to POST with `Authorization: Bearer $env.CLAUDE_AGENT_API_TOKEN`, but missed two things on the n8n side: 1. The deployment didn't expose `CLAUDE_AGENT_API_TOKEN` to the n8n container — workflow sent `Authorization: Bearer ` (empty). 2. The workflow header expression used JS concat (`='Bearer ' + $env.X`) which n8n 1.x does NOT evaluate in HTTP Request node header params. It needs template-literal form: `=Bearer {{ $env.X }}`. Evidence: `claude-agent-service` logs showed only `/health` probes — zero `/execute` calls over 12h despite DIUN firing webhooks. n8n PG execution 2250 returned `401 Missing bearer token`. ## This change - Adds ExternalSecret `claude-agent-token` in the `n8n` namespace that pulls `api_bearer_token` from Vault `secret/claude-agent-service` (same source as the receiving service's token). - Wires the token into the n8n container as env var `CLAUDE_AGENT_API_TOKEN` via `secret_key_ref`. - Sets `N8N_BLOCK_ENV_ACCESS_IN_NODE=false` so expressions CAN read `$env.*` at all (default in 1.x is false already, but setting explicitly guards against upstream default flips). - Fixes the workflow JSON backup (`workflows/diun-upgrade.json`) header expression to use `{{ $env.X }}` template syntax. The live workflow in n8n's PG DB was also patched in place (one-time `UPDATE workflow_entity SET nodes = REPLACE(...)` — workflows are not TF-managed; they were imported once). ## What is NOT in this change - No retroactive re-run of skipped DIUN events. They'll be rediscovered in future scans. - No change to the `claude-agent-service` side — its token and endpoint were already correct. - No Slack alert on n8n HTTP-node failures — future work; right now a broken workflow fails silently unless you check Execution History. ## End-to-end verification ``` $ curl -X POST n8n.viktorbarzin.me/webhook/30805ab6-... \ -d '{"diun_entry_status":"update","diun_entry_image":"docker.io/library/httpd","diun_entry_imagetag":"2.4.66",...}' {"message":"Workflow was started"} HTTP 200 # n8n PG: execution_entity latest row → status=success # claude-agent-service logs → "POST /execute HTTP/1.1" 202 Accepted ``` ## Reproduce locally ``` 1. vault login -method=oidc 2. cd stacks/n8n && ../../scripts/tg apply 3. kubectl -n n8n exec deploy/n8n -- printenv CLAUDE_AGENT_API_TOKEN (should print 64-char hex) 4. Fire synthetic webhook with non-critical image (httpd / alpine) 5. Check n8n execution is success, claude-agent-service shows 202 ``` Closes: code-ekz Related: code-bck Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-18 10:41:09 +00:00
..
.gitkeep	chore: add untracked stacks, scripts, and agent configs	2026-04-15 09:33:06 +00:00
diun-upgrade.json	[n8n] Fix broken DIUN auto-upgrade pipeline — missing auth token to claude-agent-service	2026-04-18 10:41:09 +00:00

Viktor Barzin 99180bec42 [n8n] Fix broken DIUN auto-upgrade pipeline — missing auth token to claude-agent-service

## Context

DIUN has been detecting image updates and firing Slack + webhook
notifications for weeks, but zero automated upgrades ran because the
handoff from n8n to claude-agent-service was silently 401-ing.

The pipeline (DIUN → n8n webhook → claude-agent-service /execute →
service-upgrade agent) was migrated from DevVM SSH to K8s HTTP in
42f1c3cf. The migration wired `claude-agent-service` (API_BEARER_TOKEN
env set), updated the n8n workflow JSON to POST with `Authorization:
Bearer $env.CLAUDE_AGENT_API_TOKEN`, but missed two things on the n8n
side:

1. The deployment didn't expose `CLAUDE_AGENT_API_TOKEN` to the n8n
   container — workflow sent `Authorization: Bearer ` (empty).
2. The workflow header expression used JS concat (`='Bearer ' + $env.X`)
   which n8n 1.x does NOT evaluate in HTTP Request node header params.
   It needs template-literal form: `=Bearer {{ $env.X }}`.

Evidence: `claude-agent-service` logs showed only `/health` probes —
zero `/execute` calls over 12h despite DIUN firing webhooks. n8n PG
execution 2250 returned `401 Missing bearer token`.

## This change

- Adds ExternalSecret `claude-agent-token` in the `n8n` namespace that
  pulls `api_bearer_token` from Vault `secret/claude-agent-service`
  (same source as the receiving service's token).
- Wires the token into the n8n container as env var
  `CLAUDE_AGENT_API_TOKEN` via `secret_key_ref`.
- Sets `N8N_BLOCK_ENV_ACCESS_IN_NODE=false` so expressions CAN read
  `$env.*` at all (default in 1.x is false already, but setting
  explicitly guards against upstream default flips).
- Fixes the workflow JSON backup (`workflows/diun-upgrade.json`) header
  expression to use `{{ $env.X }}` template syntax.

The live workflow in n8n's PG DB was also patched in place (one-time
`UPDATE workflow_entity SET nodes = REPLACE(...)` — workflows are not
TF-managed; they were imported once).

## What is NOT in this change

- No retroactive re-run of skipped DIUN events. They'll be rediscovered
  in future scans.
- No change to the `claude-agent-service` side — its token and endpoint
  were already correct.
- No Slack alert on n8n HTTP-node failures — future work; right now a
  broken workflow fails silently unless you check Execution History.

## End-to-end verification

```
$ curl -X POST n8n.viktorbarzin.me/webhook/30805ab6-... \
    -d '{"diun_entry_status":"update","diun_entry_image":"docker.io/library/httpd","diun_entry_imagetag":"2.4.66",...}'
{"message":"Workflow was started"}  HTTP 200

# n8n PG: execution_entity latest row  → status=success
# claude-agent-service logs           → "POST /execute HTTP/1.1" 202 Accepted
```

## Reproduce locally

```
1. vault login -method=oidc
2. cd stacks/n8n && ../../scripts/tg apply
3. kubectl -n n8n exec deploy/n8n -- printenv CLAUDE_AGENT_API_TOKEN
   (should print 64-char hex)
4. Fire synthetic webhook with non-critical image (httpd / alpine)
5. Check n8n execution is success, claude-agent-service shows 202
```

Closes: code-ekz
Related: code-bck

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-04-18 10:41:09 +00:00

.gitkeep chore: add untracked stacks, scripts, and agent configs 2026-04-15 09:33:06 +00:00

diun-upgrade.json [n8n] Fix broken DIUN auto-upgrade pipeline — missing auth token to claude-agent-service 2026-04-18 10:41:09 +00:00