infra/stacks/n8n/workflows/diun-upgrade.json
Viktor Barzin 99180bec42 [n8n] Fix broken DIUN auto-upgrade pipeline — missing auth token to claude-agent-service
## Context

DIUN has been detecting image updates and firing Slack + webhook
notifications for weeks, but zero automated upgrades ran because the
handoff from n8n to claude-agent-service was silently 401-ing.

The pipeline (DIUN → n8n webhook → claude-agent-service /execute →
service-upgrade agent) was migrated from DevVM SSH to K8s HTTP in
42f1c3cf. The migration wired `claude-agent-service` (API_BEARER_TOKEN
env set), updated the n8n workflow JSON to POST with `Authorization:
Bearer $env.CLAUDE_AGENT_API_TOKEN`, but missed two things on the n8n
side:

1. The deployment didn't expose `CLAUDE_AGENT_API_TOKEN` to the n8n
   container — workflow sent `Authorization: Bearer ` (empty).
2. The workflow header expression used JS concat (`='Bearer ' + $env.X`)
   which n8n 1.x does NOT evaluate in HTTP Request node header params.
   It needs template-literal form: `=Bearer {{ $env.X }}`.

Evidence: `claude-agent-service` logs showed only `/health` probes —
zero `/execute` calls over 12h despite DIUN firing webhooks. n8n PG
execution 2250 returned `401 Missing bearer token`.

## This change

- Adds ExternalSecret `claude-agent-token` in the `n8n` namespace that
  pulls `api_bearer_token` from Vault `secret/claude-agent-service`
  (same source as the receiving service's token).
- Wires the token into the n8n container as env var
  `CLAUDE_AGENT_API_TOKEN` via `secret_key_ref`.
- Sets `N8N_BLOCK_ENV_ACCESS_IN_NODE=false` so expressions CAN read
  `$env.*` at all (default in 1.x is false already, but setting
  explicitly guards against upstream default flips).
- Fixes the workflow JSON backup (`workflows/diun-upgrade.json`) header
  expression to use `{{ $env.X }}` template syntax.

The live workflow in n8n's PG DB was also patched in place (one-time
`UPDATE workflow_entity SET nodes = REPLACE(...)` — workflows are not
TF-managed; they were imported once).

## What is NOT in this change

- No retroactive re-run of skipped DIUN events. They'll be rediscovered
  in future scans.
- No change to the `claude-agent-service` side — its token and endpoint
  were already correct.
- No Slack alert on n8n HTTP-node failures — future work; right now a
  broken workflow fails silently unless you check Execution History.

## End-to-end verification

```
$ curl -X POST n8n.viktorbarzin.me/webhook/30805ab6-... \
    -d '{"diun_entry_status":"update","diun_entry_image":"docker.io/library/httpd","diun_entry_imagetag":"2.4.66",...}'
{"message":"Workflow was started"}  HTTP 200

# n8n PG: execution_entity latest row  → status=success
# claude-agent-service logs           → "POST /execute HTTP/1.1" 202 Accepted
```

## Reproduce locally

```
1. vault login -method=oidc
2. cd stacks/n8n && ../../scripts/tg apply
3. kubectl -n n8n exec deploy/n8n -- printenv CLAUDE_AGENT_API_TOKEN
   (should print 64-char hex)
4. Fire synthetic webhook with non-critical image (httpd / alpine)
5. Check n8n execution is success, claude-agent-service shows 202
```

Closes: code-ekz
Related: code-bck

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 10:41:09 +00:00

68 lines
3.9 KiB
JSON

{
"name": "DIUN Upgrade Agent",
"active": true,
"nodes": [
{
"parameters": {"httpMethod": "POST", "path": "30805ab6-7281-4d42-8aa1-fbfe5a9694fa", "options": {}},
"id": "webhook-trigger",
"name": "DIUN Webhook",
"type": "n8n-nodes-base.webhook",
"typeVersion": 2,
"position": [250, 300],
"webhookId": "30805ab6-7281-4d42-8aa1-fbfe5a9694fa"
},
{
"parameters": {
"conditions": {
"options": {"caseSensitive": true, "leftValue": "", "typeValidation": "strict"},
"conditions": [{"id": "cond-status", "leftValue": "={{ $json.body.diun_entry_status }}", "rightValue": "update", "operator": {"type": "string", "operation": "equals"}}],
"combinator": "and"
},
"options": {}
},
"id": "filter-status",
"name": "Filter Updates Only",
"type": "n8n-nodes-base.filter",
"typeVersion": 2,
"position": [470, 300]
},
{
"parameters": {
"jsCode": "const MAX_UPGRADES_PER_WINDOW = 5;\nconst WINDOW_HOURS = 6;\n\nconst staticData = $getWorkflowStaticData('global');\nconst now = Date.now();\nconst windowMs = WINDOW_HOURS * 60 * 60 * 1000;\n\nif (!staticData.windowStart || (now - staticData.windowStart) > windowMs) {\n staticData.windowStart = now;\n staticData.count = 0;\n}\n\nif (staticData.count >= MAX_UPGRADES_PER_WINDOW) {\n console.log('Rate limit reached: ' + staticData.count + '/' + MAX_UPGRADES_PER_WINDOW);\n return [];\n}\n\nconst image = $input.first().json.body.diun_entry_image || '';\nconst tag = $input.first().json.body.diun_entry_imagetag || '';\n\nconst dbPatterns = ['postgres', 'mysql', 'redis', 'clickhouse', 'etcd'];\nif (dbPatterns.some(p => image.toLowerCase().includes(p))) return [];\n\nconst skipPrefixes = ['viktorbarzin/', 'registry.viktorbarzin.me/', 'ancamilea/', 'mghee/'];\nif (skipPrefixes.some(p => image.startsWith(p))) return [];\n\nconst infraPrefixes = ['registry.k8s.io/', 'quay.io/tigera/', 'quay.io/metallb/', 'nvcr.io/', 'reg.kyverno.io/'];\nif (infraPrefixes.some(p => image.startsWith(p))) return [];\n\nif (tag === 'latest' || tag === '') return [];\n\nstaticData.count += 1;\nconsole.log('Upgrade ' + staticData.count + '/' + MAX_UPGRADES_PER_WINDOW + ': ' + image + ':' + tag);\n\nreturn [$input.first()];"
},
"id": "filter-images",
"name": "Filter and Rate Limit",
"type": "n8n-nodes-base.code",
"typeVersion": 2,
"position": [690, 300]
},
{
"parameters": {
"method": "POST",
"url": "http://claude-agent-service.claude-agent.svc.cluster.local:8080/execute",
"sendHeaders": true,
"headerParameters": {
"parameters": [
{"name": "Authorization", "value": "=Bearer {{ $env.CLAUDE_AGENT_API_TOKEN }}"},
{"name": "Content-Type", "value": "application/json"}
]
},
"sendBody": true,
"specifyBody": "json",
"jsonBody": "={{ JSON.stringify({ prompt: 'You are the service-upgrade agent. Read .claude/agents/service-upgrade.md for full instructions.\\n\\nUpgrade task:\\n- Image: ' + $json.body.diun_entry_image + '\\n- New tag: ' + $json.body.diun_entry_imagetag + '\\n- Hub link: ' + ($json.body.diun_entry_hublink || 'none') + '\\n\\nExecute the upgrade workflow now.', agent: '.claude/agents/service-upgrade', max_budget_usd: 10, timeout_seconds: 1800 }) }}",
"options": {}
},
"id": "http-execute",
"name": "Run Upgrade Agent",
"type": "n8n-nodes-base.httpRequest",
"typeVersion": 4.2,
"position": [910, 300]
}
],
"connections": {
"DIUN Webhook": {"main": [[{"node": "Filter Updates Only", "type": "main", "index": 0}]]},
"Filter Updates Only": {"main": [[{"node": "Filter and Rate Limit", "type": "main", "index": 0}]]},
"Filter and Rate Limit": {"main": [[{"node": "Run Upgrade Agent", "type": "main", "index": 0}]]}
},
"settings": {"executionOrder": "v1"}
}