Commit graph

15 commits

Author SHA1 Message Date
Viktor Barzin
731de63150 fix(beads-server): disable Authentik + CrowdSec on Workbench
Authentik forward-auth returns 400 for dolt-workbench (no Authentik
application configured for this domain). CrowdSec bouncer also
intermittently returns 400. Both disabled — Workbench is accessible
via Cloudflare tunnel only.

TODO: Create Authentik application for dolt-workbench.viktorbarzin.me

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-17 20:00:21 +00:00
Viktor Barzin
a19581e32b fix(beads-server): fix Workbench timeout — use internal GraphQL URL
GRAPHQLAPI_URL must point to localhost:9002 (internal), not the external
URL which goes through Authentik. SSR can't authenticate to Authentik.
Also removed Authentik from /graphql ingress — browser fetch() can't
follow 302 redirects on POST requests.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-17 19:05:47 +00:00
Viktor Barzin
da6b82ed5c fix(beads-server): persist GRAPHQLAPI_URL in Terraform
The env var was only set via kubectl and got overwritten on next apply.
Now permanently in the deployment spec.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-17 18:58:59 +00:00
Viktor Barzin
eef4242408 fix(beads-server): auto-connect Workbench to Dolt on startup
The Workbench's database connection is in-memory and lost on pod restart.
Added startup script that waits for GraphQL server readiness, then calls
addDatabaseConnection mutation automatically. No more manual reconnection.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-17 18:12:31 +00:00
Viktor Barzin
b24545ffdb fix(beads-server): fix BeadBoard project ID + install bd binary
- Fixed project_id mismatch (was "beadboard", should be actual DB project ID)
- Rebuilt Docker image with bd v1.0.2 binary (node:20-slim for glibc compat)
- Ran bd migrate to update schema from 1.0.0 → 1.0.2 (adds started_at, etc.)
- Task creation and bd CLI now work inside the container

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-17 11:57:45 +00:00
Viktor Barzin
f2037545b3 fix(beads-server): make BeadBoard .beads dir writable
BeadBoard needs to create templates/ and archetypes/ subdirectories
inside .beads/. ConfigMap mounts are read-only, causing ENOENT errors
and 503 responses. Fix: init container copies ConfigMap to emptyDir.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-17 11:37:26 +00:00
Viktor Barzin
00e2f15a5d feat(beads-server): deploy BeadBoard task visualization dashboard
Add BeadBoard (zenchantlive/beadboard) alongside Dolt server and Workbench
for task dependency graph, kanban, and agent coordination views.

- Built custom Docker image (registry.viktorbarzin.me:5050/beadboard)
- ConfigMap provides .beads/metadata.json pointing to Dolt server
- Behind Authentik auth at beadboard.viktorbarzin.me
- Also fixed: GraphQL ingress now has Authentik middleware
- Also fixed: Workbench store.json type enum (mysql → Mysql)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-17 11:30:43 +00:00
Viktor Barzin
f8facf44dd [infra] Fix rewrite-body plugin + cleanup TrueNAS + version bumps
## Context

The rewrite-body Traefik plugin (packruler/rewrite-body v1.2.0) silently
broke on Traefik v3.6.12 — every service using rybbit analytics or anti-AI
injection returned HTTP 200 with "Error 404: Not Found" body. Root cause:
middleware specs referenced plugin name `rewrite-body` but Traefik registered
it as `traefik-plugin-rewritebody`.

Migrated to maintained fork `the-ccsn/traefik-plugin-rewritebody` v0.1.3
which uses the correct plugin name. Also added `lastModified = true` and
`methods = ["GET"]` to anti-AI middleware to avoid rewriting non-HTML
responses.

## This change

- Replace packruler/rewrite-body v1.2.0 with the-ccsn/traefik-plugin-rewritebody v0.1.3
- Fix plugin name in all 3 middleware locations (ingress_factory, reverse-proxy factory, traefik anti-AI)
- Remove deprecated TrueNAS cloud sync monitor (VM decommissioned 2026-04-13)
- Remove CloudSyncStale/CloudSyncFailing/CloudSyncNeverRun alerts
- Fix PrometheusBackupNeverRun alert (for: 48h → 32d to match monthly sidecar schedule)
- Bump versions: rybbit v1.0.21→v1.1.0, wealthfolio v1.1.0→v3.2,
  networking-toolbox 1.1.1→1.6.0, cyberchef v10.24.0→v9.55.0
- MySQL standalone storage_limit 30Gi → 50Gi
- beads-server: fix Dolt workbench type casing, remove Authentik on GraphQL endpoint

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-17 05:51:52 +00:00
Viktor Barzin
e80b2f026f [infra] Migrate Terraform state from local SOPS to PostgreSQL backend
Two-tier state architecture:
- Tier 0 (infra, platform, cnpg, vault, dbaas, external-secrets): local
  state with SOPS encryption in git — unchanged, required for bootstrap.
- Tier 1 (105 app stacks): PostgreSQL backend on CNPG cluster at
  10.0.20.200:5432/terraform_state with native pg_advisory_lock.

Motivation: multi-operator friction (every workstation needed SOPS + age +
git-crypt), bootstrap complexity for new operators, and headless agents/CI
needing the full encryption toolchain just to read state.

Changes:
- terragrunt.hcl: conditional backend (local vs pg) based on tier0 list
- scripts/tg: tier detection, auto-fetch PG creds from Vault for Tier 1,
  skip SOPS and Vault KV locking for Tier 1 stacks
- scripts/state-sync: tier-aware encrypt/decrypt (skips Tier 1)
- scripts/migrate-state-to-pg: one-shot migration script (idempotent)
- stacks/vault/main.tf: pg-terraform-state static role + K8s auth role
  for claude-agent namespace
- stacks/dbaas: terraform_state DB creation + MetalLB LoadBalancer
  service on shared IP 10.0.20.200
- Deleted 107 .tfstate.enc files for migrated Tier 1 stacks
- Cleaned up per-stack tiers.tf (now generated by root terragrunt.hcl)

[ci skip]

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 19:33:12 +00:00
root
af090c818b Woodpecker CI deploy [CI SKIP] 2026-04-16 13:46:08 +00:00
Viktor Barzin
b1d152be1f [infra] Auto-create Cloudflare DNS records from ingress_factory
## Context

Deploying new services required manually adding hostnames to
cloudflare_proxied_names/cloudflare_non_proxied_names in config.tfvars —
a separate file from the service stack. This was frequently forgotten,
leaving services unreachable externally.

## This change:

- Add `dns_type` parameter to `ingress_factory` and `reverse_proxy/factory`
  modules. Setting `dns_type = "proxied"` or `"non-proxied"` auto-creates
  the Cloudflare DNS record (CNAME to tunnel or A/AAAA to public IP).
- Simplify cloudflared tunnel from 100 per-hostname rules to wildcard
  `*.viktorbarzin.me → Traefik`. Traefik still handles host-based routing.
- Add global Cloudflare provider via terragrunt.hcl (separate
  cloudflare_provider.tf with Vault-sourced API key).
- Migrate 118 hostnames from centralized config.tfvars to per-service
  dns_type. 17 hostnames remain centrally managed (Helm ingresses,
  special cases).
- Update docs, AGENTS.md, CLAUDE.md, dns.md runbook.

```
BEFORE                          AFTER
config.tfvars (manual list)     stacks/<svc>/main.tf
        |                         module "ingress" {
        v                           dns_type = "proxied"
stacks/cloudflared/               }
  for_each = list                     |
  cloudflare_record               auto-creates
  tunnel per-hostname             cloudflare_record + annotation
```

## What is NOT in this change:

- Uptime Kuma monitor migration (still reads from config.tfvars)
- 17 remaining centrally-managed hostnames (Helm, special cases)
- Removal of allow_overwrite (keep until migration confirmed stable)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 13:45:04 +00:00
Viktor Barzin
b98890d799 fix(beads-server): fix Workbench GraphQL URL for remote hosting
Dolt Workbench hardcodes http://localhost:9002/graphql in the built JS.
For k8s hosting, init container patches this to relative /graphql path.
Second ingress routes /graphql to port 9002 behind Authentik auth.

- Init container copies static JS to writable emptyDir, patches URL
- Pre-seeds store.json with Dolt connection config
- Added /graphql ingress with Authentik forward-auth

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 21:53:57 +00:00
Viktor Barzin
d9ed166640 fix(beads-server): add Authentik auth to Dolt Workbench
- Set protected=true on ingress (Authentik forward-auth)
- Remove unused DATABASE_URL env var (Workbench uses browser-based connection config)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 21:43:22 +00:00
Viktor Barzin
27d7c91608 feat(beads-server): add Dolt Workbench web UI
Deploy dolthub/dolt-workbench alongside the Dolt server in beads-server
namespace. Provides SQL console, spreadsheet editor, and commit graph
visualization for the centralized beads task database.

- Workbench at dolt-workbench.viktorbarzin.me (Cloudflare-proxied)
- Connects to Dolt server via in-cluster service DNS
- Added to cloudflare_proxied_names for external access

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 21:32:45 +00:00
Viktor Barzin
bcad200a23 chore: add untracked stacks, scripts, and agent configs
- New stacks: beads-server, hermes-agent
- Terragrunt tiers.tf for infra, phpipam, status-page
- Secrets symlinks for vault, phpipam, hermes-agent
- Scripts: cluster_manager, image_pull, containerd pullthrough setup
- Frigate config, audiblez-web app source, n8n workflows dir
- Claude agent: service-upgrade, reference: upgrade-config.json
- Removed: claudeception skill, excalidraw empty submodule, temp listings

[ci skip]

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 09:33:06 +00:00