infra

Author	SHA1	Message	Date
Viktor Barzin	b6d619e5df	fix: increase terragrunt-apply step memory to 2Gi LimitRange defaults containers to 192Mi which is insufficient for terragrunt apply on the platform stack (48 vault refs, many modules). Set explicit 1Gi request / 2Gi limit via backend_options.	2026-03-15 22:59:34 +00:00
Viktor Barzin	0c1239030d	fix: CI pipeline - disable corrupted cache, add pull before push - build-cli.yml: comment out cache_from/cache_to to avoid BuildKit "short read" errors from corrupted registry cache - default.yml: add git pull --rebase before push in cleanup-and-push to handle remote having newer commits	2026-03-15 22:51:08 +00:00
Viktor Barzin	50620e6047	add generic multi-user cluster onboarding system Data-driven user onboarding: add a JSON entry to Vault KV k8s_users, apply vault + platform + woodpecker stacks, and everything is auto-generated. Vault stack: namespace creation, per-user Vault policies with secret isolation via identity entities/aliases, K8s deployer roles, CI policy update. Platform stack: domains field in k8s_users type, TLS secrets per user namespace, user domains merged into Cloudflare DNS, user-roles ConfigMap mounted in portal. Woodpecker stack: admin list auto-generated from k8s_users, WOODPECKER_OPEN=true. K8s-portal: dual-track onboarding (general/namespace-owner), namespace-owner dashboard with Vault/kubectl commands, setup script adds Vault+Terraform+Terragrunt, contributing page with CI pipeline template, versioned image tags in CI pipeline. New: stacks/_template/ with copyable stack template for namespace-owners.	2026-03-15 22:23:36 +00:00
Viktor Barzin	3aba29e7a3	remove SOPS pipeline, deploy ESO + Vault DB/K8s engines Vault is now the sole source of truth for secrets. SOPS pipeline removed entirely — auth via `vault login -method=oidc`. Part A: SOPS removal - vault/main.tf: delete 990 lines (93 vars + 43 KV write resources), add self-read data source for OIDC creds from secret/vault - terragrunt.hcl: remove SOPS var loading, vault_root_token, check_secrets hook - scripts/tg: remove SOPS decryption, keep -auto-approve logic - .woodpecker/default.yml: replace SOPS with Vault K8s auth via curl - Delete secrets.sops.json, .sops.yaml Part B: External Secrets Operator - New stack stacks/external-secrets/ with Helm chart + 2 ClusterSecretStores (vault-kv for KV v2, vault-database for DB engine) Part C: Database secrets engine (in vault/main.tf) - MySQL + PostgreSQL connections with static role rotation (24h) - 6 MySQL roles (speedtest, wrongmove, codimd, nextcloud, shlink, grafana) - 6 PostgreSQL roles (trading, health, linkwarden, affine, woodpecker, claude_memory) Part D: Kubernetes secrets engine (in vault/main.tf) - RBAC for Vault SA to manage K8s tokens - Roles: dashboard-admin, ci-deployer, openclaw, local-admin - New scripts/vault-kubeconfig helper for dynamic kubeconfig K8s auth method with scoped policies for CI, ESO, OpenClaw, Woodpecker sync.	2026-03-15 16:37:38 +00:00
Viktor Barzin	9f2ac0fd1a	[ci skip] update AGENTS.md + CLAUDE.md with SOPS workflow, add k8s-portal CI pipeline AGENTS.md: added SOPS secrets management section, scripts/tg usage, contributor onboarding steps, pull-through cache bypass notes. CLAUDE.md: added SOPS workflow note, linux/amd64 build reminder, versioned tag guidance for pull-through cache. CI: new .woodpecker/k8s-portal.yml pipeline — auto-builds and deploys the k8s portal when files under stacks/platform/modules/k8s-portal/files/ change on master push. Uses buildx for linux/amd64.	2026-03-07 15:37:19 +00:00
Viktor Barzin	1f2c1ca361	[ci skip] phase 5+6: update CI pipelines for SOPS, add sensitive=true to secret vars Phase 5 — CI pipelines: - default.yml: add SOPS decrypt in prepare step, change git add . to specific paths (stacks/ state/ .woodpecker/), cleanup on success+failure - renew-tls.yml: change git add . to git add secrets/ state/ Phase 6 — sensitive=true: - Add sensitive = true to 256 variable declarations across 149 stack files - Prevents secret values from appearing in terraform plan output - Does NOT modify shared modules (ingress_factory, nfs_volume) to avoid breaking module interface contracts Note: CI pipeline SOPS decryption requires sops_age_key Woodpecker secret to be created before the pipeline will work with SOPS. Until then, the old terraform.tfvars path continues to function.	2026-03-07 14:30:36 +00:00
Viktor Barzin	b1a3685f50	remove f1-stream CI pipeline (moved to separate repo) The f1-stream project now lives at github.com/ViktorBarzin/f1-stream with its own Woodpecker CI pipeline.	2026-03-01 14:43:06 +00:00
Viktor Barzin	691c74aa30	fix: use inline cache to avoid cache_from comma splitting bug	2026-02-28 19:56:00 +00:00
Viktor Barzin	96c0353c13	[ci skip] add TLS to private registry, switch to registry.viktorbarzin.me	2026-02-28 19:40:38 +00:00
Viktor Barzin	140f48c6ee	fix: remove private registry from docker builds, push only to DockerHub The BuildKit builder cannot push to the insecure HTTP registry at registry.viktorbarzin.lan:5050 because buildkit_config is not being applied by the plugin. Simplified to DockerHub-only push for now. Private registry caching and push can be re-added once buildkit_config issue is resolved.	2026-02-28 19:15:54 +00:00
Viktor Barzin	a1ba218cd2	[ci skip] Phase 1: PostgreSQL migrated to CNPG on local disk Major milestone - shared PostgreSQL moved from NFS to CloudNativePG: - CNPG cluster (pg-cluster) running in dbaas namespace on local-path storage - PostGIS image (ghcr.io/cloudnative-pg/postgis:16) for dawarich compatibility - All 20 databases and 19 roles restored from pg_dumpall backup - postgresql.dbaas Service patched to point at CNPG primary - Old PG deployment scaled to 0 (NFS data intact for rollback) - All 12+ dependent services verified running: authentik, n8n, dawarich, tandoor, linkwarden, netbox, woodpecker, rybbit, affine, health, resume, trading-bot, atuin - Authentik PgBouncer working through the switched endpoint TODO: codify CNPG cluster in Terraform, add 2nd replica, update backup CronJob	2026-02-28 19:08:06 +00:00
Viktor Barzin	b3eaf76684	fix: use cache_images instead of cache_from to avoid comma splitting The plugin-docker-buildx (Codeberg version) changed CacheFrom from string to StringSlice, which causes urfave/cli to split on commas. The cache_images setting properly handles registry refs by generating both --cache-from and --cache-to flags automatically.	2026-02-28 18:53:08 +00:00
Viktor Barzin	87fc11121d	fix: use plain string for cache_from/cache_to and fix caretta helm_release - cache_from/cache_to must be plain strings, not YAML lists — the plugin-docker-buildx treats them as single string values and the Woodpecker settings layer was splitting comma-separated list items into separate --cache-from flags (type=registry and ref=... separately) - caretta.tf: replace deprecated set{} blocks with values=[yamlencode()] to fix Terraform plan error with newer Helm provider	2026-02-28 18:47:20 +00:00
Viktor Barzin	0ebf850893	fix: use YAML list for cache_from/cache_to to prevent comma splitting	2026-02-28 18:40:55 +00:00
Viktor Barzin	be47592e08	fix: remove deprecated secrets field from slack step	2026-02-28 18:32:10 +00:00
Viktor Barzin	4b6ade7b08	fix: replace removed woodpeckerci/plugin-slack with curl-based webhook	2026-02-28 18:25:23 +00:00
Viktor Barzin	c7bb324f64	[ci skip] add BuildKit layer caching and dual-push to f1-stream pipeline	2026-02-28 17:56:56 +00:00
Viktor Barzin	4a0fc18c60	[ci skip] add BuildKit layer caching and dual-push to build-cli pipeline	2026-02-28 17:56:52 +00:00
Viktor Barzin	9fd788b158	[ci skip] f1-stream: add CDN token refresh, SvelteKit frontend, multi-stream layout (Phases 6-8) - Phase 6: CDN token lifecycle with 3-strategy URL matching and periodic refresh - Phase 7: SvelteKit 2/Svelte 5 frontend with schedule calendar and hls.js player - Phase 8: Multi-stream layout supporting up to 4 simultaneous HLS streams - Update Dockerfile to multi-stage build (Node.js frontend + Python backend) - Switch deployment to :latest tag with Always pull policy for CI-driven deploys - Update Woodpecker CI to use explicit latest tag	2026-02-23 23:59:35 +00:00
Viktor Barzin	f3bcd95242	[ci skip] f1-stream: replace Go service with Python/FastAPI skeleton Replaces the existing Go-based f1-stream service with a new Python/FastAPI backend as the foundation for the rebuilt F1 streaming aggregation service. - New FastAPI backend with health and root endpoints - Python 3.13 slim Dockerfile (replaces Go multi-stage build) - Updated Terraform deployment (port 8000, reduced resources) - Buildx-based redeploy.sh with --platform linux/amd64 - Added Woodpecker CI pipeline for automated builds - Removed all old Go source, node_modules, static assets	2026-02-23 22:47:06 +00:00
Viktor Barzin	ebecaaee5c	Woodpecker CI: use built-in clone, fix CoreDNS DNS resolution [CI SKIP] - Switch from custom clone override to woodpeckerci/plugin-git built-in clone (handles auth automatically via netrc from GitHub OAuth token) - Add 8.8.8.8 and 1.1.1.1 as CoreDNS upstream resolvers alongside pfSense (fixes intermittent DNS timeouts causing clone failures) - Fix missing comma after heredoc in audit-policy.tf (syntax error)	2026-02-23 00:08:42 +00:00
Viktor Barzin	cbf041bcc9	[ci skip] Add Woodpecker CI stack (WIP) and claude agents - Add stacks/woodpecker/ with Helm-based deployment config - Add .woodpecker/ CI pipeline configs (default, build-cli, renew-tls) - Add NFS export entry for woodpecker - Add .claude/agents/ definitions	2026-02-22 21:30:25 +00:00

22 commits