infra

Author	SHA1	Message	Date
Viktor Barzin	cbf041bcc9	[ci skip] Add Woodpecker CI stack (WIP) and claude agents - Add stacks/woodpecker/ with Helm-based deployment config - Add .woodpecker/ CI pipeline configs (default, build-cli, renew-tls) - Add NFS export entry for woodpecker - Add .claude/agents/ definitions	2026-02-22 21:30:25 +00:00
Viktor Barzin	c4db6a9fdb	[ci skip] Fix poison fetcher: use HTTP/1.1 for upstream (HTTP/2 hangs) The Poison Fountain upstream (rnsaffn.com/poison2/) doesn't respond properly over HTTP/2. Force HTTP/1.1 for reliable content fetching. Also fixed NFS directory permissions for non-root curl container.	2026-02-22 20:42:53 +00:00
Viktor Barzin	36cec7c83f	[ci skip] Add poison-fountain Terraform stack (deployment, service, ingress, CronJob)	2026-02-22 19:50:57 +00:00
Viktor Barzin	006f95337e	[ci skip] Add anti_ai_scraping option to ingress_factory (default: true)	2026-02-22 19:50:07 +00:00
Viktor Barzin	fd9b06266d	[ci skip] Add anti-AI scraping Traefik middlewares (ForwardAuth, headers, trap links)	2026-02-22 19:49:32 +00:00
Viktor Barzin	c277d28bd8	[ci skip] Add NFS export and DNS record for poison-fountain	2026-02-22 19:47:46 +00:00
Viktor Barzin	8a1636b931	[ci skip] Add poison fountain Python service and fetcher script	2026-02-22 19:46:43 +00:00
Viktor Barzin	5bc1a47cb8	[ci skip] Add anti-AI scraping implementation plan	2026-02-22 19:41:39 +00:00
Viktor Barzin	4a9fe474c6	[ci skip] Add anti-AI scraping system design doc	2026-02-22 19:37:29 +00:00
Viktor Barzin	5cfe6595cd	Apply only platform stack in CI (matches old pipeline scope)	2026-02-22 18:59:02 +00:00
Viktor Barzin	9415823ab8	Use --queue-ignore-errors for CI (infra stack needs Proxmox SSH)	2026-02-22 18:29:27 +00:00
Viktor Barzin	2547a155ed	Skip infra stack in CI, remove DRONE_IMAGE_CLONE setting	2026-02-22 18:21:10 +00:00
Viktor Barzin	45b4528f0f	Add clone retry logic for intermittent DNS failures	2026-02-22 18:10:31 +00:00
Viktor Barzin	9dd284204e	Retry CI - test DNS resolution	2026-02-22 18:07:28 +00:00
Viktor Barzin	45f4459dbc	Use manual clone with alpine instead of drone/git (pull-through cache issue)	2026-02-22 18:05:53 +00:00
Viktor Barzin	8df72b545c	Test CI with drone/git:linux-amd64 clone image	2026-02-22 18:02:28 +00:00
Viktor Barzin	d2446fb4a3	Test CI pipeline with fixed clone image	2026-02-22 17:54:52 +00:00
Viktor Barzin	97f73eca0d	Trigger CI build to test updated Drone pipeline	2026-02-22 17:50:47 +00:00
Viktor Barzin	9ee3140b34	Update Drone CI pipeline for Terragrunt stack architecture Default pipeline now uses terragrunt run --all to apply all stacks instead of the broken terraform apply -target=module.kubernetes_cluster. TLS renewal pipeline stripped of unnecessary Terraform download/init since renew2.sh is pure shell (certbot + Cloudflare DNS).	2026-02-22 17:47:06 +00:00
Viktor Barzin	35488f4ef6	[ci skip] Fix Drone clone image: use alpine/git via DRONE_IMAGE_CLONE The drone/git:latest image was failing to pull through the registry cache (corrupted blobs, unexpected EOF). Set DRONE_IMAGE_CLONE on the Kubernetes runner to use alpine/git:latest globally for all pipelines.	2026-02-22 17:35:04 +00:00
Viktor Barzin	5501b5cfbf	[ci skip] Increase authentik ResourceQuota limits Authentik is a critical auth service that was at 83% CPU/memory quota utilization. Double all limits to prevent throttling.	2026-02-22 17:28:41 +00:00
Viktor Barzin	116c4d9c30	[ci skip] Remove legacy files and orphaned modules Delete 20 orphaned module directories and 3 stray files from modules/kubernetes/ that are no longer referenced by any stack. Remove 7 root-level legacy files including the empty tfstate, 27MB terraform zip, commented-out main.tf, and migration notes. Clean up commented-out dockerhub_secret and oauth-proxy references in blog, travel_blog, and city-guesser stacks. Remove stale frigate config.yaml entry from .gitignore. Remove ephemeral docs/plans/ directory.	2026-02-22 15:23:27 +00:00
Viktor Barzin	c7c7047f1c	[ci skip] Flatten module wrappers into stack roots Remove the module "xxx" { source = "./module" } indirection layer from all 66 service stacks. Resources are now defined directly in each stack's main.tf instead of through a wrapper module. - Merge module/main.tf contents into stack main.tf - Apply variable replacements (var.tier -> local.tiers.X, renamed vars) - Fix shared module paths (one fewer ../ at each level) - Move extra files/dirs (factory/, chart_values, subdirs) to stack root - Update state files to strip module.<name>. prefix - Update CLAUDE.md to reflect flat structure Verified: terragrunt plan shows 0 add, 0 destroy across all stacks.	2026-02-22 15:13:55 +00:00
Viktor Barzin	b0499a7f31	[ci skip] Update CLAUDE.md for module colocation Reflect new directory structure where service modules live inside their stack directories (stacks/<service>/module/) instead of modules/kubernetes/<service>/. Update file paths, adding service instructions, and stack structure documentation.	2026-02-22 14:39:22 +00:00
Viktor Barzin	e6420c7b36	[ci skip] Move Terraform modules into stack directories Move all 88 service modules (66 individual + 22 platform) from modules/kubernetes/<service>/ into their corresponding stack directories: - Service stacks: stacks/<service>/module/ - Platform stack: stacks/platform/modules/<service>/ This collocates module source code with its Terragrunt definition. Only shared utility modules remain in modules/kubernetes/: ingress_factory, setup_tls_secret, dockerhub_secret, oauth-proxy. All cross-references to shared modules updated to use correct relative paths. Verified with terragrunt run --all -- plan: 0 adds, 0 destroys across all 68 stacks.	2026-02-22 14:38:14 +00:00
Viktor Barzin	7ef1a0a8bb	[ci skip] Update CLAUDE.md for Terragrunt migration	2026-02-22 14:12:37 +00:00
Viktor Barzin	e2522ad9f1	[ci skip] Fix variable type mismatches in owntracks, ollama, tandoor stacks - owntracks_credentials: string -> map(string) - ollama_api_credentials: string -> map(string) - tandoor_email_password: add default="" (not in tfvars)	2026-02-22 14:07:33 +00:00
Viktor Barzin	945a5f35b0	[ci skip] Fix path.root references for git-crypt key in openclaw and drone Modules used filebase64("${path.root}/.git/git-crypt/keys/default") which breaks with Terragrunt since path.root is now stacks/<service>/ instead of repo root. Changed to accept git_crypt_key_base64 variable and resolve the path in the stack wrapper.	2026-02-22 14:01:02 +00:00
Viktor Barzin	71bfdc8e89	[ci skip] Phase 3: Remove migrated service modules from monolith All 66 service modules removed from modules/kubernetes/main.tf (now just a migration notice). The kubernetes_cluster module block removed from root main.tf. All services now managed via stacks/<service>/.	2026-02-22 13:58:07 +00:00
Viktor Barzin	a9ba8899be	[ci skip] Phase 3: Create 66 service stacks and migrate state Generated individual stack directories for all 66 services under stacks/. Each stack has terragrunt.hcl (depends on platform) and main.tf (thin wrapper calling existing module). Migrated all 64 active service states from root terraform.tfstate to individual state files. Root state is now empty. Verified with terragrunt plan on multiple stacks (no changes).	2026-02-22 13:56:34 +00:00
Viktor Barzin	39ce2000cf	[ci skip] Remove 22 platform services from modules/kubernetes/main.tf Migrated to stacks/platform/: metallb, dbaas, redis, traefik, technitium, headscale, authentik, rbac, k8s-portal, crowdsec, monitoring, vaultwarden, reverse-proxy, metrics-server, nvidia, kyverno, uptime-kuma, wireguard, xray, mailserver, cloudflared, infra-maintenance. Also removed null_resource.core_services and all depends_on references to it from the remaining ~66 service modules.	2026-02-22 13:40:45 +00:00
Viktor Barzin	7c4d32922a	[ci skip] Migrate 22 platform service states to stacks/platform State migration for all platform services from root state to state/stacks/platform/terraform.tfstate. Key changes: - module.kubernetes_cluster.module.X["key"] -> module.X - Removed null_resource.core_services from root state - Imported traefik helm_release (was missing from state) - Fixed helm provider syntax (kubernetes = {} not kubernetes {}) - Added secrets symlink for TLS cert file() resolution - Platform terragrunt plan: 0 add, 24 change (cosmetic drift), 0 destroy	2026-02-22 13:35:10 +00:00
Viktor Barzin	e2fcf4df45	[ci skip] Add platform stack (core services) for Terragrunt migration stacks/platform/ contains 22 core/cluster services: metallb, dbaas, redis, traefik, technitium, headscale, authentik, rbac, k8s-portal, crowdsec, monitoring, vaultwarden, reverse-proxy, metrics-server, nvidia, kyverno, uptime-kuma, wireguard, xray, mailserver, cloudflared, infra-maintenance. Outputs: tls_secret_name, redis_host, postgresql_host/port, mysql_host/port, smtp_host/port — consumed by downstream service stacks via dependency blocks.	2026-02-22 13:21:09 +00:00
Viktor Barzin	8715477d39	[ci skip] Migrate infra modules (VM templates, docker-registry) to stacks/infra/ Remove k8s-node-template, non-k8s-node-template, docker-registry-template, and docker-registry-vm modules from root main.tf. These are now managed by the Terragrunt infra stack at stacks/infra/. State migration verified: both root and infra stack show "No changes".	2026-02-22 13:11:16 +00:00
Viktor Barzin	f096a889d6	[ci skip] Add infra stack (Proxmox VMs)	2026-02-22 13:04:49 +00:00
Viktor Barzin	f962349465	[ci skip] Add Terragrunt directory skeleton and root config	2026-02-22 13:01:37 +00:00
Viktor Barzin	db659b1f7a	[ci skip] Fix dashy OOMKilled and healthcheck DNS false-failure - Add explicit resource limits to dashy (2Gi memory) to prevent OOMKilled during webpack build on startup - Rewrite DNS healthcheck to test from inside the Technitium pod via kubectl exec, since MetalLB virtual IPs aren't reachable from outside the L2 network - Deleted orphaned kured/tls-secret (expired Oct 2025, module disabled, not mounted by kured DaemonSet)	2026-02-22 12:46:12 +00:00
Viktor Barzin	f05bf109c5	[ci skip] Increase Drone CI resource quota to handle concurrent builds Each build pod has 8-10 containers inheriting 1 CPU / 2Gi limits from LimitRange defaults. With 4+ concurrent builds the old quota (48 CPU / 96Gi / 30 pods) was exhausted, blocking new builds. Increase to 64 CPU / 128Gi / 60 pods to safely support 5-6 concurrent builds.	2026-02-22 12:28:42 +00:00
Viktor Barzin	00dc78e0d2	[ci skip] Fix Uptime Kuma false-down reports: use bulk heartbeat API instead of per-monitor calls	2026-02-22 01:37:28 +00:00
Viktor Barzin	0ff2aaec60	[ci skip] Add native HLS playback for VIPLeague/DaddyLive streams (v1.3.1) - Add HLS proxy (hlsproxy) for rewriting m3u8 playlists and proxying segments with correct Referer/Origin headers (uses ?domain= param) - Add playerconfig service for detecting stream types (VIPLeague, DaddyLive, HLS) and extracting auth params from ksohls pages - Add VIPLeague URL resolution: extract slug from URL path, match against DaddyLive 24/7 channel index with token-based scoring - Replace Clappr with direct HLS.js player for better compatibility - Add CryptoJS CDN for DaddyLive auth module support - Disable CrowdSec on f1-stream ingress to prevent false positives - Bump image to v1.3.1	2026-02-22 01:30:06 +00:00
Viktor Barzin	a5f9c1595f	[ci skip] Fix Python f-string quoting in Slack formatter	2026-02-22 01:23:56 +00:00
Viktor Barzin	b8029351fd	[ci skip] Improve Slack message formatting with human-readable names and structured blocks	2026-02-22 01:16:47 +00:00
Viktor Barzin	a90710950b	[ci skip] Upgrade cluster-health.sh to full 24-check version with Slack	2026-02-22 01:09:02 +00:00
Viktor Barzin	284fee15ca	[ci skip] Fix Slack message formatting to use Block Kit mrkdwn	2026-02-22 01:00:48 +00:00
Viktor Barzin	e59928187b	[ci skip] Set CronJob backoffLimit=0 to prevent duplicate Slack alerts	2026-02-22 00:59:34 +00:00
Viktor Barzin	c1ee757c6b	[ci skip] Add Terragrunt migration implementation plan	2026-02-22 00:51:00 +00:00
Viktor Barzin	209355d1af	[ci skip] Add Terragrunt migration design document	2026-02-22 00:46:57 +00:00
Viktor Barzin	cd0c030a55	[ci skip] Fix CronJob kubectl image tag to :latest	2026-02-22 00:38:33 +00:00
Viktor Barzin	f79e84c693	[ci skip] Add cluster health check CronJob to OpenClaw module	2026-02-22 00:08:51 +00:00
Viktor Barzin	8b5b389f31	[ci skip] Add cluster-health skill for OpenClaw agent	2026-02-22 00:04:15 +00:00

1 2 3 4 5 ...

1342 commits