2026-03-07 14:30:36 +00:00
variable " tls_secret_name " {
2026-03-14 08:51:45 +00:00
type = string
2026-03-07 14:30:36 +00:00
sensitive = true
}
[ci skip] Infrastructure hardening: security, monitoring, reliability, maintainability
Phase 1 - Critical Security:
- Netbox: move hardcoded DB/superuser passwords to variables
- MeshCentral: disable public registration, add Authentik auth
- Traefik: disable insecure API dashboard (api.insecure=false)
- Traefik: configure forwarded headers with Cloudflare trusted IPs
Phase 2 - Security Hardening:
- Add security headers middleware (HSTS, X-Frame-Options, nosniff, etc.)
- Add Kyverno pod security policies in audit mode (privileged, host
namespaces, SYS_ADMIN, trusted registries)
- Tighten rate limiting (avg=10, burst=50)
- Add Authentik protection to grampsweb
Phase 3 - Monitoring & Alerting:
- Add critical service alerts (PostgreSQL, MySQL, Redis, Headscale,
Authentik, Loki)
- Increase Loki retention from 7 to 30 days (720h)
- Add predictive PV filling alert (predict_linear)
- Re-enable Hackmd and Privatebin down alerts
Phase 4 - Reliability:
- Add resource requests/limits to Redis, DBaaS, Technitium, Headscale,
Vaultwarden, Uptime Kuma
- Increase Alloy DaemonSet memory to 512Mi/1Gi
Phase 6 - Maintainability:
- Extract duplicated tiers locals to terragrunt.hcl generate block
(removed from 67 stacks)
- Replace hardcoded NFS IP 10.0.10.15 with var.nfs_server (114
instances across 63 files)
- Replace hardcoded Redis/PostgreSQL/MySQL/Ollama/mail host references
with variables across ~35 stacks
- Migrate xray raw ingress resources to ingress_factory modules
2026-02-23 22:05:28 +00:00
variable " nfs_server " { type = string }
variable " postgresql_host " { type = string }
2026-02-24 23:02:33 +00:00
variable " woodpecker_forgejo_url " { type = string }
2026-02-22 21:30:25 +00:00
2026-03-14 17:15:48 +00:00
data " vault_kv_secret_v2 " " secrets " {
mount = " secret "
name = " woodpecker "
}
2026-02-22 21:30:25 +00:00
resource " kubernetes_namespace " " woodpecker " {
metadata {
name = " woodpecker "
labels = {
" resource-governance/custom-quota " = " true "
tier = local . tiers . edge
}
}
}
resource " kubernetes_resource_quota " " woodpecker " {
metadata {
name = " tier-quota "
namespace = kubernetes_namespace . woodpecker . metadata [ 0 ] . name
}
spec {
hard = {
" requests.cpu " = " 16 "
" requests.memory " = " 16Gi "
resource quota review: fix OOM risks, close quota gaps, add HA protections
Phase 1 - OOM fixes:
- dashy: increase memory limit 512Mi→1Gi (was at 99% utilization)
- caretta DaemonSet: set explicit resources 300Mi/512Mi (was at 85-98%)
- mysql-operator: add Helm resource values 256Mi/512Mi, create namespace
with tier label (was at 92% of LimitRange default)
- prowlarr, flaresolverr, annas-archive-stacks: add explicit resources
(outgrowing 256Mi LimitRange defaults)
- real-estate-crawler celery: add resources 512Mi/3Gi (608Mi actual, no
explicit resources)
Phase 2 - Close quota gaps:
- nvidia, real-estate-crawler, trading-bot: remove custom-quota=true
labels so Kyverno generates tier-appropriate quotas
- descheduler: add tier=1-cluster label for proper classification
Phase 3 - Reduce excessive quotas:
- monitoring: limits.memory 240Gi→64Gi, limits.cpu 120→64
- woodpecker: limits.memory 128Gi→32Gi, limits.cpu 64→16
- GPU tier default: limits.memory 96Gi→32Gi, limits.cpu 48→16
Phase 4 - Kubelet protection:
- Add cpu: 200m to systemReserved and kubeReserved in kubelet template
Phase 5 - HA improvements:
- cloudflared: add topology spread (ScheduleAnyway) + PDB (maxUnavailable:1)
- grafana: add topology spread + PDB via Helm values
- crowdsec LAPI: add topology spread + PDB via Helm values
- authentik server: add topology spread via Helm values
- authentik worker: add topology spread + PDB via Helm values
2026-03-08 18:17:46 +00:00
" limits.memory " = " 32Gi "
2026-02-22 21:30:25 +00:00
pods = " 60 "
}
}
}
module " tls_secret " {
source = " ../../modules/kubernetes/setup_tls_secret "
namespace = kubernetes_namespace . woodpecker . metadata [ 0 ] . name
tls_secret_name = var . tls_secret_name
}
migrate consuming stacks to ESO + remove k8s-dashboard static token
Phase 9: ExternalSecret migration across 26 stacks:
Fully migrated (vault data source removed, ESO delivers secrets):
- speedtest, shadowsocks, wealthfolio, plotting-book, f1-stream, tandoor
- n8n, dawarich, diun, netbox, onlyoffice, tuya-bridge
- hackmd (ESO template for DB URL), health (ESO template for DB URL)
- trading-bot (ESO template for DATABASE_URL + 7 secret env vars)
- forgejo (removed unused vault data source)
Partially migrated (vault kept for plan-time, ESO added for runtime):
- immich, linkwarden, nextcloud, paperless-ngx (jsondecode for homepage)
- claude-memory, rybbit, url, webhook_handler (plan-time in locals/jobs)
- woodpecker, openclaw, resume (plan-time in helm values/jobs/modules)
17 stacks unchanged (all plan-time: homepage annotations, configmaps,
module inputs) — vault data source works with OIDC auth.
Phase 17a: Remove k8s-dashboard static admin token secret.
Users now get tokens via: vault write kubernetes/creds/dashboard-admin
2026-03-15 19:05:04 +00:00
resource " kubernetes_manifest " " external_secret " {
manifest = {
apiVersion = " external-secrets.io/v1beta1 "
kind = " ExternalSecret "
metadata = {
name = " woodpecker-secrets "
namespace = " woodpecker "
}
spec = {
refreshInterval = " 15m "
secretStoreRef = {
name = " vault-kv "
kind = " ClusterSecretStore "
}
target = {
name = " woodpecker-secrets "
}
dataFrom = [ {
extract = {
key = " woodpecker "
}
} ]
}
}
depends_on = [ kubernetes_namespace . woodpecker ]
}
2026-02-22 21:30:25 +00:00
resource " kubernetes_config_map " " git_crypt_key " {
metadata {
name = " git-crypt-key "
namespace = kubernetes_namespace . woodpecker . metadata [ 0 ] . name
}
data = {
" key " = filebase64 ( " ${ path . root } /../../.git/git-crypt/keys/default " )
}
}
# Database init job - creates the woodpecker database and user in PostgreSQL
resource " kubernetes_job " " db_init " {
metadata {
name = " woodpecker-db-init "
namespace = kubernetes_namespace . woodpecker . metadata [ 0 ] . name
}
spec {
template {
metadata { }
spec {
container {
name = " db-init "
image = " postgres:16-alpine "
command = [
" sh " , " -c " ,
< < - EOT
set - e
# Create user if not exists
2026-03-14 17:15:48 +00:00
PGPASSWORD =' $ { data . vault_kv_secret_v2 . secrets . data [ " dbaas_root_password " ] } ' psql - h $ { var . postgresql_host } - U root - tc " SELECT 1 FROM pg_roles WHERE rolname='woodpecker' " | grep - q 1 | | \
PGPASSWORD =' $ { data . vault_kv_secret_v2 . secrets . data [ " dbaas_root_password " ] } ' psql - h $ { var . postgresql_host } - U root - c " CREATE ROLE woodpecker WITH LOGIN PASSWORD ' ${ data . vault_kv_secret_v2 . secrets . data [ " db_password " ] } ' "
2026-02-22 21:30:25 +00:00
# Create database if not exists
2026-03-14 17:15:48 +00:00
PGPASSWORD =' $ { data . vault_kv_secret_v2 . secrets . data [ " dbaas_root_password " ] } ' psql - h $ { var . postgresql_host } - U root - tc " SELECT 1 FROM pg_database WHERE datname='woodpecker' " | grep - q 1 | | \
PGPASSWORD =' $ { data . vault_kv_secret_v2 . secrets . data [ " dbaas_root_password " ] } ' psql - h $ { var . postgresql_host } - U root - c " CREATE DATABASE woodpecker OWNER woodpecker "
2026-02-22 21:30:25 +00:00
echo " Database init complete "
EOT
]
}
restart_policy = " Never "
}
}
backoff_limit = 3
}
wait_for_completion = true
timeouts {
create = " 2m "
}
}
# NFS PV for Woodpecker server data (Helm chart creates PVC via StatefulSet VCT)
resource " kubernetes_persistent_volume " " woodpecker_server_data " {
metadata {
name = " woodpecker-server-data "
}
spec {
capacity = {
storage = " 10Gi "
}
[ci skip] migrate 29 services from inline NFS to CSI-backed PV/PVC
Batch migration of all single-volume and simple multi-volume stacks.
All services verified healthy after migration. Uses nfs-truenas
StorageClass with soft,timeo=30,retrans=3 mount options to eliminate
stale NFS mount hangs.
Services: atuin, audiobookshelf, calibre, changedetection, diun,
excalidraw, forgejo, freshrss, grampsweb, hackmd, health,
isponsorblocktv, matrix, meshcentral, n8n, navidrome, ntfy, ollama,
onlyoffice, owntracks, paperless-ngx, poison-fountain, send,
stirling-pdf, tandoor, wealthfolio, whisper, woodpecker, ytdlp
2026-03-02 00:15:39 +00:00
access_modes = [ " ReadWriteOnce " ]
persistent_volume_reclaim_policy = " Retain "
storage_class_name = " nfs-truenas "
volume_mode = " Filesystem "
2026-02-22 21:30:25 +00:00
persistent_volume_source {
[ci skip] migrate 29 services from inline NFS to CSI-backed PV/PVC
Batch migration of all single-volume and simple multi-volume stacks.
All services verified healthy after migration. Uses nfs-truenas
StorageClass with soft,timeo=30,retrans=3 mount options to eliminate
stale NFS mount hangs.
Services: atuin, audiobookshelf, calibre, changedetection, diun,
excalidraw, forgejo, freshrss, grampsweb, hackmd, health,
isponsorblocktv, matrix, meshcentral, n8n, navidrome, ntfy, ollama,
onlyoffice, owntracks, paperless-ngx, poison-fountain, send,
stirling-pdf, tandoor, wealthfolio, whisper, woodpecker, ytdlp
2026-03-02 00:15:39 +00:00
csi {
driver = " nfs.csi.k8s.io "
volume_handle = " woodpecker-server-data "
volume_attributes = {
server = var . nfs_server
share = " /mnt/main/woodpecker "
}
2026-02-22 21:30:25 +00:00
}
}
claim_ref {
name = " data-woodpecker-server-0 "
namespace = kubernetes_namespace . woodpecker . metadata [ 0 ] . name
}
}
}
# Helm release for Woodpecker CI
resource " helm_release " " woodpecker " {
name = " woodpecker "
namespace = kubernetes_namespace . woodpecker . metadata [ 0 ] . name
repository = " oci://ghcr.io/woodpecker-ci/helm "
chart = " woodpecker "
version = " 3.5.1 "
values = [
templatefile ( " ${ path . module } /values.yaml " , {
2026-03-14 17:15:48 +00:00
github_client_id = data . vault_kv_secret_v2 . secrets . data [ " github_client_id " ]
github_client_secret = data . vault_kv_secret_v2 . secrets . data [ " github_client_secret " ]
agent_secret = data . vault_kv_secret_v2 . secrets . data [ " agent_secret " ]
db_password = data . vault_kv_secret_v2 . secrets . data [ " db_password " ]
2026-02-24 23:02:33 +00:00
postgresql_host = var . postgresql_host
2026-03-14 17:15:48 +00:00
forgejo_client_id = data . vault_kv_secret_v2 . secrets . data [ " forgejo_client_id " ]
forgejo_client_secret = data . vault_kv_secret_v2 . secrets . data [ " forgejo_client_secret " ]
2026-02-24 23:02:33 +00:00
forgejo_url = var . woodpecker_forgejo_url
2026-02-22 21:30:25 +00:00
} )
]
timeout = 600
depends_on = [ kubernetes_job . db_init , kubernetes_persistent_volume . woodpecker_server_data ]
}
# ClusterRoleBinding - build pods need cluster-admin to PATCH deployments across namespaces
resource " kubernetes_cluster_role_binding " " woodpecker " {
metadata {
name = " woodpecker "
}
subject {
kind = " ServiceAccount "
name = " woodpecker-agent "
namespace = kubernetes_namespace . woodpecker . metadata [ 0 ] . name
}
role_ref {
kind = " ClusterRole "
name = " cluster-admin "
api_group = " rbac.authorization.k8s.io "
}
}
# Also bind the default SA (pipeline pods run as default)
resource " kubernetes_cluster_role_binding " " woodpecker_default " {
metadata {
name = " woodpecker-default "
}
subject {
kind = " ServiceAccount "
name = " default "
namespace = kubernetes_namespace . woodpecker . metadata [ 0 ] . name
}
role_ref {
kind = " ClusterRole "
name = " cluster-admin "
api_group = " rbac.authorization.k8s.io "
}
}
2026-03-15 19:06:12 +00:00
# --- Vault → Woodpecker Secret Sync ---
# Syncs secrets from Vault KV (secret/ci/global) to Woodpecker global secrets via API.
# Runs every 6 hours. Secrets are created/updated via Woodpecker REST API.
resource " kubernetes_config_map " " vault_woodpecker_sync " {
metadata {
name = " vault-woodpecker-sync "
namespace = kubernetes_namespace . woodpecker . metadata [ 0 ] . name
}
data = {
2026-03-15 19:37:00 +00:00
" sync.sh " = < < - SCRIPT
2026-03-15 19:06:12 +00:00
#!/bin/sh
set - e
VAULT_ADDR =" http://vault-active.vault.svc.cluster.local:8200 "
WP_API =" http://woodpecker-server.woodpecker.svc.cluster.local:8000/api "
# Authenticate to Vault via K8s SA
2026-03-15 19:37:00 +00:00
SA_TOKEN =$ $ ( cat / var / run / secrets / kubernetes . io / serviceaccount / token )
VAULT_TOKEN =$ $ ( curl - sf - X POST " $ $ VAULT_ADDR/v1/auth/kubernetes/login " \
- d " { \ " role \ " : \ " woodpecker - sync \ " , \ " jwt \ " : \ " $ $ SA_TOKEN \ " } " | jq - r . auth . client_token )
2026-03-15 19:06:12 +00:00
2026-03-15 19:37:00 +00:00
if [ - z " $ $ VAULT_TOKEN " ] | | [ " $ $ VAULT_TOKEN " = " null " ] ; then
2026-03-15 19:06:12 +00:00
echo " ERROR: Failed to authenticate to Vault "
exit 1
fi
# Get Woodpecker API token from Vault
2026-03-15 19:37:00 +00:00
WP_TOKEN =$ $ ( curl - sf - H " X-Vault-Token: $ $ VAULT_TOKEN " \
" $ $ VAULT_ADDR/v1/secret/data/ci/global " | jq - r ' . data . data . woodpecker_api_token // empty')
2026-03-15 19:06:12 +00:00
2026-03-15 19:37:00 +00:00
if [ - z " $ $ WP_TOKEN " ] ; then
2026-03-15 19:06:12 +00:00
echo " ERROR: No woodpecker_api_token in secret/ci/global "
exit 1
fi
# Sync global secrets
2026-03-15 19:37:00 +00:00
SECRETS =$ $ ( curl - sf - H " X-Vault-Token: $ $ VAULT_TOKEN " \
" $ $ VAULT_ADDR/v1/secret/data/ci/global " | jq - r ' . data . data | to_entries [ ] | select ( . key ! = " woodpecker_api_token " ) | @ base64 ' )
2026-03-15 19:06:12 +00:00
synced =0
2026-03-15 19:37:00 +00:00
for entry in $ $ SECRETS ; do
NAME =$ $ ( echo " $ $ entry " | base64 - d | jq - r . key )
VALUE =$ $ ( echo " $ $ entry " | base64 - d | jq - r . value )
2026-03-15 19:06:12 +00:00
# Try PATCH first (update), fall back to POST (create)
2026-03-15 19:37:00 +00:00
STATUS =$ $ ( curl - sf - o / dev / null - w " %%{http_code} " - X PATCH " $ $ WP_API/secrets/ $ $ NAME " \
- H " Authorization: Bearer $ $ WP_TOKEN " \
2026-03-15 19:06:12 +00:00
- H " Content-Type: application/json " \
2026-03-15 19:37:00 +00:00
- d " { \ " name \ " : \ " $ $ NAME \ " , \ " value \ " : \ " $ $ VALUE \ " , \ " events \ " :[ \ " push \ " , \ " tag \ " , \ " deployment \ " ]} " 2 > / dev / null | | echo " 000 " )
2026-03-15 19:06:12 +00:00
2026-03-15 19:37:00 +00:00
if [ " $ $ STATUS " ! = " 200 " ] ; then
curl - sf - X POST " $ $ WP_API/secrets " \
- H " Authorization: Bearer $ $ WP_TOKEN " \
2026-03-15 19:06:12 +00:00
- H " Content-Type: application/json " \
2026-03-15 19:37:00 +00:00
- d " { \ " name \ " : \ " $ $ NAME \ " , \ " value \ " : \ " $ $ VALUE \ " , \ " events \ " :[ \ " push \ " , \ " tag \ " , \ " deployment \ " ]} " > / dev / null
2026-03-15 19:06:12 +00:00
fi
2026-03-15 19:37:00 +00:00
synced =$ $ ( ( synced + 1 ) )
2026-03-15 19:06:12 +00:00
done
2026-03-15 19:37:00 +00:00
echo " Synced $ $ synced global secrets from Vault to Woodpecker "
2026-03-15 19:06:12 +00:00
SCRIPT
}
}
resource " kubernetes_cron_job_v1 " " vault_secret_sync " {
metadata {
name = " vault-woodpecker-sync "
namespace = kubernetes_namespace . woodpecker . metadata [ 0 ] . name
}
spec {
schedule = " 0 */6 * * * "
successful_jobs_history_limit = 3
failed_jobs_history_limit = 3
concurrency_policy = " Forbid "
job_template {
metadata { }
spec {
template {
metadata { }
spec {
container {
name = " sync "
image = " curlimages/curl "
command = [ " /bin/sh " , " /scripts/sync.sh " ]
volume_mount {
name = " sync-script "
mount_path = " /scripts "
}
resources {
requests = {
cpu = " 10m "
memory = " 32Mi "
}
limits = {
memory = " 64Mi "
}
}
}
volume {
name = " sync-script "
config_map {
name = kubernetes_config_map . vault_woodpecker_sync . metadata [ 0 ] . name
}
}
restart_policy = " OnFailure "
}
}
}
}
}
}
2026-02-22 21:30:25 +00:00
module " ingress " {
source = " ../../modules/kubernetes/ingress_factory "
namespace = kubernetes_namespace . woodpecker . metadata [ 0 ] . name
name = " ci "
service_name = " woodpecker-server "
tls_secret_name = var . tls_secret_name
2026-03-07 16:41:36 +00:00
extra_annotations = {
" gethomepage.dev/enabled " = " true "
" gethomepage.dev/name " = " Woodpecker CI "
" gethomepage.dev/description " = " CI/CD pipelines "
" gethomepage.dev/icon " = " woodpecker-ci.png "
" gethomepage.dev/group " = " Development & CI "
" gethomepage.dev/pod-selector " = " "
}
2026-02-22 21:30:25 +00:00
}