[ci skip] Add platform stack (core services) for Terragrunt migration
stacks/platform/ contains 22 core/cluster services: metallb, dbaas, redis,
traefik, technitium, headscale, authentik, rbac, k8s-portal, crowdsec,
monitoring, vaultwarden, reverse-proxy, metrics-server, nvidia, kyverno,
uptime-kuma, wireguard, xray, mailserver, cloudflared, infra-maintenance.
Outputs: tls_secret_name, redis_host, postgresql_host/port, mysql_host/port,
smtp_host/port — consumed by downstream service stacks via dependency blocks.
2026-02-22 13:21:09 +00:00
|
|
|
# =============================================================================
|
|
|
|
|
# Platform Stack — Core & Cluster Services
|
|
|
|
|
# =============================================================================
|
|
|
|
|
#
|
|
|
|
|
# This stack groups ~22 core/cluster services that form the platform layer.
|
|
|
|
|
# These services are always present (no DEFCON gating) and provide the
|
|
|
|
|
# foundational infrastructure that application stacks depend on.
|
|
|
|
|
#
|
|
|
|
|
# Services included:
|
|
|
|
|
# metallb, dbaas, cloudflared, infra-maintenance,
|
|
|
|
|
# redis, traefik, technitium, headscale, authentik, rbac, k8s-portal,
|
2026-02-25 21:54:01 +00:00
|
|
|
# crowdsec, monitoring, vaultwarden, reverse-proxy, metrics-server, vpa,
|
[ci skip] Add platform stack (core services) for Terragrunt migration
stacks/platform/ contains 22 core/cluster services: metallb, dbaas, redis,
traefik, technitium, headscale, authentik, rbac, k8s-portal, crowdsec,
monitoring, vaultwarden, reverse-proxy, metrics-server, nvidia, kyverno,
uptime-kuma, wireguard, xray, mailserver, cloudflared, infra-maintenance.
Outputs: tls_secret_name, redis_host, postgresql_host/port, mysql_host/port,
smtp_host/port — consumed by downstream service stacks via dependency blocks.
2026-02-22 13:21:09 +00:00
|
|
|
# nvidia, kyverno, uptime-kuma, wireguard, xray, mailserver
|
|
|
|
|
# =============================================================================
|
|
|
|
|
|
|
|
|
|
# -----------------------------------------------------------------------------
|
|
|
|
|
# Tier Definitions
|
|
|
|
|
# -----------------------------------------------------------------------------
|
|
|
|
|
|
|
|
|
|
# =============================================================================
|
|
|
|
|
# Variable Declarations
|
|
|
|
|
# =============================================================================
|
|
|
|
|
|
2026-03-14 17:15:48 +00:00
|
|
|
# --- Core (non-secret, from config.tfvars) ---
|
2026-03-07 14:30:36 +00:00
|
|
|
variable "tls_secret_name" {
|
|
|
|
|
type = string
|
|
|
|
|
}
|
[ci skip] Infrastructure hardening: security, monitoring, reliability, maintainability
Phase 1 - Critical Security:
- Netbox: move hardcoded DB/superuser passwords to variables
- MeshCentral: disable public registration, add Authentik auth
- Traefik: disable insecure API dashboard (api.insecure=false)
- Traefik: configure forwarded headers with Cloudflare trusted IPs
Phase 2 - Security Hardening:
- Add security headers middleware (HSTS, X-Frame-Options, nosniff, etc.)
- Add Kyverno pod security policies in audit mode (privileged, host
namespaces, SYS_ADMIN, trusted registries)
- Tighten rate limiting (avg=10, burst=50)
- Add Authentik protection to grampsweb
Phase 3 - Monitoring & Alerting:
- Add critical service alerts (PostgreSQL, MySQL, Redis, Headscale,
Authentik, Loki)
- Increase Loki retention from 7 to 30 days (720h)
- Add predictive PV filling alert (predict_linear)
- Re-enable Hackmd and Privatebin down alerts
Phase 4 - Reliability:
- Add resource requests/limits to Redis, DBaaS, Technitium, Headscale,
Vaultwarden, Uptime Kuma
- Increase Alloy DaemonSet memory to 512Mi/1Gi
Phase 6 - Maintainability:
- Extract duplicated tiers locals to terragrunt.hcl generate block
(removed from 67 stacks)
- Replace hardcoded NFS IP 10.0.10.15 with var.nfs_server (114
instances across 63 files)
- Replace hardcoded Redis/PostgreSQL/MySQL/Ollama/mail host references
with variables across ~35 stacks
- Migrate xray raw ingress resources to ingress_factory modules
2026-02-23 22:05:28 +00:00
|
|
|
variable "nfs_server" { type = string }
|
|
|
|
|
variable "redis_host" { type = string }
|
|
|
|
|
variable "postgresql_host" { type = string }
|
|
|
|
|
variable "mysql_host" { type = string }
|
|
|
|
|
variable "ollama_host" { type = string }
|
|
|
|
|
variable "mail_host" { type = string }
|
[ci skip] Add platform stack (core services) for Terragrunt migration
stacks/platform/ contains 22 core/cluster services: metallb, dbaas, redis,
traefik, technitium, headscale, authentik, rbac, k8s-portal, crowdsec,
monitoring, vaultwarden, reverse-proxy, metrics-server, nvidia, kyverno,
uptime-kuma, wireguard, xray, mailserver, cloudflared, infra-maintenance.
Outputs: tls_secret_name, redis_host, postgresql_host/port, mysql_host/port,
smtp_host/port — consumed by downstream service stacks via dependency blocks.
2026-02-22 13:21:09 +00:00
|
|
|
variable "prod" {
|
|
|
|
|
type = bool
|
|
|
|
|
default = false
|
|
|
|
|
}
|
[ci skip] k8s portal: fix setup script + add onboarding hub (5 new pages)
Bug fixes:
- CA cert now populated in ConfigMap (was empty → TLS failures)
- Remove useless heredoc quote escaping in setup script
- Fix homepage: VPN callout, correct verification command (get namespaces)
- Fix false-positive sensitive=true on ingress_path, tls_secret_name,
truenas_host, ollama_host, client_certificate_secret_name
New pages (direct Svelte, no mdsvex dependency):
- /onboarding: step-by-step guide (VPN, kubectl, git, first PR)
- /architecture: cluster topology, storage, networking, tiers
- /services: catalog of 70+ services with URLs
- /contributing: PR workflow, what you can/can't change, NEVER list
- /troubleshooting: common issues and fixes
Navigation bar added to layout. All pages use consistent docs styling.
Requires Docker image rebuild: cd stacks/platform/modules/k8s-portal/files
&& docker build -t viktorbarzin/k8s-portal:latest . && docker push
2026-03-07 15:06:26 +00:00
|
|
|
variable "k8s_ca_cert" {
|
|
|
|
|
type = string
|
|
|
|
|
default = ""
|
|
|
|
|
}
|
[ci skip] Add platform stack (core services) for Terragrunt migration
stacks/platform/ contains 22 core/cluster services: metallb, dbaas, redis,
traefik, technitium, headscale, authentik, rbac, k8s-portal, crowdsec,
monitoring, vaultwarden, reverse-proxy, metrics-server, nvidia, kyverno,
uptime-kuma, wireguard, xray, mailserver, cloudflared, infra-maintenance.
Outputs: tls_secret_name, redis_host, postgresql_host/port, mysql_host/port,
smtp_host/port — consumed by downstream service stacks via dependency blocks.
2026-02-22 13:21:09 +00:00
|
|
|
variable "ssh_private_key" {
|
|
|
|
|
type = string
|
|
|
|
|
default = ""
|
|
|
|
|
sensitive = true
|
|
|
|
|
}
|
|
|
|
|
variable "cloudflare_email" { type = string }
|
|
|
|
|
variable "cloudflare_account_id" { type = string }
|
|
|
|
|
variable "cloudflare_zone_id" { type = string }
|
|
|
|
|
variable "cloudflare_tunnel_id" { type = string }
|
|
|
|
|
variable "public_ip" { type = string }
|
|
|
|
|
variable "cloudflare_proxied_names" {}
|
|
|
|
|
variable "cloudflare_non_proxied_names" {}
|
|
|
|
|
variable "monitoring_idrac_username" { type = string }
|
|
|
|
|
|
2026-03-14 17:15:48 +00:00
|
|
|
# --- Vault KV secrets ---
|
|
|
|
|
data "vault_kv_secret_v2" "secrets" {
|
|
|
|
|
mount = "secret"
|
|
|
|
|
name = "platform"
|
2026-03-07 14:30:36 +00:00
|
|
|
}
|
[ci skip] Add platform stack (core services) for Terragrunt migration
stacks/platform/ contains 22 core/cluster services: metallb, dbaas, redis,
traefik, technitium, headscale, authentik, rbac, k8s-portal, crowdsec,
monitoring, vaultwarden, reverse-proxy, metrics-server, nvidia, kyverno,
uptime-kuma, wireguard, xray, mailserver, cloudflared, infra-maintenance.
Outputs: tls_secret_name, redis_host, postgresql_host/port, mysql_host/port,
smtp_host/port — consumed by downstream service stacks via dependency blocks.
2026-02-22 13:21:09 +00:00
|
|
|
|
2026-03-14 17:15:48 +00:00
|
|
|
locals {
|
2026-03-15 00:03:59 +00:00
|
|
|
homepage_credentials = jsondecode(data.vault_kv_secret_v2.secrets.data["homepage_credentials"])
|
|
|
|
|
k8s_users = jsondecode(data.vault_kv_secret_v2.secrets.data["k8s_users"])
|
|
|
|
|
xray_reality_clients = jsondecode(data.vault_kv_secret_v2.secrets.data["xray_reality_clients"])
|
|
|
|
|
xray_reality_short_ids = jsondecode(data.vault_kv_secret_v2.secrets.data["xray_reality_short_ids"])
|
|
|
|
|
mailserver_accounts = jsondecode(data.vault_kv_secret_v2.secrets.data["mailserver_accounts"])
|
|
|
|
|
mailserver_aliases = jsondecode(data.vault_kv_secret_v2.secrets.data["mailserver_aliases"])
|
2026-03-14 17:15:48 +00:00
|
|
|
mailserver_opendkim_key = jsondecode(data.vault_kv_secret_v2.secrets.data["mailserver_opendkim_key"])
|
|
|
|
|
mailserver_sasl_passwd = jsondecode(data.vault_kv_secret_v2.secrets.data["mailserver_sasl_passwd"])
|
add generic multi-user cluster onboarding system
Data-driven user onboarding: add a JSON entry to Vault KV k8s_users,
apply vault + platform + woodpecker stacks, and everything is auto-generated.
Vault stack: namespace creation, per-user Vault policies with secret isolation
via identity entities/aliases, K8s deployer roles, CI policy update.
Platform stack: domains field in k8s_users type, TLS secrets per user namespace,
user domains merged into Cloudflare DNS, user-roles ConfigMap mounted in portal.
Woodpecker stack: admin list auto-generated from k8s_users, WOODPECKER_OPEN=true.
K8s-portal: dual-track onboarding (general/namespace-owner), namespace-owner
dashboard with Vault/kubectl commands, setup script adds Vault+Terraform+Terragrunt,
contributing page with CI pipeline template, versioned image tags in CI pipeline.
New: stacks/_template/ with copyable stack template for namespace-owners.
2026-03-15 22:23:36 +00:00
|
|
|
|
|
|
|
|
# User domains from namespace-owners for DNS/Cloudflare
|
|
|
|
|
user_domains = flatten([
|
2026-03-15 23:36:46 +00:00
|
|
|
for name, user in local.k8s_users : lookup(user, "domains", [])
|
add generic multi-user cluster onboarding system
Data-driven user onboarding: add a JSON entry to Vault KV k8s_users,
apply vault + platform + woodpecker stacks, and everything is auto-generated.
Vault stack: namespace creation, per-user Vault policies with secret isolation
via identity entities/aliases, K8s deployer roles, CI policy update.
Platform stack: domains field in k8s_users type, TLS secrets per user namespace,
user domains merged into Cloudflare DNS, user-roles ConfigMap mounted in portal.
Woodpecker stack: admin list auto-generated from k8s_users, WOODPECKER_OPEN=true.
K8s-portal: dual-track onboarding (general/namespace-owner), namespace-owner
dashboard with Vault/kubectl commands, setup script adds Vault+Terraform+Terragrunt,
contributing page with CI pipeline template, versioned image tags in CI pipeline.
New: stacks/_template/ with copyable stack template for namespace-owners.
2026-03-15 22:23:36 +00:00
|
|
|
if user.role == "namespace-owner"
|
|
|
|
|
])
|
[ci skip] iSCSI migration, healthcheck fixes, health probes, etcd backup
- Migrate MySQL/PostgreSQL storage from local-path to iscsi-truenas
- Add democratic-csi iSCSI driver module for TrueNAS
- Add open-iscsi to cloud-init VM template
- Fix Shlink health probe path (/api/v3 -> /rest/v3 for Shlink 5.0)
- Fix etcd backup: use etcd 3.5.21-0 (3.6.x is distroless, no /bin/sh)
- Fix cluster healthcheck CronJob: always exit 0 to prevent circular
JobFailed alerts (reporting via Slack, not exit codes)
- Fix Uptime Kuma nested list handling in cluster-health.sh
- Add health probes to: audiobookshelf, immich ML, ntfy, headscale,
uptime-kuma, vaultwarden, rybbit (clickhouse + server + client),
shlink, shlink-web
- Add iSCSI storage documentation to CLAUDE.md
2026-03-06 19:54:21 +00:00
|
|
|
}
|
|
|
|
|
|
[ci skip] Add platform stack (core services) for Terragrunt migration
stacks/platform/ contains 22 core/cluster services: metallb, dbaas, redis,
traefik, technitium, headscale, authentik, rbac, k8s-portal, crowdsec,
monitoring, vaultwarden, reverse-proxy, metrics-server, nvidia, kyverno,
uptime-kuma, wireguard, xray, mailserver, cloudflared, infra-maintenance.
Outputs: tls_secret_name, redis_host, postgresql_host/port, mysql_host/port,
smtp_host/port — consumed by downstream service stacks via dependency blocks.
2026-02-22 13:21:09 +00:00
|
|
|
# =============================================================================
|
|
|
|
|
# Module Calls
|
|
|
|
|
# =============================================================================
|
|
|
|
|
|
|
|
|
|
# -----------------------------------------------------------------------------
|
|
|
|
|
# MetalLB — L2 load balancer
|
|
|
|
|
# -----------------------------------------------------------------------------
|
|
|
|
|
module "metallb" {
|
2026-02-22 14:38:14 +00:00
|
|
|
source = "./modules/metallb"
|
[ci skip] Add platform stack (core services) for Terragrunt migration
stacks/platform/ contains 22 core/cluster services: metallb, dbaas, redis,
traefik, technitium, headscale, authentik, rbac, k8s-portal, crowdsec,
monitoring, vaultwarden, reverse-proxy, metrics-server, nvidia, kyverno,
uptime-kuma, wireguard, xray, mailserver, cloudflared, infra-maintenance.
Outputs: tls_secret_name, redis_host, postgresql_host/port, mysql_host/port,
smtp_host/port — consumed by downstream service stacks via dependency blocks.
2026-02-22 13:21:09 +00:00
|
|
|
tier = local.tiers.core
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
# -----------------------------------------------------------------------------
|
|
|
|
|
# DBaaS — MySQL + PostgreSQL + pgAdmin
|
|
|
|
|
# -----------------------------------------------------------------------------
|
|
|
|
|
module "dbaas" {
|
2026-02-22 15:13:55 +00:00
|
|
|
source = "./modules/dbaas"
|
[ci skip] Add platform stack (core services) for Terragrunt migration
stacks/platform/ contains 22 core/cluster services: metallb, dbaas, redis,
traefik, technitium, headscale, authentik, rbac, k8s-portal, crowdsec,
monitoring, vaultwarden, reverse-proxy, metrics-server, nvidia, kyverno,
uptime-kuma, wireguard, xray, mailserver, cloudflared, infra-maintenance.
Outputs: tls_secret_name, redis_host, postgresql_host/port, mysql_host/port,
smtp_host/port — consumed by downstream service stacks via dependency blocks.
2026-02-22 13:21:09 +00:00
|
|
|
prod = var.prod
|
|
|
|
|
tls_secret_name = var.tls_secret_name
|
[ci skip] Infrastructure hardening: security, monitoring, reliability, maintainability
Phase 1 - Critical Security:
- Netbox: move hardcoded DB/superuser passwords to variables
- MeshCentral: disable public registration, add Authentik auth
- Traefik: disable insecure API dashboard (api.insecure=false)
- Traefik: configure forwarded headers with Cloudflare trusted IPs
Phase 2 - Security Hardening:
- Add security headers middleware (HSTS, X-Frame-Options, nosniff, etc.)
- Add Kyverno pod security policies in audit mode (privileged, host
namespaces, SYS_ADMIN, trusted registries)
- Tighten rate limiting (avg=10, burst=50)
- Add Authentik protection to grampsweb
Phase 3 - Monitoring & Alerting:
- Add critical service alerts (PostgreSQL, MySQL, Redis, Headscale,
Authentik, Loki)
- Increase Loki retention from 7 to 30 days (720h)
- Add predictive PV filling alert (predict_linear)
- Re-enable Hackmd and Privatebin down alerts
Phase 4 - Reliability:
- Add resource requests/limits to Redis, DBaaS, Technitium, Headscale,
Vaultwarden, Uptime Kuma
- Increase Alloy DaemonSet memory to 512Mi/1Gi
Phase 6 - Maintainability:
- Extract duplicated tiers locals to terragrunt.hcl generate block
(removed from 67 stacks)
- Replace hardcoded NFS IP 10.0.10.15 with var.nfs_server (114
instances across 63 files)
- Replace hardcoded Redis/PostgreSQL/MySQL/Ollama/mail host references
with variables across ~35 stacks
- Migrate xray raw ingress resources to ingress_factory modules
2026-02-23 22:05:28 +00:00
|
|
|
nfs_server = var.nfs_server
|
2026-03-14 17:15:48 +00:00
|
|
|
dbaas_root_password = data.vault_kv_secret_v2.secrets.data["dbaas_root_password"]
|
|
|
|
|
postgresql_root_password = data.vault_kv_secret_v2.secrets.data["dbaas_postgresql_root_password"]
|
|
|
|
|
pgadmin_password = data.vault_kv_secret_v2.secrets.data["dbaas_pgadmin_password"]
|
2026-02-28 19:23:36 +00:00
|
|
|
kube_config_path = var.kube_config_path
|
[ci skip] Add platform stack (core services) for Terragrunt migration
stacks/platform/ contains 22 core/cluster services: metallb, dbaas, redis,
traefik, technitium, headscale, authentik, rbac, k8s-portal, crowdsec,
monitoring, vaultwarden, reverse-proxy, metrics-server, nvidia, kyverno,
uptime-kuma, wireguard, xray, mailserver, cloudflared, infra-maintenance.
Outputs: tls_secret_name, redis_host, postgresql_host/port, mysql_host/port,
smtp_host/port — consumed by downstream service stacks via dependency blocks.
2026-02-22 13:21:09 +00:00
|
|
|
tier = local.tiers.cluster
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
# -----------------------------------------------------------------------------
|
|
|
|
|
# Redis — Shared Redis instance
|
|
|
|
|
# -----------------------------------------------------------------------------
|
|
|
|
|
module "redis" {
|
2026-02-22 15:13:55 +00:00
|
|
|
source = "./modules/redis"
|
[ci skip] Add platform stack (core services) for Terragrunt migration
stacks/platform/ contains 22 core/cluster services: metallb, dbaas, redis,
traefik, technitium, headscale, authentik, rbac, k8s-portal, crowdsec,
monitoring, vaultwarden, reverse-proxy, metrics-server, nvidia, kyverno,
uptime-kuma, wireguard, xray, mailserver, cloudflared, infra-maintenance.
Outputs: tls_secret_name, redis_host, postgresql_host/port, mysql_host/port,
smtp_host/port — consumed by downstream service stacks via dependency blocks.
2026-02-22 13:21:09 +00:00
|
|
|
tls_secret_name = var.tls_secret_name
|
[ci skip] Infrastructure hardening: security, monitoring, reliability, maintainability
Phase 1 - Critical Security:
- Netbox: move hardcoded DB/superuser passwords to variables
- MeshCentral: disable public registration, add Authentik auth
- Traefik: disable insecure API dashboard (api.insecure=false)
- Traefik: configure forwarded headers with Cloudflare trusted IPs
Phase 2 - Security Hardening:
- Add security headers middleware (HSTS, X-Frame-Options, nosniff, etc.)
- Add Kyverno pod security policies in audit mode (privileged, host
namespaces, SYS_ADMIN, trusted registries)
- Tighten rate limiting (avg=10, burst=50)
- Add Authentik protection to grampsweb
Phase 3 - Monitoring & Alerting:
- Add critical service alerts (PostgreSQL, MySQL, Redis, Headscale,
Authentik, Loki)
- Increase Loki retention from 7 to 30 days (720h)
- Add predictive PV filling alert (predict_linear)
- Re-enable Hackmd and Privatebin down alerts
Phase 4 - Reliability:
- Add resource requests/limits to Redis, DBaaS, Technitium, Headscale,
Vaultwarden, Uptime Kuma
- Increase Alloy DaemonSet memory to 512Mi/1Gi
Phase 6 - Maintainability:
- Extract duplicated tiers locals to terragrunt.hcl generate block
(removed from 67 stacks)
- Replace hardcoded NFS IP 10.0.10.15 with var.nfs_server (114
instances across 63 files)
- Replace hardcoded Redis/PostgreSQL/MySQL/Ollama/mail host references
with variables across ~35 stacks
- Migrate xray raw ingress resources to ingress_factory modules
2026-02-23 22:05:28 +00:00
|
|
|
nfs_server = var.nfs_server
|
[ci skip] Add platform stack (core services) for Terragrunt migration
stacks/platform/ contains 22 core/cluster services: metallb, dbaas, redis,
traefik, technitium, headscale, authentik, rbac, k8s-portal, crowdsec,
monitoring, vaultwarden, reverse-proxy, metrics-server, nvidia, kyverno,
uptime-kuma, wireguard, xray, mailserver, cloudflared, infra-maintenance.
Outputs: tls_secret_name, redis_host, postgresql_host/port, mysql_host/port,
smtp_host/port — consumed by downstream service stacks via dependency blocks.
2026-02-22 13:21:09 +00:00
|
|
|
tier = local.tiers.cluster
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
# -----------------------------------------------------------------------------
|
|
|
|
|
# Traefik — Ingress controller (Helm)
|
|
|
|
|
# -----------------------------------------------------------------------------
|
|
|
|
|
module "traefik" {
|
2026-03-01 14:13:05 +00:00
|
|
|
source = "./modules/traefik"
|
|
|
|
|
tier = local.tiers.core
|
2026-03-14 17:15:48 +00:00
|
|
|
crowdsec_api_key = data.vault_kv_secret_v2.secrets.data["ingress_crowdsec_api_key"]
|
2026-03-01 14:13:05 +00:00
|
|
|
redis_host = var.redis_host
|
|
|
|
|
tls_secret_name = var.tls_secret_name
|
2026-03-14 17:15:48 +00:00
|
|
|
auth_fallback_htpasswd = data.vault_kv_secret_v2.secrets.data["auth_fallback_htpasswd"]
|
[ci skip] Add platform stack (core services) for Terragrunt migration
stacks/platform/ contains 22 core/cluster services: metallb, dbaas, redis,
traefik, technitium, headscale, authentik, rbac, k8s-portal, crowdsec,
monitoring, vaultwarden, reverse-proxy, metrics-server, nvidia, kyverno,
uptime-kuma, wireguard, xray, mailserver, cloudflared, infra-maintenance.
Outputs: tls_secret_name, redis_host, postgresql_host/port, mysql_host/port,
smtp_host/port — consumed by downstream service stacks via dependency blocks.
2026-02-22 13:21:09 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
|
|
# -----------------------------------------------------------------------------
|
|
|
|
|
# Technitium — DNS server
|
|
|
|
|
# -----------------------------------------------------------------------------
|
|
|
|
|
module "technitium" {
|
2026-02-22 15:13:55 +00:00
|
|
|
source = "./modules/technitium"
|
[ci skip] Add platform stack (core services) for Terragrunt migration
stacks/platform/ contains 22 core/cluster services: metallb, dbaas, redis,
traefik, technitium, headscale, authentik, rbac, k8s-portal, crowdsec,
monitoring, vaultwarden, reverse-proxy, metrics-server, nvidia, kyverno,
uptime-kuma, wireguard, xray, mailserver, cloudflared, infra-maintenance.
Outputs: tls_secret_name, redis_host, postgresql_host/port, mysql_host/port,
smtp_host/port — consumed by downstream service stacks via dependency blocks.
2026-02-22 13:21:09 +00:00
|
|
|
tls_secret_name = var.tls_secret_name
|
[ci skip] Infrastructure hardening: security, monitoring, reliability, maintainability
Phase 1 - Critical Security:
- Netbox: move hardcoded DB/superuser passwords to variables
- MeshCentral: disable public registration, add Authentik auth
- Traefik: disable insecure API dashboard (api.insecure=false)
- Traefik: configure forwarded headers with Cloudflare trusted IPs
Phase 2 - Security Hardening:
- Add security headers middleware (HSTS, X-Frame-Options, nosniff, etc.)
- Add Kyverno pod security policies in audit mode (privileged, host
namespaces, SYS_ADMIN, trusted registries)
- Tighten rate limiting (avg=10, burst=50)
- Add Authentik protection to grampsweb
Phase 3 - Monitoring & Alerting:
- Add critical service alerts (PostgreSQL, MySQL, Redis, Headscale,
Authentik, Loki)
- Increase Loki retention from 7 to 30 days (720h)
- Add predictive PV filling alert (predict_linear)
- Re-enable Hackmd and Privatebin down alerts
Phase 4 - Reliability:
- Add resource requests/limits to Redis, DBaaS, Technitium, Headscale,
Vaultwarden, Uptime Kuma
- Increase Alloy DaemonSet memory to 512Mi/1Gi
Phase 6 - Maintainability:
- Extract duplicated tiers locals to terragrunt.hcl generate block
(removed from 67 stacks)
- Replace hardcoded NFS IP 10.0.10.15 with var.nfs_server (114
instances across 63 files)
- Replace hardcoded Redis/PostgreSQL/MySQL/Ollama/mail host references
with variables across ~35 stacks
- Migrate xray raw ingress resources to ingress_factory modules
2026-02-23 22:05:28 +00:00
|
|
|
nfs_server = var.nfs_server
|
|
|
|
|
mysql_host = var.mysql_host
|
2026-03-14 17:15:48 +00:00
|
|
|
homepage_token = local.homepage_credentials["technitium"]["token"]
|
|
|
|
|
technitium_db_password = data.vault_kv_secret_v2.secrets.data["technitium_db_password"]
|
|
|
|
|
technitium_username = data.vault_kv_secret_v2.secrets.data["technitium_username"]
|
|
|
|
|
technitium_password = data.vault_kv_secret_v2.secrets.data["technitium_password"]
|
[ci skip] Add platform stack (core services) for Terragrunt migration
stacks/platform/ contains 22 core/cluster services: metallb, dbaas, redis,
traefik, technitium, headscale, authentik, rbac, k8s-portal, crowdsec,
monitoring, vaultwarden, reverse-proxy, metrics-server, nvidia, kyverno,
uptime-kuma, wireguard, xray, mailserver, cloudflared, infra-maintenance.
Outputs: tls_secret_name, redis_host, postgresql_host/port, mysql_host/port,
smtp_host/port — consumed by downstream service stacks via dependency blocks.
2026-02-22 13:21:09 +00:00
|
|
|
tier = local.tiers.core
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
# -----------------------------------------------------------------------------
|
|
|
|
|
# Headscale — Tailscale control server
|
|
|
|
|
# -----------------------------------------------------------------------------
|
|
|
|
|
module "headscale" {
|
2026-02-22 15:13:55 +00:00
|
|
|
source = "./modules/headscale"
|
[ci skip] Add platform stack (core services) for Terragrunt migration
stacks/platform/ contains 22 core/cluster services: metallb, dbaas, redis,
traefik, technitium, headscale, authentik, rbac, k8s-portal, crowdsec,
monitoring, vaultwarden, reverse-proxy, metrics-server, nvidia, kyverno,
uptime-kuma, wireguard, xray, mailserver, cloudflared, infra-maintenance.
Outputs: tls_secret_name, redis_host, postgresql_host/port, mysql_host/port,
smtp_host/port — consumed by downstream service stacks via dependency blocks.
2026-02-22 13:21:09 +00:00
|
|
|
tls_secret_name = var.tls_secret_name
|
[ci skip] Infrastructure hardening: security, monitoring, reliability, maintainability
Phase 1 - Critical Security:
- Netbox: move hardcoded DB/superuser passwords to variables
- MeshCentral: disable public registration, add Authentik auth
- Traefik: disable insecure API dashboard (api.insecure=false)
- Traefik: configure forwarded headers with Cloudflare trusted IPs
Phase 2 - Security Hardening:
- Add security headers middleware (HSTS, X-Frame-Options, nosniff, etc.)
- Add Kyverno pod security policies in audit mode (privileged, host
namespaces, SYS_ADMIN, trusted registries)
- Tighten rate limiting (avg=10, burst=50)
- Add Authentik protection to grampsweb
Phase 3 - Monitoring & Alerting:
- Add critical service alerts (PostgreSQL, MySQL, Redis, Headscale,
Authentik, Loki)
- Increase Loki retention from 7 to 30 days (720h)
- Add predictive PV filling alert (predict_linear)
- Re-enable Hackmd and Privatebin down alerts
Phase 4 - Reliability:
- Add resource requests/limits to Redis, DBaaS, Technitium, Headscale,
Vaultwarden, Uptime Kuma
- Increase Alloy DaemonSet memory to 512Mi/1Gi
Phase 6 - Maintainability:
- Extract duplicated tiers locals to terragrunt.hcl generate block
(removed from 67 stacks)
- Replace hardcoded NFS IP 10.0.10.15 with var.nfs_server (114
instances across 63 files)
- Replace hardcoded Redis/PostgreSQL/MySQL/Ollama/mail host references
with variables across ~35 stacks
- Migrate xray raw ingress resources to ingress_factory modules
2026-02-23 22:05:28 +00:00
|
|
|
nfs_server = var.nfs_server
|
2026-03-14 17:15:48 +00:00
|
|
|
headscale_config = data.vault_kv_secret_v2.secrets.data["headscale_config"]
|
|
|
|
|
headscale_acl = data.vault_kv_secret_v2.secrets.data["headscale_acl"]
|
|
|
|
|
homepage_token = try(local.homepage_credentials["headscale"]["api_key"], "")
|
[ci skip] Add platform stack (core services) for Terragrunt migration
stacks/platform/ contains 22 core/cluster services: metallb, dbaas, redis,
traefik, technitium, headscale, authentik, rbac, k8s-portal, crowdsec,
monitoring, vaultwarden, reverse-proxy, metrics-server, nvidia, kyverno,
uptime-kuma, wireguard, xray, mailserver, cloudflared, infra-maintenance.
Outputs: tls_secret_name, redis_host, postgresql_host/port, mysql_host/port,
smtp_host/port — consumed by downstream service stacks via dependency blocks.
2026-02-22 13:21:09 +00:00
|
|
|
tier = local.tiers.core
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
# -----------------------------------------------------------------------------
|
|
|
|
|
# Authentik — Identity provider (SSO)
|
|
|
|
|
# -----------------------------------------------------------------------------
|
|
|
|
|
module "authentik" {
|
2026-02-22 15:13:55 +00:00
|
|
|
source = "./modules/authentik"
|
[ci skip] Add platform stack (core services) for Terragrunt migration
stacks/platform/ contains 22 core/cluster services: metallb, dbaas, redis,
traefik, technitium, headscale, authentik, rbac, k8s-portal, crowdsec,
monitoring, vaultwarden, reverse-proxy, metrics-server, nvidia, kyverno,
uptime-kuma, wireguard, xray, mailserver, cloudflared, infra-maintenance.
Outputs: tls_secret_name, redis_host, postgresql_host/port, mysql_host/port,
smtp_host/port — consumed by downstream service stacks via dependency blocks.
2026-02-22 13:21:09 +00:00
|
|
|
tier = local.tiers.cluster
|
|
|
|
|
tls_secret_name = var.tls_secret_name
|
2026-03-14 17:15:48 +00:00
|
|
|
secret_key = data.vault_kv_secret_v2.secrets.data["authentik_secret_key"]
|
|
|
|
|
postgres_password = data.vault_kv_secret_v2.secrets.data["authentik_postgres_password"]
|
[ci skip] Infrastructure hardening: security, monitoring, reliability, maintainability
Phase 1 - Critical Security:
- Netbox: move hardcoded DB/superuser passwords to variables
- MeshCentral: disable public registration, add Authentik auth
- Traefik: disable insecure API dashboard (api.insecure=false)
- Traefik: configure forwarded headers with Cloudflare trusted IPs
Phase 2 - Security Hardening:
- Add security headers middleware (HSTS, X-Frame-Options, nosniff, etc.)
- Add Kyverno pod security policies in audit mode (privileged, host
namespaces, SYS_ADMIN, trusted registries)
- Tighten rate limiting (avg=10, burst=50)
- Add Authentik protection to grampsweb
Phase 3 - Monitoring & Alerting:
- Add critical service alerts (PostgreSQL, MySQL, Redis, Headscale,
Authentik, Loki)
- Increase Loki retention from 7 to 30 days (720h)
- Add predictive PV filling alert (predict_linear)
- Re-enable Hackmd and Privatebin down alerts
Phase 4 - Reliability:
- Add resource requests/limits to Redis, DBaaS, Technitium, Headscale,
Vaultwarden, Uptime Kuma
- Increase Alloy DaemonSet memory to 512Mi/1Gi
Phase 6 - Maintainability:
- Extract duplicated tiers locals to terragrunt.hcl generate block
(removed from 67 stacks)
- Replace hardcoded NFS IP 10.0.10.15 with var.nfs_server (114
instances across 63 files)
- Replace hardcoded Redis/PostgreSQL/MySQL/Ollama/mail host references
with variables across ~35 stacks
- Migrate xray raw ingress resources to ingress_factory modules
2026-02-23 22:05:28 +00:00
|
|
|
redis_host = var.redis_host
|
2026-03-14 17:15:48 +00:00
|
|
|
homepage_token = try(local.homepage_credentials["authentik"]["token"], "")
|
[ci skip] Add platform stack (core services) for Terragrunt migration
stacks/platform/ contains 22 core/cluster services: metallb, dbaas, redis,
traefik, technitium, headscale, authentik, rbac, k8s-portal, crowdsec,
monitoring, vaultwarden, reverse-proxy, metrics-server, nvidia, kyverno,
uptime-kuma, wireguard, xray, mailserver, cloudflared, infra-maintenance.
Outputs: tls_secret_name, redis_host, postgresql_host/port, mysql_host/port,
smtp_host/port — consumed by downstream service stacks via dependency blocks.
2026-02-22 13:21:09 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
|
|
# -----------------------------------------------------------------------------
|
|
|
|
|
# RBAC — Kubernetes OIDC RBAC (depends on Authentik)
|
|
|
|
|
# -----------------------------------------------------------------------------
|
|
|
|
|
module "rbac" {
|
2026-02-22 15:13:55 +00:00
|
|
|
source = "./modules/rbac"
|
[ci skip] Add platform stack (core services) for Terragrunt migration
stacks/platform/ contains 22 core/cluster services: metallb, dbaas, redis,
traefik, technitium, headscale, authentik, rbac, k8s-portal, crowdsec,
monitoring, vaultwarden, reverse-proxy, metrics-server, nvidia, kyverno,
uptime-kuma, wireguard, xray, mailserver, cloudflared, infra-maintenance.
Outputs: tls_secret_name, redis_host, postgresql_host/port, mysql_host/port,
smtp_host/port — consumed by downstream service stacks via dependency blocks.
2026-02-22 13:21:09 +00:00
|
|
|
tier = local.tiers.cluster
|
|
|
|
|
tls_secret_name = var.tls_secret_name
|
2026-03-14 17:15:48 +00:00
|
|
|
k8s_users = local.k8s_users
|
[ci skip] Add platform stack (core services) for Terragrunt migration
stacks/platform/ contains 22 core/cluster services: metallb, dbaas, redis,
traefik, technitium, headscale, authentik, rbac, k8s-portal, crowdsec,
monitoring, vaultwarden, reverse-proxy, metrics-server, nvidia, kyverno,
uptime-kuma, wireguard, xray, mailserver, cloudflared, infra-maintenance.
Outputs: tls_secret_name, redis_host, postgresql_host/port, mysql_host/port,
smtp_host/port — consumed by downstream service stacks via dependency blocks.
2026-02-22 13:21:09 +00:00
|
|
|
ssh_private_key = var.ssh_private_key
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
# -----------------------------------------------------------------------------
|
|
|
|
|
# K8s Portal — Self-service Kubernetes portal (depends on Authentik)
|
|
|
|
|
# -----------------------------------------------------------------------------
|
|
|
|
|
module "k8s-portal" {
|
2026-02-22 15:13:55 +00:00
|
|
|
source = "./modules/k8s-portal"
|
[ci skip] Add platform stack (core services) for Terragrunt migration
stacks/platform/ contains 22 core/cluster services: metallb, dbaas, redis,
traefik, technitium, headscale, authentik, rbac, k8s-portal, crowdsec,
monitoring, vaultwarden, reverse-proxy, metrics-server, nvidia, kyverno,
uptime-kuma, wireguard, xray, mailserver, cloudflared, infra-maintenance.
Outputs: tls_secret_name, redis_host, postgresql_host/port, mysql_host/port,
smtp_host/port — consumed by downstream service stacks via dependency blocks.
2026-02-22 13:21:09 +00:00
|
|
|
tier = local.tiers.edge
|
|
|
|
|
tls_secret_name = var.tls_secret_name
|
[ci skip] k8s portal: fix setup script + add onboarding hub (5 new pages)
Bug fixes:
- CA cert now populated in ConfigMap (was empty → TLS failures)
- Remove useless heredoc quote escaping in setup script
- Fix homepage: VPN callout, correct verification command (get namespaces)
- Fix false-positive sensitive=true on ingress_path, tls_secret_name,
truenas_host, ollama_host, client_certificate_secret_name
New pages (direct Svelte, no mdsvex dependency):
- /onboarding: step-by-step guide (VPN, kubectl, git, first PR)
- /architecture: cluster topology, storage, networking, tiers
- /services: catalog of 70+ services with URLs
- /contributing: PR workflow, what you can/can't change, NEVER list
- /troubleshooting: common issues and fixes
Navigation bar added to layout. All pages use consistent docs styling.
Requires Docker image rebuild: cd stacks/platform/modules/k8s-portal/files
&& docker build -t viktorbarzin/k8s-portal:latest . && docker push
2026-03-07 15:06:26 +00:00
|
|
|
k8s_ca_cert = var.k8s_ca_cert
|
[ci skip] Add platform stack (core services) for Terragrunt migration
stacks/platform/ contains 22 core/cluster services: metallb, dbaas, redis,
traefik, technitium, headscale, authentik, rbac, k8s-portal, crowdsec,
monitoring, vaultwarden, reverse-proxy, metrics-server, nvidia, kyverno,
uptime-kuma, wireguard, xray, mailserver, cloudflared, infra-maintenance.
Outputs: tls_secret_name, redis_host, postgresql_host/port, mysql_host/port,
smtp_host/port — consumed by downstream service stacks via dependency blocks.
2026-02-22 13:21:09 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
|
|
# -----------------------------------------------------------------------------
|
|
|
|
|
# CrowdSec — Security/WAF
|
|
|
|
|
# -----------------------------------------------------------------------------
|
|
|
|
|
module "crowdsec" {
|
2026-02-22 15:13:55 +00:00
|
|
|
source = "./modules/crowdsec"
|
[ci skip] Add platform stack (core services) for Terragrunt migration
stacks/platform/ contains 22 core/cluster services: metallb, dbaas, redis,
traefik, technitium, headscale, authentik, rbac, k8s-portal, crowdsec,
monitoring, vaultwarden, reverse-proxy, metrics-server, nvidia, kyverno,
uptime-kuma, wireguard, xray, mailserver, cloudflared, infra-maintenance.
Outputs: tls_secret_name, redis_host, postgresql_host/port, mysql_host/port,
smtp_host/port — consumed by downstream service stacks via dependency blocks.
2026-02-22 13:21:09 +00:00
|
|
|
tier = local.tiers.cluster
|
|
|
|
|
tls_secret_name = var.tls_secret_name
|
[ci skip] Infrastructure hardening: security, monitoring, reliability, maintainability
Phase 1 - Critical Security:
- Netbox: move hardcoded DB/superuser passwords to variables
- MeshCentral: disable public registration, add Authentik auth
- Traefik: disable insecure API dashboard (api.insecure=false)
- Traefik: configure forwarded headers with Cloudflare trusted IPs
Phase 2 - Security Hardening:
- Add security headers middleware (HSTS, X-Frame-Options, nosniff, etc.)
- Add Kyverno pod security policies in audit mode (privileged, host
namespaces, SYS_ADMIN, trusted registries)
- Tighten rate limiting (avg=10, burst=50)
- Add Authentik protection to grampsweb
Phase 3 - Monitoring & Alerting:
- Add critical service alerts (PostgreSQL, MySQL, Redis, Headscale,
Authentik, Loki)
- Increase Loki retention from 7 to 30 days (720h)
- Add predictive PV filling alert (predict_linear)
- Re-enable Hackmd and Privatebin down alerts
Phase 4 - Reliability:
- Add resource requests/limits to Redis, DBaaS, Technitium, Headscale,
Vaultwarden, Uptime Kuma
- Increase Alloy DaemonSet memory to 512Mi/1Gi
Phase 6 - Maintainability:
- Extract duplicated tiers locals to terragrunt.hcl generate block
(removed from 67 stacks)
- Replace hardcoded NFS IP 10.0.10.15 with var.nfs_server (114
instances across 63 files)
- Replace hardcoded Redis/PostgreSQL/MySQL/Ollama/mail host references
with variables across ~35 stacks
- Migrate xray raw ingress resources to ingress_factory modules
2026-02-23 22:05:28 +00:00
|
|
|
mysql_host = var.mysql_host
|
2026-03-14 17:15:48 +00:00
|
|
|
homepage_username = local.homepage_credentials["crowdsec"]["username"]
|
|
|
|
|
homepage_password = local.homepage_credentials["crowdsec"]["password"]
|
|
|
|
|
enroll_key = data.vault_kv_secret_v2.secrets.data["crowdsec_enroll_key"]
|
|
|
|
|
db_password = data.vault_kv_secret_v2.secrets.data["crowdsec_db_password"]
|
|
|
|
|
crowdsec_dash_api_key = data.vault_kv_secret_v2.secrets.data["crowdsec_dash_api_key"]
|
|
|
|
|
crowdsec_dash_machine_id = data.vault_kv_secret_v2.secrets.data["crowdsec_dash_machine_id"]
|
|
|
|
|
crowdsec_dash_machine_password = data.vault_kv_secret_v2.secrets.data["crowdsec_dash_machine_password"]
|
|
|
|
|
slack_webhook_url = data.vault_kv_secret_v2.secrets.data["alertmanager_slack_api_url"]
|
[ci skip] Add platform stack (core services) for Terragrunt migration
stacks/platform/ contains 22 core/cluster services: metallb, dbaas, redis,
traefik, technitium, headscale, authentik, rbac, k8s-portal, crowdsec,
monitoring, vaultwarden, reverse-proxy, metrics-server, nvidia, kyverno,
uptime-kuma, wireguard, xray, mailserver, cloudflared, infra-maintenance.
Outputs: tls_secret_name, redis_host, postgresql_host/port, mysql_host/port,
smtp_host/port — consumed by downstream service stacks via dependency blocks.
2026-02-22 13:21:09 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
|
|
# -----------------------------------------------------------------------------
|
|
|
|
|
# Monitoring — Prometheus / Grafana / Loki stack
|
|
|
|
|
# -----------------------------------------------------------------------------
|
|
|
|
|
module "monitoring" {
|
2026-02-22 15:13:55 +00:00
|
|
|
source = "./modules/monitoring"
|
|
|
|
|
tls_secret_name = var.tls_secret_name
|
[ci skip] Infrastructure hardening: security, monitoring, reliability, maintainability
Phase 1 - Critical Security:
- Netbox: move hardcoded DB/superuser passwords to variables
- MeshCentral: disable public registration, add Authentik auth
- Traefik: disable insecure API dashboard (api.insecure=false)
- Traefik: configure forwarded headers with Cloudflare trusted IPs
Phase 2 - Security Hardening:
- Add security headers middleware (HSTS, X-Frame-Options, nosniff, etc.)
- Add Kyverno pod security policies in audit mode (privileged, host
namespaces, SYS_ADMIN, trusted registries)
- Tighten rate limiting (avg=10, burst=50)
- Add Authentik protection to grampsweb
Phase 3 - Monitoring & Alerting:
- Add critical service alerts (PostgreSQL, MySQL, Redis, Headscale,
Authentik, Loki)
- Increase Loki retention from 7 to 30 days (720h)
- Add predictive PV filling alert (predict_linear)
- Re-enable Hackmd and Privatebin down alerts
Phase 4 - Reliability:
- Add resource requests/limits to Redis, DBaaS, Technitium, Headscale,
Vaultwarden, Uptime Kuma
- Increase Alloy DaemonSet memory to 512Mi/1Gi
Phase 6 - Maintainability:
- Extract duplicated tiers locals to terragrunt.hcl generate block
(removed from 67 stacks)
- Replace hardcoded NFS IP 10.0.10.15 with var.nfs_server (114
instances across 63 files)
- Replace hardcoded Redis/PostgreSQL/MySQL/Ollama/mail host references
with variables across ~35 stacks
- Migrate xray raw ingress resources to ingress_factory modules
2026-02-23 22:05:28 +00:00
|
|
|
nfs_server = var.nfs_server
|
|
|
|
|
mysql_host = var.mysql_host
|
2026-03-14 17:15:48 +00:00
|
|
|
alertmanager_account_password = data.vault_kv_secret_v2.secrets.data["alertmanager_account_password"]
|
2026-02-22 15:13:55 +00:00
|
|
|
idrac_username = var.monitoring_idrac_username
|
2026-03-14 17:15:48 +00:00
|
|
|
idrac_password = data.vault_kv_secret_v2.secrets.data["monitoring_idrac_password"]
|
|
|
|
|
alertmanager_slack_api_url = data.vault_kv_secret_v2.secrets.data["alertmanager_slack_api_url"]
|
|
|
|
|
tiny_tuya_service_secret = data.vault_kv_secret_v2.secrets.data["tiny_tuya_service_secret"]
|
|
|
|
|
haos_api_token = data.vault_kv_secret_v2.secrets.data["haos_api_token"]
|
|
|
|
|
pve_password = data.vault_kv_secret_v2.secrets.data["pve_password"]
|
|
|
|
|
grafana_db_password = data.vault_kv_secret_v2.secrets.data["grafana_db_password"]
|
|
|
|
|
grafana_admin_password = data.vault_kv_secret_v2.secrets.data["grafana_admin_password"]
|
2026-02-22 15:13:55 +00:00
|
|
|
tier = local.tiers.cluster
|
[ci skip] Add platform stack (core services) for Terragrunt migration
stacks/platform/ contains 22 core/cluster services: metallb, dbaas, redis,
traefik, technitium, headscale, authentik, rbac, k8s-portal, crowdsec,
monitoring, vaultwarden, reverse-proxy, metrics-server, nvidia, kyverno,
uptime-kuma, wireguard, xray, mailserver, cloudflared, infra-maintenance.
Outputs: tls_secret_name, redis_host, postgresql_host/port, mysql_host/port,
smtp_host/port — consumed by downstream service stacks via dependency blocks.
2026-02-22 13:21:09 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
|
|
# -----------------------------------------------------------------------------
|
|
|
|
|
# Vaultwarden — Password manager
|
|
|
|
|
# -----------------------------------------------------------------------------
|
|
|
|
|
module "vaultwarden" {
|
2026-02-22 15:13:55 +00:00
|
|
|
source = "./modules/vaultwarden"
|
[ci skip] Add platform stack (core services) for Terragrunt migration
stacks/platform/ contains 22 core/cluster services: metallb, dbaas, redis,
traefik, technitium, headscale, authentik, rbac, k8s-portal, crowdsec,
monitoring, vaultwarden, reverse-proxy, metrics-server, nvidia, kyverno,
uptime-kuma, wireguard, xray, mailserver, cloudflared, infra-maintenance.
Outputs: tls_secret_name, redis_host, postgresql_host/port, mysql_host/port,
smtp_host/port — consumed by downstream service stacks via dependency blocks.
2026-02-22 13:21:09 +00:00
|
|
|
tls_secret_name = var.tls_secret_name
|
[ci skip] Infrastructure hardening: security, monitoring, reliability, maintainability
Phase 1 - Critical Security:
- Netbox: move hardcoded DB/superuser passwords to variables
- MeshCentral: disable public registration, add Authentik auth
- Traefik: disable insecure API dashboard (api.insecure=false)
- Traefik: configure forwarded headers with Cloudflare trusted IPs
Phase 2 - Security Hardening:
- Add security headers middleware (HSTS, X-Frame-Options, nosniff, etc.)
- Add Kyverno pod security policies in audit mode (privileged, host
namespaces, SYS_ADMIN, trusted registries)
- Tighten rate limiting (avg=10, burst=50)
- Add Authentik protection to grampsweb
Phase 3 - Monitoring & Alerting:
- Add critical service alerts (PostgreSQL, MySQL, Redis, Headscale,
Authentik, Loki)
- Increase Loki retention from 7 to 30 days (720h)
- Add predictive PV filling alert (predict_linear)
- Re-enable Hackmd and Privatebin down alerts
Phase 4 - Reliability:
- Add resource requests/limits to Redis, DBaaS, Technitium, Headscale,
Vaultwarden, Uptime Kuma
- Increase Alloy DaemonSet memory to 512Mi/1Gi
Phase 6 - Maintainability:
- Extract duplicated tiers locals to terragrunt.hcl generate block
(removed from 67 stacks)
- Replace hardcoded NFS IP 10.0.10.15 with var.nfs_server (114
instances across 63 files)
- Replace hardcoded Redis/PostgreSQL/MySQL/Ollama/mail host references
with variables across ~35 stacks
- Migrate xray raw ingress resources to ingress_factory modules
2026-02-23 22:05:28 +00:00
|
|
|
mail_host = var.mail_host
|
2026-03-14 17:15:48 +00:00
|
|
|
smtp_password = data.vault_kv_secret_v2.secrets.data["vaultwarden_smtp_password"]
|
[ci skip] Add platform stack (core services) for Terragrunt migration
stacks/platform/ contains 22 core/cluster services: metallb, dbaas, redis,
traefik, technitium, headscale, authentik, rbac, k8s-portal, crowdsec,
monitoring, vaultwarden, reverse-proxy, metrics-server, nvidia, kyverno,
uptime-kuma, wireguard, xray, mailserver, cloudflared, infra-maintenance.
Outputs: tls_secret_name, redis_host, postgresql_host/port, mysql_host/port,
smtp_host/port — consumed by downstream service stacks via dependency blocks.
2026-02-22 13:21:09 +00:00
|
|
|
tier = local.tiers.edge
|
2026-03-15 00:03:59 +00:00
|
|
|
nfs_server = var.nfs_server
|
[ci skip] Add platform stack (core services) for Terragrunt migration
stacks/platform/ contains 22 core/cluster services: metallb, dbaas, redis,
traefik, technitium, headscale, authentik, rbac, k8s-portal, crowdsec,
monitoring, vaultwarden, reverse-proxy, metrics-server, nvidia, kyverno,
uptime-kuma, wireguard, xray, mailserver, cloudflared, infra-maintenance.
Outputs: tls_secret_name, redis_host, postgresql_host/port, mysql_host/port,
smtp_host/port — consumed by downstream service stacks via dependency blocks.
2026-02-22 13:21:09 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
|
|
# -----------------------------------------------------------------------------
|
|
|
|
|
# Reverse Proxy — Generic reverse proxy
|
|
|
|
|
# -----------------------------------------------------------------------------
|
|
|
|
|
module "reverse-proxy" {
|
2026-02-22 15:13:55 +00:00
|
|
|
source = "./modules/reverse_proxy"
|
[ci skip] Add platform stack (core services) for Terragrunt migration
stacks/platform/ contains 22 core/cluster services: metallb, dbaas, redis,
traefik, technitium, headscale, authentik, rbac, k8s-portal, crowdsec,
monitoring, vaultwarden, reverse-proxy, metrics-server, nvidia, kyverno,
uptime-kuma, wireguard, xray, mailserver, cloudflared, infra-maintenance.
Outputs: tls_secret_name, redis_host, postgresql_host/port, mysql_host/port,
smtp_host/port — consumed by downstream service stacks via dependency blocks.
2026-02-22 13:21:09 +00:00
|
|
|
tls_secret_name = var.tls_secret_name
|
2026-03-14 17:15:48 +00:00
|
|
|
truenas_homepage_token = local.homepage_credentials["reverse_proxy"]["truenas_token"]
|
|
|
|
|
pfsense_homepage_token = local.homepage_credentials["reverse_proxy"]["pfsense_token"]
|
|
|
|
|
haos_homepage_token = try(local.homepage_credentials["home_assistant"]["token"], "")
|
[ci skip] Add platform stack (core services) for Terragrunt migration
stacks/platform/ contains 22 core/cluster services: metallb, dbaas, redis,
traefik, technitium, headscale, authentik, rbac, k8s-portal, crowdsec,
monitoring, vaultwarden, reverse-proxy, metrics-server, nvidia, kyverno,
uptime-kuma, wireguard, xray, mailserver, cloudflared, infra-maintenance.
Outputs: tls_secret_name, redis_host, postgresql_host/port, mysql_host/port,
smtp_host/port — consumed by downstream service stacks via dependency blocks.
2026-02-22 13:21:09 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
|
|
# -----------------------------------------------------------------------------
|
|
|
|
|
# Metrics Server — Kubernetes metrics
|
|
|
|
|
# -----------------------------------------------------------------------------
|
|
|
|
|
module "metrics-server" {
|
2026-02-22 15:13:55 +00:00
|
|
|
source = "./modules/metrics-server"
|
[ci skip] Add platform stack (core services) for Terragrunt migration
stacks/platform/ contains 22 core/cluster services: metallb, dbaas, redis,
traefik, technitium, headscale, authentik, rbac, k8s-portal, crowdsec,
monitoring, vaultwarden, reverse-proxy, metrics-server, nvidia, kyverno,
uptime-kuma, wireguard, xray, mailserver, cloudflared, infra-maintenance.
Outputs: tls_secret_name, redis_host, postgresql_host/port, mysql_host/port,
smtp_host/port — consumed by downstream service stacks via dependency blocks.
2026-02-22 13:21:09 +00:00
|
|
|
tier = local.tiers.cluster
|
|
|
|
|
tls_secret_name = var.tls_secret_name
|
|
|
|
|
}
|
|
|
|
|
|
2026-02-25 21:54:01 +00:00
|
|
|
# -----------------------------------------------------------------------------
|
|
|
|
|
# VPA + Goldilocks — Vertical Pod Autoscaler & resource dashboard
|
|
|
|
|
# -----------------------------------------------------------------------------
|
|
|
|
|
module "vpa" {
|
|
|
|
|
source = "./modules/vpa"
|
|
|
|
|
tls_secret_name = var.tls_secret_name
|
|
|
|
|
tier = local.tiers.cluster
|
|
|
|
|
}
|
|
|
|
|
|
2026-03-01 23:38:58 +00:00
|
|
|
# -----------------------------------------------------------------------------
|
|
|
|
|
# NFS CSI — CSI driver for NFS with soft mount options (no stale mount hangs)
|
|
|
|
|
# -----------------------------------------------------------------------------
|
|
|
|
|
module "nfs-csi" {
|
|
|
|
|
source = "./modules/nfs-csi"
|
|
|
|
|
tier = local.tiers.cluster
|
|
|
|
|
nfs_server = var.nfs_server
|
|
|
|
|
}
|
|
|
|
|
|
[ci skip] iSCSI migration, healthcheck fixes, health probes, etcd backup
- Migrate MySQL/PostgreSQL storage from local-path to iscsi-truenas
- Add democratic-csi iSCSI driver module for TrueNAS
- Add open-iscsi to cloud-init VM template
- Fix Shlink health probe path (/api/v3 -> /rest/v3 for Shlink 5.0)
- Fix etcd backup: use etcd 3.5.21-0 (3.6.x is distroless, no /bin/sh)
- Fix cluster healthcheck CronJob: always exit 0 to prevent circular
JobFailed alerts (reporting via Slack, not exit codes)
- Fix Uptime Kuma nested list handling in cluster-health.sh
- Add health probes to: audiobookshelf, immich ML, ntfy, headscale,
uptime-kuma, vaultwarden, rybbit (clickhouse + server + client),
shlink, shlink-web
- Add iSCSI storage documentation to CLAUDE.md
2026-03-06 19:54:21 +00:00
|
|
|
# -----------------------------------------------------------------------------
|
|
|
|
|
# iSCSI CSI — democratic-csi for TrueNAS iSCSI (database storage)
|
|
|
|
|
# -----------------------------------------------------------------------------
|
|
|
|
|
module "iscsi-csi" {
|
2026-03-14 08:51:45 +00:00
|
|
|
source = "./modules/iscsi-csi"
|
|
|
|
|
tier = local.tiers.cluster
|
|
|
|
|
truenas_host = var.nfs_server # Same TrueNAS host
|
2026-03-14 17:15:48 +00:00
|
|
|
truenas_api_key = data.vault_kv_secret_v2.secrets.data["truenas_api_key"]
|
|
|
|
|
truenas_ssh_private_key = data.vault_kv_secret_v2.secrets.data["truenas_ssh_private_key"]
|
[ci skip] iSCSI migration, healthcheck fixes, health probes, etcd backup
- Migrate MySQL/PostgreSQL storage from local-path to iscsi-truenas
- Add democratic-csi iSCSI driver module for TrueNAS
- Add open-iscsi to cloud-init VM template
- Fix Shlink health probe path (/api/v3 -> /rest/v3 for Shlink 5.0)
- Fix etcd backup: use etcd 3.5.21-0 (3.6.x is distroless, no /bin/sh)
- Fix cluster healthcheck CronJob: always exit 0 to prevent circular
JobFailed alerts (reporting via Slack, not exit codes)
- Fix Uptime Kuma nested list handling in cluster-health.sh
- Add health probes to: audiobookshelf, immich ML, ntfy, headscale,
uptime-kuma, vaultwarden, rybbit (clickhouse + server + client),
shlink, shlink-web
- Add iSCSI storage documentation to CLAUDE.md
2026-03-06 19:54:21 +00:00
|
|
|
}
|
|
|
|
|
|
2026-02-28 17:22:53 +00:00
|
|
|
# -----------------------------------------------------------------------------
|
|
|
|
|
# CNPG — CloudNativePG Operator + local-path-provisioner for database storage
|
|
|
|
|
# -----------------------------------------------------------------------------
|
|
|
|
|
module "cnpg" {
|
|
|
|
|
source = "./modules/cnpg"
|
|
|
|
|
tier = local.tiers.cluster
|
|
|
|
|
}
|
|
|
|
|
|
2026-03-08 19:49:48 +00:00
|
|
|
# -----------------------------------------------------------------------------
|
|
|
|
|
# Sealed Secrets — encrypts secrets for safe git storage
|
|
|
|
|
# -----------------------------------------------------------------------------
|
|
|
|
|
module "sealed-secrets" {
|
|
|
|
|
source = "./modules/sealed-secrets"
|
|
|
|
|
tier = local.tiers.cluster
|
|
|
|
|
}
|
|
|
|
|
|
[ci skip] Add platform stack (core services) for Terragrunt migration
stacks/platform/ contains 22 core/cluster services: metallb, dbaas, redis,
traefik, technitium, headscale, authentik, rbac, k8s-portal, crowdsec,
monitoring, vaultwarden, reverse-proxy, metrics-server, nvidia, kyverno,
uptime-kuma, wireguard, xray, mailserver, cloudflared, infra-maintenance.
Outputs: tls_secret_name, redis_host, postgresql_host/port, mysql_host/port,
smtp_host/port — consumed by downstream service stacks via dependency blocks.
2026-02-22 13:21:09 +00:00
|
|
|
# -----------------------------------------------------------------------------
|
|
|
|
|
# NVIDIA — GPU device plugin
|
|
|
|
|
# -----------------------------------------------------------------------------
|
|
|
|
|
module "nvidia" {
|
2026-02-22 15:13:55 +00:00
|
|
|
source = "./modules/nvidia"
|
[ci skip] Add platform stack (core services) for Terragrunt migration
stacks/platform/ contains 22 core/cluster services: metallb, dbaas, redis,
traefik, technitium, headscale, authentik, rbac, k8s-portal, crowdsec,
monitoring, vaultwarden, reverse-proxy, metrics-server, nvidia, kyverno,
uptime-kuma, wireguard, xray, mailserver, cloudflared, infra-maintenance.
Outputs: tls_secret_name, redis_host, postgresql_host/port, mysql_host/port,
smtp_host/port — consumed by downstream service stacks via dependency blocks.
2026-02-22 13:21:09 +00:00
|
|
|
tls_secret_name = var.tls_secret_name
|
|
|
|
|
tier = local.tiers.gpu
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
# -----------------------------------------------------------------------------
|
|
|
|
|
# Kyverno — Policy engine
|
|
|
|
|
# -----------------------------------------------------------------------------
|
|
|
|
|
module "kyverno" {
|
2026-02-22 14:38:14 +00:00
|
|
|
source = "./modules/kyverno"
|
[ci skip] Add platform stack (core services) for Terragrunt migration
stacks/platform/ contains 22 core/cluster services: metallb, dbaas, redis,
traefik, technitium, headscale, authentik, rbac, k8s-portal, crowdsec,
monitoring, vaultwarden, reverse-proxy, metrics-server, nvidia, kyverno,
uptime-kuma, wireguard, xray, mailserver, cloudflared, infra-maintenance.
Outputs: tls_secret_name, redis_host, postgresql_host/port, mysql_host/port,
smtp_host/port — consumed by downstream service stacks via dependency blocks.
2026-02-22 13:21:09 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
|
|
# -----------------------------------------------------------------------------
|
|
|
|
|
# Uptime Kuma — Status monitoring
|
|
|
|
|
# -----------------------------------------------------------------------------
|
|
|
|
|
module "uptime-kuma" {
|
2026-02-22 15:13:55 +00:00
|
|
|
source = "./modules/uptime-kuma"
|
[ci skip] Add platform stack (core services) for Terragrunt migration
stacks/platform/ contains 22 core/cluster services: metallb, dbaas, redis,
traefik, technitium, headscale, authentik, rbac, k8s-portal, crowdsec,
monitoring, vaultwarden, reverse-proxy, metrics-server, nvidia, kyverno,
uptime-kuma, wireguard, xray, mailserver, cloudflared, infra-maintenance.
Outputs: tls_secret_name, redis_host, postgresql_host/port, mysql_host/port,
smtp_host/port — consumed by downstream service stacks via dependency blocks.
2026-02-22 13:21:09 +00:00
|
|
|
tls_secret_name = var.tls_secret_name
|
[ci skip] Infrastructure hardening: security, monitoring, reliability, maintainability
Phase 1 - Critical Security:
- Netbox: move hardcoded DB/superuser passwords to variables
- MeshCentral: disable public registration, add Authentik auth
- Traefik: disable insecure API dashboard (api.insecure=false)
- Traefik: configure forwarded headers with Cloudflare trusted IPs
Phase 2 - Security Hardening:
- Add security headers middleware (HSTS, X-Frame-Options, nosniff, etc.)
- Add Kyverno pod security policies in audit mode (privileged, host
namespaces, SYS_ADMIN, trusted registries)
- Tighten rate limiting (avg=10, burst=50)
- Add Authentik protection to grampsweb
Phase 3 - Monitoring & Alerting:
- Add critical service alerts (PostgreSQL, MySQL, Redis, Headscale,
Authentik, Loki)
- Increase Loki retention from 7 to 30 days (720h)
- Add predictive PV filling alert (predict_linear)
- Re-enable Hackmd and Privatebin down alerts
Phase 4 - Reliability:
- Add resource requests/limits to Redis, DBaaS, Technitium, Headscale,
Vaultwarden, Uptime Kuma
- Increase Alloy DaemonSet memory to 512Mi/1Gi
Phase 6 - Maintainability:
- Extract duplicated tiers locals to terragrunt.hcl generate block
(removed from 67 stacks)
- Replace hardcoded NFS IP 10.0.10.15 with var.nfs_server (114
instances across 63 files)
- Replace hardcoded Redis/PostgreSQL/MySQL/Ollama/mail host references
with variables across ~35 stacks
- Migrate xray raw ingress resources to ingress_factory modules
2026-02-23 22:05:28 +00:00
|
|
|
nfs_server = var.nfs_server
|
[ci skip] Add platform stack (core services) for Terragrunt migration
stacks/platform/ contains 22 core/cluster services: metallb, dbaas, redis,
traefik, technitium, headscale, authentik, rbac, k8s-portal, crowdsec,
monitoring, vaultwarden, reverse-proxy, metrics-server, nvidia, kyverno,
uptime-kuma, wireguard, xray, mailserver, cloudflared, infra-maintenance.
Outputs: tls_secret_name, redis_host, postgresql_host/port, mysql_host/port,
smtp_host/port — consumed by downstream service stacks via dependency blocks.
2026-02-22 13:21:09 +00:00
|
|
|
tier = local.tiers.cluster
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
# -----------------------------------------------------------------------------
|
|
|
|
|
# WireGuard — VPN server
|
|
|
|
|
# -----------------------------------------------------------------------------
|
|
|
|
|
module "wireguard" {
|
2026-02-22 15:13:55 +00:00
|
|
|
source = "./modules/wireguard"
|
[ci skip] Add platform stack (core services) for Terragrunt migration
stacks/platform/ contains 22 core/cluster services: metallb, dbaas, redis,
traefik, technitium, headscale, authentik, rbac, k8s-portal, crowdsec,
monitoring, vaultwarden, reverse-proxy, metrics-server, nvidia, kyverno,
uptime-kuma, wireguard, xray, mailserver, cloudflared, infra-maintenance.
Outputs: tls_secret_name, redis_host, postgresql_host/port, mysql_host/port,
smtp_host/port — consumed by downstream service stacks via dependency blocks.
2026-02-22 13:21:09 +00:00
|
|
|
tls_secret_name = var.tls_secret_name
|
2026-03-14 17:15:48 +00:00
|
|
|
wg_0_conf = data.vault_kv_secret_v2.secrets.data["wireguard_wg_0_conf"]
|
|
|
|
|
wg_0_key = data.vault_kv_secret_v2.secrets.data["wireguard_wg_0_key"]
|
|
|
|
|
firewall_sh = data.vault_kv_secret_v2.secrets.data["wireguard_firewall_sh"]
|
[ci skip] Add platform stack (core services) for Terragrunt migration
stacks/platform/ contains 22 core/cluster services: metallb, dbaas, redis,
traefik, technitium, headscale, authentik, rbac, k8s-portal, crowdsec,
monitoring, vaultwarden, reverse-proxy, metrics-server, nvidia, kyverno,
uptime-kuma, wireguard, xray, mailserver, cloudflared, infra-maintenance.
Outputs: tls_secret_name, redis_host, postgresql_host/port, mysql_host/port,
smtp_host/port — consumed by downstream service stacks via dependency blocks.
2026-02-22 13:21:09 +00:00
|
|
|
tier = local.tiers.core
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
# -----------------------------------------------------------------------------
|
|
|
|
|
# Xray — Proxy/tunnel
|
|
|
|
|
# -----------------------------------------------------------------------------
|
|
|
|
|
module "xray" {
|
2026-02-22 15:13:55 +00:00
|
|
|
source = "./modules/xray"
|
[ci skip] Add platform stack (core services) for Terragrunt migration
stacks/platform/ contains 22 core/cluster services: metallb, dbaas, redis,
traefik, technitium, headscale, authentik, rbac, k8s-portal, crowdsec,
monitoring, vaultwarden, reverse-proxy, metrics-server, nvidia, kyverno,
uptime-kuma, wireguard, xray, mailserver, cloudflared, infra-maintenance.
Outputs: tls_secret_name, redis_host, postgresql_host/port, mysql_host/port,
smtp_host/port — consumed by downstream service stacks via dependency blocks.
2026-02-22 13:21:09 +00:00
|
|
|
tls_secret_name = var.tls_secret_name
|
|
|
|
|
tier = local.tiers.core
|
|
|
|
|
|
2026-03-14 17:15:48 +00:00
|
|
|
xray_reality_clients = local.xray_reality_clients
|
|
|
|
|
xray_reality_private_key = data.vault_kv_secret_v2.secrets.data["xray_reality_private_key"]
|
|
|
|
|
xray_reality_short_ids = local.xray_reality_short_ids
|
[ci skip] Add platform stack (core services) for Terragrunt migration
stacks/platform/ contains 22 core/cluster services: metallb, dbaas, redis,
traefik, technitium, headscale, authentik, rbac, k8s-portal, crowdsec,
monitoring, vaultwarden, reverse-proxy, metrics-server, nvidia, kyverno,
uptime-kuma, wireguard, xray, mailserver, cloudflared, infra-maintenance.
Outputs: tls_secret_name, redis_host, postgresql_host/port, mysql_host/port,
smtp_host/port — consumed by downstream service stacks via dependency blocks.
2026-02-22 13:21:09 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
|
|
# -----------------------------------------------------------------------------
|
|
|
|
|
# Mailserver — docker-mailserver
|
|
|
|
|
# -----------------------------------------------------------------------------
|
|
|
|
|
module "mailserver" {
|
2026-02-22 15:13:55 +00:00
|
|
|
source = "./modules/mailserver"
|
[ci skip] Add platform stack (core services) for Terragrunt migration
stacks/platform/ contains 22 core/cluster services: metallb, dbaas, redis,
traefik, technitium, headscale, authentik, rbac, k8s-portal, crowdsec,
monitoring, vaultwarden, reverse-proxy, metrics-server, nvidia, kyverno,
uptime-kuma, wireguard, xray, mailserver, cloudflared, infra-maintenance.
Outputs: tls_secret_name, redis_host, postgresql_host/port, mysql_host/port,
smtp_host/port — consumed by downstream service stacks via dependency blocks.
2026-02-22 13:21:09 +00:00
|
|
|
tls_secret_name = var.tls_secret_name
|
[ci skip] Infrastructure hardening: security, monitoring, reliability, maintainability
Phase 1 - Critical Security:
- Netbox: move hardcoded DB/superuser passwords to variables
- MeshCentral: disable public registration, add Authentik auth
- Traefik: disable insecure API dashboard (api.insecure=false)
- Traefik: configure forwarded headers with Cloudflare trusted IPs
Phase 2 - Security Hardening:
- Add security headers middleware (HSTS, X-Frame-Options, nosniff, etc.)
- Add Kyverno pod security policies in audit mode (privileged, host
namespaces, SYS_ADMIN, trusted registries)
- Tighten rate limiting (avg=10, burst=50)
- Add Authentik protection to grampsweb
Phase 3 - Monitoring & Alerting:
- Add critical service alerts (PostgreSQL, MySQL, Redis, Headscale,
Authentik, Loki)
- Increase Loki retention from 7 to 30 days (720h)
- Add predictive PV filling alert (predict_linear)
- Re-enable Hackmd and Privatebin down alerts
Phase 4 - Reliability:
- Add resource requests/limits to Redis, DBaaS, Technitium, Headscale,
Vaultwarden, Uptime Kuma
- Increase Alloy DaemonSet memory to 512Mi/1Gi
Phase 6 - Maintainability:
- Extract duplicated tiers locals to terragrunt.hcl generate block
(removed from 67 stacks)
- Replace hardcoded NFS IP 10.0.10.15 with var.nfs_server (114
instances across 63 files)
- Replace hardcoded Redis/PostgreSQL/MySQL/Ollama/mail host references
with variables across ~35 stacks
- Migrate xray raw ingress resources to ingress_factory modules
2026-02-23 22:05:28 +00:00
|
|
|
nfs_server = var.nfs_server
|
|
|
|
|
mysql_host = var.mysql_host
|
2026-03-14 17:15:48 +00:00
|
|
|
mailserver_accounts = local.mailserver_accounts
|
|
|
|
|
postfix_account_aliases = local.mailserver_aliases
|
|
|
|
|
opendkim_key = local.mailserver_opendkim_key
|
|
|
|
|
sasl_passwd = local.mailserver_sasl_passwd
|
|
|
|
|
roundcube_db_password = data.vault_kv_secret_v2.secrets.data["mailserver_roundcubemail_db_password"]
|
[ci skip] Add platform stack (core services) for Terragrunt migration
stacks/platform/ contains 22 core/cluster services: metallb, dbaas, redis,
traefik, technitium, headscale, authentik, rbac, k8s-portal, crowdsec,
monitoring, vaultwarden, reverse-proxy, metrics-server, nvidia, kyverno,
uptime-kuma, wireguard, xray, mailserver, cloudflared, infra-maintenance.
Outputs: tls_secret_name, redis_host, postgresql_host/port, mysql_host/port,
smtp_host/port — consumed by downstream service stacks via dependency blocks.
2026-02-22 13:21:09 +00:00
|
|
|
tier = local.tiers.edge
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
# -----------------------------------------------------------------------------
|
|
|
|
|
# Cloudflared — Cloudflare tunnel + DNS records
|
|
|
|
|
# -----------------------------------------------------------------------------
|
|
|
|
|
module "cloudflared" {
|
2026-02-22 15:13:55 +00:00
|
|
|
source = "./modules/cloudflared"
|
[ci skip] Add platform stack (core services) for Terragrunt migration
stacks/platform/ contains 22 core/cluster services: metallb, dbaas, redis,
traefik, technitium, headscale, authentik, rbac, k8s-portal, crowdsec,
monitoring, vaultwarden, reverse-proxy, metrics-server, nvidia, kyverno,
uptime-kuma, wireguard, xray, mailserver, cloudflared, infra-maintenance.
Outputs: tls_secret_name, redis_host, postgresql_host/port, mysql_host/port,
smtp_host/port — consumed by downstream service stacks via dependency blocks.
2026-02-22 13:21:09 +00:00
|
|
|
tier = local.tiers.core
|
|
|
|
|
tls_secret_name = var.tls_secret_name
|
|
|
|
|
|
2026-03-14 17:15:48 +00:00
|
|
|
cloudflare_api_key = data.vault_kv_secret_v2.secrets.data["cloudflare_api_key"]
|
[ci skip] Add platform stack (core services) for Terragrunt migration
stacks/platform/ contains 22 core/cluster services: metallb, dbaas, redis,
traefik, technitium, headscale, authentik, rbac, k8s-portal, crowdsec,
monitoring, vaultwarden, reverse-proxy, metrics-server, nvidia, kyverno,
uptime-kuma, wireguard, xray, mailserver, cloudflared, infra-maintenance.
Outputs: tls_secret_name, redis_host, postgresql_host/port, mysql_host/port,
smtp_host/port — consumed by downstream service stacks via dependency blocks.
2026-02-22 13:21:09 +00:00
|
|
|
cloudflare_email = var.cloudflare_email
|
|
|
|
|
cloudflare_account_id = var.cloudflare_account_id
|
|
|
|
|
cloudflare_zone_id = var.cloudflare_zone_id
|
|
|
|
|
cloudflare_tunnel_id = var.cloudflare_tunnel_id
|
|
|
|
|
public_ip = var.public_ip
|
2026-03-15 23:36:46 +00:00
|
|
|
cloudflare_proxied_names = concat(var.cloudflare_proxied_names, nonsensitive(local.user_domains))
|
[ci skip] Add platform stack (core services) for Terragrunt migration
stacks/platform/ contains 22 core/cluster services: metallb, dbaas, redis,
traefik, technitium, headscale, authentik, rbac, k8s-portal, crowdsec,
monitoring, vaultwarden, reverse-proxy, metrics-server, nvidia, kyverno,
uptime-kuma, wireguard, xray, mailserver, cloudflared, infra-maintenance.
Outputs: tls_secret_name, redis_host, postgresql_host/port, mysql_host/port,
smtp_host/port — consumed by downstream service stacks via dependency blocks.
2026-02-22 13:21:09 +00:00
|
|
|
cloudflare_non_proxied_names = var.cloudflare_non_proxied_names
|
2026-03-14 17:15:48 +00:00
|
|
|
cloudflare_tunnel_token = data.vault_kv_secret_v2.secrets.data["cloudflare_tunnel_token"]
|
[ci skip] Add platform stack (core services) for Terragrunt migration
stacks/platform/ contains 22 core/cluster services: metallb, dbaas, redis,
traefik, technitium, headscale, authentik, rbac, k8s-portal, crowdsec,
monitoring, vaultwarden, reverse-proxy, metrics-server, nvidia, kyverno,
uptime-kuma, wireguard, xray, mailserver, cloudflared, infra-maintenance.
Outputs: tls_secret_name, redis_host, postgresql_host/port, mysql_host/port,
smtp_host/port — consumed by downstream service stacks via dependency blocks.
2026-02-22 13:21:09 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
|
|
# -----------------------------------------------------------------------------
|
|
|
|
|
# Infra Maintenance — Automated maintenance jobs
|
|
|
|
|
# -----------------------------------------------------------------------------
|
|
|
|
|
module "infra-maintenance" {
|
2026-02-22 15:13:55 +00:00
|
|
|
source = "./modules/infra-maintenance"
|
[ci skip] Infrastructure hardening: security, monitoring, reliability, maintainability
Phase 1 - Critical Security:
- Netbox: move hardcoded DB/superuser passwords to variables
- MeshCentral: disable public registration, add Authentik auth
- Traefik: disable insecure API dashboard (api.insecure=false)
- Traefik: configure forwarded headers with Cloudflare trusted IPs
Phase 2 - Security Hardening:
- Add security headers middleware (HSTS, X-Frame-Options, nosniff, etc.)
- Add Kyverno pod security policies in audit mode (privileged, host
namespaces, SYS_ADMIN, trusted registries)
- Tighten rate limiting (avg=10, burst=50)
- Add Authentik protection to grampsweb
Phase 3 - Monitoring & Alerting:
- Add critical service alerts (PostgreSQL, MySQL, Redis, Headscale,
Authentik, Loki)
- Increase Loki retention from 7 to 30 days (720h)
- Add predictive PV filling alert (predict_linear)
- Re-enable Hackmd and Privatebin down alerts
Phase 4 - Reliability:
- Add resource requests/limits to Redis, DBaaS, Technitium, Headscale,
Vaultwarden, Uptime Kuma
- Increase Alloy DaemonSet memory to 512Mi/1Gi
Phase 6 - Maintainability:
- Extract duplicated tiers locals to terragrunt.hcl generate block
(removed from 67 stacks)
- Replace hardcoded NFS IP 10.0.10.15 with var.nfs_server (114
instances across 63 files)
- Replace hardcoded Redis/PostgreSQL/MySQL/Ollama/mail host references
with variables across ~35 stacks
- Migrate xray raw ingress resources to ingress_factory modules
2026-02-23 22:05:28 +00:00
|
|
|
nfs_server = var.nfs_server
|
2026-03-14 17:15:48 +00:00
|
|
|
git_user = data.vault_kv_secret_v2.secrets.data["webhook_handler_git_user"]
|
|
|
|
|
git_token = data.vault_kv_secret_v2.secrets.data["webhook_handler_git_token"]
|
|
|
|
|
technitium_username = data.vault_kv_secret_v2.secrets.data["technitium_username"]
|
|
|
|
|
technitium_password = data.vault_kv_secret_v2.secrets.data["technitium_password"]
|
[ci skip] Add platform stack (core services) for Terragrunt migration
stacks/platform/ contains 22 core/cluster services: metallb, dbaas, redis,
traefik, technitium, headscale, authentik, rbac, k8s-portal, crowdsec,
monitoring, vaultwarden, reverse-proxy, metrics-server, nvidia, kyverno,
uptime-kuma, wireguard, xray, mailserver, cloudflared, infra-maintenance.
Outputs: tls_secret_name, redis_host, postgresql_host/port, mysql_host/port,
smtp_host/port — consumed by downstream service stacks via dependency blocks.
2026-02-22 13:21:09 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
|
|
# =============================================================================
|
|
|
|
|
# Outputs (consumed by service stacks via Terragrunt dependency)
|
|
|
|
|
# =============================================================================
|
|
|
|
|
|
|
|
|
|
output "tls_secret_name" {
|
|
|
|
|
value = var.tls_secret_name
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
output "redis_host" {
|
[ci skip] Infrastructure hardening: security, monitoring, reliability, maintainability
Phase 1 - Critical Security:
- Netbox: move hardcoded DB/superuser passwords to variables
- MeshCentral: disable public registration, add Authentik auth
- Traefik: disable insecure API dashboard (api.insecure=false)
- Traefik: configure forwarded headers with Cloudflare trusted IPs
Phase 2 - Security Hardening:
- Add security headers middleware (HSTS, X-Frame-Options, nosniff, etc.)
- Add Kyverno pod security policies in audit mode (privileged, host
namespaces, SYS_ADMIN, trusted registries)
- Tighten rate limiting (avg=10, burst=50)
- Add Authentik protection to grampsweb
Phase 3 - Monitoring & Alerting:
- Add critical service alerts (PostgreSQL, MySQL, Redis, Headscale,
Authentik, Loki)
- Increase Loki retention from 7 to 30 days (720h)
- Add predictive PV filling alert (predict_linear)
- Re-enable Hackmd and Privatebin down alerts
Phase 4 - Reliability:
- Add resource requests/limits to Redis, DBaaS, Technitium, Headscale,
Vaultwarden, Uptime Kuma
- Increase Alloy DaemonSet memory to 512Mi/1Gi
Phase 6 - Maintainability:
- Extract duplicated tiers locals to terragrunt.hcl generate block
(removed from 67 stacks)
- Replace hardcoded NFS IP 10.0.10.15 with var.nfs_server (114
instances across 63 files)
- Replace hardcoded Redis/PostgreSQL/MySQL/Ollama/mail host references
with variables across ~35 stacks
- Migrate xray raw ingress resources to ingress_factory modules
2026-02-23 22:05:28 +00:00
|
|
|
value = var.redis_host
|
[ci skip] Add platform stack (core services) for Terragrunt migration
stacks/platform/ contains 22 core/cluster services: metallb, dbaas, redis,
traefik, technitium, headscale, authentik, rbac, k8s-portal, crowdsec,
monitoring, vaultwarden, reverse-proxy, metrics-server, nvidia, kyverno,
uptime-kuma, wireguard, xray, mailserver, cloudflared, infra-maintenance.
Outputs: tls_secret_name, redis_host, postgresql_host/port, mysql_host/port,
smtp_host/port — consumed by downstream service stacks via dependency blocks.
2026-02-22 13:21:09 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
|
|
output "postgresql_host" {
|
[ci skip] Infrastructure hardening: security, monitoring, reliability, maintainability
Phase 1 - Critical Security:
- Netbox: move hardcoded DB/superuser passwords to variables
- MeshCentral: disable public registration, add Authentik auth
- Traefik: disable insecure API dashboard (api.insecure=false)
- Traefik: configure forwarded headers with Cloudflare trusted IPs
Phase 2 - Security Hardening:
- Add security headers middleware (HSTS, X-Frame-Options, nosniff, etc.)
- Add Kyverno pod security policies in audit mode (privileged, host
namespaces, SYS_ADMIN, trusted registries)
- Tighten rate limiting (avg=10, burst=50)
- Add Authentik protection to grampsweb
Phase 3 - Monitoring & Alerting:
- Add critical service alerts (PostgreSQL, MySQL, Redis, Headscale,
Authentik, Loki)
- Increase Loki retention from 7 to 30 days (720h)
- Add predictive PV filling alert (predict_linear)
- Re-enable Hackmd and Privatebin down alerts
Phase 4 - Reliability:
- Add resource requests/limits to Redis, DBaaS, Technitium, Headscale,
Vaultwarden, Uptime Kuma
- Increase Alloy DaemonSet memory to 512Mi/1Gi
Phase 6 - Maintainability:
- Extract duplicated tiers locals to terragrunt.hcl generate block
(removed from 67 stacks)
- Replace hardcoded NFS IP 10.0.10.15 with var.nfs_server (114
instances across 63 files)
- Replace hardcoded Redis/PostgreSQL/MySQL/Ollama/mail host references
with variables across ~35 stacks
- Migrate xray raw ingress resources to ingress_factory modules
2026-02-23 22:05:28 +00:00
|
|
|
value = var.postgresql_host
|
[ci skip] Add platform stack (core services) for Terragrunt migration
stacks/platform/ contains 22 core/cluster services: metallb, dbaas, redis,
traefik, technitium, headscale, authentik, rbac, k8s-portal, crowdsec,
monitoring, vaultwarden, reverse-proxy, metrics-server, nvidia, kyverno,
uptime-kuma, wireguard, xray, mailserver, cloudflared, infra-maintenance.
Outputs: tls_secret_name, redis_host, postgresql_host/port, mysql_host/port,
smtp_host/port — consumed by downstream service stacks via dependency blocks.
2026-02-22 13:21:09 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
|
|
output "postgresql_port" {
|
|
|
|
|
value = 5432
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
output "mysql_host" {
|
[ci skip] Infrastructure hardening: security, monitoring, reliability, maintainability
Phase 1 - Critical Security:
- Netbox: move hardcoded DB/superuser passwords to variables
- MeshCentral: disable public registration, add Authentik auth
- Traefik: disable insecure API dashboard (api.insecure=false)
- Traefik: configure forwarded headers with Cloudflare trusted IPs
Phase 2 - Security Hardening:
- Add security headers middleware (HSTS, X-Frame-Options, nosniff, etc.)
- Add Kyverno pod security policies in audit mode (privileged, host
namespaces, SYS_ADMIN, trusted registries)
- Tighten rate limiting (avg=10, burst=50)
- Add Authentik protection to grampsweb
Phase 3 - Monitoring & Alerting:
- Add critical service alerts (PostgreSQL, MySQL, Redis, Headscale,
Authentik, Loki)
- Increase Loki retention from 7 to 30 days (720h)
- Add predictive PV filling alert (predict_linear)
- Re-enable Hackmd and Privatebin down alerts
Phase 4 - Reliability:
- Add resource requests/limits to Redis, DBaaS, Technitium, Headscale,
Vaultwarden, Uptime Kuma
- Increase Alloy DaemonSet memory to 512Mi/1Gi
Phase 6 - Maintainability:
- Extract duplicated tiers locals to terragrunt.hcl generate block
(removed from 67 stacks)
- Replace hardcoded NFS IP 10.0.10.15 with var.nfs_server (114
instances across 63 files)
- Replace hardcoded Redis/PostgreSQL/MySQL/Ollama/mail host references
with variables across ~35 stacks
- Migrate xray raw ingress resources to ingress_factory modules
2026-02-23 22:05:28 +00:00
|
|
|
value = var.mysql_host
|
[ci skip] Add platform stack (core services) for Terragrunt migration
stacks/platform/ contains 22 core/cluster services: metallb, dbaas, redis,
traefik, technitium, headscale, authentik, rbac, k8s-portal, crowdsec,
monitoring, vaultwarden, reverse-proxy, metrics-server, nvidia, kyverno,
uptime-kuma, wireguard, xray, mailserver, cloudflared, infra-maintenance.
Outputs: tls_secret_name, redis_host, postgresql_host/port, mysql_host/port,
smtp_host/port — consumed by downstream service stacks via dependency blocks.
2026-02-22 13:21:09 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
|
|
output "mysql_port" {
|
|
|
|
|
value = 3306
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
output "smtp_host" {
|
[ci skip] Infrastructure hardening: security, monitoring, reliability, maintainability
Phase 1 - Critical Security:
- Netbox: move hardcoded DB/superuser passwords to variables
- MeshCentral: disable public registration, add Authentik auth
- Traefik: disable insecure API dashboard (api.insecure=false)
- Traefik: configure forwarded headers with Cloudflare trusted IPs
Phase 2 - Security Hardening:
- Add security headers middleware (HSTS, X-Frame-Options, nosniff, etc.)
- Add Kyverno pod security policies in audit mode (privileged, host
namespaces, SYS_ADMIN, trusted registries)
- Tighten rate limiting (avg=10, burst=50)
- Add Authentik protection to grampsweb
Phase 3 - Monitoring & Alerting:
- Add critical service alerts (PostgreSQL, MySQL, Redis, Headscale,
Authentik, Loki)
- Increase Loki retention from 7 to 30 days (720h)
- Add predictive PV filling alert (predict_linear)
- Re-enable Hackmd and Privatebin down alerts
Phase 4 - Reliability:
- Add resource requests/limits to Redis, DBaaS, Technitium, Headscale,
Vaultwarden, Uptime Kuma
- Increase Alloy DaemonSet memory to 512Mi/1Gi
Phase 6 - Maintainability:
- Extract duplicated tiers locals to terragrunt.hcl generate block
(removed from 67 stacks)
- Replace hardcoded NFS IP 10.0.10.15 with var.nfs_server (114
instances across 63 files)
- Replace hardcoded Redis/PostgreSQL/MySQL/Ollama/mail host references
with variables across ~35 stacks
- Migrate xray raw ingress resources to ingress_factory modules
2026-02-23 22:05:28 +00:00
|
|
|
value = var.mail_host
|
[ci skip] Add platform stack (core services) for Terragrunt migration
stacks/platform/ contains 22 core/cluster services: metallb, dbaas, redis,
traefik, technitium, headscale, authentik, rbac, k8s-portal, crowdsec,
monitoring, vaultwarden, reverse-proxy, metrics-server, nvidia, kyverno,
uptime-kuma, wireguard, xray, mailserver, cloudflared, infra-maintenance.
Outputs: tls_secret_name, redis_host, postgresql_host/port, mysql_host/port,
smtp_host/port — consumed by downstream service stacks via dependency blocks.
2026-02-22 13:21:09 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
|
|
output "smtp_port" {
|
|
|
|
|
value = 587
|
|
|
|
|
}
|