Stage 1 of moving private images off the registry:2 container at registry.viktorbarzin.me:5050 (which has hit distribution#3324 corruption 3x in 3 weeks) onto Forgejo's built-in OCI registry. No cutover risk — pods still pull from the existing registry until Phase 3. What changes: * Forgejo deployment: memory 384Mi→1Gi, PVC 5Gi→15Gi (cap 50Gi). Explicit FORGEJO__packages__ENABLED + CHUNKED_UPLOAD_PATH (defensive, v11 default-on). * ingress_factory: max_body_size variable was declared but never wired in after the nginx→Traefik migration. Now creates a per-ingress Buffering middleware when set; default null = no limit (preserves existing behavior). Forgejo ingress sets max_body_size=5g to allow multi-GB layer pushes. * Cluster-wide registry-credentials Secret: 4th auths entry for forgejo.viktorbarzin.me, populated from Vault secret/viktor/ forgejo_pull_token (cluster-puller PAT, read:package). Existing Kyverno ClusterPolicy syncs cluster-wide — no policy edits. * Containerd hosts.toml redirect: forgejo.viktorbarzin.me → in-cluster Traefik LB 10.0.20.200 (avoids hairpin NAT for in-cluster pulls). Cloud-init for new VMs + scripts/setup-forgejo-containerd-mirror.sh for existing nodes. * Forgejo retention CronJob (0 4 * * *): keeps newest 10 versions per package + always :latest. First 7 days dry-run (DRY_RUN=true); flip the local in cleanup.tf after log review. * Forgejo integrity probe CronJob (*/15): same algorithm as the existing registry-integrity-probe. Existing Prometheus alerts (RegistryManifestIntegrityFailure et al) made instance-aware so they cover both registries during the bake. * Docs: design+plan in docs/plans/, setup runbook in docs/runbooks/. Operational note — the apply order is non-trivial because the new Vault keys (forgejo_pull_token, forgejo_cleanup_token, secret/ci/global/forgejo_*) must exist BEFORE terragrunt apply in the kyverno + monitoring + forgejo stacks. The setup runbook documents the bootstrap sequence. Phase 1 (per-project dual-push pipelines) follows in subsequent commits. Bake clock starts when the last project goes dual-push. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
38 lines
1.9 KiB
HCL
38 lines
1.9 KiB
HCL
# =============================================================================
|
|
# Monitoring Stack — Prometheus / Grafana / Loki
|
|
# =============================================================================
|
|
|
|
variable "tls_secret_name" { type = string }
|
|
variable "nfs_server" { type = string }
|
|
variable "mysql_host" { type = string }
|
|
variable "monitoring_idrac_username" { type = string }
|
|
|
|
data "vault_kv_secret_v2" "secrets" {
|
|
mount = "secret"
|
|
name = "platform"
|
|
}
|
|
|
|
data "vault_kv_secret_v2" "viktor" {
|
|
mount = "secret"
|
|
name = "viktor"
|
|
}
|
|
|
|
module "monitoring" {
|
|
source = "./modules/monitoring"
|
|
tls_secret_name = var.tls_secret_name
|
|
nfs_server = var.nfs_server
|
|
mysql_host = var.mysql_host
|
|
alertmanager_account_password = data.vault_kv_secret_v2.secrets.data["alertmanager_account_password"]
|
|
idrac_username = var.monitoring_idrac_username
|
|
idrac_password = data.vault_kv_secret_v2.secrets.data["monitoring_idrac_password"]
|
|
alertmanager_slack_api_url = data.vault_kv_secret_v2.secrets.data["alertmanager_slack_api_url"]
|
|
tiny_tuya_service_secret = data.vault_kv_secret_v2.secrets.data["tiny_tuya_service_secret"]
|
|
haos_api_token = data.vault_kv_secret_v2.secrets.data["haos_api_token"]
|
|
pve_password = data.vault_kv_secret_v2.secrets.data["pve_password"]
|
|
grafana_admin_password = data.vault_kv_secret_v2.secrets.data["grafana_admin_password"]
|
|
kube_config_path = var.kube_config_path
|
|
registry_user = data.vault_kv_secret_v2.viktor.data["registry_user"]
|
|
registry_password = data.vault_kv_secret_v2.viktor.data["registry_password"]
|
|
forgejo_pull_token = data.vault_kv_secret_v2.viktor.data["forgejo_pull_token"]
|
|
tier = local.tiers.cluster
|
|
}
|