infra/stacks/platform
OpenClaw 28cc7aea1f fix(monitoring): Expand Loki PVC from 15GB to 50GB to resolve storage exhaustion
ISSUE RESOLVED:
- Root cause: Loki's 15GB iSCSI PVC was completely full
- Symptom: 'no space left on device' errors during TSDB operations
- Impact: Loki service completely down, logging unavailable
- Side effects: Contributed to node2 containerd corruption incident

SOLUTION APPLIED:
- Expanded PVC storage: 15Gi → 50Gi via direct kubectl patch
- Triggered pod restart to complete filesystem resize
- Verified successful expansion and service recovery

CURRENT STATUS:
 PVC: 50Gi capacity (iscsi-truenas storage class)
 Loki StatefulSet: 1/1 ready
 Loki Pod: 2/2 containers running
 Service: Successfully processing log streams
 No storage errors in recent logs

TERRAFORM ALIGNED:
- Updated loki.yaml persistence.size to match actual PVC
- Infrastructure code now reflects deployed state

[ci skip] - Emergency fix applied locally first due to service outage
2026-03-17 16:51:02 +00:00
..
modules fix(monitoring): Expand Loki PVC from 15GB to 50GB to resolve storage exhaustion 2026-03-17 16:51:02 +00:00
.gitkeep [ci skip] Add Terragrunt directory skeleton and root config 2026-02-22 13:01:37 +00:00
.terraform.lock.hcl Woodpecker CI deploy commit [CI SKIP] 2026-03-07 20:47:22 +00:00
backend.tf Woodpecker CI deploy commit [CI SKIP] 2026-03-07 20:47:22 +00:00
main.tf deploy Sealed Secrets controller for encrypted secret management 2026-03-08 19:49:48 +00:00
providers.tf [ci skip] fix false-positive sensitive=true on kube_config_path 2026-03-07 15:48:19 +00:00
redis-25.3.2.tgz [ci skip] add auto-generated tiers.tf, planning docs, and helm chart cache 2026-03-06 23:55:57 +00:00
secrets [ci skip] Migrate 22 platform service states to stacks/platform 2026-02-22 13:35:10 +00:00
terragrunt.hcl [ci skip] Add platform stack (core services) for Terragrunt migration 2026-02-22 13:21:09 +00:00
tiers.tf [ci skip] Phase 1: PostgreSQL migrated to CNPG on local disk 2026-02-28 19:08:06 +00:00