mitigate cluster instability during terraform applies

- Recreate strategy for heavy single-replica deployments (onlyoffice, stirling-pdf)
- Reduce maxSurge on multi-replica deployments (traefik, authentik, grafana, kyverno)
  to prevent memory request surge overwhelming scheduler
- Weekly etcd defrag CronJob (Sunday 3 AM) to prevent fragmentation buildup
- Disable Kyverno policy reports (ephemeral report cleanup)
- Cloud-init: journald persistence + 4Gi swap for worker nodes
- Kubelet: LimitedSwap behavior for memory pressure relief
This commit is contained in:
Viktor Barzin 2026-03-15 17:23:39 +00:00
parent 1fe7798609
commit c034adab5f
9 changed files with 100 additions and 2 deletions

View file

@ -56,6 +56,10 @@ apt:
filename: docker.list
runcmd:
# Enable persistent journald logging for crash forensics
- mkdir -p /var/log/journal
- sed -i 's/#Storage=auto/Storage=persistent/' /etc/systemd/journald.conf
- systemctl restart systemd-journald
%{if is_k8s_template}
- apt-mark hold kubelet kubeadm kubectl
- systemctl stop kubelet
@ -63,6 +67,14 @@ runcmd:
- ${containerd_config_update_command}
- systemctl restart containerd
- systemctl enable --now iscsid
# Create 4Gi swap file for worker node memory pressure relief (NOT for master — etcd is latency-critical)
- fallocate -l 4G /swapfile
- chmod 600 /swapfile
- mkswap /swapfile
- swapon /swapfile
- echo '/swapfile none swap sw 0 0' >> /etc/fstab
- sysctl -w vm.swappiness=10
- echo 'vm.swappiness=10' >> /etc/sysctl.d/99-swap.conf
- ${k8s_join_command}
- systemctl enable kubelet
- systemctl start kubelet