No description
LimitRange defaults had a 4-8x limit/request ratio causing the scheduler to overpack nodes. When pods burst, nodes OOM-thrashed and became unresponsive (k8s-node3 and k8s-node4 both went down today). Changes: - Increase default memory requests across all tiers (ratio now 2x): - core/cluster: 64Mi → 256Mi request (512Mi limit) - gpu: 256Mi → 1Gi request (2Gi limit) - edge/aux/fallback: 64Mi → 128Mi request (256Mi limit) - Add kubelet memory reservation and eviction thresholds: - systemReserved: 512Mi, kubeReserved: 512Mi - evictionHard: 500Mi (was 100Mi), evictionSoft: 1Gi (was unset) - Applied to all nodes and future node template |
||
|---|---|---|
| .claude | ||
| .git-crypt | ||
| .planning | ||
| .woodpecker | ||
| cli | ||
| diagram | ||
| docs/plans | ||
| modules | ||
| playbooks | ||
| scripts | ||
| secrets | ||
| stacks | ||
| .gitattributes | ||
| .gitignore | ||
| .sops.yaml | ||
| AGENTS.md | ||
| config.tfvars | ||
| LICENSE.txt | ||
| MEMORY.md | ||
| README.md | ||
| secrets.sops.json | ||
| terragrunt.hcl | ||
| tiers.tf | ||
This repo contains my infra-as-code sources.
My infrastructure is built using Terraform, Kubernetes and CI/CD is done using Woodpecker CI.
Read more by visiting my website: https://viktorbarzin.me
git-crypt setup
To decrypt the secrets, you need to setup git-crypt.
- Install git-crypt.
- Setup gpg keys on the machine
git-crypt unlock
This will unlock the secrets and will lock them on commit