No description
Find a file
Viktor Barzin 9e4fb23b10 [ci skip] right-size all pod resources based on VPA + live metrics audit
Full cluster resource audit: cross-referenced Goldilocks VPA recommendations,
live kubectl top metrics, and Terraform definitions for 100+ containers.

Critical fixes:
- dashy: CPU throttled at 98% (490m/500m) → 2 CPU limit
- stirling-pdf: CPU throttled at 99.7% (299m/300m) → 2 CPU limit
- traefik auth-proxy/bot-block-proxy: mem limit 32Mi → 128Mi

Added explicit resources to ~40 containers that had none:
- audiobookshelf, changedetection, cyberchef, dawarich, diun, echo,
  excalidraw, freshrss, hackmd, isponsorblocktv, linkwarden, n8n,
  navidrome, ntfy, owntracks, privatebin, send, shadowsocks, tandoor,
  tor-proxy, wealthfolio, networking-toolbox, rybbit, mailserver,
  cloudflared, pgadmin, phpmyadmin, crowdsec-web, xray, wireguard,
  k8s-portal, tuya-bridge, ollama-ui, whisper, piper, immich-server,
  immich-postgresql, osrm-foot

GPU containers: added CPU/mem alongside GPU limits:
- ollama: removed CPU/mem limits (models vary in size), keep GPU only
- frigate: req 500m/2Gi, lim 4/8Gi + GPU
- immich-ml: req 100m/1Gi, lim 2/4Gi + GPU

Right-sized ~25 over-provisioned containers:
- kms-web-page: 500m/512Mi → 50m/64Mi (was using 0m/10Mi)
- onlyoffice: CPU 8 → 2 (VPA upper 45m)
- realestate-crawler-api: CPU 2000m → 250m
- blog/travel-blog/webhook-handler: 500m → 100m
- coturn/health/plotting-book: reduced to match actual usage

Conservative methodology: limits = max(VPA upper * 2, live usage * 2)
2026-03-01 19:18:50 +00:00
.claude [ci skip] switch VPA to off mode globally, fix Ollama/MySQL resources 2026-03-01 19:03:49 +00:00
.git-crypt Add 1 git-crypt collaborator [ci skip] 2025-10-24 18:00:00 +00:00
.planning [ci skip] f1-stream: update project state - all 8 phases complete 2026-02-24 00:28:54 +00:00
.woodpecker remove f1-stream CI pipeline (moved to separate repo) 2026-03-01 14:43:06 +00:00
cli update @ record as well 2024-12-02 21:51:05 +00:00
diagram [ci skip] Sunset Drone CI: remove all artifacts, DNS, configs, and references 2026-02-23 19:38:55 +00:00
docs/plans [ci skip] add Traefik resilience hardening implementation plan 2026-03-01 13:53:50 +00:00
modules [ci skip] add retry middleware (2 attempts, 100ms) to default ingress chain 2026-03-01 14:35:53 +00:00
playbooks [ci skip] Reduce node config drift: GPU label, OIDC idempotency, node-exporter, rebuild docs 2026-02-22 22:59:38 +00:00
scripts [ci skip] expand k8s worker nodes to 256G, update inventory and extend script 2026-02-28 16:00:16 +00:00
secrets [ci skip] onlyoffice: cache fonts/themes on NFS for fast restarts 2026-03-01 18:02:38 +00:00
stacks [ci skip] right-size all pod resources based on VPA + live metrics audit 2026-03-01 19:18:50 +00:00
.gitattributes add git-crypt terraform 2021-02-14 18:17:40 +00:00
.gitignore [ci skip] Update .gitignore: exclude terragrunt-generated files 2026-02-22 21:30:45 +00:00
LICENSE.txt Drone CI Update TLS Certificates Commit 2025-10-12 00:13:18 +00:00
README.md [ci skip] Sunset Drone CI: remove all artifacts, DNS, configs, and references 2026-02-23 19:38:55 +00:00
terragrunt.hcl [ci skip] Infrastructure hardening: security, monitoring, reliability, maintainability 2026-02-23 22:05:28 +00:00
tiers.tf [ci skip] Phase 1: PostgreSQL migrated to CNPG on local disk 2026-02-28 19:08:06 +00:00

This repo contains my infra-as-code sources.

My infrastructure is built using Terraform, Kubernetes and CI/CD is done using Woodpecker CI.

Read more by visiting my website: https://viktorbarzin.me

git-crypt setup

To decrypt the secrets, you need to setup git-crypt.

  1. Install git-crypt.
  2. Setup gpg keys on the machine
  3. git-crypt unlock

This will unlock the secrets and will lock them on commit