No description
Find a file
Viktor Barzin 23019da8e5 equalize memory req=lim across 70+ containers using Prometheus 7d max data
After node2 OOM incident, right-size memory across the cluster by setting
requests=limits based on max_over_time(container_memory_working_set_bytes[7d])
with 1.3x headroom. Eliminates ~37Gi overcommit gap.

Categories:
- Safe equalization (50 containers): set req=lim where max7d well within target
- Limit increases (8 containers): raise limits for services spiking above current
- No Prometheus data (12 containers): conservatively set lim=req
- Exception: nextcloud keeps req=256Mi/lim=8Gi due to Apache memory spikes

Also increased dbaas namespace quota from 12Gi to 16Gi to accommodate mysql
4Gi limits across 3 replicas.
2026-03-14 21:46:49 +00:00
.claude remember: CrowdSec Helm upgrade timeout [ci skip] 2026-03-14 12:04:07 +00:00
.git-crypt Add 1 git-crypt collaborator [ci skip] 2025-10-24 18:00:00 +00:00
.planning [ci skip] add auto-generated tiers.tf, planning docs, and helm chart cache 2026-03-06 23:55:57 +00:00
.woodpecker [ci skip] update AGENTS.md + CLAUDE.md with SOPS workflow, add k8s-portal CI pipeline 2026-03-07 15:37:19 +00:00
cli update @ record as well 2024-12-02 21:51:05 +00:00
diagram [ci skip] Sunset Drone CI: remove all artifacts, DNS, configs, and references 2026-02-23 19:38:55 +00:00
docs/plans [ci skip] k8s portal: fix setup script + add onboarding hub (5 new pages) 2026-03-07 15:06:26 +00:00
modules [ci skip] iSCSI migration, healthcheck fixes, health probes, etcd backup 2026-03-06 19:54:21 +00:00
playbooks [ci skip] Reduce node config drift: GPU label, OIDC idempotency, node-exporter, rebuild docs 2026-02-22 22:59:38 +00:00
scripts [ci skip] add Forgejo task pipeline for OpenClaw AI agent 2026-03-07 21:11:07 +00:00
secrets [ci skip] remove atuin: destroy stack, DNS, NFS export, PostgreSQL credentials 2026-03-06 20:11:14 +00:00
stacks equalize memory req=lim across 70+ containers using Prometheus 7d max data 2026-03-14 21:46:49 +00:00
.gitattributes add git-crypt terraform 2021-02-14 18:17:40 +00:00
.gitignore Archive terraform.tfvars — secrets now in SOPS 2026-03-11 21:16:11 +00:00
.sops.yaml [ci skip] phase 1: SOPS tooling setup (.sops.yaml, scripts/tg, .gitignore) 2026-03-07 13:57:42 +00:00
AGENTS.md [ci skip] add sealed secrets convention: fileset + kubernetes_manifest pattern 2026-03-08 20:03:50 +00:00
config.tfvars add novelapp deployment [ci skip] 2026-03-14 18:51:14 +00:00
LICENSE.txt Drone CI Update TLS Certificates Commit 2025-10-12 00:13:18 +00:00
MEMORY.md Update MEMORY.md timestamp 2026-03-07 16:43:15 +00:00
README.md [ci skip] Sunset Drone CI: remove all artifacts, DNS, configs, and references 2026-02-23 19:38:55 +00:00
secrets.sops.json Add Vault OIDC authentication via Authentik 2026-03-14 13:53:05 +00:00
setup-monitoring.sh fix(monitoring): Add setup script for automated health check environment 2026-03-13 13:57:11 +00:00
terragrunt.hcl migrate all secrets from SOPS to Vault KV 2026-03-14 17:15:48 +00:00
tiers.tf [ci skip] Phase 1: PostgreSQL migrated to CNPG on local disk 2026-02-28 19:08:06 +00:00

This repo contains my infra-as-code sources.

My infrastructure is built using Terraform, Kubernetes and CI/CD is done using Woodpecker CI.

Read more by visiting my website: https://viktorbarzin.me

git-crypt setup

To decrypt the secrets, you need to setup git-crypt.

  1. Install git-crypt.
  2. Setup gpg keys on the machine
  3. git-crypt unlock

This will unlock the secrets and will lock them on commit