No description
Find a file
Viktor Barzin 4ea3ffe9d3 Reduce downtime during platform stack applies
CrowdSec Helm fix:
- Increase ResourceQuota requests.cpu from 1 to 4 — pods were at 302%
  of quota, preventing scheduling during rolling upgrades
- Reduce Helm timeout from 3600s to 600s — 1 hour hang is excessive
- Add wait=true and wait_for_jobs=true for proper readiness checking

Prometheus startup guard:
- Add startup guard to 8 rate/increase-based alerts that false-fire
  after Prometheus restarts (needs 2 scrapes for rate() to work):
  PodCrashLooping, ContainerOOMKilled, CoreDNSErrors,
  HighServiceErrorRate, HighService4xxRate, HighServiceLatency,
  SSDHighWriteRate, HDDHighWriteRate
- Guard: and on() (time() - process_start_time_seconds) > 900
  suppresses alerts for 15m after Prometheus startup
2026-03-14 12:09:09 +00:00
.claude remember: CrowdSec Helm upgrade timeout [ci skip] 2026-03-14 12:04:07 +00:00
.git-crypt Add 1 git-crypt collaborator [ci skip] 2025-10-24 18:00:00 +00:00
.planning [ci skip] add auto-generated tiers.tf, planning docs, and helm chart cache 2026-03-06 23:55:57 +00:00
.woodpecker [ci skip] update AGENTS.md + CLAUDE.md with SOPS workflow, add k8s-portal CI pipeline 2026-03-07 15:37:19 +00:00
cli update @ record as well 2024-12-02 21:51:05 +00:00
diagram [ci skip] Sunset Drone CI: remove all artifacts, DNS, configs, and references 2026-02-23 19:38:55 +00:00
docs/plans [ci skip] k8s portal: fix setup script + add onboarding hub (5 new pages) 2026-03-07 15:06:26 +00:00
modules [ci skip] iSCSI migration, healthcheck fixes, health probes, etcd backup 2026-03-06 19:54:21 +00:00
playbooks [ci skip] Reduce node config drift: GPU label, OIDC idempotency, node-exporter, rebuild docs 2026-02-22 22:59:38 +00:00
scripts [ci skip] add Forgejo task pipeline for OpenClaw AI agent 2026-03-07 21:11:07 +00:00
secrets [ci skip] remove atuin: destroy stack, DNS, NFS export, PostgreSQL credentials 2026-03-06 20:11:14 +00:00
stacks Reduce downtime during platform stack applies 2026-03-14 12:09:09 +00:00
.gitattributes add git-crypt terraform 2021-02-14 18:17:40 +00:00
.gitignore Archive terraform.tfvars — secrets now in SOPS 2026-03-11 21:16:11 +00:00
.sops.yaml [ci skip] phase 1: SOPS tooling setup (.sops.yaml, scripts/tg, .gitignore) 2026-03-07 13:57:42 +00:00
AGENTS.md [ci skip] add sealed secrets convention: fileset + kubernetes_manifest pattern 2026-03-08 20:03:50 +00:00
config.tfvars Add terminal stack - reverse proxy to ttyd behind authentik 2026-03-10 23:46:01 +00:00
LICENSE.txt Drone CI Update TLS Certificates Commit 2025-10-12 00:13:18 +00:00
MEMORY.md Update MEMORY.md timestamp 2026-03-07 16:43:15 +00:00
README.md [ci skip] Sunset Drone CI: remove all artifacts, DNS, configs, and references 2026-02-23 19:38:55 +00:00
secrets.sops.json [ci skip] fix Navidrome credentials: admin user is wizard not admin 2026-03-07 20:39:56 +00:00
setup-monitoring.sh fix(monitoring): Add setup script for automated health check environment 2026-03-13 13:57:11 +00:00
terragrunt.hcl [ci skip] phase 3: switch terragrunt to load config.tfvars + SOPS secrets 2026-03-07 14:16:28 +00:00
tiers.tf [ci skip] Phase 1: PostgreSQL migrated to CNPG on local disk 2026-02-28 19:08:06 +00:00

This repo contains my infra-as-code sources.

My infrastructure is built using Terraform, Kubernetes and CI/CD is done using Woodpecker CI.

Read more by visiting my website: https://viktorbarzin.me

git-crypt setup

To decrypt the secrets, you need to setup git-crypt.

  1. Install git-crypt.
  2. Setup gpg keys on the machine
  3. git-crypt unlock

This will unlock the secrets and will lock them on commit