infra/stacks
Viktor Barzin 304f0de43a add Metric Staleness alerts for UPS, iDRAC, ATS, and HA metrics
Replace fragile NoiDRACData alert with proper absent() checks. Add
UPSMetricsMissing (critical), iDRACRedfishMetricsMissing,
iDRACSNMPMetricsMissing, ATSMetricsMissing, and
HomeAssistantMetricsMissing alerts. Update PowerOutage and NodeDown
inhibit rules to suppress staleness alerts during outages.
2026-03-23 22:24:17 +02:00
..
_template multi-user access: fix template memory default, add storage quota, add CONTRIBUTING.md [ci skip] 2026-03-19 23:49:15 +00:00
actualbudget ingress latency: add histogram buckets, fix restarts, right-size memory 2026-03-23 10:52:43 +02:00
affine fix DB password rotation desync in 5 stacks 2026-03-17 07:39:29 +00:00
audiobookshelf state(dbaas): update encrypted state 2026-03-19 20:23:59 +00:00
authentik scale down non-critical services to free cluster memory 2026-03-22 03:10:12 +02:00
blog sync regenerated providers.tf + upstream changes 2026-03-22 02:56:04 +02:00
calibre ingress latency: add histogram buckets, fix restarts, right-size memory 2026-03-23 10:52:43 +02:00
changedetection fix OOMKilled containers: bump immich/actualbudget memory, disable changedetection, cap clickhouse 2026-03-22 15:22:29 +02:00
city-guesser sync regenerated providers.tf + upstream changes 2026-03-22 02:56:04 +02:00
claude-memory switch claude-memory server to multi-user API_KEYS auth 2026-03-22 20:08:07 +02:00
cloudflared add IPv6 connectivity via Hurricane Electric 6in4 tunnel 2026-03-23 02:22:00 +02:00
cnpg extract remaining 19 modules from platform, complete stack split [ci skip] 2026-03-17 21:42:16 +00:00
coturn state(dbaas): update encrypted state 2026-03-19 20:23:59 +00:00
crowdsec fix CrowdSec collection names and increase Helm timeout 2026-03-23 03:41:13 +02:00
cyberchef sync regenerated providers.tf + upstream changes 2026-03-22 02:56:04 +02:00
dashy sync regenerated providers.tf + upstream changes 2026-03-22 02:56:04 +02:00
dawarich bump memory limits for OOM-prone services 2026-03-21 11:12:12 +00:00
dbaas fix backup IO stats: use /proc/$$/io instead of /proc/self/io 2026-03-23 12:33:52 +02:00
descheduler sync regenerated providers.tf + upstream changes 2026-03-22 02:56:04 +02:00
diun regenerate providers.tf: remove vault_root_token variable [ci skip] 2026-03-15 21:21:01 +00:00
ebook2audiobook sync regenerated providers.tf + upstream changes 2026-03-22 02:56:04 +02:00
echo state(dbaas): update encrypted state 2026-03-19 20:23:59 +00:00
excalidraw sync regenerated providers.tf + upstream changes 2026-03-22 02:56:04 +02:00
external-secrets regenerate providers.tf: remove vault_root_token variable [ci skip] 2026-03-15 21:21:01 +00:00
f1-stream scale up f1-stream and changedetection [ci skip] 2026-03-16 07:06:09 +00:00
forgejo ingress latency: add histogram buckets, fix restarts, right-size memory 2026-03-23 10:52:43 +02:00
freedify state(dbaas): update encrypted state 2026-03-19 20:23:59 +00:00
freshrss bump memory limits for OOM-prone services 2026-03-21 11:12:12 +00:00
frigate move Frigate cache to tmpfs to eliminate disk writes on node1 2026-03-23 11:52:49 +02:00
grampsweb migrate 16 plan-time stacks: vault data source → ESO + kubernetes_secret 2026-03-15 22:06:39 +00:00
hackmd fix DB password desync + migrate remaining tfvars to Vault 2026-03-15 21:39:45 +00:00
headscale extract remaining 19 modules from platform, complete stack split [ci skip] 2026-03-17 21:42:16 +00:00
health right-size 14 services and scale down GPU-heavy workloads [ci skip] 2026-03-15 23:00:49 +00:00
homepage sync regenerated providers.tf + upstream changes 2026-03-22 02:56:04 +02:00
immich fix backup IO stats: use /proc/$$/io instead of /proc/self/io 2026-03-23 12:33:52 +02:00
infra update infra stack terraform lock file (helm/kubernetes/vault providers) 2026-03-23 02:24:47 +02:00
infra-maintenance fix backup IO stats: use /proc/$$/io instead of /proc/self/io 2026-03-23 12:33:52 +02:00
iscsi-csi extract remaining 19 modules from platform, complete stack split [ci skip] 2026-03-17 21:42:16 +00:00
isponsorblocktv sync regenerated providers.tf + upstream changes 2026-03-22 02:56:04 +02:00
jsoncrack sync regenerated providers.tf + upstream changes 2026-03-22 02:56:04 +02:00
k8s-dashboard fix nextcloud db-username + k8s-dashboard chart repo 2026-03-22 02:50:48 +02:00
k8s-portal feat(k8s-portal): update onboarding + architecture with SOPS state docs 2026-03-17 23:17:47 +00:00
kms sync regenerated providers.tf + upstream changes 2026-03-22 02:56:04 +02:00
kyverno add Kyverno TLS secret sync + enhance renewal pipeline 2026-03-23 22:19:34 +02:00
linkwarden right-size 14 services and scale down GPU-heavy workloads [ci skip] 2026-03-15 23:00:49 +00:00
mailserver add IPv6 connectivity via Hurricane Electric 6in4 tunnel 2026-03-23 02:22:00 +02:00
matrix regenerate providers.tf: remove vault_root_token variable [ci skip] 2026-03-15 21:21:01 +00:00
meshcentral sync regenerated providers.tf + upstream changes 2026-03-22 02:56:04 +02:00
metallb extract remaining 19 modules from platform, complete stack split [ci skip] 2026-03-17 21:42:16 +00:00
metrics-server extract remaining 19 modules from platform, complete stack split [ci skip] 2026-03-17 21:42:16 +00:00
monitoring add Metric Staleness alerts for UPS, iDRAC, ATS, and HA metrics 2026-03-23 22:24:17 +02:00
n8n bump memory limits for OOM-prone services 2026-03-21 11:12:12 +00:00
navidrome state(dbaas): update encrypted state 2026-03-19 20:23:59 +00:00
netbox regenerate providers.tf: remove vault_root_token variable [ci skip] 2026-03-15 21:21:01 +00:00
networking-toolbox sync regenerated providers.tf + upstream changes 2026-03-22 02:56:04 +02:00
nextcloud fix nextcloud db-username + k8s-dashboard chart repo 2026-03-22 02:50:48 +02:00
nfs-csi extract remaining 19 modules from platform, complete stack split [ci skip] 2026-03-17 21:42:16 +00:00
novelapp ingress latency: add histogram buckets, fix restarts, right-size memory 2026-03-23 10:52:43 +02:00
ntfy state(dbaas): update encrypted state 2026-03-19 20:23:59 +00:00
nvidia right-size memory requests to unblock GPU workloads and fix dbaas quota [ci skip] 2026-03-17 22:35:54 +00:00
ollama fix ollama: remove conditional count on basicAuth (incompatible with ESO data source) 2026-03-15 22:24:36 +00:00
onlyoffice right-size memory requests to unblock GPU workloads and fix dbaas quota [ci skip] 2026-03-17 22:35:54 +00:00
openclaw sync regenerated providers.tf + upstream changes 2026-03-22 02:56:04 +02:00
osm_routing sync regenerated providers.tf + upstream changes 2026-03-22 02:56:04 +02:00
owntracks state(dbaas): update encrypted state 2026-03-19 20:23:59 +00:00
paperless-ngx regenerate providers.tf: remove vault_root_token variable [ci skip] 2026-03-15 21:21:01 +00:00
platform add agent route to k8s-portal 2026-03-23 02:25:08 +02:00
plotting-book add weekly SQLite backup for plotting-book to NFS 2026-03-23 02:24:43 +02:00
poison-fountain sync regenerated providers.tf + upstream changes 2026-03-22 02:56:04 +02:00
priority-pass use registry.viktorbarzin.me hostname for private images + protect ingress 2026-03-23 01:02:27 +02:00
privatebin state(dbaas): update encrypted state 2026-03-19 20:23:59 +00:00
rbac extract remaining 19 modules from platform, complete stack split [ci skip] 2026-03-17 21:42:16 +00:00
real-estate-crawler scale down non-critical services to free cluster memory 2026-03-22 03:10:12 +02:00
redis fix backup IO stats: use /proc/$$/io instead of /proc/self/io 2026-03-23 12:33:52 +02:00
reloader sync regenerated providers.tf + upstream changes 2026-03-22 02:56:04 +02:00
resume regenerate providers.tf: remove vault_root_token variable [ci skip] 2026-03-15 21:21:01 +00:00
reverse-proxy add htpasswd auth to private docker registry + expose at registry.viktorbarzin.me 2026-03-22 22:10:10 +02:00
rybbit fix OOMKilled containers: bump immich/actualbudget memory, disable changedetection, cap clickhouse 2026-03-22 15:22:29 +02:00
sealed-secrets extract remaining 19 modules from platform, complete stack split [ci skip] 2026-03-17 21:42:16 +00:00
send sync regenerated providers.tf + upstream changes 2026-03-22 02:56:04 +02:00
servarr state(dbaas): update encrypted state 2026-03-19 20:23:59 +00:00
shadowsocks regenerate providers.tf: remove vault_root_token variable [ci skip] 2026-03-15 21:21:01 +00:00
speedtest fix DB password desync + migrate remaining tfvars to Vault 2026-03-15 21:39:45 +00:00
stirling-pdf right-size 14 services and scale down GPU-heavy workloads [ci skip] 2026-03-15 23:00:49 +00:00
tandoor fix DB password desync + migrate remaining tfvars to Vault 2026-03-15 21:39:45 +00:00
technitium extract remaining 19 modules from platform, complete stack split [ci skip] 2026-03-17 21:42:16 +00:00
terminal sync regenerated providers.tf + upstream changes 2026-03-22 02:56:04 +02:00
tor-proxy sync regenerated providers.tf + upstream changes 2026-03-22 02:56:04 +02:00
trading-bot fix DB password desync + migrate remaining tfvars to Vault 2026-03-15 21:39:45 +00:00
traefik ingress latency: add histogram buckets, fix restarts, right-size memory 2026-03-23 10:52:43 +02:00
travel_blog sync regenerated providers.tf + upstream changes 2026-03-22 02:56:04 +02:00
tuya-bridge scale down non-critical services to free cluster memory 2026-03-22 03:10:12 +02:00
uptime-kuma extract remaining 19 modules from platform, complete stack split [ci skip] 2026-03-17 21:42:16 +00:00
url state(dbaas): update encrypted state 2026-03-19 20:23:59 +00:00
vault fix backup IO stats: use /proc/$$/io instead of /proc/self/io 2026-03-23 12:33:52 +02:00
vaultwarden fix backup IO stats: use /proc/$$/io instead of /proc/self/io 2026-03-23 12:33:52 +02:00
vpa extract remaining 19 modules from platform, complete stack split [ci skip] 2026-03-17 21:42:16 +00:00
wealthfolio regenerate providers.tf: remove vault_root_token variable [ci skip] 2026-03-15 21:21:01 +00:00
webhook_handler fix(provision): security hardening from code review 2026-03-18 21:25:03 +00:00
whisper sync regenerated providers.tf + upstream changes 2026-03-22 02:56:04 +02:00
wireguard extract remaining 19 modules from platform, complete stack split [ci skip] 2026-03-17 21:42:16 +00:00
woodpecker fix DB password rotation desync in 5 stacks 2026-03-17 07:39:29 +00:00
xray extract remaining 19 modules from platform, complete stack split [ci skip] 2026-03-17 21:42:16 +00:00
ytdlp state(dbaas): update encrypted state 2026-03-19 20:23:59 +00:00