infra

Viktor Barzin 0216e993dc Some checks failed ci/woodpecker/push/build-cli Pipeline was successful Details ci/woodpecker/push/default Pipeline failed Details etcd-load-reduction: remove VPA/Goldilocks, disable kyverno reporting, descheduler hourly The control-plane flap (etcd lease-renewal timeouts) recurred. Rather than move etcd to SSD (code-oflt, deferred again), the chosen direction is to REDUCE etcd load enough that the leader-election-timeout band-aid (renew 10s->30s) becomes removable. These are the big, clean cuts: 1. Remove VPA/Goldilocks (stacks/vpa emptied). All 349 VPAs ran updateMode=Off (no auto-right-sizing) yet cost ~800 etcd objects + continuous recommender writes + a pod-creation admission webhook, purely to feed a dashboard. krr (Dockerized, on-demand) replaces it. Reverses the re-add after memory 2431. 2. Disable kyverno reporting (admission/aggregate/background). policyReports were already off, so the pipeline generated ephemeralreports + an hourly all-resource etcd re-scan for NO user-facing output. Admission enforcement (deny-* policies) and Keel mutation are unaffected; violations surface via Loki->Slack. 3. descheduler */5 -> hourly (fewer list/evict cycles; rebalancing isn't urgent). Deferred (poor ROI / unsafe as planned): ESO refreshInterval 15m->1h is a ~20-stack sprawl for ~0.1 writes/s; keel background=false is invalid for a mutate-existing policy and its churn is apply-time not steady-state. Both filed as follow-up beads. Post-apply: delete the chart-orphaned VPA CRDs to cascade-clean leftover CRs. Then measure etcd apply-latency and revert the timeouts. Docs updated (VPA/Goldilocks -> krr). See memory 5402-5407. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>		2026-06-12 19:41:22 +00:00
..
agents	fix: restore tree dropped by `6d224861`; land stem95su gdrive-sync (10m) [ci skip]	2026-06-09 08:45:33 +00:00
commands	fix: restore tree dropped by `6d224861`; land stem95su gdrive-sync (10m) [ci skip]	2026-06-09 08:45:33 +00:00
reference	Merge forgejo/master (tts stack) into wizard/android-emulator	2026-06-11 19:53:07 +00:00
scripts	fix: restore tree dropped by `6d224861`; land stem95su gdrive-sync (10m) [ci skip]	2026-06-09 08:45:33 +00:00
skills	fix: restore tree dropped by `6d224861`; land stem95su gdrive-sync (10m) [ci skip]	2026-06-09 08:45:33 +00:00
calendar-query.py	fix: restore tree dropped by `6d224861`; land stem95su gdrive-sync (10m) [ci skip]	2026-06-09 08:45:33 +00:00
CLAUDE.md	etcd-load-reduction: remove VPA/Goldilocks, disable kyverno reporting, descheduler hourly	2026-06-12 19:41:22 +00:00
home-assistant-sofia.py	fix: restore tree dropped by `6d224861`; land stem95su gdrive-sync (10m) [ci skip]	2026-06-09 08:45:33 +00:00
home-assistant.py	fix: restore tree dropped by `6d224861`; land stem95su gdrive-sync (10m) [ci skip]	2026-06-09 08:45:33 +00:00
internet-mode-used_DO_NOT_REMOVE_MANUALLY_SECURITY_RISK	fix: restore tree dropped by `6d224861`; land stem95su gdrive-sync (10m) [ci skip]	2026-06-09 08:45:33 +00:00
pfsense.py	fix: restore tree dropped by `6d224861`; land stem95su gdrive-sync (10m) [ci skip]	2026-06-09 08:45:33 +00:00
settings.json	fix: restore tree dropped by `6d224861`; land stem95su gdrive-sync (10m) [ci skip]	2026-06-09 08:45:33 +00:00