Commit graph

  • 307b7f6819 update claude knowledge: infra operational learnings from commit history [ci skip] Viktor Barzin 2026-03-15 10:46:45 +00:00
  • 56ddee457a fix: openclaw policy violation + reduce memory requests for capacity Viktor Barzin 2026-03-15 10:37:58 +00:00
  • 683f255c2e fix: increase memory for OOMKilled services Viktor Barzin 2026-03-15 10:26:11 +00:00
  • b219961bd8 fix: MySQL memory overcommit + shlink OOMKill Viktor Barzin 2026-03-15 03:21:55 +00:00
  • 0a69af618d update claude knowledge: vault KV secrets migration [ci skip] Viktor Barzin 2026-03-15 02:48:21 +00:00
  • 4a27345057 enable memory-core plugin for OpenClaw [ci skip] Viktor Barzin 2026-03-15 02:39:14 +00:00
  • 516685c504 Woodpecker CI deploy commit [CI SKIP] root 2026-03-15 02:38:30 +00:00
  • 52bce849d7 fix cluster health: resolve 21/23 failures from healthcheck Viktor Barzin 2026-03-15 02:33:46 +00:00
  • 5f71a53b08 add memory-tool instructions to project CLAUDE.md [ci skip] Viktor Barzin 2026-03-15 02:16:03 +00:00
  • 456e2777f5 update claude knowledge: LinuxServer.io container optimization learnings [ci skip] Viktor Barzin 2026-03-15 02:04:04 +00:00
  • 7c97902052 fix calibre slow startup: bake calibre binaries into image, skip chown on NFS Viktor Barzin 2026-03-15 02:03:19 +00:00
  • ff83ec3325 add infrastructure agent team: 8 specialized agents + 14 diagnostic scripts Viktor Barzin 2026-03-15 02:01:07 +00:00
  • fca4e02c54 prometheus: increase memory to 4Gi and probe delays for TSDB compaction Viktor Barzin 2026-03-15 01:44:52 +00:00
  • 43b49f7f6c cluster recovery: fix resource limits and node1 memory Viktor Barzin 2026-03-15 01:44:28 +00:00
  • a3c198e10e add AUTH_SECRET and ALLOWED_ORIGIN env vars to novelapp deployment Viktor Barzin 2026-03-15 00:33:38 +00:00
  • 29032e0b6b fix vaultwarden backup image: use docker.io/library/alpine for Kyverno Viktor Barzin 2026-03-15 00:16:31 +00:00
  • 6f562b5da6 add vaultwarden daily backup CronJob to NFS Viktor Barzin 2026-03-15 00:03:59 +00:00
  • 46afa85b01 fix openclaw config mount and OOM: use init container, increase memory to 2Gi Viktor Barzin 2026-03-14 23:42:08 +00:00
  • 916aa6c6cb update claude knowledge: OpenClaw deployment and tg wrapper learnings [ci skip] Viktor Barzin 2026-03-14 23:41:00 +00:00
  • 92cc3f01c1 migrate vaultwarden storage from NFS to iSCSI Viktor Barzin 2026-03-14 22:45:56 +00:00
  • 7e72a10848 exclude manifest requests from nginx registry cache Viktor Barzin 2026-03-14 22:43:20 +00:00
  • 0d01b3d1f3 Woodpecker CI deploy commit [CI SKIP] root 2026-03-14 22:11:47 +00:00
  • 23019da8e5 equalize memory req=lim across 70+ containers using Prometheus 7d max data Viktor Barzin 2026-03-14 21:46:49 +00:00
  • eb0301b02b lower memory limits closer to actual usage Viktor Barzin 2026-03-14 21:15:26 +00:00
  • 843ba0ae8f scale down unused/over-replicated services Viktor Barzin 2026-03-14 21:12:44 +00:00
  • f7c2c06009 right-size memory: set requests=limits based on actual usage Viktor Barzin 2026-03-14 21:01:24 +00:00
  • 2c296d4d7c add novelapp deployment [ci skip] Viktor Barzin 2026-03-14 18:51:14 +00:00
  • 98d7c2a4a5 fix: resolve HCL semicolons and vault-platform dependency cycle Viktor Barzin 2026-03-14 17:37:25 +00:00
  • a8d944eb9b migrate all secrets from SOPS to Vault KV Viktor Barzin 2026-03-14 17:15:48 +00:00
  • 39b7dac1a9 fix: bump openclaw memory limit to 1536Mi Viktor Barzin 2026-03-14 16:45:57 +00:00
  • 683361b55e fix: bump calibre memory limit to 512Mi Viktor Barzin 2026-03-14 16:38:09 +00:00
  • 8612eb3fc7 fix: bump affine migration init container memory to 512Mi Viktor Barzin 2026-03-14 16:37:06 +00:00
  • ef4cfc146a scale down ollama-ui, netbox, tandoor to free cluster memory Viktor Barzin 2026-03-14 16:30:08 +00:00
  • 2be858f616 fix: eliminate memory overcommit to prevent node OOM crashes Viktor Barzin 2026-03-14 16:01:41 +00:00
  • 27fa8ea18f Hide Vault OIDC from main login dropdown Viktor Barzin 2026-03-14 14:12:16 +00:00
  • 1dec7e6bea Add Vault OIDC authentication via Authentik Viktor Barzin 2026-03-14 13:53:05 +00:00
  • 44aa6d61c2 Reduce downtime during platform stack applies Viktor Barzin 2026-03-14 12:47:56 +00:00
  • 4ea3ffe9d3 Reduce downtime during platform stack applies Viktor Barzin 2026-03-14 12:09:09 +00:00
  • 4635d3b826 remember: CrowdSec Helm upgrade timeout [ci skip] Viktor Barzin 2026-03-14 12:04:07 +00:00
  • 17065304dc Fix NFSServerUnresponsive false positives Viktor Barzin 2026-03-14 11:28:17 +00:00
  • 6377a8b85b Monitoring overhaul: reduce noise, add coverage gaps, auto-load dashboards Viktor Barzin 2026-03-14 10:22:22 +00:00
  • a6f71fc6f0 feat(claude-memory): add stack and update image to standalone repo Viktor Barzin 2026-03-14 09:49:38 +00:00
  • 2102cb2d73 Right-size CPU requests cluster-wide and remove missed CPU limits Viktor Barzin 2026-03-14 09:22:24 +00:00
  • b00f810d3d Remove all CPU limits cluster-wide to eliminate CFS throttling Viktor Barzin 2026-03-14 08:51:45 +00:00
  • 120f83ce93 Nextcloud performance tuning and fix backup cron job Viktor Barzin 2026-03-14 08:20:51 +00:00
  • 8aaa75b57b Migrate Matrix Synapse from SQLite to PostgreSQL Viktor Barzin 2026-03-13 23:21:59 +00:00
  • 1fefffeebc Add OTP resource limits and scale up Viktor Barzin 2026-03-13 22:33:27 +00:00
  • c8d15adc16 Remove LokiDown alert rule and inhibit reference Viktor Barzin 2026-03-13 22:17:36 +00:00
  • b323e567e4 Add HAProxy for Redis HA master-only routing Viktor Barzin 2026-03-13 20:54:32 +00:00
  • d05ff57b11 authentik: auto-assign invitation group via expression policy [ci skip] Viktor Barzin 2026-03-13 20:17:57 +00:00
  • 160fda882f authentik: cleanup unused resources + add invitation enrollment flow [ci skip] Viktor Barzin 2026-03-13 20:06:17 +00:00
  • af5f6a659b right-size Nextcloud resources after MySQL migration Viktor Barzin 2026-03-13 19:16:06 +00:00
  • 3e03fbec63 increase MaxRequestWorkers to 150 now that Nextcloud is on MySQL Viktor Barzin 2026-03-13 18:59:21 +00:00
  • aa3d3d0e66 migrate Nextcloud from SQLite to MySQL InnoDB Cluster Viktor Barzin 2026-03-12 23:27:12 +00:00
  • ce79bd5c04 Add node hang instrumentation and scale down chromium services Viktor Barzin 2026-03-11 22:46:33 +00:00
  • 8029823f79 fix(monitoring): Add setup script for automated health check environment OpenClaw 2026-03-13 13:57:11 +00:00
  • dfcef89c35 fix Frigate GPU stall: add inference speed check to liveness probe Viktor Barzin 2026-03-13 10:23:21 +00:00
  • 50d539908c feat(monitoring): Disable Loki centralized logging while preserving configuration OpenClaw 2026-03-13 08:41:23 +00:00
  • 523f0ba7eb fix(monitoring): Expand Loki PVC from 15GB to 50GB to resolve storage exhaustion OpenClaw 2026-03-13 08:13:05 +00:00
  • 4a9bd89b11 feat(health-check): Add Prometheus-based CPU and power monitoring OpenClaw 2026-03-13 07:32:36 +00:00
  • a09967e098 feat(monitoring): Enhance disk monitoring and containerd GC after node2 incident OpenClaw 2026-03-13 07:16:56 +00:00
  • fd6c1cca93 fix(nextcloud): Database corruption recovery and conservative Apache tuning OpenClaw 2026-03-12 13:38:37 +00:00
  • db1e301eea fix(nextcloud): Increase Apache MaxRequestWorkers to resolve health check timeouts OpenClaw 2026-03-12 13:14:20 +00:00
  • cedb90be33 Clean up: Remove test push file OpenClaw 2026-03-12 12:38:46 +00:00
  • 84b616de41 Test: Verify git push functionality from OpenClaw OpenClaw 2026-03-12 12:38:36 +00:00
  • 3f0cf4ff4d stabilize Nextcloud: relax probes, reduce resources for 2-client SQLite workload Viktor Barzin 2026-03-12 10:01:20 +00:00
  • bef0c073d5 fix DNS health check: use system resolver instead of hardcoded 10.0.20.101 Viktor Barzin 2026-03-12 09:08:34 +00:00
  • 81bfccaefc fix OOM kills: tune MySQL memory, reduce Nextcloud workers, increase Uptime Kuma limit Viktor Barzin 2026-03-12 07:26:08 +00:00
  • f2c7444159 fix nvidia quota: use custom quota (32 CPU) instead of Kyverno-generated (16 CPU) Viktor Barzin 2026-03-12 07:04:34 +00:00
  • f07f05f9bb migrate Nextcloud data volume from NFS to iSCSI for fsync support Viktor Barzin 2026-03-11 23:23:37 +00:00
  • 4427530e65 Archive terraform.tfvars — secrets now in SOPS Viktor Barzin 2026-03-11 21:16:11 +00:00
  • d7953322dd fix cluster health: pin actualbudget, spread MySQL, scale grampsweb, fix GPU toleration Viktor Barzin 2026-03-11 11:43:00 +00:00
  • 6bdcd88d25 set Recreate strategy for plotting-book deployment Viktor Barzin 2026-03-10 23:47:30 +00:00
  • 5a9881337d Add terminal stack - reverse proxy to ttyd behind authentik Viktor Barzin 2026-03-10 23:46:01 +00:00
  • d8bcdfef2e revert MaxRequestWorkers to 50, exclude nextcloud from 5xx alert Viktor Barzin 2026-03-09 22:05:20 +00:00
  • eed991a27b exclude nextcloud from HighServiceErrorRate alert Viktor Barzin 2026-03-09 20:26:30 +00:00
  • 0ca81a6112 fix: mount Apache MPM config under nextcloud.extraVolumes (not top-level) Viktor Barzin 2026-03-08 21:37:39 +00:00
  • ff03f2b99f tune Nextcloud Apache/PHP to fix constant crash-looping (50 restarts/6d) Viktor Barzin 2026-03-08 21:33:27 +00:00
  • ad8b90575e fix noisy JobFailed and duplicate mail server alerts Viktor Barzin 2026-03-08 21:22:43 +00:00
  • 33c7976630 reduce alert noise: add cascade inhibitions, increase for durations, drop Loki alerts Viktor Barzin 2026-03-08 21:13:16 +00:00
  • 2fa8ba2038 [ci skip] add sealed secrets convention: fileset + kubernetes_manifest pattern Viktor Barzin 2026-03-08 20:03:50 +00:00
  • 6b3e84f465 deploy Sealed Secrets controller for encrypted secret management Viktor Barzin 2026-03-08 19:49:48 +00:00
  • d352d6e7f8 resource quota review: fix OOM risks, close quota gaps, add HA protections Viktor Barzin 2026-03-08 18:17:46 +00:00
  • ead33b23dd enable MySQL InnoDB Cluster auto-recovery after crashes Viktor Barzin 2026-03-08 17:13:03 +00:00
  • 98f4920af1 [ci skip] remember: update kubelet thresholds when changing node memory Viktor Barzin 2026-03-08 10:34:17 +00:00
  • fffc2ed0ab fix node OOM: reduce memory overcommit ratio and add kubelet eviction thresholds Viktor Barzin 2026-03-08 10:33:38 +00:00
  • 193f2e2dc5 add iSCSI persistent volume for plotting-book SQLite database Viktor Barzin 2026-03-07 21:57:22 +00:00
  • 4374e78869 [ci skip] fix Wealthfolio Homepage icon: wealthfolio.png → mdi-finance Viktor Barzin 2026-03-07 21:32:58 +00:00
  • 9d031290cc [ci skip] fix Homepage icons for Tandoor, Listenarr, Networking Toolbox, Goldilocks Viktor Barzin 2026-03-07 21:29:51 +00:00
  • 32bd30f56e [ci skip] fix invalid Homepage dashboard icons for 9 services Viktor Barzin 2026-03-07 21:14:17 +00:00
  • 76a4987eef [ci skip] add Forgejo task pipeline for OpenClaw AI agent Viktor Barzin 2026-03-07 21:09:31 +00:00
  • c2765e890b add nginx caching proxy for Homepage widget API requests Viktor Barzin 2026-03-07 21:06:51 +00:00
  • b4f9777ecd Woodpecker CI deploy commit [CI SKIP] root 2026-03-07 20:47:22 +00:00
  • 33b20ce111 add Google OAuth env vars to plotting-book deployment Viktor Barzin 2026-03-07 20:41:08 +00:00
  • 650c2a6ed4 [ci skip] add liveness probe to Send deployment Viktor Barzin 2026-03-07 20:39:44 +00:00
  • 144fd151cb [ci skip] fix Navidrome credentials: admin user is wizard not admin Viktor Barzin 2026-03-07 20:20:18 +00:00
  • 7af0024473 [ci skip] fix pfSense widget: wan interface is vtnet0 not vmx0 Viktor Barzin 2026-03-07 20:02:36 +00:00
  • 3ec643a897 [ci skip] fix pfSense widget: remove wanStatus (API v2 missing gateway endpoint) Viktor Barzin 2026-03-07 19:54:23 +00:00
  • f3042f318e [ci skip] fix widget issues: ports, Immich v2 API, Nextcloud trusted domains Viktor Barzin 2026-03-07 19:36:25 +00:00
  • 17256c8f76 [ci skip] fix widget URLs: use correct k8s service ports Viktor Barzin 2026-03-07 19:23:57 +00:00