infra/stacks/monitoring/modules/monitoring
Viktor Barzin ea18116da9 fix: NFS outage recovery — migrate to NFSv4, add alerting
NFS server restart broke NFSv3 (lockd kernel bug on PVE 6.14).
All 52 NFS PVs patched to nfsvers=4, NFSv3 disabled on PVE.

Changes:
- nfs_volume module: add nfsvers=4 mount option
- nfs-csi StorageClass: add nfsvers=4 mount option
- dbaas: MySQL serverInstances 3→1, mysql-native-password=ON
- monitoring: add NFSCSINodeDown and NFSMountFailures alerts

[ci skip]

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 10:28:27 +00:00
..
dashboards monitoring + proxmox-csi: LVM snapshot RBAC, pushgateway NodePort, backup dashboard 2026-04-06 11:57:41 +03:00
server-power-cycle extract monitoring, nvidia, mailserver, cloudflared, kyverno from platform [ci skip] 2026-03-17 21:34:11 +00:00
alloy.yaml extract monitoring, nvidia, mailserver, cloudflared, kyverno from platform [ci skip] 2026-03-17 21:34:11 +00:00
Dockerfile extract monitoring, nvidia, mailserver, cloudflared, kyverno from platform [ci skip] 2026-03-17 21:34:11 +00:00
goflow2.tf extract monitoring, nvidia, mailserver, cloudflared, kyverno from platform [ci skip] 2026-03-17 21:34:11 +00:00
grafana.tf truenas deprecation: migrate all non-immich storage to proxmox NFS 2026-04-12 14:35:39 +01:00
grafana_chart_values.yaml feat: organize Grafana dashboards into folders 2026-03-28 16:23:49 +02:00
idrac.tf fix(monitoring): use patched idrac exporter with PSU input voltage metric 2026-03-23 22:07:36 +02:00
k8s-monitoring-values.yaml cleanup: remove calibre and audiobookshelf stacks after ebooks migration [ci skip] 2026-03-25 23:56:07 +02:00
loki.tf extract monitoring, nvidia, mailserver, cloudflared, kyverno from platform [ci skip] 2026-03-17 21:34:11 +00:00
loki.yaml extract monitoring, nvidia, mailserver, cloudflared, kyverno from platform [ci skip] 2026-03-17 21:34:11 +00:00
main.tf deprecate TrueNAS: migrate Immich NFS to Proxmox, remove all 10.0.10.15 references [ci skip] 2026-04-13 14:42:07 +00:00
prometheus.tf truenas deprecation: migrate all non-immich storage to proxmox NFS 2026-04-12 14:35:39 +01:00
prometheus_chart_values.tpl fix: NFS outage recovery — migrate to NFSv4, add alerting 2026-04-14 10:28:27 +00:00
prometheus_snmp_chart_values.yaml extract monitoring, nvidia, mailserver, cloudflared, kyverno from platform [ci skip] 2026-03-17 21:34:11 +00:00
pve_exporter.tf extract monitoring, nvidia, mailserver, cloudflared, kyverno from platform [ci skip] 2026-03-17 21:34:11 +00:00
snmp_exporter.tf security(monitoring): remove public SNMP exporter ingress 2026-04-06 15:23:56 +03:00
ups_snmp_values.yaml extract monitoring, nvidia, mailserver, cloudflared, kyverno from platform [ci skip] 2026-03-17 21:34:11 +00:00