fix: NFS outage recovery — migrate to NFSv4, add alerting

NFS server restart broke NFSv3 (lockd kernel bug on PVE 6.14).
All 52 NFS PVs patched to nfsvers=4, NFSv3 disabled on PVE.

Changes:
- nfs_volume module: add nfsvers=4 mount option
- nfs-csi StorageClass: add nfsvers=4 mount option
- dbaas: MySQL serverInstances 3→1, mysql-native-password=ON
- monitoring: add NFSCSINodeDown and NFSMountFailures alerts

[ci skip]

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
Viktor Barzin 2026-04-14 10:28:27 +00:00
parent 92900b5e08
commit ea18116da9
4 changed files with 21 additions and 1 deletions

View file

@ -180,7 +180,7 @@ resource "helm_release" "mysql_cluster" {
version = "2.2.7"
values = [yamlencode({
serverInstances = 3
serverInstances = 1
routerInstances = 1
serverVersion = "8.4.4"
@ -216,6 +216,7 @@ resource "helm_release" "mysql_cluster" {
mycnf = <<-EOT
[mysqld]
skip-name-resolve
mysql-native-password=ON
# Auto-recovery after crashes: rejoin group without manual intervention
group_replication_autorejoin_tries=2016
group_replication_exit_state_action=OFFLINE_MODE