diff --git a/.claude/skills/k8s-nfs-mount-troubleshooting/SKILL.md b/.claude/skills/k8s-nfs-mount-troubleshooting/SKILL.md index 5f3fa3b8..0aa94e14 100644 --- a/.claude/skills/k8s-nfs-mount-troubleshooting/SKILL.md +++ b/.claude/skills/k8s-nfs-mount-troubleshooting/SKILL.md @@ -6,11 +6,13 @@ description: | (3) Error message shows "Protocol not supported" or "mount.nfs: access denied", (4) NFS volume defined in pod spec but container won't start, (5) Container starts but gets "Permission denied" writing to NFS volume (non-root container UID mismatch), - (6) CronJob or init container fails silently when writing to NFS. Common root causes - are missing NFS export on the server and UID mismatch for non-root containers. + (6) CronJob or init container fails silently when writing to NFS, (7) Pod shows Running + 1/1 but service is unresponsive after a node reboot — stale NFS mount causes frozen + processes with zero listening sockets. Common root causes are missing NFS export on the + server, UID mismatch for non-root containers, and stale mounts after node reboots. author: Claude Code -version: 1.1.0 -date: 2026-02-22 +version: 1.2.0 +date: 2026-02-28 --- # Kubernetes NFS Mount Troubleshooting @@ -142,6 +144,91 @@ kubectl exec -n -- sh -c 'echo test > /path/to/nfs/testfile' ssh root@10.0.10.15 "ls -la /mnt/main//" ``` +## Variant: Stale NFS Mounts After Node Reboot (Ghost Running Pods) + +### Problem +After a node reboot (e.g., from kured rolling kernel updates), pods are rescheduled and +show `Running 1/1` status, but the application process is frozen/hung. The service is +completely unresponsive despite appearing healthy to Kubernetes. + +### Trigger Conditions +- Node was recently rebooted (check `kubectl get nodes` for age, or kured logs) +- Pod shows `Running 1/1` with 0 restarts (looks perfectly healthy) +- Service is unresponsive — Uptime Kuma or curl shows timeout/connection refused +- `kubectl exec -- ss -tlnp` shows **zero listening sockets** (the process started but is hung) +- Pod uses NFS volumes (inline `nfs {}` or PVC backed by NFS) +- Multiple pods across different namespaces all exhibit the same symptom simultaneously +- `kubectl describe pod` shows no warnings or errors — everything looks normal + +### Root Cause +When a node reboots, the NFS client mounts go stale. If the pod is rescheduled to the +same or different node before NFS fully recovers, the application process starts but +immediately hangs when it tries to access the NFS-mounted filesystem. The process is +stuck in an uninterruptible I/O wait (D state) but Kubernetes sees the container as +running because the PID exists and liveness probes (if any) may not exercise the NFS path. + +### Solution +Force-delete the affected pods to trigger a clean reschedule with fresh NFS mounts: + +```bash +# Identify hung pods — Running but no listening sockets +kubectl exec -n -- ss -tlnp 2>/dev/null +# If output is empty or shows no expected ports, the pod is hung + +# Force-delete to skip graceful shutdown (hung process won't respond to SIGTERM) +kubectl delete pod -n --force --grace-period=0 + +# The deployment controller creates a new pod with fresh NFS mounts +kubectl get pods -n -w +``` + +For bulk remediation after a cluster-wide event: +```bash +# Find all pods with NFS volumes that might be hung +# Check each service's expected port — if ss -tlnp shows nothing, force-delete +for ns in calibre stirling-pdf send speedtest n8n paperless-ngx; do + pod=$(kubectl get pod -n $ns -o name | head -1) + sockets=$(kubectl exec -n $ns ${pod} -- ss -tlnp 2>/dev/null | wc -l) + if [ "$sockets" -le 1 ]; then + echo "HUNG: $ns/$pod (no listening sockets)" + kubectl delete ${pod} -n $ns --force --grace-period=0 + fi +done +``` + +### Verification +```bash +# New pod should have listening sockets +kubectl exec -n -- ss -tlnp +# Should show the application's expected port (e.g., *:8080) + +# Service should respond +kubectl exec -n -- curl -sI http://localhost:/ +# Should return HTTP response +``` + +### Key Diagnostic Insight +The critical signal is **Running 1/1 but zero listening sockets**. Normal healthy pods +always have at least one listening socket for their application port. If `ss -tlnp` +returns nothing, the process is hung on a stale NFS mount, not crashed — that's why +Kubernetes thinks it's fine. + +### Prevention +- Add **liveness probes** that hit the application's HTTP endpoint (not just TCP connect): + ```hcl + liveness_probe { + http_get { + path = "/" + port = 8080 + } + initial_delay_seconds = 60 + period_seconds = 30 + timeout_seconds = 5 + } + ``` +- This ensures Kubernetes detects hung pods and restarts them automatically. + ## See Also - TrueNAS NFS configuration documentation - Kubernetes NFS volume documentation +- k8s-limitrange-oom-silent-kill (for OOM issues often confused with NFS hangs)