infra/.claude/skills/extend-vm-storage/SKILL.md
Viktor Barzin fd0f4a0365 fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip]
6d224861 came from a --no-checkout worktree whose empty index made the
commit drop every file except two. This restores 05b50d2b's full tree and
correctly adds stacks/stem95su/gdrive-sync.tf + the service-catalog stem95su
entry. Forward-only (parent=6d224861, no force-push); [ci skip] since the
live infra was never applied from the broken commit.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-09 08:45:33 +00:00

2.4 KiB

name description author version date
extend-vm-storage Extend disk storage on a Kubernetes node VM (Proxmox-hosted). Use when: (1) User wants to increase disk space on a k8s node VM, (2) A node is running low on disk, (3) User says "extend storage" or "add disk space". Automates: drain → shutdown → resize → boot → expand filesystem → uncordon. Claude Code 1.0.0 2025-01-01

Extend VM Storage Skill

Purpose: Extend disk storage on a Kubernetes node VM (Proxmox-hosted).

When to use: User wants to increase disk space on a k8s node VM, or a node is running low on disk.

Workflow

1. Identify the Node

Ask the user which node needs more storage and how much to add.

Valid nodes: k8s-master, k8s-node1, k8s-node2, k8s-node3, k8s-node4

2. Run the Script

./scripts/extend_vm_storage.sh <node-name> <size-increment>

Example:

./scripts/extend_vm_storage.sh k8s-node2 +64G

3. What the Script Does

  1. Validates inputs (node name and size format)
  2. Resolves node IP via kubectl
  3. Prompts for confirmation
  4. Drains the node (evicts pods)
  5. Shuts down the VM in Proxmox
  6. Resizes the disk (scsi0) by the given increment
  7. Starts the VM and waits for SSH
  8. Expands the filesystem inside the guest (auto-detects LVM vs direct partition)
  9. Uncordons the node
  10. Shows verification output (df -h and node status)

4. Update Terraform (if needed)

If you want Terraform to reflect the new disk size, update the VM definition in main.tf or modules/create-vm/ so that a future terraform apply doesn't revert the change. Check if the VM disk size is managed by Terraform:

grep -A5 "disk" main.tf | grep -i size

If managed, update the size value to match the new total.

5. Verification

After the script completes, verify:

kubectl --kubeconfig $(pwd)/config get nodes
ssh wizard@<node-ip> "df -h /"

Recovery

If the script fails mid-way:

  1. Check VM status: ssh root@192.168.1.127 "qm status <vmid>"
  2. Start VM if stopped: ssh root@192.168.1.127 "qm start <vmid>"
  3. Uncordon node: kubectl --kubeconfig $(pwd)/config uncordon <node-name>

Constants

Setting Value
Proxmox host root@192.168.1.127
VM SSH user wizard
Disk name scsi0
Shutdown timeout 300s
SSH wait timeout 300s

Questions to Ask User

  1. Which node needs more storage?
  2. How much storage to add? (e.g., +64G)