infra/.claude/skills/extend-vm-storage/SKILL.md
Viktor Barzin fd0f4a0365 fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip]
6d224861 came from a --no-checkout worktree whose empty index made the
commit drop every file except two. This restores 05b50d2b's full tree and
correctly adds stacks/stem95su/gdrive-sync.tf + the service-catalog stem95su
entry. Forward-only (parent=6d224861, no force-push); [ci skip] since the
live infra was never applied from the broken commit.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-09 08:45:33 +00:00

90 lines
2.4 KiB
Markdown

---
name: extend-vm-storage
description: |
Extend disk storage on a Kubernetes node VM (Proxmox-hosted).
Use when: (1) User wants to increase disk space on a k8s node VM,
(2) A node is running low on disk, (3) User says "extend storage"
or "add disk space". Automates: drain → shutdown → resize → boot →
expand filesystem → uncordon.
author: Claude Code
version: 1.0.0
date: 2025-01-01
---
# Extend VM Storage Skill
**Purpose**: Extend disk storage on a Kubernetes node VM (Proxmox-hosted).
**When to use**: User wants to increase disk space on a k8s node VM, or a node is running low on disk.
## Workflow
### 1. Identify the Node
Ask the user which node needs more storage and how much to add.
Valid nodes: `k8s-master`, `k8s-node1`, `k8s-node2`, `k8s-node3`, `k8s-node4`
### 2. Run the Script
```bash
./scripts/extend_vm_storage.sh <node-name> <size-increment>
```
**Example**:
```bash
./scripts/extend_vm_storage.sh k8s-node2 +64G
```
### 3. What the Script Does
1. Validates inputs (node name and size format)
2. Resolves node IP via kubectl
3. Prompts for confirmation
4. Drains the node (evicts pods)
5. Shuts down the VM in Proxmox
6. Resizes the disk (`scsi0`) by the given increment
7. Starts the VM and waits for SSH
8. Expands the filesystem inside the guest (auto-detects LVM vs direct partition)
9. Uncordons the node
10. Shows verification output (`df -h` and node status)
### 4. Update Terraform (if needed)
If you want Terraform to reflect the new disk size, update the VM definition in `main.tf` or `modules/create-vm/` so that a future `terraform apply` doesn't revert the change. Check if the VM disk size is managed by Terraform:
```bash
grep -A5 "disk" main.tf | grep -i size
```
If managed, update the size value to match the new total.
### 5. Verification
After the script completes, verify:
```bash
kubectl --kubeconfig $(pwd)/config get nodes
ssh wizard@<node-ip> "df -h /"
```
## Recovery
If the script fails mid-way:
1. Check VM status: `ssh root@192.168.1.127 "qm status <vmid>"`
2. Start VM if stopped: `ssh root@192.168.1.127 "qm start <vmid>"`
3. Uncordon node: `kubectl --kubeconfig $(pwd)/config uncordon <node-name>`
## Constants
| Setting | Value |
|---------|-------|
| Proxmox host | `root@192.168.1.127` |
| VM SSH user | `wizard` |
| Disk name | `scsi0` |
| Shutdown timeout | 300s |
| SSH wait timeout | 300s |
## Questions to Ask User
1. Which node needs more storage?
2. How much storage to add? (e.g., +64G)