add claude [ci skip]

This commit is contained in:
Viktor Barzin 2026-02-06 20:10:02 +00:00
parent 9ef4d38d51
commit ffa80f0df6
18 changed files with 3026 additions and 95 deletions

View file

@ -3,7 +3,7 @@
## Instructions for Claude
- **When the user says "remember" something**: Always update this file (`.claude/CLAUDE.md`) with the information so it persists across sessions
- **When discovering new patterns or versions**: Add them to the appropriate section below
- **Use `/update-knowledge` command**: Or edit this file directly to add learnings
- **Skills available**: Check `.claude/skills/` directory for specialized workflows (e.g., `setup-project.md` for deploying new services)
## Execution Environment (CRITICAL)
- **Prefer running commands directly first** - only use remote executor as fallback if local execution fails
@ -19,33 +19,98 @@
- **helm**: chart operations - needs cluster access
- **docker**: container operations on remote hosts
- **ssh**: connections to infrastructure nodes
- **python/pip**: ALL Python and pip commands must run via remote executor
- **Any command interacting with**: Proxmox, Kubernetes cluster, NFS server, other infrastructure
### Remote Command Execution (FALLBACK)
For commands that fail locally, use the file-based relay:
For commands that fail locally, use the file-based relay. Uses a **shared executor** at `~/.claude/` on the remote VM.
**IMPORTANT: Always use multi-session mode** - create a session at the start of each conversation.
#### Shared Executor Architecture
The executor lives at `~/.claude/` on the remote VM (wizard@10.0.10.10) and serves all projects:
- `~/.claude/remote-executor.sh` - The shared command executor
- `~/.claude/session-exec.sh` - Shared session management
- `~/.claude/sessions/` → symlink to project sessions (or shared sessions directory)
Each session includes a `workdir.txt` specifying which project directory to use.
#### Multi-Session Mode (REQUIRED)
Each Claude session gets isolated command execution:
**To execute a remote command:**
```bash
# 1. Write command
echo "your-command-here" > /Volumes/wizard/code/infra/.claude/cmd_input.txt
# 2. Wait and check status
sleep 1 && cat /Volumes/wizard/code/infra/.claude/cmd_status.txt
# 3. Read output (when status is "done:*")
cat /Volumes/wizard/code/infra/.claude/cmd_output.txt
# 1. Create a session (once per Claude session)
SESSION_ID=$(.claude/session-exec.sh)
# 2. Write command to your session
echo "your-command-here" > .claude/sessions/$SESSION_ID/cmd_input.txt
# 3. Wait and check status
sleep 1 && cat .claude/sessions/$SESSION_ID/cmd_status.txt
# 4. Read output (when status is "done:*")
cat .claude/sessions/$SESSION_ID/cmd_output.txt
# 5. Cleanup when done (optional - auto-cleaned after 24h)
.claude/session-exec.sh $SESSION_ID cleanup
```
**Status values:** `ready` | `running` | `done:N` (N = exit code)
**Requires user to start executor in another terminal:**
**Requires user to start shared executor in another terminal:**
```bash
.claude/remote-executor.sh wizard@10.0.10.10 /home/wizard/code/infra
# On wizard@10.0.10.10:
~/.claude/remote-executor.sh
```
**Session helper commands:**
- `.claude/session-exec.sh` - Create new session (returns session ID)
- `.claude/session-exec.sh <id> status` - Check session status
- `.claude/session-exec.sh <id> cleanup` - Remove a session
- `.claude/session-exec.sh _ list` - List all active sessions
---
## Overview
Terraform-based infrastructure repository managing a home Kubernetes cluster on Proxmox VMs. Uses git-crypt for secrets encryption.
## Static File Paths (NEVER CHANGE)
- **Main config**: `terraform.tfvars` - All secrets, DNS, Cloudflare config, WireGuard peers
- **Root terraform**: `main.tf` - Proxmox provider, VM templates, kubernetes_cluster module
- **K8s services**: `modules/kubernetes/main.tf` - All service module definitions
- **Secrets**: `secrets/` - git-crypt encrypted TLS certs and keys
## Network Topology (Static IPs)
```
┌─────────────────────────────────────────────────────────────────┐
│ 10.0.10.0/24 - Management Network │
├─────────────────────────────────────────────────────────────────┤
│ 10.0.10.10 - Wizard (main server / remote executor host) │
│ 10.0.10.15 - NFS Server (TrueNAS) - /mnt/main/* │
└─────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────┐
│ 10.0.20.0/24 - Kubernetes Network │
├─────────────────────────────────────────────────────────────────┤
│ 10.0.20.1 - pfSense Gateway │
│ 10.0.20.10 - Docker Registry VM (MAC: DE:AD:BE:EF:22:22) │
│ 10.0.20.100 - k8s-master │
│ 10.0.20.101 - Technitium DNS │
│ 10.0.20.102 - MetalLB IP Pool Start │
│ 10.0.20.200 - MetalLB IP Pool End │
└─────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────┐
│ 192.168.1.0/24 - Physical Network │
├─────────────────────────────────────────────────────────────────┤
│ 192.168.1.127 - Proxmox Hypervisor │
└─────────────────────────────────────────────────────────────────┘
```
## Domains
- **Public**: `viktorbarzin.me` (Cloudflare-managed)
- **Internal**: `viktorbarzin.lan` (Technitium DNS)
## Directory Structure
- `main.tf` - Main Terraform entry point, imports all modules
- `modules/kubernetes/` - Kubernetes service deployments (one folder per service)
@ -77,6 +142,11 @@ volume {
```
Only use PV/PVC when the Helm chart requires `existingClaim` (like the Nextcloud Helm chart).
### Adding NFS Exports
To add a new NFS exported directory (on the remote VM via executor):
1. Edit `nfs_directories.txt` - add the new directory path, keep the list sorted
2. Run `nfs_exports.sh` to create the NFS export
### Factory Pattern (for multi-user services)
Used when a service needs one instance per user. Structure:
```
@ -92,6 +162,33 @@ To add a new user:
2. Add Cloudflare route in tfvars
3. Add module block in main.tf calling factory
### Init Container Pattern (for database migrations)
Use when a service needs to run database migrations before starting:
```hcl
init_container {
name = "migration"
image = "service-image:tag"
command = ["sh", "-c", "migration-command"]
dynamic "env" {
for_each = local.common_env
content {
name = env.value.name
value = env.value.value
}
}
}
```
Example: AFFiNE runs `node ./scripts/self-host-predeploy.js` in init container.
### SMTP/Email Configuration
When configuring services to use the mailserver:
- **Use public hostname**: `mail.viktorbarzin.me` (for TLS cert validation)
- **Do NOT use**: `mailserver.mailserver.svc.cluster.local` (TLS cert mismatch)
- **Port**: 587 (STARTTLS)
- **Credentials**: Use existing accounts from `mailserver_accounts` in tfvars
- **Common email**: `info@viktorbarzin.me` for service notifications
## Common Variables
- `tls_secret_name` - TLS certificate secret name
- `tier` - Deployment tier label
@ -100,6 +197,7 @@ To add a new user:
## Service Versions (as of 2025-01)
- Immich: v2.4.1
- Freedify: latest (music streaming, factory pattern)
- AFFiNE: stable (visual canvas, uses PostgreSQL + Redis)
## Useful Commands
```bash
@ -124,26 +222,137 @@ Top-level modules in `main.tf`:
- `module.docker-registry-vm` - Docker registry VM
- `module.kubernetes_cluster` - Main K8s cluster (contains all services)
### Kubernetes Services (under module.kubernetes_cluster.module.*)
Core (tier 0-1):
- `metallb`, `dbaas`, `technitium`, `nginx-ingress`, `crowdsec`, `cloudflared`
- `redis`, `metrics-server`, `authentik`, `nvidia`, `vaultwarden`, `reverse-proxy`
- `wireguard`, `headscale`, `xray`, `monitoring`
---
GPU (tier 2):
- `immich`, `frigate`, `ollama`, `ebook2audiobook`
## Complete Service Catalog
Edge/Aux (tier 3-4):
- `blog`, `drone`, `hackmd`, `mailserver`, `privatebin`, `shadowsocks`
- `city-guesser`, `echo`, `url`, `webhook_handler`, `excalidraw`, `travel_blog`
- `dashy`, `send`, `ytdlp`, `uptime-kuma`, `calibre`, `audiobookshelf`
- `paperless-ngx`, `jsoncrack`, `servarr`, `ntfy`, `cyberchef`, `diun`
- `meshcentral`, `nextcloud`, `homepage`, `matrix`, `linkwarden`, `actualbudget`
- `owntracks`, `dawarich`, `changedetection`, `tandoor`, `n8n`, `real-estate-crawler`
- `tor-proxy`, `onlyoffice`, `forgejo`, `freshrss`, `navidrome`, `networking-toolbox`
- `tuya-bridge`, `stirling-pdf`, `isponsorblocktv`, `rybbit`, `wealthfolio`
- `kyverno`, `speedtest`, `freedify`, `netbox`, `f1-stream`, `kms`, `k8s-dashboard`
- `descheduler`, `reloader`, `infra-maintenance`
### DEFCON Level 1 (Critical - Network & Auth)
| Service | Description | Tier |
|---------|-------------|------|
| wireguard | VPN server | core |
| technitium | DNS server (10.0.20.101) | core |
| headscale | Tailscale control server | core |
| nginx-ingress | Ingress controller | core |
| xray | Proxy/tunnel | core |
| authentik | Identity provider (SSO) | core |
| cloudflared | Cloudflare tunnel | core |
| authelia | Auth middleware | core |
| monitoring | Prometheus/Grafana stack | core |
### DEFCON Level 2 (Storage & Security)
| Service | Description | Tier |
|---------|-------------|------|
| vaultwarden | Bitwarden-compatible password manager | cluster |
| redis | Shared Redis at `redis.redis.svc.cluster.local` | cluster |
| immich | Photo management (GPU) | gpu |
| nvidia | GPU device plugin | gpu |
| metrics-server | K8s metrics | cluster |
| uptime-kuma | Status monitoring | cluster |
| crowdsec | Security/WAF | cluster |
| kyverno | Policy engine | cluster |
### DEFCON Level 3 (Admin)
| Service | Description | Tier |
|---------|-------------|------|
| k8s-dashboard | Kubernetes dashboard | edge |
| reverse-proxy | Generic reverse proxy | edge |
### DEFCON Level 4 (Active Use)
| Service | Description | Tier |
|---------|-------------|------|
| mailserver | Email (docker-mailserver) | edge |
| shadowsocks | Proxy | edge |
| webhook_handler | Webhook processing | edge |
| tuya-bridge | Smart home bridge | edge |
| dawarich | Location history | edge |
| owntracks | Location tracking | edge |
| nextcloud | File sync/share | edge |
| calibre | E-book management | edge |
| onlyoffice | Document editing | edge |
| f1-stream | F1 streaming | edge |
| rybbit | Analytics | edge |
| isponsorblocktv | SponsorBlock for TV | edge |
| actualbudget | Budgeting (factory pattern) | aux |
### DEFCON Level 5 (Optional)
| Service | Description | Tier |
|---------|-------------|------|
| blog | Personal blog | aux |
| descheduler | Pod descheduler | aux |
| drone | CI/CD | aux |
| hackmd | Collaborative markdown | aux |
| kms | Key management | aux |
| privatebin | Encrypted pastebin | aux |
| vault | HashiCorp Vault | aux |
| reloader | ConfigMap/Secret reloader | aux |
| city-guesser | Game | aux |
| echo | Echo server | aux |
| url | URL shortener | aux |
| excalidraw | Whiteboard | aux |
| travel_blog | Travel blog | aux |
| dashy | Dashboard | aux |
| send | Firefox Send | aux |
| ytdlp | YouTube downloader | aux |
| wealthfolio | Finance tracking | aux |
| audiobookshelf | Audiobook server | aux |
| paperless-ngx | Document management | aux |
| jsoncrack | JSON visualizer | aux |
| servarr | Media automation (Sonarr/Radarr/etc) | aux |
| ntfy | Push notifications | aux |
| cyberchef | Data transformation | aux |
| diun | Docker image update notifier | aux |
| meshcentral | Remote management | aux |
| homepage | Dashboard/startpage | aux |
| matrix | Matrix chat server | aux |
| linkwarden | Bookmark manager | aux |
| changedetection | Web change detection | aux |
| tandoor | Recipe manager | aux |
| n8n | Workflow automation | aux |
| real-estate-crawler | Property crawler | aux |
| tor-proxy | Tor proxy | aux |
| forgejo | Git forge | aux |
| freshrss | RSS reader | aux |
| navidrome | Music streaming | aux |
| networking-toolbox | Network tools | aux |
| stirling-pdf | PDF tools | aux |
| speedtest | Speed testing | aux |
| freedify | Music streaming (factory pattern) | aux |
| netbox | Network documentation | aux |
| infra-maintenance | Maintenance jobs | aux |
| ollama | LLM server (GPU) | gpu |
| frigate | NVR/camera (GPU) | gpu |
| ebook2audiobook | E-book to audio (GPU) | gpu |
| affine | Visual canvas/whiteboard (PostgreSQL + Redis) | aux |
---
## Cloudflare Domains
### Proxied (CDN + WAF enabled)
```
blog, hackmd, privatebin, url, echo, f1tv, excalidraw, send,
audiobookshelf, jsoncrack, ntfy, cyberchef, homepage, linkwarden,
changedetection, tandoor, n8n, stirling-pdf, dashy, city-guesser,
travel, netbox
```
### Non-Proxied (Direct DNS)
```
mail, wg, headscale, immich, calibre, vaultwarden, drone,
mailserver-antispam, mailserver-admin, webhook, uptime,
owntracks, dawarich, tuya, meshcentral, nextcloud, actualbudget,
onlyoffice, forgejo, freshrss, navidrome, ollama, openwebui,
isponsorblocktv, speedtest, freedify, rybbit, paperless,
servarr, prowlarr, bazarr, radarr, sonarr, flaresolverr,
jellyfin, jellyseerr, tdarr, affine
```
### Special Subdomains
- `*.viktor.actualbudget` - Actualbudget factory instances
- `*.freedify` - Freedify factory instances
- `mailserver.*` - Mail server components (antispam, admin)
---
## CI/CD
- Drone CI (`.drone.yml`) for automated deployments
@ -152,10 +361,19 @@ Edge/Aux (tier 3-4):
- **After committing, run `git push origin master`** to sync changes
## Infrastructure
- Proxmox hypervisor for VMs
- Proxmox hypervisor for VMs (192.168.1.127)
- Kubernetes cluster with GPU node (5 nodes: k8s-master + k8s-node1-4, running v1.34.2)
- NFS server at 10.0.10.15 for storage
- Redis shared service at `redis.redis.svc.cluster.local`
- Docker registry at 10.0.20.10
### GPU Node (k8s-node1)
- **Taint**: `nvidia.com/gpu=true:NoSchedule` - Only GPU workloads can run here
- **Label**: `gpu=true`
- GPU workloads must have both:
- `node_selector = { "gpu": "true" }`
- `toleration { key = "nvidia.com/gpu", operator = "Equal", value = "true", effect = "NoSchedule" }`
- Taint is applied via `null_resource.gpu_node_taint` in `modules/kubernetes/nvidia/main.tf`
## Git Operations (IMPORTANT)
- **Git is slow** on this repo due to many files - commands can take 30+ seconds
@ -170,10 +388,72 @@ Edge/Aux (tier 3-4):
- Groups: "R730 Host", "Nvidia Tesla T4 GPU", "Power", "Cluster"
- kube-state-metrics provides: `kube_deployment_*`, `kube_statefulset_*`, `kube_daemonset_*`
## DEFCON Levels
Services are organized by criticality in `modules/kubernetes/main.tf`:
- Level 1 (Critical): wireguard, technitium, headscale, nginx-ingress, xray, authentik, cloudflare, monitoring
- Level 2 (Storage): vaultwarden, redis, immich, nvidia, metrics-server, uptime-kuma, crowdsec, kyverno
- Level 3 (Admin): k8s-dashboard, reverse-proxy
- Level 4 (Active): mailserver, shadowsocks, dawarich, nextcloud, calibre, actualbudget, etc.
- Level 5 (Optional): blog, drone, hackmd, ollama, servarr, paperless-ngx, etc.
## Tier System
- **0-core**: Critical infrastructure (ingress, DNS, VPN, auth)
- **1-cluster**: Cluster services (Redis, metrics, security)
- **2-gpu**: GPU workloads (Immich, Ollama, Frigate)
- **3-edge**: User-facing services
- **4-aux**: Optional/auxiliary services
---
## User Preferences
### Calendar
- **Default calendar**: Nextcloud (always use unless otherwise specified)
- **Nextcloud URL**: `https://nextcloud.viktorbarzin.me`
- **CalDAV endpoint**: `https://nextcloud.viktorbarzin.me/remote.php/dav/calendars/<username>/<calendar-name>/`
### Home Assistant
- **Default smart home**: Home Assistant (always use for smart home control)
- **HA URL**: `https://ha-london.viktorbarzin.me`
- **Script**: `.claude/home-assistant.py`
- **Aliases**: "ha" or "HA" = Home Assistant
### Remote Executor
- **Always use multi-session mode** - never use legacy single-file mode
- Create a session at the start of each conversation with `.claude/session-exec.sh`
- Use session files at `.claude/sessions/$SESSION_ID/` for all remote commands
### Development
- **Frontend framework**: Svelte (user is learning it, so use Svelte for all new web apps)
---
## Skills & Workflows
Skills are specialized workflows for common tasks. Located in `.claude/skills/`.
### Available Skills
**setup-project** (`.claude/skills/setup-project.md`)
- Deploy new self-hosted services from GitHub repos
- Automated workflow: Docker image → Terraform module → Deploy
- Handles database setup, ingress, DNS configuration
- **When to use**: User provides GitHub URL or wants to deploy a new service
- **Example**: "Deploy [GitHub repo] to the cluster"
**setup-remote-executor** (`.claude/skills/setup-remote-executor.md`)
- Set up shared remote executor in new projects
- Creates session-exec.sh wrapper for the shared executor
- **When to use**: Adding Claude Code support to a new project
- **Example**: "Set up remote executor for this project"
---
## Service-Specific Notes
### AFFiNE (Visual Canvas)
- **Image**: `ghcr.io/toeverything/affine:stable`
- **Port**: 3010
- **Requires**: PostgreSQL + Redis
- **Migration**: Init container runs `node ./scripts/self-host-predeploy.js`
- **Storage**: NFS at `/mnt/main/affine` mounted to `/root/.affine/storage` and `/root/.affine/config`
- **Key env vars**:
- `AFFINE_SERVER_EXTERNAL_URL` - Public URL (e.g., `https://affine.viktorbarzin.me`)
- `AFFINE_SERVER_HTTPS` - Set to `true` behind TLS ingress
- `DATABASE_URL` - PostgreSQL connection string
- `REDIS_SERVER_HOST` - Redis hostname
- `MAILER_*` - SMTP configuration for email invites
- **Local-first**: Data stored in browser by default; syncs to server when user creates account
- **Docs**: https://docs.affine.pro/self-host-affine

391
.claude/calendar-query.py Normal file
View file

@ -0,0 +1,391 @@
#!/usr/bin/env python3
"""
Nextcloud CalDAV Calendar Script
Queries and creates calendar events.
"""
import argparse
import json
import os
import sys
import uuid
from datetime import datetime, timedelta
from urllib.parse import urljoin
try:
import caldav
from icalendar import Calendar, Event, vText
except ImportError:
print("ERROR: Required packages not installed. Run:")
print(" pip install caldav icalendar")
sys.exit(1)
# Configuration from environment variables
NEXTCLOUD_URL = os.environ.get("NEXTCLOUD_URL", "https://nextcloud.viktorbarzin.me")
CALDAV_URL = f"{NEXTCLOUD_URL}/remote.php/dav"
USERNAME = os.environ.get("NEXTCLOUD_USER")
APP_PASSWORD = os.environ.get("NEXTCLOUD_APP_PASSWORD")
if not USERNAME or not APP_PASSWORD:
print("ERROR: NEXTCLOUD_USER and NEXTCLOUD_APP_PASSWORD environment variables must be set.")
print("These should be set when activating the Claude venv (~/.venvs/claude)")
sys.exit(1)
def get_client():
"""Create CalDAV client connection."""
return caldav.DAVClient(
url=CALDAV_URL,
username=USERNAME,
password=APP_PASSWORD
)
def list_calendars():
"""List all available calendars."""
client = get_client()
principal = client.principal()
calendars = principal.calendars()
result = []
for cal in calendars:
result.append({
"name": cal.name,
"url": str(cal.url)
})
return result
def get_events(calendar_name=None, start_date=None, end_date=None, days=7):
"""Get events from calendar(s) within a date range."""
client = get_client()
principal = client.principal()
calendars = principal.calendars()
if start_date is None:
start_date = datetime.now().replace(hour=0, minute=0, second=0, microsecond=0)
if end_date is None:
end_date = start_date + timedelta(days=days)
all_events = []
for cal in calendars:
if calendar_name and cal.name.lower() != calendar_name.lower():
continue
try:
events = cal.search(start=start_date, end=end_date, expand=True)
for event in events:
try:
ical = Calendar.from_ical(event.data)
for component in ical.walk():
if component.name == "VEVENT":
event_data = {
"calendar": cal.name,
"summary": str(component.get("summary", "No title")),
"start": None,
"end": None,
"location": str(component.get("location", "")) or None,
"description": str(component.get("description", "")) or None,
"all_day": False
}
dtstart = component.get("dtstart")
dtend = component.get("dtend")
if dtstart:
dt = dtstart.dt
if hasattr(dt, 'hour'):
event_data["start"] = dt.strftime("%Y-%m-%d %H:%M")
else:
event_data["start"] = dt.strftime("%Y-%m-%d")
event_data["all_day"] = True
if dtend:
dt = dtend.dt
if hasattr(dt, 'hour'):
event_data["end"] = dt.strftime("%Y-%m-%d %H:%M")
else:
event_data["end"] = dt.strftime("%Y-%m-%d")
all_events.append(event_data)
except Exception as e:
pass # Skip malformed events
except Exception as e:
print(f"Warning: Could not fetch from {cal.name}: {e}", file=sys.stderr)
# Sort by start date
all_events.sort(key=lambda x: x["start"] or "")
return all_events
def create_event(summary, start_time, end_time=None, calendar_name="Personal",
location=None, description=None, all_day=False):
"""Create a new calendar event."""
client = get_client()
principal = client.principal()
calendars = principal.calendars()
# Find the target calendar
target_cal = None
for cal in calendars:
if cal.name.lower() == calendar_name.lower():
target_cal = cal
break
if not target_cal:
# Try partial match
for cal in calendars:
if calendar_name.lower() in cal.name.lower():
target_cal = cal
break
if not target_cal:
raise ValueError(f"Calendar '{calendar_name}' not found. Available: {[c.name for c in calendars]}")
# Create the event
cal = Calendar()
cal.add('prodid', '-//Claude Calendar Script//viktorbarzin.me//')
cal.add('version', '2.0')
event = Event()
event.add('summary', summary)
event.add('uid', str(uuid.uuid4()))
event.add('dtstamp', datetime.now())
if all_day:
event.add('dtstart', start_time.date())
if end_time:
event.add('dtend', end_time.date())
else:
event.add('dtend', (start_time + timedelta(days=1)).date())
else:
event.add('dtstart', start_time)
if end_time:
event.add('dtend', end_time)
else:
# Default to 1 hour duration
event.add('dtend', start_time + timedelta(hours=1))
if location:
event.add('location', location)
if description:
event.add('description', description)
cal.add_component(event)
# Save to calendar
target_cal.save_event(cal.to_ical().decode('utf-8'))
return {
"status": "created",
"summary": summary,
"calendar": target_cal.name,
"start": start_time.strftime("%Y-%m-%d %H:%M") if not all_day else start_time.strftime("%Y-%m-%d"),
"end": end_time.strftime("%Y-%m-%d %H:%M") if end_time and not all_day else None
}
def format_events(events, output_format="text"):
"""Format events for display."""
if output_format == "json":
return json.dumps(events, indent=2)
if not events:
return "No events found."
lines = []
current_date = None
for event in events:
event_date = event["start"][:10] if event["start"] else "Unknown"
if event_date != current_date:
current_date = event_date
try:
dt = datetime.strptime(event_date, "%Y-%m-%d")
lines.append(f"\n## {dt.strftime('%A, %B %d, %Y')}")
except:
lines.append(f"\n## {event_date}")
time_str = ""
if not event["all_day"] and event["start"]:
time_str = event["start"][11:16]
if event["end"]:
time_str += f" - {event['end'][11:16]}"
else:
time_str = "All day"
line = f"- **{event['summary']}** ({time_str})"
if event["location"]:
line += f" @ {event['location']}"
if event["calendar"] != "personal":
line += f" [{event['calendar']}]"
lines.append(line)
if event["description"]:
# Truncate long descriptions
desc = event["description"][:200]
if len(event["description"]) > 200:
desc += "..."
lines.append(f" {desc}")
return "\n".join(lines)
def parse_date_arg(date_str):
"""Parse flexible date arguments."""
today = datetime.now().replace(hour=0, minute=0, second=0, microsecond=0)
if date_str == "today":
return today, today + timedelta(days=1)
elif date_str == "tomorrow":
return today + timedelta(days=1), today + timedelta(days=2)
elif date_str == "week" or date_str == "this week":
# Start from today, go to end of week (Sunday)
days_until_sunday = 6 - today.weekday()
return today, today + timedelta(days=days_until_sunday + 1)
elif date_str == "next week":
days_until_next_monday = 7 - today.weekday()
start = today + timedelta(days=days_until_next_monday)
return start, start + timedelta(days=7)
elif date_str == "month" or date_str == "this month":
return today, today + timedelta(days=30)
else:
# Try to parse as a date
try:
dt = datetime.strptime(date_str, "%Y-%m-%d")
return dt, dt + timedelta(days=1)
except:
return today, today + timedelta(days=7)
def parse_datetime(dt_str):
"""Parse flexible datetime strings."""
today = datetime.now().replace(hour=0, minute=0, second=0, microsecond=0)
# Handle relative dates with time
if dt_str.startswith("today "):
time_part = dt_str.replace("today ", "")
try:
t = datetime.strptime(time_part, "%H:%M")
return today.replace(hour=t.hour, minute=t.minute)
except:
pass
if dt_str.startswith("tomorrow "):
time_part = dt_str.replace("tomorrow ", "")
try:
t = datetime.strptime(time_part, "%H:%M")
return (today + timedelta(days=1)).replace(hour=t.hour, minute=t.minute)
except:
pass
# Try full datetime format
for fmt in ["%Y-%m-%d %H:%M", "%Y-%m-%d %H:%M:%S", "%Y-%m-%dT%H:%M", "%Y-%m-%dT%H:%M:%S"]:
try:
return datetime.strptime(dt_str, fmt)
except:
continue
# Try date only
try:
return datetime.strptime(dt_str, "%Y-%m-%d")
except:
pass
raise ValueError(f"Could not parse datetime: {dt_str}. Use 'YYYY-MM-DD HH:MM' or 'tomorrow HH:MM'")
def main():
parser = argparse.ArgumentParser(description="Query and manage Nextcloud Calendar")
parser.add_argument("command", choices=["list", "events", "today", "tomorrow", "week", "month", "create"],
help="Command to run")
parser.add_argument("--calendar", "-c", default="Personal", help="Calendar name (default: Personal)")
parser.add_argument("--days", "-d", type=int, default=7, help="Number of days to fetch")
parser.add_argument("--json", action="store_true", help="Output as JSON")
parser.add_argument("--date", help="Specific date (YYYY-MM-DD) or relative (today, tomorrow, week, month)")
# Create event options
parser.add_argument("--title", "-t", help="Event title (for create)")
parser.add_argument("--start", "-s", help="Start time: 'YYYY-MM-DD HH:MM' or 'tomorrow 10:00'")
parser.add_argument("--end", "-e", help="End time: 'YYYY-MM-DD HH:MM' (optional, defaults to +1 hour)")
parser.add_argument("--location", "-l", help="Event location")
parser.add_argument("--description", help="Event description")
parser.add_argument("--all-day", action="store_true", help="Create all-day event")
args = parser.parse_args()
output_format = "json" if args.json else "text"
try:
if args.command == "list":
calendars = list_calendars()
if output_format == "json":
print(json.dumps(calendars, indent=2))
else:
print("Available calendars:")
for cal in calendars:
print(f" - {cal['name']}")
elif args.command == "events":
if args.date:
start, end = parse_date_arg(args.date)
else:
start = datetime.now().replace(hour=0, minute=0, second=0, microsecond=0)
end = start + timedelta(days=args.days)
events = get_events(
calendar_name=args.calendar,
start_date=start,
end_date=end
)
print(format_events(events, output_format))
elif args.command in ["today", "tomorrow", "week", "month"]:
start, end = parse_date_arg(args.command)
events = get_events(
calendar_name=args.calendar,
start_date=start,
end_date=end
)
print(format_events(events, output_format))
elif args.command == "create":
if not args.title:
print("ERROR: --title is required for create command", file=sys.stderr)
sys.exit(1)
if not args.start:
print("ERROR: --start is required for create command", file=sys.stderr)
sys.exit(1)
# Parse start time
start_time = parse_datetime(args.start)
end_time = parse_datetime(args.end) if args.end else None
result = create_event(
summary=args.title,
start_time=start_time,
end_time=end_time,
calendar_name=args.calendar,
location=args.location,
description=args.description,
all_day=args.all_day
)
if output_format == "json":
print(json.dumps(result, indent=2))
else:
print(f"Event created: {result['summary']}")
print(f" Calendar: {result['calendar']}")
print(f" Start: {result['start']}")
if result['end']:
print(f" End: {result['end']}")
except Exception as e:
print(f"Error: {e}", file=sys.stderr)
sys.exit(1)
if __name__ == "__main__":
main()

373
.claude/home-assistant.py Normal file
View file

@ -0,0 +1,373 @@
#!/usr/bin/env python3
"""
Home Assistant API Script
Control and query Home Assistant entities.
"""
import argparse
import json
import os
import sys
from urllib.parse import urljoin
try:
import requests
except ImportError:
print("ERROR: Required package not installed. Run:")
print(" pip install requests")
sys.exit(1)
# Configuration from environment variables
HA_URL = os.environ.get("HOME_ASSISTANT_URL", "").rstrip("/")
HA_TOKEN = os.environ.get("HOME_ASSISTANT_TOKEN")
if not HA_URL or not HA_TOKEN:
print("ERROR: HOME_ASSISTANT_URL and HOME_ASSISTANT_TOKEN environment variables must be set.")
print("These should be set when activating the Claude venv (~/.venvs/claude)")
sys.exit(1)
HEADERS = {
"Authorization": f"Bearer {HA_TOKEN}",
"Content-Type": "application/json",
}
def api_get(endpoint):
"""Make GET request to HA API."""
url = f"{HA_URL}/api/{endpoint}"
response = requests.get(url, headers=HEADERS, timeout=30)
response.raise_for_status()
return response.json()
def api_post(endpoint, data=None):
"""Make POST request to HA API."""
url = f"{HA_URL}/api/{endpoint}"
response = requests.post(url, headers=HEADERS, json=data or {}, timeout=30)
response.raise_for_status()
return response.json() if response.text else {}
def get_states():
"""Get all entity states."""
return api_get("states")
def get_state(entity_id):
"""Get state of a specific entity."""
return api_get(f"states/{entity_id}")
def get_services():
"""Get all available services."""
return api_get("services")
def call_service(domain, service, entity_id=None, data=None):
"""Call a Home Assistant service."""
payload = data or {}
if entity_id:
payload["entity_id"] = entity_id
return api_post(f"services/{domain}/{service}", payload)
def list_entities(domain_filter=None, area_filter=None):
"""List all entities, optionally filtered by domain or area."""
states = get_states()
entities = []
for state in states:
entity_id = state["entity_id"]
domain = entity_id.split(".")[0]
if domain_filter and domain != domain_filter:
continue
entities.append({
"entity_id": entity_id,
"state": state["state"],
"friendly_name": state["attributes"].get("friendly_name", entity_id),
"domain": domain,
})
# Sort by domain, then entity_id
entities.sort(key=lambda x: (x["domain"], x["entity_id"]))
return entities
def turn_on(entity_id):
"""Turn on an entity."""
domain = entity_id.split(".")[0]
return call_service(domain, "turn_on", entity_id)
def turn_off(entity_id):
"""Turn off an entity."""
domain = entity_id.split(".")[0]
return call_service(domain, "turn_off", entity_id)
def toggle(entity_id):
"""Toggle an entity."""
domain = entity_id.split(".")[0]
return call_service(domain, "toggle", entity_id)
def set_value(entity_id, value):
"""Set value for input entities (input_number, input_text, etc.)."""
domain = entity_id.split(".")[0]
if domain == "input_number":
return call_service(domain, "set_value", entity_id, {"value": float(value)})
elif domain == "input_text":
return call_service(domain, "set_value", entity_id, {"value": str(value)})
elif domain == "input_boolean":
if value.lower() in ("true", "on", "1", "yes"):
return turn_on(entity_id)
else:
return turn_off(entity_id)
elif domain == "input_select":
return call_service(domain, "select_option", entity_id, {"option": str(value)})
elif domain == "light":
# Assume value is brightness percentage
return call_service(domain, "turn_on", entity_id, {"brightness_pct": int(value)})
elif domain == "climate":
return call_service(domain, "set_temperature", entity_id, {"temperature": float(value)})
elif domain == "cover":
return call_service(domain, "set_cover_position", entity_id, {"position": int(value)})
else:
print(f"Warning: set_value not implemented for domain '{domain}'", file=sys.stderr)
return {}
def run_script(script_id):
"""Run a script."""
if not script_id.startswith("script."):
script_id = f"script.{script_id}"
return call_service("script", "turn_on", script_id)
def run_scene(scene_id):
"""Activate a scene."""
if not scene_id.startswith("scene."):
scene_id = f"scene.{scene_id}"
return call_service("scene", "turn_on", scene_id)
def send_notification(message, title=None, target="notify"):
"""Send a notification."""
data = {"message": message}
if title:
data["title"] = title
return call_service("notify", target, data=data)
def format_entities(entities, output_format="text"):
"""Format entities for display."""
if output_format == "json":
return json.dumps(entities, indent=2)
if not entities:
return "No entities found."
lines = []
current_domain = None
for entity in entities:
if entity["domain"] != current_domain:
current_domain = entity["domain"]
lines.append(f"\n## {current_domain}")
state = entity["state"]
name = entity["friendly_name"]
eid = entity["entity_id"]
# Color-code common states
if state in ("on", "home", "open", "playing"):
state_display = f"[ON] {state}"
elif state in ("off", "away", "closed", "idle", "paused"):
state_display = f"[--] {state}"
elif state == "unavailable":
state_display = "[??] unavailable"
else:
state_display = state
lines.append(f"- {name}: {state_display}")
lines.append(f" `{eid}`")
return "\n".join(lines)
def search_entities(query):
"""Search entities by name or ID."""
query = query.lower()
states = get_states()
matches = []
for state in states:
entity_id = state["entity_id"]
friendly_name = state["attributes"].get("friendly_name", "").lower()
if query in entity_id.lower() or query in friendly_name:
matches.append({
"entity_id": entity_id,
"state": state["state"],
"friendly_name": state["attributes"].get("friendly_name", entity_id),
"domain": entity_id.split(".")[0],
})
matches.sort(key=lambda x: (x["domain"], x["entity_id"]))
return matches
def main():
parser = argparse.ArgumentParser(description="Control Home Assistant")
subparsers = parser.add_subparsers(dest="command", help="Command to run")
# List command
list_parser = subparsers.add_parser("list", help="List entities")
list_parser.add_argument("--domain", "-d", help="Filter by domain (light, switch, sensor, etc.)")
list_parser.add_argument("--json", action="store_true", help="Output as JSON")
# Search command
search_parser = subparsers.add_parser("search", help="Search entities")
search_parser.add_argument("query", help="Search query")
search_parser.add_argument("--json", action="store_true", help="Output as JSON")
# State command
state_parser = subparsers.add_parser("state", help="Get entity state")
state_parser.add_argument("entity_id", help="Entity ID")
state_parser.add_argument("--json", action="store_true", help="Output as JSON")
# On command
on_parser = subparsers.add_parser("on", help="Turn on entity")
on_parser.add_argument("entity_id", help="Entity ID")
# Off command
off_parser = subparsers.add_parser("off", help="Turn off entity")
off_parser.add_argument("entity_id", help="Entity ID")
# Toggle command
toggle_parser = subparsers.add_parser("toggle", help="Toggle entity")
toggle_parser.add_argument("entity_id", help="Entity ID")
# Set command
set_parser = subparsers.add_parser("set", help="Set entity value")
set_parser.add_argument("entity_id", help="Entity ID")
set_parser.add_argument("value", help="Value to set")
# Script command
script_parser = subparsers.add_parser("script", help="Run a script")
script_parser.add_argument("script_id", help="Script ID (with or without 'script.' prefix)")
# Scene command
scene_parser = subparsers.add_parser("scene", help="Activate a scene")
scene_parser.add_argument("scene_id", help="Scene ID (with or without 'scene.' prefix)")
# Service command
service_parser = subparsers.add_parser("service", help="Call a service")
service_parser.add_argument("domain", help="Service domain")
service_parser.add_argument("service", help="Service name")
service_parser.add_argument("--entity", "-e", help="Entity ID")
service_parser.add_argument("--data", "-d", help="JSON data")
# Services list command
services_parser = subparsers.add_parser("services", help="List available services")
services_parser.add_argument("--domain", "-d", help="Filter by domain")
services_parser.add_argument("--json", action="store_true", help="Output as JSON")
# Notify command
notify_parser = subparsers.add_parser("notify", help="Send notification")
notify_parser.add_argument("message", help="Notification message")
notify_parser.add_argument("--title", "-t", help="Notification title")
notify_parser.add_argument("--target", default="notify", help="Notification target (default: notify)")
args = parser.parse_args()
if not args.command:
parser.print_help()
sys.exit(1)
try:
if args.command == "list":
entities = list_entities(domain_filter=args.domain)
output_format = "json" if args.json else "text"
print(format_entities(entities, output_format))
elif args.command == "search":
entities = search_entities(args.query)
output_format = "json" if args.json else "text"
print(format_entities(entities, output_format))
elif args.command == "state":
state = get_state(args.entity_id)
if args.json:
print(json.dumps(state, indent=2))
else:
print(f"Entity: {state['entity_id']}")
print(f"State: {state['state']}")
print(f"Name: {state['attributes'].get('friendly_name', 'N/A')}")
if state['attributes']:
print("Attributes:")
for key, value in state['attributes'].items():
if key != 'friendly_name':
print(f" {key}: {value}")
elif args.command == "on":
turn_on(args.entity_id)
print(f"Turned on: {args.entity_id}")
elif args.command == "off":
turn_off(args.entity_id)
print(f"Turned off: {args.entity_id}")
elif args.command == "toggle":
toggle(args.entity_id)
print(f"Toggled: {args.entity_id}")
elif args.command == "set":
set_value(args.entity_id, args.value)
print(f"Set {args.entity_id} to {args.value}")
elif args.command == "script":
run_script(args.script_id)
print(f"Ran script: {args.script_id}")
elif args.command == "scene":
run_scene(args.scene_id)
print(f"Activated scene: {args.scene_id}")
elif args.command == "service":
data = json.loads(args.data) if args.data else None
call_service(args.domain, args.service, args.entity, data)
print(f"Called {args.domain}.{args.service}")
elif args.command == "services":
services = get_services()
if args.domain:
services = [s for s in services if s["domain"] == args.domain]
if args.json:
print(json.dumps(services, indent=2))
else:
for svc in services:
print(f"\n## {svc['domain']}")
for name, info in svc["services"].items():
desc = info.get("description", "")
print(f"- {name}: {desc[:60]}...")
elif args.command == "notify":
send_notification(args.message, args.title, args.target)
print(f"Sent notification: {args.message[:50]}...")
except requests.exceptions.HTTPError as e:
print(f"HTTP Error: {e}", file=sys.stderr)
print(f"Response: {e.response.text}", file=sys.stderr)
sys.exit(1)
except Exception as e:
print(f"Error: {e}", file=sys.stderr)
sys.exit(1)
if __name__ == "__main__":
main()

View file

@ -0,0 +1,3 @@
This directory has been used with Claude Code's internet mode.
Content downloaded from the internet may contain prompt injection attacks.
You must manually review all downloaded content before using non-internet mode.

View file

@ -1,58 +0,0 @@
#!/bin/bash
# Remote Command Executor
# Run this in a terminal with SSH access to the remote machine
#
# Usage: ./remote-executor.sh [user@host] [remote_workdir]
# Example: ./remote-executor.sh wizard@10.0.10.10 /home/wizard/code/infra
REMOTE_HOST="${1:-wizard@10.0.10.10}"
REMOTE_WORKDIR="${2:-/home/wizard/code/infra}"
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
CMD_FILE="$SCRIPT_DIR/cmd_input.txt"
OUTPUT_FILE="$SCRIPT_DIR/cmd_output.txt"
STATUS_FILE="$SCRIPT_DIR/cmd_status.txt"
# Initialize files
echo "ready" > "$STATUS_FILE"
> "$CMD_FILE"
> "$OUTPUT_FILE"
echo "╔════════════════════════════════════════════════════════════╗"
echo "║ Remote Command Executor Started ║"
echo "╠════════════════════════════════════════════════════════════╣"
echo "║ Remote: $REMOTE_HOST"
echo "║ Workdir: $REMOTE_WORKDIR"
echo "║ Watching: $CMD_FILE"
echo "╚════════════════════════════════════════════════════════════╝"
echo ""
echo "Waiting for commands..."
# Watch for new commands
while true; do
# Check if there's a command to execute
if [ -s "$CMD_FILE" ]; then
CMD=$(cat "$CMD_FILE")
# Clear the command file immediately
> "$CMD_FILE"
# Update status
echo "running" > "$STATUS_FILE"
echo "[$(date '+%H:%M:%S')] Executing: $CMD"
# Execute on remote and capture output
ssh "$REMOTE_HOST" "cd $REMOTE_WORKDIR && $CMD" > "$OUTPUT_FILE" 2>&1
EXIT_CODE=$?
# Append exit code to output
echo "" >> "$OUTPUT_FILE"
echo "---EXIT_CODE:$EXIT_CODE---" >> "$OUTPUT_FILE"
# Update status
echo "done:$EXIT_CODE" > "$STATUS_FILE"
echo "[$(date '+%H:%M:%S')] Done (exit: $EXIT_CODE)"
fi
sleep 0.2
done

1
.claude/session_id.txt Normal file
View file

@ -0,0 +1 @@
1769818605-3230-4984

View file

@ -0,0 +1,175 @@
---
name: bluestacks-burp-interception
description: |
Intercept Android app HTTPS traffic using BlueStacks and Burp Suite on macOS.
Use when: (1) Need to analyze Android app API calls, (2) App ignores HTTP proxy,
(3) App uses SSL pinning that blocks interception, (4) Need to install Burp CA
as system certificate. Covers ADB setup, proxy configuration, Zygisk SSL unpinning,
and Magisk trustusercerts module for system CA installation.
author: Claude Code
version: 1.0.0
date: 2026-01-24
---
# BlueStacks + Burp Suite HTTPS Traffic Interception
## Problem
You want to intercept HTTPS traffic from an Android app running in BlueStacks to analyze
API calls, but the app either ignores the proxy or uses SSL certificate pinning.
## Context / Trigger Conditions
- Running BlueStacks on macOS with Burp Suite
- App traffic not appearing in Burp Suite
- App crashes or refuses to connect when proxy is set
- Need to bypass SSL pinning for security testing/research
## Prerequisites
- BlueStacks with Magisk (kitsune variant) and root enabled
- Zygisk-SSL-Unpinning module installed
- trustusercerts Magisk module installed
- Android SDK installed (for ADB)
- Burp Suite running on port 8080
## Solution
### Step 1: Connect ADB to BlueStacks
```bash
# ADB location on macOS (Android SDK)
ADB=~/Library/Android/sdk/platform-tools/adb
# Connect to BlueStacks
$ADB connect localhost:5555
# Verify connection
$ADB devices
# Should show: emulator-5554 or localhost:5555
```
Note: BlueStacks runs **arm64-v8a** (not x86 as you might expect).
### Step 2: Set HTTP Proxy
Use your Mac's WiFi IP address (not 10.0.2.2 or localhost):
```bash
# Get Mac WiFi IP
IP=$(ipconfig getifaddr en0)
# Set proxy (Burp default port 8080)
$ADB shell settings put global http_proxy ${IP}:8080
# Verify
$ADB shell settings get global http_proxy
# Disable proxy when done
$ADB shell settings put global http_proxy :0
```
### Step 3: Configure SSL Unpinning for Target App
```bash
# Find app package name
$ADB shell pm list packages | grep <keyword>
# Edit config
$ADB shell "su -c 'cat > /data/local/tmp/zyg.ssl/config.json << EOF
{
\"targets\": [
{
\"pkg_name\" : \"com.example.app\",
\"enable\": true,
\"start_safe\": true,
\"start_delay\": 1000
}
]
}
EOF'"
# Restart the app
$ADB shell am force-stop com.example.app
$ADB shell monkey -p com.example.app -c android.intent.category.LAUNCHER 1
# Verify SSL unpinning is active
$ADB shell "logcat -d | grep -i ZygiskSSL | tail -10"
# Should show: "App detected: com.example.app" and "[*] SSL UNPINNING [#]"
```
### Step 4: Install Burp CA as System Certificate
```bash
# Download Burp CA cert
curl -x http://127.0.0.1:8080 http://burp/cert -o /tmp/burp-cert.der
# Convert to PEM
openssl x509 -inform DER -in /tmp/burp-cert.der -out /tmp/burp-cert.pem
# Get hash for Android cert store naming
HASH=$(openssl x509 -inform PEM -subject_hash_old -in /tmp/burp-cert.pem | head -1)
cp /tmp/burp-cert.pem /tmp/${HASH}.0
# Push to device
$ADB push /tmp/${HASH}.0 /sdcard/
# Install via trustusercerts Magisk module
$ADB shell "su -c 'cp /sdcard/${HASH}.0 /data/adb/modules/trustusercerts/system/etc/security/cacerts/'"
$ADB shell "su -c 'chmod 644 /data/adb/modules/trustusercerts/system/etc/security/cacerts/${HASH}.0'"
# Reboot required for Magisk overlay
$ADB shell "su -c 'reboot'"
# After reboot, verify cert is in system store
$ADB shell "su -c 'ls /system/etc/security/cacerts/${HASH}.0'"
```
### Step 5: Test Interception
1. Re-enable proxy after reboot: `$ADB shell settings put global http_proxy ${IP}:8080`
2. Launch target app
3. Check Burp Suite → Proxy → HTTP history for requests
## Verification
- Proxy set: `adb shell settings get global http_proxy` returns `<ip>:8080`
- SSL unpinning active: `logcat | grep ZygiskSSL` shows "SSL UNPINNING"
- Burp CA installed: `ls /system/etc/security/cacerts/<hash>.0` exists
- Traffic visible in Burp Suite HTTP history
## Troubleshooting
| Symptom | Cause | Fix |
|---------|-------|-----|
| No traffic in Burp | Proxy not set | Check `settings get global http_proxy` |
| App shows SSL error | Cert not installed | Verify cert in system store, reboot |
| SSL unpinning not working | Config not loaded | Force-stop app, check config.json syntax |
| ADB connection refused | BlueStacks ADB disabled | Enable in BlueStacks Settings → Advanced |
| Wrong cert hash | Using wrong openssl flag | Use `subject_hash_old` not `subject_hash` |
## Notes
- BlueStacks runs arm64-v8a, so Zygisk modules need arm64 support
- The trustusercerts module copies certs at boot via Magisk overlay
- System partition is read-only; use Magisk modules instead of direct mounting
- Burp cert hash is typically `9a5ba575` but verify for your instance
- Some apps may use additional protections (root detection, Frida detection)
## Quick Reference
```bash
# Set proxy
adb shell settings put global http_proxy <ip>:8080
# Disable proxy
adb shell settings put global http_proxy :0
# Check SSL unpinning logs
adb shell "logcat -d | grep -i ZygiskSSL"
# Force restart app
adb shell am force-stop <package> && adb shell monkey -p <package> -c android.intent.category.LAUNCHER 1
```
## References
- [Zygisk-SSL-Unpinning](https://github.com/m0szy/Zygisk-SSL-Unpinning)
- [MagiskTrustUserCerts](https://github.com/NVISOsecurity/MagiskTrustUserCerts)
- [Burp Suite Documentation](https://portswigger.net/burp/documentation)

@ -0,0 +1 @@
Subproject commit 7d7f5915f90db26e1a3fc52db6ac2e68d6d705a2

View file

@ -0,0 +1,310 @@
---
name: fastapi-svelte-gpu-webui
description: |
Pattern for building web UIs for GPU-based CLI tools. Use when:
(1) Wrapping a command-line tool with a web interface, (2) Building job queue
systems for long-running GPU tasks, (3) Creating file upload/download workflows,
(4) Need real-time progress updates via WebSocket, (5) Deploying to Kubernetes
with GPU scheduling. Covers FastAPI backend, Svelte 5 frontend, NFS storage,
and Terraform deployment.
author: Claude Code
version: 1.0.0
date: 2025-01-31
---
# FastAPI + Svelte GPU WebUI Pattern
## Problem
Many powerful tools are command-line only, making them inaccessible to non-technical
users. Building a web UI requires handling file uploads, job queuing, progress tracking,
and GPU resource scheduling.
## Context / Trigger Conditions
- You have a CLI tool that does heavy processing (ML inference, media conversion, etc.)
- Want to add a web interface for easier access
- Need to track long-running job progress
- Deploying to Kubernetes with GPU nodes
- Files need to persist across pod restarts (NFS storage)
## Solution Overview
### Directory Structure
```
project-web/
├── backend/
│ ├── main.py # FastAPI app
│ ├── api/
│ │ ├── __init__.py
│ │ └── routes.py # REST endpoints
│ ├── services/
│ │ ├── __init__.py
│ │ └── converter.py # CLI wrapper + job manager
│ ├── models/
│ │ ├── __init__.py
│ │ └── schemas.py # Pydantic models
│ └── requirements.txt
├── frontend/
│ ├── src/
│ │ ├── App.svelte
│ │ ├── lib/
│ │ │ ├── FileUpload.svelte
│ │ │ ├── JobsList.svelte
│ │ │ └── ProgressBar.svelte
│ │ └── stores/
│ │ └── jobs.js
│ ├── package.json
│ └── vite.config.js
├── Dockerfile
└── README.md
```
### Backend: Job Manager Pattern
```python
# services/converter.py
import asyncio
import uuid
from datetime import datetime
from pathlib import Path
from typing import Optional, Callable
import subprocess
class Job:
id: str
filename: str
status: str # pending, processing, completed, failed
progress: float
created_at: datetime
output_file: Optional[str]
error: Optional[str]
class JobManager:
def __init__(self, storage_path: str = "/mnt"):
self.storage_path = Path(storage_path)
self.jobs: dict[str, Job] = {}
self.progress_callbacks: dict[str, list[Callable]] = {}
def create_job(self, filename: str, **options) -> Job:
job_id = str(uuid.uuid4())
job = Job(
id=job_id,
filename=filename,
status="pending",
progress=0.0,
created_at=datetime.now(),
**options
)
self.jobs[job_id] = job
return job
async def run_conversion(self, job_id: str):
job = self.jobs[job_id]
job.status = "processing"
input_path = self.storage_path / "uploads" / job.filename
output_dir = self.storage_path / "outputs" / job_id
output_dir.mkdir(parents=True, exist_ok=True)
# Build command for CLI tool
cmd = [
"/path/to/cli-tool",
str(input_path),
"-o", str(output_dir),
# Add other options...
]
# Run with output capture for progress parsing
process = await asyncio.create_subprocess_exec(
*cmd,
stdout=asyncio.subprocess.PIPE,
stderr=asyncio.subprocess.PIPE,
)
# Parse output for progress updates
async def read_output(stream):
while True:
line = await stream.readline()
if not line:
break
line_str = line.decode().strip()
# Parse progress from CLI output
if "%" in line_str:
# Extract and update progress
self.update_progress(job_id, parsed_progress)
await asyncio.gather(
read_output(process.stdout),
read_output(process.stderr)
)
returncode = await process.wait()
if returncode == 0:
output_files = list(output_dir.glob("*.m4b"))
if output_files:
job.output_file = output_files[0].name
job.status = "completed"
else:
job.status = "failed"
job.error = f"Exit code {returncode}"
job_manager = JobManager()
```
### Backend: API Routes
```python
# api/routes.py
from fastapi import APIRouter, UploadFile, File, HTTPException
from fastapi.responses import FileResponse
from pathlib import Path
import shutil
import asyncio
router = APIRouter(prefix="/api")
@router.post("/upload")
async def upload_file(file: UploadFile = File(...)):
upload_dir = Path("/mnt/uploads")
upload_dir.mkdir(parents=True, exist_ok=True)
file_path = upload_dir / file.filename
with file_path.open("wb") as buffer:
shutil.copyfileobj(file.file, buffer)
return {"filename": file.filename, "size": file_path.stat().st_size}
@router.post("/jobs")
async def create_job(request: JobCreate):
job = job_manager.create_job(filename=request.filename, ...)
asyncio.create_task(job_manager.run_conversion(job.id))
return job
@router.get("/jobs")
async def list_jobs():
return job_manager.get_all_jobs()
@router.get("/jobs/{job_id}/download")
async def download_job(job_id: str):
job = job_manager.get_job(job_id)
if not job or job.status != "completed":
raise HTTPException(404)
output_path = Path("/mnt/outputs") / job_id / job.output_file
return FileResponse(output_path, filename=job.output_file)
```
### Frontend: Svelte 5 Components
```svelte
<!-- FileUpload.svelte -->
<script>
let { onUpload } = $props();
let dragOver = $state(false);
let uploading = $state(false);
async function handleUpload(file) {
uploading = true;
const formData = new FormData();
formData.append('file', file);
const response = await fetch('/api/upload', {
method: 'POST',
body: formData
});
if (response.ok) {
const data = await response.json();
onUpload(data.filename);
}
uploading = false;
}
</script>
<div class="dropzone"
class:dragover={dragOver}
ondragover={(e) => { e.preventDefault(); dragOver = true; }}
ondragleave={() => dragOver = false}
ondrop={(e) => { e.preventDefault(); handleUpload(e.dataTransfer.files[0]); }}>
Drop file here
</div>
```
### Dockerfile
```dockerfile
FROM python:3.12-slim
# Install Node for frontend build
RUN apt-get update && apt-get install -y nodejs npm
# Build frontend
COPY frontend/ /app/frontend/
WORKDIR /app/frontend
RUN npm install && npm run build
# Install backend
COPY backend/ /app/backend/
WORKDIR /app/backend
RUN pip install -r requirements.txt
# Serve static files from FastAPI
EXPOSE 8000
CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"]
```
### Terraform Deployment (GPU)
```hcl
resource "kubernetes_deployment" "myapp" {
spec {
template {
spec {
node_selector = { "gpu" : "true" }
toleration {
key = "nvidia.com/gpu"
operator = "Equal"
value = "true"
effect = "NoSchedule"
}
container {
image = "myregistry/myapp@sha256:..."
name = "myapp"
resources {
limits = { "nvidia.com/gpu" = "1" }
}
volume_mount {
name = "data"
mount_path = "/mnt"
}
}
volume {
name = "data"
nfs {
server = "10.0.10.15"
path = "/mnt/main/myapp"
}
}
}
}
}
}
```
## Verification
1. Upload a file via the UI
2. Start a conversion job
3. Watch progress update in real-time
4. Download the completed file
5. Verify files persist across pod restarts
## Notes
- Use image digest for reliable deployments (see `k8s-docker-registry-cache-bypass` skill)
- NFS storage persists across pod restarts
- GPU node taints require matching tolerations
- Consider adding job persistence (database) for production use
- WebSocket can provide smoother progress updates than polling
## See Also
- `k8s-docker-registry-cache-bypass` - Fixing image cache issues
- `k8s-gpu-no-nvidia-devices` - GPU device troubleshooting
- `python-filename-sanitization` - Secure file handling

View file

@ -0,0 +1,191 @@
---
name: home-assistant
description: |
Control Home Assistant smart home devices and automations. Use when:
(1) User asks to turn on/off lights, switches, or devices,
(2) User asks about the state of sensors, devices, or entities,
(3) User says "turn on the lights", "set temperature", "lock the door",
(4) User asks to run a scene or script,
(5) User asks "what devices are on?" or "is the door locked?",
(6) User mentions smart home, IoT, or home automation.
Always use Home Assistant for smart home control.
author: Claude Code
version: 1.0.0
date: 2025-01-25
---
# Home Assistant Control
## Problem
Need to control smart home devices, check sensor states, or run automations via Home Assistant.
## Context / Trigger Conditions
- User asks to control lights, switches, covers, climate, etc.
- User asks about device states ("is the light on?", "what's the temperature?")
- User wants to run a scene or script
- User mentions turning things on/off
- User asks about smart home devices
## Prerequisites
- Remote executor must be running (Python commands require remote execution)
- The `~/.venvs/claude` virtualenv must have `requests` package installed
- Environment variables `HOME_ASSISTANT_URL` and `HOME_ASSISTANT_TOKEN` must be set in the venv activation script
## Solution
### Script Location
```
/home/wizard/code/infra/.claude/home-assistant.py
```
### Execution Pattern (CRITICAL)
Always use the remote executor with venv activation to get environment variables:
```bash
source ~/.venvs/claude/bin/activate && cd /home/wizard/code/infra && python .claude/home-assistant.py [command] [options]
```
### Available Commands
#### List Entities
```bash
# List all entities
python .claude/home-assistant.py list
# List by domain
python .claude/home-assistant.py list --domain light
python .claude/home-assistant.py list --domain switch
python .claude/home-assistant.py list --domain sensor
python .claude/home-assistant.py list --domain climate
python .claude/home-assistant.py list --domain cover
# JSON output
python .claude/home-assistant.py list --json
```
#### Search Entities
```bash
# Search by name or ID
python .claude/home-assistant.py search "living room"
python .claude/home-assistant.py search "temperature"
python .claude/home-assistant.py search "door"
```
#### Get Entity State
```bash
python .claude/home-assistant.py state light.living_room
python .claude/home-assistant.py state sensor.temperature
python .claude/home-assistant.py state --json light.living_room
```
#### Control Entities
```bash
# Turn on/off
python .claude/home-assistant.py on light.living_room
python .claude/home-assistant.py off switch.tv
python .claude/home-assistant.py toggle light.bedroom
# Set values
python .claude/home-assistant.py set light.living_room 75 # brightness %
python .claude/home-assistant.py set climate.thermostat 22 # temperature
python .claude/home-assistant.py set cover.blinds 50 # position %
python .claude/home-assistant.py set input_number.volume 80 # numeric value
python .claude/home-assistant.py set input_boolean.away_mode on # boolean
python .claude/home-assistant.py set input_select.mode "Night" # select option
```
#### Run Scenes and Scripts
```bash
# Activate a scene
python .claude/home-assistant.py scene movie_night
python .claude/home-assistant.py scene scene.good_morning
# Run a script
python .claude/home-assistant.py script bedtime_routine
python .claude/home-assistant.py script script.welcome_home
```
#### Call Any Service
```bash
# Generic service call
python .claude/home-assistant.py service light turn_on --entity light.kitchen --data '{"brightness": 255}'
python .claude/home-assistant.py service climate set_hvac_mode --entity climate.living_room --data '{"hvac_mode": "heat"}'
python .claude/home-assistant.py service media_player play_media --entity media_player.tv --data '{"media_content_id": "...", "media_content_type": "video"}'
```
#### List Services
```bash
# List all available services
python .claude/home-assistant.py services
# Filter by domain
python .claude/home-assistant.py services --domain light
python .claude/home-assistant.py services --domain climate
```
#### Send Notifications
```bash
python .claude/home-assistant.py notify "Door left open!"
python .claude/home-assistant.py notify "Motion detected" --title "Security Alert"
python .claude/home-assistant.py notify "Hello" --target notify.mobile_app
```
## Complete Example via Remote Executor
To turn on the living room light:
1. Write command to remote executor:
```bash
# Using Write tool to write to cmd_input.txt:
source ~/.venvs/claude/bin/activate && cd /home/wizard/code/infra && python .claude/home-assistant.py on light.living_room
```
2. Wait and check status:
```bash
sleep 2 && cat /System/Volumes/Data/mnt/wizard/code/infra/.claude/cmd_status.txt
```
3. Read output:
```bash
cat /System/Volumes/Data/mnt/wizard/code/infra/.claude/cmd_output.txt
```
## Common Entity Domains
| Domain | Description | Common Actions |
|--------|-------------|----------------|
| `light` | Lights | on, off, toggle, set brightness |
| `switch` | Switches | on, off, toggle |
| `sensor` | Sensors | state (read-only) |
| `binary_sensor` | Binary sensors | state (read-only) |
| `climate` | Thermostats | set temperature, set mode |
| `cover` | Blinds/covers | open, close, set position |
| `lock` | Locks | lock, unlock |
| `media_player` | Media devices | play, pause, volume |
| `input_boolean` | Helper toggles | on, off |
| `input_number` | Helper numbers | set value |
| `input_select` | Helper dropdowns | select option |
| `script` | Scripts | run |
| `scene` | Scenes | activate |
| `automation` | Automations | trigger, on, off |
## Verification
- Commands print confirmation message on success
- Use `state` command to verify entity changed
- Exit code 0 = success, 1 = error
## Common Errors
| Error | Cause | Fix |
|-------|-------|-----|
| `HOME_ASSISTANT_URL and HOME_ASSISTANT_TOKEN must be set` | Didn't source venv activation | Use `source ~/.venvs/claude/bin/activate && python ...` |
| `404 Not Found` | Entity doesn't exist | Use `search` command to find correct entity ID |
| `401 Unauthorized` | Token invalid/expired | Generate new long-lived token in HA |
| `Connection refused` | HA not reachable | Check URL and network connectivity |
## Notes
1. **Entity IDs are case-sensitive** - use `search` to find exact IDs
2. **Token must have sufficient permissions** - ensure token has access to all entities
3. **Some entities require specific data** - use `services` command to see required fields
4. **HA URL**: `https://ha-london.viktorbarzin.me`

View file

@ -0,0 +1,110 @@
---
name: k8s-docker-registry-cache-bypass
description: |
Fix for Kubernetes pods running old Docker images despite pushing new versions.
Use when: (1) kubectl shows correct image tag but container runs old code,
(2) Local registry mirror caches stale images, (3) imagePullPolicy: Always
doesn't force fresh pulls, (4) containerd config has mirror that intercepts pulls.
Solution: Use image digest instead of tag to bypass cache entirely.
author: Claude Code
version: 1.0.0
date: 2025-01-31
---
# Kubernetes Docker Registry Cache Bypass
## Problem
Kubernetes pods continue running old Docker images even after pushing new versions with
the same tag (e.g., `:latest`). This happens when a local registry mirror caches images
and serves stale versions, ignoring `imagePullPolicy: Always`.
## Context / Trigger Conditions
- Pod is running but application code is outdated
- `docker push` succeeded with new layers
- `kubectl describe pod` shows correct image tag
- Cluster has a local registry mirror configured (e.g., in containerd config)
- `imagePullPolicy: Always` doesn't fix the issue
- Nodes configured with registry mirrors at `/etc/containerd/certs.d/` or similar
## Solution
### 1. Get the image digest after pushing
```bash
docker push viktorbarzin/myimage:latest
# Output includes: latest: digest: sha256:abc123... size: 856
```
### 2. Use digest instead of tag in deployment
```hcl
# Terraform
container {
# Use digest to bypass local registry cache
image = "docker.io/viktorbarzin/myimage@sha256:abc123..."
image_pull_policy = "Always"
name = "myimage"
}
```
```yaml
# Kubernetes YAML
containers:
- name: myimage
image: docker.io/viktorbarzin/myimage@sha256:abc123...
imagePullPolicy: Always
```
### 3. Apply and restart
```bash
terraform apply -target=module.kubernetes_cluster.module.myservice
kubectl rollout restart deployment/myservice -n mynamespace
```
## Why This Works
- Registry mirrors match by tag, not digest
- When you specify a digest, the node must fetch that exact manifest
- The mirror may not have the digest cached, forcing a pull from upstream
- Even if cached, the digest guarantees the exact image version
## Verification
```bash
# Check the pod is using the new image
kubectl get pod -n mynamespace -o jsonpath='{.items[*].spec.containers[*].image}'
# Verify application behavior reflects new code
kubectl exec -n mynamespace deploy/myservice -- <verification-command>
```
## Example
Before (problematic):
```hcl
image = "docker.io/viktorbarzin/audiblez-web:latest"
```
After (fixed):
```hcl
image = "docker.io/viktorbarzin/audiblez-web@sha256:4d0e2c839555e2229bc91a0b1273569bac88529e8b3c3cadad3c3cf9d865fa29"
```
## Notes
- You must update the digest each time you push a new image
- Consider automating digest extraction in CI/CD pipelines
- This is a workaround; ideally fix the registry mirror configuration
- To find your registry mirror config: `cat /etc/containerd/config.toml` on nodes
- Common mirror locations: `/etc/containerd/certs.d/docker.io/hosts.toml`
## Diagnosing Registry Mirror Issues
```bash
# On a k8s node, check containerd config
cat /etc/containerd/config.toml | grep -A5 mirrors
# Check if mirror is intercepting
crictl pull docker.io/library/alpine:latest --debug 2>&1 | grep -i mirror
# List cached images on node
crictl images | grep myimage
```
## References
- [Kubernetes imagePullPolicy documentation](https://kubernetes.io/docs/concepts/containers/images/#image-pull-policy)
- [containerd registry configuration](https://github.com/containerd/containerd/blob/main/docs/hosts.md)

View file

@ -0,0 +1,147 @@
---
name: k8s-gpu-no-nvidia-devices
description: |
Fix for Kubernetes GPU pods showing "CUDA not supported" or no /dev/nvidia* devices
despite nvidia.com/gpu resource allocation. Use when: (1) container runs but torch.cuda.is_available()
returns False, (2) ls /dev/nvidia* shows "no matches found", (3) nvidia-smi fails inside pod
but works on host, (4) PyTorch/TensorFlow falls back to CPU despite GPU allocation.
Covers NVIDIA device plugin, time-slicing, and container runtime issues.
author: Claude Code
version: 1.0.0
date: 2026-01-27
---
# Kubernetes GPU Pod - No NVIDIA Devices Found
## Problem
A Kubernetes pod requests GPU resources (`nvidia.com/gpu: 1`) and schedules on a GPU node,
but inside the container there are no NVIDIA devices visible. The application falls back
to CPU with messages like "CUDA not supported by the Torch installed!" despite running
in a CUDA-enabled container image.
## Context / Trigger Conditions
- Pod shows `Running` status and is on a node with `gpu=true` label
- `kubectl describe pod` shows GPU limit/request is satisfied
- Inside container: `ls /dev/nvidia*` returns "no matches found"
- Inside container: `nvidia-smi` fails or command not found
- Application logs show: "CUDA not supported", "Switching to CPU", "torch.cuda.is_available() = False"
- On the host node: `nvidia-smi` works fine
## Solution
### Step 1: Verify GPU Availability
Check if other pods are consuming the GPU:
```bash
# List all pods using GPU resources
kubectl get pods -A -o json | jq -r '.items[] | select(.spec.containers[].resources.limits."nvidia.com/gpu" != null) | "\(.metadata.namespace)/\(.metadata.name)"'
# Check NVIDIA device plugin pods
kubectl get pods -n nvidia -l app=nvidia-device-plugin
kubectl logs -n nvidia -l app=nvidia-device-plugin --tail=50
```
### Step 2: Free GPU Resources
If another workload is using the GPU, unload it:
```bash
# For Ollama specifically
kubectl exec -n ollama deployment/ollama -- ollama stop <model_name>
# Or scale down the conflicting deployment
kubectl scale deployment/<name> -n <namespace> --replicas=0
```
### Step 3: Restart the Affected Pod
After freeing GPU resources, restart the pod to get fresh device allocation:
```bash
kubectl rollout restart deployment/<name> -n <namespace>
# Or delete the pod directly
kubectl delete pod <pod-name> -n <namespace>
```
### Step 4: Verify GPU Access
```bash
# Check devices are now visible
kubectl exec -n <namespace> deployment/<name> -- ls -la /dev/nvidia*
# Test nvidia-smi
kubectl exec -n <namespace> deployment/<name> -- nvidia-smi
# Test PyTorch CUDA
kubectl exec -n <namespace> deployment/<name> -- python3 -c "import torch; print('CUDA:', torch.cuda.is_available())"
```
## Verification
After restart, you should see:
```
/dev/nvidia0
/dev/nvidiactl
/dev/nvidia-uvm
/dev/nvidia-uvm-tools
```
And `nvidia-smi` should show the GPU with your container process.
## Example
```bash
# Problem: ebook2audiobook shows "CUDA not supported"
$ kubectl exec -n ebook2audiobook deployment/ebook2audiobook -- ls /dev/nvidia*
zsh:1: no matches found: /dev/nvidia*
# Solution: Unload Ollama model holding the GPU
$ kubectl exec -n ollama deployment/ollama -- ollama ps
NAME SIZE PROCESSOR
qwen2.5:14b 10 GB 33%/67% CPU/GPU
$ kubectl exec -n ollama deployment/ollama -- ollama stop qwen2.5:14b
# Restart the affected pod
$ kubectl rollout restart deployment/ebook2audiobook -n ebook2audiobook
# Verify
$ kubectl exec -n ebook2audiobook deployment/ebook2audiobook -- nvidia-smi
# Should now show the Tesla T4 GPU
```
## Notes
- **GPU Time-Slicing**: If using NVIDIA GPU time-slicing (configured in GPU Operator),
multiple pods can share a GPU. However, device injection still requires proper timing.
- **Pod Scheduling Order**: Pods that start while GPU is fully allocated may not get
devices injected even after GPU becomes available - a restart is required.
- **Container Runtime**: The NVIDIA Container Toolkit must be properly configured.
Issues can arise from:
- cgroup driver mismatch (systemd vs cgroupfs)
- Container updates causing device loss
- SELinux blocking device access
- **Image Compatibility**: The container image must have CUDA libraries matching the
driver version. Check with `nvidia-smi` on host for driver version.
- **This Cluster**: Uses NVIDIA GPU Operator with time-slicing (20 replicas per GPU).
GPU node is `k8s-node1` with Tesla T4.
## See Also
- Check GPU Operator status: `kubectl get pods -n nvidia`
- View time-slicing config: `kubectl get configmap -n nvidia time-slicing-config -o yaml`
## References
- [NVIDIA Container Toolkit Troubleshooting](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/troubleshooting.html)
- [Kubernetes GPU Device Plugin](https://github.com/NVIDIA/k8s-device-plugin)
- [NVIDIA GPU Operator](https://docs.nvidia.com/datacenter/cloud-native/gpu-operator/latest/overview.html)

View file

@ -0,0 +1,96 @@
---
name: k8s-nfs-mount-troubleshooting
description: |
Debug Kubernetes NFS volume mount failures. Use when: (1) Pod stuck in ContainerCreating
for extended time, (2) kubectl describe shows "MountVolume.SetUp failed" with NFS errors,
(3) Error message shows "Protocol not supported" or "mount.nfs: access denied",
(4) NFS volume defined in pod spec but container won't start. Common root cause is
missing NFS export on the server, not a protocol issue.
author: Claude Code
version: 1.0.0
date: 2026-01-28
---
# Kubernetes NFS Mount Troubleshooting
## Problem
Pods with NFS volumes get stuck in `ContainerCreating` state indefinitely. The error
messages from `kubectl describe pod` can be misleading, showing protocol or permission
errors when the actual issue is the NFS export doesn't exist.
## Context / Trigger Conditions
- Pod status shows `ContainerCreating` for more than 1-2 minutes
- `kubectl describe pod` shows events like:
- `MountVolume.SetUp failed for volume "data" : mount failed: exit status 32`
- `mount.nfs: Protocol not supported`
- `mount.nfs: access denied by server`
- Pod spec includes an NFS volume mount
- Other pods on the same node work fine
## Solution
### Step 1: Identify the NFS path
```bash
kubectl describe pod -n <namespace> <pod-name> | grep -A5 "Volumes:"
```
Look for the NFS server and path (e.g., `10.0.10.15:/mnt/main/myservice`)
### Step 2: Verify the export exists on NFS server
SSH to the NFS server and check:
```bash
ssh root@<nfs-server> "ls -la /mnt/main/myservice"
```
### Step 3: If directory doesn't exist, create it
```bash
ssh root@<nfs-server> "mkdir -p /mnt/main/myservice && chmod 777 /mnt/main/myservice"
```
### Step 4: Add to NFS exports (TrueNAS specific)
For TrueNAS, add the path to the NFS share configuration:
1. Add directory to `scripts/nfs_directories.txt`
2. Run `scripts/nfs_exports.sh` to update the share via API
### Step 5: Restart the pod
```bash
kubectl delete pod -n <namespace> -l app=<app-label>
```
The deployment will create a new pod that should now mount successfully.
## Verification
```bash
kubectl get pods -n <namespace>
# Should show 1/1 Running instead of 0/1 ContainerCreating
kubectl exec -n <namespace> <pod-name> -- ls -la /app/data
# Should show the mounted directory contents
```
## Example
**Symptom:**
```
Events:
Warning FailedMount 55s (x13 over 11m) kubelet MountVolume.SetUp failed for volume "data" : mount failed: exit status 32
Mounting command: mount
Mounting arguments: -t nfs 10.0.10.15:/mnt/main/resume /var/lib/kubelet/pods/.../data
Output: mount.nfs: Protocol not supported
```
**Root Cause:** The directory `/mnt/main/resume` didn't exist on the TrueNAS server.
**Fix:**
```bash
ssh root@10.0.10.15 'mkdir -p /mnt/main/resume && chmod 777 /mnt/main/resume'
# Then add to NFS exports and restart pod
```
## Notes
- The "Protocol not supported" error is misleading - it often means the export path doesn't exist
- Always check the NFS server first before investigating protocol/firewall issues
- For TrueNAS, the NFS share must be updated via API/UI after creating new directories
- NFSv3 vs NFSv4 issues are rare in modern setups; missing paths are more common
- Check that the NFS client packages are installed on Kubernetes nodes if this is a new cluster
## See Also
- TrueNAS NFS configuration documentation
- Kubernetes NFS volume documentation

View file

@ -0,0 +1,142 @@
---
name: nextcloud-calendar
description: |
Create, list, and query calendar events in Nextcloud via CalDAV. Use when:
(1) User asks to create a calendar event, (2) User asks what's on their calendar,
(3) User says "add to calendar" or "schedule", (4) User asks about upcoming events.
Always use Nextcloud calendar unless user specifies otherwise.
author: Claude Code
version: 1.0.0
date: 2025-01-25
---
# Nextcloud Calendar Management
## Problem
Need to create, query, or manage calendar events in the user's Nextcloud calendar.
## Context / Trigger Conditions
- User asks to create/add a calendar event
- User asks "what's on my calendar?" or similar
- User mentions scheduling something
- User says "remind me" with a date (create calendar event)
- Default calendar is always Nextcloud unless otherwise specified
## Prerequisites
- Remote executor must be running (Python commands require remote execution)
- The `~/.venvs/claude` virtualenv must have `caldav` and `icalendar` packages installed
- Environment variables `NEXTCLOUD_USER` and `NEXTCLOUD_APP_PASSWORD` must be set in the venv activation script
## Solution
### Script Location
```
/home/wizard/code/infra/.claude/calendar-query.py
```
### Execution Pattern (CRITICAL)
Always use the remote executor with venv activation to get environment variables:
```bash
source ~/.venvs/claude/bin/activate && cd /home/wizard/code/infra && python .claude/calendar-query.py [command] [options]
```
### Available Commands
#### List Calendars
```bash
python .claude/calendar-query.py list
```
#### Query Events
```bash
# Today's events
python .claude/calendar-query.py today
# Tomorrow's events
python .claude/calendar-query.py tomorrow
# This week
python .claude/calendar-query.py week
# This month
python .claude/calendar-query.py month
# Custom date range
python .claude/calendar-query.py events --days 14
python .claude/calendar-query.py events --date 2026-04-10
# From specific calendar
python .claude/calendar-query.py today --calendar "Work"
```
#### Create Events
```bash
# All-day event (single day)
python .claude/calendar-query.py create --title "Doctor appointment" --start "2026-03-15" --all-day
# All-day event (multi-day) - end date is EXCLUSIVE
# For April 10-13, use end date April 14
python .claude/calendar-query.py create --title "Vacation" --start "2026-04-10" --end "2026-04-14" --all-day
# Timed event
python .claude/calendar-query.py create --title "Meeting" --start "2026-03-15 14:00" --end "2026-03-15 15:00"
# With location and description
python .claude/calendar-query.py create --title "Lunch" --start "tomorrow 12:00" --location "Cafe" --description "Team lunch"
# Relative dates work
python .claude/calendar-query.py create --title "Call" --start "today 16:00"
python .claude/calendar-query.py create --title "Review" --start "tomorrow 10:00"
```
### Output Formats
```bash
# JSON output (for parsing)
python .claude/calendar-query.py today --json
# Text output (default, human-readable)
python .claude/calendar-query.py week
```
## Complete Example via Remote Executor
To create an event "Team offsite" from March 20-22, 2026:
1. Write command to remote executor:
```bash
echo 'source ~/.venvs/claude/bin/activate && cd /home/wizard/code/infra && python .claude/calendar-query.py create --title "Team offsite" --start "2026-03-20" --end "2026-03-23" --all-day' > /System/Volumes/Data/mnt/wizard/code/infra/.claude/cmd_input.txt
```
2. Wait and check status:
```bash
sleep 3 && cat /System/Volumes/Data/mnt/wizard/code/infra/.claude/cmd_status.txt
```
3. Read output:
```bash
cat /System/Volumes/Data/mnt/wizard/code/infra/.claude/cmd_output.txt
```
## Important Notes
1. **End dates are exclusive** for all-day events (CalDAV standard). To create an event spanning April 10-13, set end to April 14.
2. **Must source venv activation** - Using `~/.venvs/claude/bin/python` directly won't work because environment variables (`NEXTCLOUD_USER`, `NEXTCLOUD_APP_PASSWORD`) are set in the activation script.
3. **No delete/update commands** - The script currently only supports create and query. To modify events, user must do it manually in Nextcloud.
4. **Default calendar** is "Personal" - use `--calendar` flag for others.
## Verification
- For queries: Output shows formatted event list
- For creates: Output shows "Event created: [title]" with calendar name and start date
- Exit code 0 = success, 1 = error (check output for details)
## Common Errors
| Error | Cause | Fix |
|-------|-------|-----|
| `NEXTCLOUD_USER and NEXTCLOUD_APP_PASSWORD must be set` | Didn't source venv activation | Use `source ~/.venvs/claude/bin/activate && python ...` |
| `Required packages not installed` | caldav/icalendar missing | Run `~/.venvs/claude/bin/pip install caldav icalendar` |
| `Calendar 'X' not found` | Wrong calendar name | Run `list` command to see available calendars |

View file

@ -0,0 +1,182 @@
---
name: python-filename-sanitization
description: |
Secure filename sanitization pattern for Python web applications. Use when:
(1) Accepting user-provided filenames for file operations, (2) Building file
rename/upload functionality, (3) Preventing path traversal attacks (../../../etc/passwd),
(4) Preventing shell injection through filenames, (5) FastAPI/Flask file handling.
Provides regex-based whitelist approach with pathlib for safe file operations.
author: Claude Code
version: 1.0.0
date: 2025-01-31
---
# Python Filename Sanitization
## Problem
User-provided filenames can contain malicious characters that enable path traversal
attacks, shell injection, or filesystem corruption. Direct use of user input in
file paths is a security vulnerability.
## Context / Trigger Conditions
- Building file upload, rename, or download functionality
- User can specify filenames via API or form input
- Files are stored on server filesystem
- Need to prevent: `../`, shell metacharacters, null bytes, etc.
## Solution
### Complete Sanitization Function
```python
import re
from pathlib import Path
def sanitize_filename(filename: str, max_length: int = 200) -> str:
"""
Sanitize a filename to prevent path traversal and shell injection.
Only allows alphanumeric characters, spaces, hyphens, underscores,
parentheses, and dots.
"""
if not filename:
raise ValueError("Filename cannot be empty")
# Remove any path components (prevent path traversal)
filename = Path(filename).name
# Only allow safe characters: alphanumeric, space, hyphen, underscore, parentheses, dot
# This regex removes anything that isn't in the allowed set
safe_filename = re.sub(r'[^a-zA-Z0-9\s\-_().]', '', filename)
# Collapse multiple spaces/dots
safe_filename = re.sub(r'\s+', ' ', safe_filename)
safe_filename = re.sub(r'\.+', '.', safe_filename)
# Strip leading/trailing whitespace and dots
safe_filename = safe_filename.strip(' .')
# Limit length
if len(safe_filename) > max_length:
safe_filename = safe_filename[:max_length]
if not safe_filename:
raise ValueError("Filename contains no valid characters")
return safe_filename
```
### FastAPI Integration Example
```python
from fastapi import APIRouter, HTTPException
from pydantic import BaseModel
from pathlib import Path
class RenameRequest(BaseModel):
new_name: str
@router.patch("/files/{file_id}/rename")
async def rename_file(file_id: str, request: RenameRequest):
"""Rename a file with sanitized input."""
file_dir = Path("/data/files") / file_id
if not file_dir.exists():
raise HTTPException(status_code=404, detail="File not found")
# Find existing file
files = list(file_dir.glob("*"))
if not files:
raise HTTPException(status_code=404, detail="No file found")
current_file = files[0]
current_extension = current_file.suffix
# Sanitize the new name
try:
safe_name = sanitize_filename(request.new_name)
except ValueError as e:
raise HTTPException(status_code=400, detail=str(e))
# Preserve original extension
if not safe_name.lower().endswith(current_extension.lower()):
safe_name = safe_name + current_extension
# Create new path (same directory, new filename)
new_file = file_dir / safe_name
# Check for conflicts
if new_file.exists() and new_file != current_file:
raise HTTPException(status_code=400, detail="A file with that name already exists")
# Rename using pathlib (no shell commands!)
current_file.rename(new_file)
return {"status": "renamed", "new_filename": safe_name}
```
## Key Security Principles
### 1. Whitelist, Don't Blacklist
```python
# BAD: Trying to block dangerous characters
filename = filename.replace('../', '').replace('\x00', '')
# GOOD: Only allow known-safe characters
safe_filename = re.sub(r'[^a-zA-Z0-9\s\-_().]', '', filename)
```
### 2. Use pathlib, Not Shell Commands
```python
# BAD: Shell command (vulnerable to injection)
os.system(f'mv "{old_path}" "{new_path}"')
# GOOD: Pure Python (no shell)
old_path.rename(new_path)
```
### 3. Extract Basename First
```python
# BAD: User could submit "../../../etc/passwd"
filename = user_input
# GOOD: Extract just the filename part
filename = Path(user_input).name
```
### 4. Validate After Sanitization
```python
# Ensure something remains after sanitization
if not safe_filename:
raise ValueError("Filename contains no valid characters")
```
## Verification
```python
# Test cases that should be handled safely
assert sanitize_filename("normal.txt") == "normal.txt"
assert sanitize_filename("../../../etc/passwd") == "etcpasswd"
assert sanitize_filename("file; rm -rf /") == "file rm -rf"
assert sanitize_filename(" spaces .txt") == "spaces.txt"
assert sanitize_filename("$(whoami).txt") == "whoami.txt"
# Test cases that should raise errors
try:
sanitize_filename("") # Should raise ValueError
except ValueError:
pass
try:
sanitize_filename("$#@!") # Should raise ValueError (no valid chars)
except ValueError:
pass
```
## Notes
- This is intentionally restrictive; expand the regex if you need Unicode support
- For Unicode filenames, consider `unicodedata.normalize('NFKD', ...)` first
- Max length of 200 is conservative; filesystem limits vary (255 bytes typical)
- Always preserve file extensions when renaming to avoid breaking file associations
- Consider adding a UUID prefix for guaranteed uniqueness in upload scenarios
## References
- [OWASP Path Traversal](https://owasp.org/www-community/attacks/Path_Traversal)
- [CWE-22: Path Traversal](https://cwe.mitre.org/data/definitions/22.html)
- [Python pathlib documentation](https://docs.python.org/3/library/pathlib.html)

View file

@ -0,0 +1,388 @@
# Setup Project Skill
**Purpose**: Deploy a new self-hosted service to the Kubernetes cluster from a GitHub repository.
**When to use**: User provides a GitHub URL or project name and wants to deploy it to the cluster.
## Workflow
### 1. Research Phase
**Input**: GitHub repository URL or project name
**Actions**:
- Visit the GitHub repository
- Check the README for:
- Official Docker image (Docker Hub, ghcr.io, etc.)
- docker-compose.yml file
- Self-hosting documentation
- Required dependencies (PostgreSQL, MySQL, Redis, etc.)
- Environment variables needed
- Default ports
- Storage requirements
**Find Docker Image Priority**:
1. Check official documentation for recommended image
2. Look in docker-compose.yml for `image:` directive
3. Check GitHub Container Registry: `ghcr.io/<org>/<repo>`
4. Check Docker Hub: `<org>/<repo>`
5. Check releases page for container images
6. Last resort: Build from Dockerfile (avoid if possible)
**Extract Configuration**:
- Container port (default port the app listens on)
- Environment variables (DATABASE_URL, REDIS_HOST, SMTP, etc.)
- Volume mounts (what data needs persistence)
- Dependencies (database type, cache, etc.)
### 2. Database Setup (if needed)
**If project requires PostgreSQL**:
- User provides database credentials or use pattern: `<service>` user with secure password
- Database will be created in shared `postgresql.dbaas.svc.cluster.local`
- Connection string format: `postgresql://<user>:<password>@postgresql.dbaas.svc.cluster.local:5432/<dbname>`
**If project requires MySQL**:
- User provides database credentials
- Database in shared `mysql.dbaas.svc.cluster.local`
- Connection string format: `mysql://<user>:<password>@mysql.dbaas.svc.cluster.local:3306/<dbname>`
**If project requires Redis**:
- Use shared Redis: `redis.redis.svc.cluster.local:6379`
- No password required
**IMPORTANT**: Never create databases yourself - always ask user for credentials to use.
### 3. Terraform Module Creation
**Create module directory**:
```bash
mkdir -p modules/kubernetes/<service-name>/
```
**Create `modules/kubernetes/<service-name>/main.tf`**:
```hcl
variable "tls_secret_name" {}
variable "tier" { type = string }
variable "postgresql_password" {} # Only if needed
# Add other variables as needed (smtp_password, api_keys, etc.)
resource "kubernetes_namespace" "<service>" {
metadata {
name = "<service>"
}
}
module "tls_secret" {
source = "../setup_tls_secret"
namespace = kubernetes_namespace.<service>.metadata[0].name
tls_secret_name = var.tls_secret_name
}
# If database migrations needed, add init_container
resource "kubernetes_deployment" "<service>" {
metadata {
name = "<service>"
namespace = kubernetes_namespace.<service>.metadata[0].name
labels = {
app = "<service>"
tier = var.tier
}
}
spec {
replicas = 1
selector {
match_labels = {
app = "<service>"
}
}
template {
metadata {
labels = {
app = "<service>"
}
}
spec {
# Init container for migrations (if needed)
# init_container { ... }
container {
name = "<service>"
image = "<docker-image>:<tag>"
port {
container_port = <port>
}
# Environment variables
env {
name = "DATABASE_URL"
value = "postgresql://<service>:${var.postgresql_password}@postgresql.dbaas.svc.cluster.local:5432/<service>"
}
# Add other env vars as needed
# Volume mounts for persistent data
volume_mount {
name = "data"
mount_path = "<mount-path>"
sub_path = "<optional-subpath>"
}
resources {
requests = {
memory = "256Mi"
cpu = "100m"
}
limits = {
memory = "2Gi"
cpu = "1"
}
}
# Health checks (if endpoints exist)
liveness_probe {
http_get {
path = "/health" # or /healthz, /, etc.
port = <port>
}
initial_delay_seconds = 60
period_seconds = 30
}
}
# NFS volume for persistence
volume {
name = "data"
nfs {
server = "10.0.10.15"
path = "/mnt/main/<service>"
}
}
}
}
}
}
resource "kubernetes_service" "<service>" {
metadata {
name = "<service>"
namespace = kubernetes_namespace.<service>.metadata[0].name
labels = {
app = "<service>"
}
}
spec {
selector = {
app = "<service>"
}
port {
name = "http"
port = 80
target_port = <container-port>
}
}
}
module "ingress" {
source = "../ingress_factory"
namespace = kubernetes_namespace.<service>.metadata[0].name
name = "<service>"
tls_secret_name = var.tls_secret_name
# Add extra_annotations if needed (proxy-body-size, timeouts, etc.)
}
```
### 4. Update Main Terraform Files
**Add to `modules/kubernetes/main.tf`**:
1. Add variable declarations at top:
```hcl
variable "<service>_postgresql_password" { type = string }
```
2. Add to appropriate DEFCON level (ask user which level, default to 5):
```hcl
5 : [
...,
"<service>"
]
```
3. Add module block at bottom:
```hcl
module "<service>" {
source = "./<service>"
for_each = contains(local.active_modules, "<service>") ? { <service> = true } : {}
tls_secret_name = var.tls_secret_name
postgresql_password = var.<service>_postgresql_password
tier = local.tiers.aux # or appropriate tier
depends_on = [null_resource.core_services]
}
```
**Add to `main.tf`**:
1. Add variable:
```hcl
variable "<service>_postgresql_password" { type = string }
```
2. Pass to kubernetes_cluster module:
```hcl
module "kubernetes_cluster" {
...
<service>_postgresql_password = var.<service>_postgresql_password
}
```
**Update `terraform.tfvars`**:
1. Add password/credentials:
```hcl
<service>_postgresql_password = "<secure-password>"
```
2. Add to Cloudflare DNS (ask user if proxied or non-proxied):
```hcl
cloudflare_non_proxied_names = [
...,
"<service>"
]
```
### 5. Email/SMTP Configuration (if needed)
If service needs to send emails:
```hcl
env {
name = "MAILER_HOST"
value = "mailserver.viktorbarzin.me" # Public hostname for TLS
}
env {
name = "MAILER_PORT"
value = "587"
}
env {
name = "MAILER_USER"
value = "info@viktorbarzin.me"
}
env {
name = "MAILER_PASSWORD"
value = var.mailserver_accounts["info@viktorbarzin.me"] # Pass from module
}
```
Add to module call:
```hcl
smtp_password = var.mailserver_accounts["info@viktorbarzin.me"]
```
### 6. Apply Terraform
```bash
# Via remote executor
terraform init
terraform apply -target=module.kubernetes_cluster.module.<service> -auto-approve
```
### 7. Verification
```bash
kubectl get pods -n <service>
kubectl logs -n <service> -l app=<service> --tail=50
```
Test URL: `https://<service>.viktorbarzin.me`
### 8. Commit Changes
```bash
git add modules/kubernetes/<service>/ main.tf modules/kubernetes/main.tf terraform.tfvars
git commit -m "Add <service> deployment
- Deploy <service> as <description>
- Uses <dependencies>
- Ingress at <service>.viktorbarzin.me
[ci skip]"
```
## Common Patterns
### Init Container for Migrations
```hcl
init_container {
name = "migration"
image = "<same-image>"
command = ["sh", "-c", "<migration-command>"]
# Same env vars and volumes as main container
}
```
### Dynamic Environment Variables
```hcl
locals {
common_env = [
{ name = "VAR1", value = "value1" },
{ name = "VAR2", value = "value2" },
]
}
dynamic "env" {
for_each = local.common_env
content {
name = env.value.name
value = env.value.value
}
}
```
### External URL Configuration
Many apps need their public URL configured:
```hcl
env {
name = "APP_URL" # or PUBLIC_URL, EXTERNAL_URL, etc.
value = "https://<service>.viktorbarzin.me"
}
env {
name = "HTTPS" # or ENABLE_HTTPS, etc.
value = "true"
}
```
## Checklist
- [ ] Find official Docker image or docker-compose
- [ ] Identify dependencies (DB, Redis, etc.)
- [ ] Ask user for database credentials (never create yourself)
- [ ] Create `modules/kubernetes/<service>/main.tf`
- [ ] Update `modules/kubernetes/main.tf` (variables, DEFCON level, module block)
- [ ] Update `main.tf` (variable, pass to module)
- [ ] Update `terraform.tfvars` (password, Cloudflare DNS)
- [ ] Run `terraform init` and `terraform apply`
- [ ] Verify pods are running
- [ ] Test the URL
- [ ] Commit changes with `[ci skip]`
## Questions to Ask User
1. What DEFCON level should this service be in? (Default: 5)
2. Should Cloudflare proxy this domain? (Default: no, add to non_proxied_names)
3. Does this need email/SMTP? (Configure if yes)
4. What database credentials should I use? (Never create yourself)
5. What tier? (core/cluster/gpu/edge/aux - default: aux)
## Notes
- **Always use official documentation** as the source of truth
- **Prefer stable/latest tags** over specific versions for self-hosted
- **Use shared infrastructure**: PostgreSQL at `postgresql.dbaas.svc.cluster.local`, Redis at `redis.redis.svc.cluster.local`
- **NFS storage**: Always at `10.0.10.15:/mnt/main/<service>`
- **Email**: Use `mailserver.viktorbarzin.me` (public hostname) not internal service name
- **Resource limits**: Start conservative, can increase if needed
- **Health checks**: Only add if the app has health endpoints

View file

@ -0,0 +1,102 @@
# Setup Shared Remote Executor
Skill for setting up Claude Code's shared remote executor in new projects.
## When to Use
- When adding Claude Code support to a new project
- When the user says "set up remote executor for this project"
- When working on a new project that needs remote command execution
## Prerequisites
- Shared executor already deployed at `~/.claude/` on wizard@10.0.10.10
- Project accessible via NFS from both macOS and the remote VM
## Setup Steps
### 1. Create .claude Directory
```bash
mkdir -p .claude/sessions
```
### 2. Create session-exec.sh Wrapper
Create `.claude/session-exec.sh` with the following content (adjust PROJECT_ROOT):
```bash
#!/bin/bash
# Project-Local Session Helper - Wrapper for shared executor
set -euo pipefail
SHARED_SESSION_EXEC="/home/wizard/.claude/session-exec.sh"
PROJECT_ROOT="/home/wizard/path/to/project" # UPDATE THIS
if [ -f "$SHARED_SESSION_EXEC" ]; then
if [ "${1:-}" = "create" ] || [ -z "${1:-}" ]; then
"$SHARED_SESSION_EXEC" create "$PROJECT_ROOT"
else
"$SHARED_SESSION_EXEC" "$@"
fi
else
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
SESSIONS_DIR="$SCRIPT_DIR/sessions"
SESSION_ID="${1:-$(date +%s)-$$-$RANDOM}"
ACTION="${2:-create}"
SESSION_DIR="$SESSIONS_DIR/$SESSION_ID"
case "$ACTION" in
create|init|"")
mkdir -p "$SESSION_DIR"
echo "ready" > "$SESSION_DIR/cmd_status.txt"
echo "$PROJECT_ROOT" > "$SESSION_DIR/workdir.txt"
> "$SESSION_DIR/cmd_input.txt"
> "$SESSION_DIR/cmd_output.txt"
echo "$SESSION_ID"
;;
cleanup|remove|delete)
[ -d "$SESSION_DIR" ] && rm -rf "$SESSION_DIR"
;;
status)
[ -d "$SESSION_DIR" ] && cat "$SESSION_DIR/cmd_status.txt"
;;
list)
[ -d "$SESSIONS_DIR" ] && ls -1 "$SESSIONS_DIR" 2>/dev/null
;;
esac
fi
```
Make executable: `chmod +x .claude/session-exec.sh`
### 3. Link Sessions Directory (on remote VM)
Run on the remote VM to add project sessions to the shared executor:
```bash
# Option A: Symlink project sessions (if using project-local sessions)
ln -sfn /path/to/project/.claude/sessions ~/.claude/sessions
# Option B: Use shared sessions (all projects share one directory)
# Just ensure ~/.claude/sessions exists
```
### 4. Create CLAUDE.md
Add execution instructions to `.claude/CLAUDE.md`:
```markdown
## Remote Command Execution
Uses shared executor at `~/.claude/` on wizard@10.0.10.10.
### Usage
\```bash
SESSION_ID=$(.claude/session-exec.sh)
echo "command" > .claude/sessions/$SESSION_ID/cmd_input.txt
sleep 1 && cat .claude/sessions/$SESSION_ID/cmd_status.txt
cat .claude/sessions/$SESSION_ID/cmd_output.txt
\```
Start executor: `~/.claude/remote-executor.sh` (on remote VM)
```
## Shared Executor Location
- Scripts: `~/.claude/remote-executor.sh`, `~/.claude/session-exec.sh`
- Sessions: `~/.claude/sessions/`
- Remote VM: wizard@10.0.10.10

View file

@ -0,0 +1,97 @@
---
name: terraform-state-identity-mismatch
description: |
Fix Terraform "Unexpected Identity Change" errors during plan/apply. Use when:
(1) Terraform fails with "the Terraform Provider unexpectedly returned a different
identity", (2) State refresh shows identity mismatch between stored and current values,
(3) Resource was created but terraform apply timed out, leaving state inconsistent.
Solution involves removing and reimporting the affected resource.
author: Claude Code
version: 1.0.0
date: 2026-01-28
---
# Terraform State Identity Mismatch Fix
## Problem
Terraform fails during plan or apply with an "Unexpected Identity Change" error,
indicating the stored state identity doesn't match what the provider returns when
reading the resource.
## Context / Trigger Conditions
- Error message contains: "Unexpected Identity Change: During the read operation,
the Terraform Provider unexpectedly returned a different identity"
- Often occurs after a terraform apply times out mid-creation
- Resource exists in the cluster/cloud but state is corrupted
- Common with Kubernetes provider after deployment rollout timeouts
## Solution
### Step 1: Identify the affected resource
The error message includes the resource address:
```
with module.kubernetes_cluster.module.resume["resume"].kubernetes_deployment.resume
```
### Step 2: Remove from state
```bash
terraform state rm 'module.kubernetes_cluster.module.resume["resume"].kubernetes_deployment.resume'
```
Note: Use single quotes around the address to handle brackets properly.
### Step 3: Import the resource back
```bash
terraform import 'module.kubernetes_cluster.module.resume["resume"].kubernetes_deployment.resume' <namespace>/<name>
```
For Kubernetes deployments, the import ID is `namespace/deployment-name`.
### Step 4: Verify with plan
```bash
terraform plan -target=<module-path>
```
Should show minimal or no changes if import was successful.
### Step 5: Apply to sync any drift
```bash
terraform apply -target=<module-path>
```
## Verification
- `terraform plan` runs without identity errors
- `terraform apply` completes successfully
- Resource still exists and functions correctly
## Example
**Error:**
```
Error: Unexpected Identity Change
Current Identity: cty.ObjectVal(map[string]cty.Value{"api_version":cty.NullVal...})
New Identity: cty.ObjectVal(map[string]cty.Value{"api_version":cty.StringVal("apps/v1")...})
with module.kubernetes_cluster.module.resume["resume"].kubernetes_deployment.resume
```
**Fix:**
```bash
terraform state rm 'module.kubernetes_cluster.module.resume["resume"].kubernetes_deployment.resume'
# Output: Removed ... Successfully removed 1 resource instance(s).
terraform import 'module.kubernetes_cluster.module.resume["resume"].kubernetes_deployment.resume' resume/resume
# Output: Import successful!
terraform apply -target=module.kubernetes_cluster.module.resume -auto-approve
# Output: Apply complete! Resources: 0 added, 1 changed, 0 destroyed.
```
## Notes
- This is a provider bug, not user error - consider reporting to provider maintainers
- The resource continues to work fine; only the terraform state is affected
- Always verify the resource exists before importing (don't import non-existent resources)
- For Kubernetes resources, import IDs are typically `namespace/name`
- For AWS resources, import IDs vary by resource type (check provider docs)
- Consider adding `-lock=false` if state locking causes issues during recovery
## See Also
- Terraform state management documentation
- Kubernetes provider import documentation