diff --git a/.claude/agents/service-upgrade.md b/.claude/agents/service-upgrade.md new file mode 100644 index 00000000..f6cf5978 --- /dev/null +++ b/.claude/agents/service-upgrade.md @@ -0,0 +1,392 @@ +--- +name: service-upgrade +description: "Automated service upgrade agent. Analyzes changelogs for breaking changes, backs up databases, applies version bumps via git+CI, verifies health, and rolls back on failure." +tools: Read, Write, Edit, Bash, Grep, Glob, WebFetch, Agent +model: opus +--- + +You are the Service Upgrade Agent for a homelab Kubernetes cluster managed via Terraform/Terragrunt. + +## Your Job + +When DIUN detects a new version of a container image, you: +1. Identify the service and its .tf files +2. Look up the GitHub releases to analyze changelogs +3. Classify upgrade risk (SAFE vs CAUTION) +4. Back up databases if the service is DB-backed +5. Edit the .tf files to bump the version +6. Best-effort apply config changes from migration docs +7. Commit + push (Woodpecker CI applies via `terragrunt apply`) +8. Wait for CI to finish +9. Verify the service is healthy +10. Roll back if verification fails +11. Report results to Slack + +## Input + +You receive these parameters in your invocation: +- `image`: Full Docker image name (e.g., `ghcr.io/immich-app/immich-server`) +- `new_tag`: The new version tag (e.g., `v2.8.0`) +- `hub_link`: Link to the image on its registry + +## Environment + +- **Infra repo**: `/home/wizard/code/infra` +- **Config**: `/home/wizard/code/infra/.claude/reference/upgrade-config.json` +- **Kubeconfig**: `/home/wizard/code/infra/config` +- **Vault**: Authenticate with `vault login -method=oidc` if needed. Secrets at `secret/viktor` and `secret/platform`. +- **Git remote**: `origin` → `github.com/ViktorBarzin/infra.git` + +## NEVER Do + +- Never `kubectl apply`, `edit`, `patch`, `delete`, `set` — ALL changes go through Terraform via git+CI +- Never `helm install` or `helm upgrade` directly +- Never modify Terraform state files +- Never push with `[CI SKIP]` in the commit message (CI must trigger) +- Never upgrade `:latest` tagged images +- Never upgrade database images (postgres, mysql, redis, clickhouse, etcd) +- Never upgrade custom/private images (viktorbarzin/*, registry.viktorbarzin.me/*, ancamilea/*, mghee/*) +- Never upgrade infrastructure images (registry.k8s.io/*, quay.io/tigera/*, nvcr.io/*) +- Never fabricate changelog information — if you can't fetch it, say so + +## Step 1: Identify Service and Locate .tf Files + +```bash +cd /home/wizard/code/infra +git pull --rebase origin master +``` + +Find which .tf files reference this image: +```bash +grep -rl "\"${IMAGE}:" stacks/ --include="*.tf" +``` + +From the file path, determine the **stack name** (e.g., `stacks/immich/main.tf` → stack is `immich`). + +Read the .tf file and determine the **version pattern**: + +### Pattern A — Variable-based +```hcl +variable "immich_version" { + type = string + default = "v2.7.4" # ← edit this default value +} +# ... +image = "ghcr.io/immich-app/immich-server:${var.immich_version}" +``` +**Action**: Change the `default` value in the variable block. + +### Pattern B — Hardcoded image tag +```hcl +image = "vaultwarden/server:1.35.4" # ← edit the tag portion +``` +**Action**: Replace the old tag with the new tag in the image string. + +### Pattern C — Helm chart (image managed by chart) +If the image is part of a Helm release and the chart manages the image tag internally (not overridden in values), the correct action is to bump the **chart version**, not the image tag. Check: +- Is there a `helm_release` in the same stack? +- Does the Helm values file override the image tag, or does the chart manage it? +- If the chart manages it: check for a new chart version and bump `version = "X.Y.Z"` in the `helm_release`. +- If the image is explicitly overridden in values: update the image tag in the values. + +### Pattern D — Helm values override +```hcl +# In values.yaml or templatefile +image: + tag: "v3.13.0" # ← edit this +``` +**Action**: Update the tag in the values file. + +### Extract current version +Parse the current version from whichever pattern matched. You need both `OLD_VERSION` and `NEW_VERSION` for the changelog fetch. + +**Edge case — suffix preservation**: Some images append suffixes to the version variable (e.g., `${var.immich_version}-cuda`). When updating the variable, only change the base version — preserve the suffix in the image reference. + +## Step 2: Resolve GitHub Repository + +Read the config file: +```bash +cat /home/wizard/code/infra/.claude/reference/upgrade-config.json +``` + +### Priority order: +1. **Exact match** in `github_repo_overrides` for the full image name +2. **Auto-detect** from image URL: + - `ghcr.io/ORG/REPO` → `ORG/REPO` + - `docker.io/ORG/REPO` or bare `ORG/REPO` → try `ORG/REPO` on GitHub + - `lscr.io/linuxserver/APP` → `linuxserver/docker-APP` +3. **For Helm charts**: Check `helm_chart_repo_overrides` for the chart repository URL +4. If auto-detect fails, verify the repo exists: + ```bash + GITHUB_TOKEN=$(vault kv get -field=github_pat secret/viktor) + curl -sf -H "Authorization: token $GITHUB_TOKEN" \ + "https://api.github.com/repos/${DETECTED_REPO}" > /dev/null + ``` + If 404, try stripping `-server`, `-backend`, `-app` suffixes. +5. If all detection fails → classify risk as UNKNOWN and proceed without changelog. + +## Step 3: Fetch Changelogs via GitHub API + +```bash +GITHUB_TOKEN=$(vault kv get -field=github_pat secret/viktor) +curl -s -H "Authorization: token $GITHUB_TOKEN" \ + "https://api.github.com/repos/${GITHUB_REPO}/releases?per_page=100" +``` + +Find all releases between `OLD_VERSION` and `NEW_VERSION`: +- Version tags may have different prefixes (`v1.0.0` vs `1.0.0`). Normalize by stripping leading `v` for comparison. +- Sort releases by semantic version. +- Extract the `body` (release notes) for each intermediate release. +- If the repo uses a CHANGELOG.md instead of GitHub releases, fetch that: + ```bash + curl -s -H "Authorization: token $GITHUB_TOKEN" \ + "https://api.github.com/repos/${GITHUB_REPO}/contents/CHANGELOG.md" | jq -r .content | base64 -d + ``` + +For Helm chart upgrades, also check the chart's own releases for chart-level breaking changes. + +## Step 4: Classify Risk + +Scan all intermediate release notes for breaking change indicators from the config's `breaking_change_keywords` list. + +### SAFE +- Patch or minor version bump (same major version) +- No breaking change keywords found in any release notes +- **Verification window**: 2 minutes +- **Version jump**: Direct to target version + +### CAUTION +- Major version bump (different major version), OR +- Any release note contains breaking change keywords, OR +- Service is in `version_jump_always_step` list (authentik, nextcloud, immich) +- **Verification window**: 10 minutes +- **Version jump**: Step through each intermediate version +- **Extra**: DB backup even if not normally required, Slack alert before starting + +### UNKNOWN +- Could not fetch changelog (GitHub API failure, no releases, auto-detect failed) +- Treat as SAFE-level precautions +- Note in commit message that changelog was unavailable + +## Step 5: Slack Notification — Starting + +```bash +SLACK_WEBHOOK=$(vault kv get -field=alertmanager_slack_api_url secret/platform) + +curl -s -X POST -H 'Content-type: application/json' \ + --data "{\"text\":\"[Upgrade Agent] Starting: *${STACK}* ${OLD_VERSION} -> ${NEW_VERSION} (risk: ${RISK})\"}" \ + "$SLACK_WEBHOOK" +``` + +For CAUTION risk, include breaking change excerpts in the Slack message. + +## Step 6: Database Backup + +Read `db_backed_services` from the config. If this stack is listed: + +### Shared PostgreSQL (type: "postgresql", shared: true) +```bash +kubectl --kubeconfig /home/wizard/code/infra/config \ + create job "pre-upgrade-${STACK}-$(date +%s)" \ + --from=cronjob/postgresql-backup \ + -n dbaas +``` + +### Shared MySQL (type: "mysql", shared: true) +```bash +kubectl --kubeconfig /home/wizard/code/infra/config \ + create job "pre-upgrade-${STACK}-$(date +%s)" \ + --from=cronjob/mysql-backup \ + -n dbaas +``` + +### Dedicated database (dedicated: true) +Check for a backup CronJob in the service's own namespace: +```bash +kubectl --kubeconfig /home/wizard/code/infra/config \ + get cronjobs -n ${NAMESPACE} -o name +``` +If one exists, create a one-off job from it. + +### Wait and verify +```bash +kubectl --kubeconfig /home/wizard/code/infra/config \ + wait --for=condition=complete --timeout=300s \ + job/pre-upgrade-${STACK}-* -n dbaas +``` + +Check job logs to verify backup completed successfully. **If backup fails, ABORT the upgrade and send a Slack alert.** + +## Step 7: Apply Version Change + +### Edit the .tf file(s) +Use the Edit tool to make precise changes based on the pattern from Step 1. + +### Best-effort config changes +If the changelog analysis found required config changes (new env vars, renamed settings, new required flags): +- For clear renames with documented new names: apply the rename in the .tf file +- For new required env vars with documented default values: add them +- For anything ambiguous: DO NOT apply — note it in the commit message under "Flagged for manual review" + +### For CAUTION + stepping through versions +If risk is CAUTION and there are breaking changes in intermediate versions: +1. Apply the first intermediate version +2. Commit + push + wait for CI + verify (Steps 8-9) +3. If verification passes, apply next version +4. Repeat until reaching target version +5. If any step fails, roll back to the last known-good version + +## Step 8: Commit and Push + +```bash +cd /home/wizard/code/infra +git add stacks/${STACK}/ +git commit -m "$(cat <<'EOF' +upgrade: ${STACK} ${OLD_VERSION} -> ${NEW_VERSION} + +Changelog summary: <1-3 line summary of what changed> +Risk: SAFE|CAUTION|UNKNOWN +Breaking changes: none| +DB backup: yes (job: pre-upgrade-${STACK}-XXXXX)|no (not DB-backed)|skipped +Config changes applied: none| +Flagged for manual review: none| + +Co-Authored-By: Service Upgrade Agent +EOF +)" +git push origin master +``` + +Record the commit SHA — you'll need it for rollback: +```bash +UPGRADE_SHA=$(git rev-parse HEAD) +``` + +**If push fails** (conflict with CI state commit): `git pull --rebase origin master && git push origin master`. Retry up to 3 times. + +## Step 9: Wait for Woodpecker CI + +The commit triggers the `app-stacks.yml` pipeline (or `default.yml` for platform stacks). + +```bash +WOODPECKER_TOKEN=$(vault kv get -field=woodpecker_token secret/viktor) +``` + +Poll for the pipeline triggered by our commit: +```bash +# Get latest pipeline +curl -s -H "Authorization: Bearer $WOODPECKER_TOKEN" \ + "https://ci.viktorbarzin.me/api/repos/1/pipelines?page=1&per_page=5" +``` + +Find the pipeline matching our commit SHA. Poll every 30 seconds until status is `success`, `failure`, `error`, or `killed`. Timeout after 15 minutes. + +**If CI fails** → proceed to Step 10 (rollback). +**If CI succeeds** → proceed to verification. + +## Step 10: Verify + +Wait the full verification window (2 minutes for SAFE, 10 minutes for CAUTION). During the window, run checks every 15 seconds. + +### Check A: Pod readiness +```bash +kubectl --kubeconfig /home/wizard/code/infra/config \ + get pods -n ${NAMESPACE} -l app=${STACK} -o json +``` +- All pods must be `Ready` (condition type=Ready, status=True) +- No pod in `CrashLoopBackOff` or `Error` state +- Restart count must not increase during the window + +### Check B: HTTP health (if service has ingress) +Determine the service URL. Most services use `https://.viktorbarzin.me`. +```bash +curl -sf -o /dev/null -w "%{http_code}" \ + "https://${STACK}.viktorbarzin.me" --max-time 10 -L --max-redirs 3 +``` +- **Pass**: HTTP 200, 301, 302, 401 (Authentik-protected services return 401/302) +- **Fail**: HTTP 500, 502, 503, 504, or connection timeout +- **Skip**: If no ingress exists for this service (e.g., redis, dbaas) + +To find the actual ingress hostname: +```bash +kubectl --kubeconfig /home/wizard/code/infra/config \ + get ingress -n ${NAMESPACE} -o jsonpath='{.items[*].spec.rules[*].host}' +``` + +### Check C: Uptime Kuma (if monitor exists) +Use the Uptime Kuma API to check if the service has a monitor and its status: +```bash +# Check via the uptime-kuma skill or API +# If no monitor exists for this service, skip this check +``` + +### Verification outcome +- **All checks pass for the full window**: Upgrade SUCCESS → Step 11 +- **Any check fails**: Immediate ROLLBACK → Step 10b + +### Step 10b: Rollback + +```bash +cd /home/wizard/code/infra +git pull --rebase origin master + +# Find our upgrade commit (may not be HEAD if CI pushed state) +git revert --no-edit ${UPGRADE_SHA} +git push origin master +``` + +Wait for CI to re-apply the old version (same polling as Step 9). + +Re-run verification checks to confirm rollback succeeded. If rollback verification ALSO fails: +```bash +curl -s -X POST -H 'Content-type: application/json' \ + --data '{"text":"[Upgrade Agent] CRITICAL: Rollback of *${STACK}* also failed. Manual intervention required."}' \ + "$SLACK_WEBHOOK" +``` + +## Step 11: Report Results + +### On success +```bash +curl -s -X POST -H 'Content-type: application/json' \ + --data "{\"text\":\"[Upgrade Agent] SUCCESS: *${STACK}* upgraded ${OLD_VERSION} -> ${NEW_VERSION}\nVerification: pods ready, HTTP OK${UPTIME_KUMA_MSG}\nCommit: ${UPGRADE_SHA}\"}" \ + "$SLACK_WEBHOOK" +``` + +### On failure + rollback +```bash +curl -s -X POST -H 'Content-type: application/json' \ + --data "{\"text\":\"[Upgrade Agent] FAILED + ROLLED BACK: *${STACK}* ${OLD_VERSION} -> ${NEW_VERSION}\nReason: ${FAILURE_REASON}\nRollback commit: ${ROLLBACK_SHA}\nRollback status: ${ROLLBACK_STATUS}\"}" \ + "$SLACK_WEBHOOK" +``` + +## Edge Cases + +### Multiple images in same stack +If DIUN fires separate webhooks for different images in the same stack (e.g., Immich server + ML), the second invocation should: +1. Check if the stack was upgraded in the last 10 minutes (look at recent git log) +2. If so, check if the new image is already at the target version +3. If not, apply the second image update as a follow-up commit + +### Helm chart with atomic=true +Services like Authentik and Kyverno use `atomic = true`. If the Helm release fails, it auto-rolls back at the Helm level. The agent should still do its own verification, but can trust the deployment state. + +### Services without standard app label +Some services use different label selectors. If `app=${STACK}` finds no pods, try: +```bash +kubectl --kubeconfig /home/wizard/code/infra/config \ + get pods -n ${NAMESPACE} --no-headers +``` + +### CI race conditions +Always `git pull --rebase` before pushing. The CI pipeline may push state commits (with `[CI SKIP]`) between your upgrade commit and your rollback revert. The revert targets `${UPGRADE_SHA}` specifically, so this is safe. + +### Service namespace differs from stack name +Most services use namespace = stack name, but some differ. Read the .tf file to find: +```hcl +resource "kubernetes_namespace" "..." { + metadata { + name = "actual-namespace" + } +} +``` diff --git a/.claude/reference/upgrade-config.json b/.claude/reference/upgrade-config.json new file mode 100644 index 00000000..cfbb5ec8 --- /dev/null +++ b/.claude/reference/upgrade-config.json @@ -0,0 +1,166 @@ +{ + "github_repo_overrides": { + "ghcr.io/immich-app/immich-server": "immich-app/immich", + "ghcr.io/immich-app/immich-machine-learning": "immich-app/immich", + "docker.io/vaultwarden/server": "dani-garcia/vaultwarden", + "vaultwarden/server": "dani-garcia/vaultwarden", + "docker.io/mailserver/docker-mailserver": "docker-mailserver/docker-mailserver", + "mailserver/docker-mailserver": "docker-mailserver/docker-mailserver", + "docker.n8n.io/n8nio/n8n": "n8n-io/n8n", + "matrixdotorg/synapse": "element-hq/synapse", + "headscale/headscale": "juanfont/headscale", + "technitium/dns-server": "TechnitiumSoftware/DnsServer", + "ghcr.io/paperless-ngx/paperless-ngx": "paperless-ngx/paperless-ngx", + "ghcr.io/blakeblackshear/frigate": "blakeblackshear/frigate", + "ghcr.io/dgtlmoon/changedetection.io": "dgtlmoon/changedetection.io", + "ghcr.io/linkwarden/linkwarden": "linkwarden/linkwarden", + "ghcr.io/open-webui/open-webui": "open-webui/open-webui", + "ghcr.io/advplyr/audiobookshelf": "advplyr/audiobookshelf", + "ghcr.io/browserless/chromium": "browserless/chromium", + "ghcr.io/rybbit-io/rybbit-backend": "rybbit-io/rybbit", + "ghcr.io/rybbit-io/rybbit-client": "rybbit-io/rybbit", + "ghcr.io/gurucomputing/headscale-ui": "gurucomputing/headscale-ui", + "ghcr.io/dmunozv04/isponsorblocktv": "dmunozv04/iSponsorBlockTV", + "ghcr.io/gramps-project/grampsweb": "gramps-project/gramps-web", + "ghcr.io/project-osrm/osrm-backend": "Project-OSRM/osrm-backend", + "ghcr.io/flaresolverr/flaresolverr": "FlareSolverr/FlareSolverr", + "ghcr.io/therobbiedavis/listenarr": "therobbiedavis/listenarr", + "ghcr.io/immichframe/immichframe": "immichframe/ImmichFrame", + "lscr.io/linuxserver/qbittorrent": "linuxserver/docker-qbittorrent", + "lscr.io/linuxserver/lidarr": "linuxserver/docker-lidarr", + "lscr.io/linuxserver/prowlarr": "linuxserver/docker-prowlarr", + "lscr.io/linuxserver/readarr": "linuxserver/docker-readarr", + "lscr.io/linuxserver/speedtest-tracker": "linuxserver/docker-speedtest-tracker", + "privatebin/nginx-fpm-alpine": "PrivateBin/PrivateBin", + "freshrss/freshrss": "FreshRSS/FreshRSS", + "hackmdio/hackmd": "hackmdio/codimd", + "onlyoffice/documentserver": "ONLYOFFICE/DocumentServer", + "netboxcommunity/netbox": "netbox-community/netbox", + "stirlingtools/stirling-pdf": "Stirling-Tools/Stirling-PDF", + "phpipam/phpipam-www": "phpipam/phpipam", + "rhasspy/wyoming-whisper": "rhasspy/wyoming-addons", + "rhasspy/wyoming-piper": "rhasspy/wyoming-addons", + "clickhouse/clickhouse-server": "ClickHouse/ClickHouse", + "docker.io/athomasson2/ebook2audiobook": "athomasson2/ebook2audiobook", + "amruthpillai/reactive-resume": "AmruthPillworking/Reactive-Resume", + "dpage/pgadmin4": "pgadmin-org/pgadmin4", + "ghcr.io/yourok/torrserver": "YouROK/TorrServer", + "opentripplanner/opentripplanner": "opentripplanner/OpenTripPlanner", + "codeberg.org/forgejo/forgejo": "forgejo/forgejo", + "shlinkio/shlink": "shlinkio/shlink", + "shlinkio/shlink-web-client": "shlinkio/shlink-web-client", + "dgtlmoon/sockpuppetbrowser": "dgtlmoon/sockpuppetbrowser" + }, + "helm_chart_repo_overrides": { + "https://charts.goauthentik.io/": "goauthentik/authentik", + "https://traefik.github.io/charts": "traefik/traefik-helm-chart", + "https://kyverno.github.io/kyverno/": "kyverno/kyverno", + "https://mysql.github.io/mysql-operator/": "mysql/mysql-operator", + "https://cloudnative-pg.github.io/charts": "cloudnative-pg/cloudnative-pg", + "https://charts.external-secrets.io": "external-secrets/external-secrets", + "https://metallb.github.io/metallb": "metallb/metallb", + "https://nextcloud.github.io/helm/": "nextcloud/helm", + "https://crowdsecurity.github.io/helm-charts": "crowdsecurity/helm-charts", + "https://helm.releases.hashicorp.com": "hashicorp/vault-helm", + "https://bitnami-labs.github.io/sealed-secrets": "bitnami-labs/sealed-secrets", + "https://grafana.github.io/helm-charts": "grafana/helm-charts", + "https://prometheus-community.github.io/helm-charts": "prometheus-community/helm-charts", + "https://democratic-csi.github.io/charts/": "democratic-csi/democratic-csi", + "https://stakater.github.io/stakater-charts": "stakater/Reloader", + "https://topolvm.github.io/pvc-autoresizer": "topolvm/pvc-autoresizer", + "https://kubernetes-sigs.github.io/descheduler/": "kubernetes-sigs/descheduler", + "https://kubernetes-sigs.github.io/metrics-server/": "kubernetes-sigs/metrics-server", + "https://charts.fairwinds.com/stable": "FairwindsOps/goldilocks", + "https://helm.ngc.nvidia.com/nvidia": "NVIDIA/gpu-operator", + "oci://ghcr.io/woodpecker-ci/helm": "woodpecker-ci/helm", + "oci://10.0.20.10:5000/bitnamicharts": "bitnami/charts" + }, + "db_backed_services": { + "affine": { "type": "postgresql", "db_name": "affine", "shared": true }, + "claude-memory": { "type": "postgresql", "db_name": "claude_memory", "shared": true }, + "crowdsec": { "type": "postgresql", "db_name": "crowdsec", "shared": true }, + "dawarich": { "type": "postgresql", "db_name": "dawarich", "shared": true }, + "health": { "type": "postgresql", "db_name": "health", "shared": true }, + "linkwarden": { "type": "postgresql", "db_name": "linkwarden", "shared": true }, + "matrix": { "type": "postgresql", "db_name": "matrix", "shared": true }, + "n8n": { "type": "postgresql", "db_name": "n8n", "shared": true }, + "netbox": { "type": "postgresql", "db_name": "netbox", "shared": true }, + "rybbit": { "type": "postgresql", "db_name": "rybbit", "shared": true }, + "tandoor": { "type": "postgresql", "db_name": "tandoor", "shared": true }, + "technitium": { "type": "postgresql", "db_name": "technitium", "shared": true }, + "trading-bot": { "type": "postgresql", "db_name": "trading_bot", "shared": true }, + "woodpecker": { "type": "postgresql", "db_name": "woodpecker", "shared": true }, + "immich": { "type": "postgresql", "db_name": "immich", "dedicated": true, "backup_cronjob": "postgresql-backup", "backup_namespace": "immich" }, + "authentik": { "type": "postgresql", "dedicated": true, "notes": "Uses PgBouncer, managed by Helm chart" }, + "hackmd": { "type": "mysql", "db_name": "codimd", "shared": true }, + "mailserver": { "type": "mysql", "db_name": "mailserver", "shared": true }, + "monitoring": { "type": "mysql", "db_name": "monitoring", "shared": true, "notes": "Grafana backend" }, + "nextcloud": { "type": "mysql", "db_name": "nextcloud", "shared": true }, + "onlyoffice": { "type": "mysql", "db_name": "onlyoffice", "shared": true }, + "paperless-ngx": { "type": "mysql", "db_name": "paperless_ngx", "shared": true }, + "phpipam": { "type": "mysql", "db_name": "phpipam", "shared": true }, + "real-estate-crawler": { "type": "mysql", "db_name": "wrongmove", "shared": true }, + "speedtest": { "type": "mysql", "db_name": "speedtest", "shared": true }, + "url": { "type": "mysql", "db_name": "shlink", "shared": true }, + "vault": { "type": "mysql", "db_name": "vault", "shared": true } + }, + "backup_infrastructure": { + "postgresql": { + "cronjob_name": "postgresql-backup", + "namespace": "dbaas", + "credential_secret": "pg-cluster-superuser", + "credential_key": "password", + "host": "pg-cluster-rw.dbaas", + "backup_pvc": "dbaas-postgresql-backup-host" + }, + "mysql": { + "cronjob_name": "mysql-backup", + "namespace": "dbaas", + "credential_secret": "cluster-secret", + "credential_key": "ROOT_PASSWORD", + "host": "mysql.dbaas", + "backup_pvc": "dbaas-mysql-backup-host" + } + }, + "version_jump_always_step": [ + "authentik", + "nextcloud", + "immich" + ], + "auto_detect_rules": { + "ghcr.io/{org}/{repo}": "Use org/repo directly, strip -server/-backend suffixes if repo 404s", + "docker.io/{org}/{repo}": "Try org/repo on GitHub", + "lscr.io/linuxserver/{app}": "Map to linuxserver/docker-{app}", + "quay.io/{org}/{repo}": "Try org/repo on GitHub", + "registry.gitlab.com/{org}/{repo}": "Try org/repo on GitHub (may be GitLab-only)" + }, + "skip_image_patterns": [ + "viktorbarzin/*", + "registry.viktorbarzin.me/*", + "ancamilea/*", + "mghee/*", + "*postgres*", + "*mysql*", + "*redis*", + "*clickhouse*", + "*etcd*", + "registry.k8s.io/*", + "quay.io/tigera/*", + "quay.io/metallb/*", + "nvcr.io/*", + "reg.kyverno.io/*" + ], + "breaking_change_keywords": [ + "breaking", + "BREAKING", + "migration required", + "schema change", + "database migration", + "manual intervention", + "action required", + "removed", + "deprecated", + "renamed", + "incompatible" + ] +} diff --git a/modules/kubernetes/ebook2audiobook/audiblez-web/.gitignore b/modules/kubernetes/ebook2audiobook/audiblez-web/.gitignore new file mode 100644 index 00000000..bfa32681 --- /dev/null +++ b/modules/kubernetes/ebook2audiobook/audiblez-web/.gitignore @@ -0,0 +1,53 @@ +# Python +__pycache__/ +*.py[cod] +*$py.class +*.so +.Python +build/ +develop-eggs/ +dist/ +downloads/ +eggs/ +.eggs/ +lib/ +lib64/ +parts/ +sdist/ +var/ +wheels/ +*.egg-info/ +.installed.cfg +*.egg +venv/ +.venv/ +ENV/ + +# Node +node_modules/ +frontend/dist/ +npm-debug.log* +yarn-debug.log* +yarn-error.log* + +# IDE +.idea/ +.vscode/ +*.swp +*.swo +*~ + +# OS +.DS_Store +._* +Thumbs.db + +# Uploads and outputs (runtime data) +uploads/ +outputs/ + +# Environment +.env +.env.local +.env.*.local + diff --git a/modules/kubernetes/ebook2audiobook/audiblez-web/Dockerfile b/modules/kubernetes/ebook2audiobook/audiblez-web/Dockerfile new file mode 100644 index 00000000..55b11b30 --- /dev/null +++ b/modules/kubernetes/ebook2audiobook/audiblez-web/Dockerfile @@ -0,0 +1,35 @@ +FROM viktorbarzin/audiblez:latest + +# Install Node.js for building frontend +RUN apt-get update && \ + apt-get install -y curl && \ + curl -fsSL https://deb.nodesource.com/setup_22.x | bash - && \ + apt-get install -y nodejs && \ + apt-get clean && \ + rm -rf /var/lib/apt/lists/* + +WORKDIR /app + +# Build frontend +COPY frontend/package.json frontend/package-lock.json* ./frontend/ +WORKDIR /app/frontend +RUN npm install + +COPY frontend/ ./ +RUN npm run build + +# Install backend dependencies +WORKDIR /app/backend +COPY backend/requirements.txt ./ +RUN pip install --no-cache-dir --break-system-packages -r requirements.txt + +COPY backend/ ./ + +# Copy voice samples +COPY samples/ /app/samples/ + +WORKDIR /app/backend + +EXPOSE 8000 + +CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"] diff --git a/modules/kubernetes/ebook2audiobook/audiblez-web/README.md b/modules/kubernetes/ebook2audiobook/audiblez-web/README.md new file mode 100644 index 00000000..45cf9a84 --- /dev/null +++ b/modules/kubernetes/ebook2audiobook/audiblez-web/README.md @@ -0,0 +1,52 @@ +# Audiblez Web UI + +Web interface for converting EPUB files to audiobooks using [audiblez](https://github.com/santinic/audiblez). + +image + +## Features + +- Upload EPUB files via drag & drop +- Select from 50+ voices across multiple languages +- Preview voice samples before converting +- Real-time progress updates via WebSocket +- Download completed audiobooks + +## Development + +### Backend + +```bash +cd backend +pip install -r requirements.txt +uvicorn main:app --reload +``` + +### Frontend + +```bash +cd frontend +npm install +npm run dev +``` + +### Voice Samples + +Generate voice samples (requires audiblez environment): + +```bash +python generate_samples.py samples/ +``` + +## Docker Build + +```bash +docker build -t audiblez-web . +docker run -p 8000:8000 -v /path/to/data:/mnt audiblez-web +``` + +## Deployment + +Deployed to Kubernetes via Terraform. The service mounts NFS storage at `/mnt` for: +- `/mnt/uploads` - Uploaded EPUB files +- `/mnt/outputs` - Generated audiobooks diff --git a/modules/kubernetes/ebook2audiobook/audiblez-web/backend/api/__init__.py b/modules/kubernetes/ebook2audiobook/audiblez-web/backend/api/__init__.py new file mode 100644 index 00000000..e69de29b diff --git a/modules/kubernetes/ebook2audiobook/audiblez-web/backend/api/auth.py b/modules/kubernetes/ebook2audiobook/audiblez-web/backend/api/auth.py new file mode 100644 index 00000000..6ce68d8c --- /dev/null +++ b/modules/kubernetes/ebook2audiobook/audiblez-web/backend/api/auth.py @@ -0,0 +1,104 @@ +""" +Authentication module for extracting user identity from Authentik headers. + +When nginx ingress is protected with Authentik, these headers are forwarded: +- X-Authentik-Username: The user's username +- X-Authentik-Uid: Unique user ID (used for directory separation) +- X-Authentik-Email: User's email +- X-Authentik-Name: User's display name +- X-Authentik-Groups: Comma-separated group list +""" + +from dataclasses import dataclass +from fastapi import Request, HTTPException +from typing import Optional +import re + + +@dataclass +class User: + """Represents an authenticated user from Authentik.""" + uid: str + username: str + email: Optional[str] = None + name: Optional[str] = None + groups: list[str] = None + + def __post_init__(self): + if self.groups is None: + self.groups = [] + + +def sanitize_user_id(uid: str) -> str: + """ + Sanitize user ID for use as a directory name. + Only allows alphanumeric, hyphens, and underscores. + """ + if not uid: + raise ValueError("User ID cannot be empty") + + # Only allow safe characters for filesystem + safe_uid = re.sub(r'[^a-zA-Z0-9\-_]', '', uid) + + if not safe_uid: + raise ValueError("User ID contains no valid characters") + + # Limit length to prevent path issues + if len(safe_uid) > 64: + safe_uid = safe_uid[:64] + + return safe_uid + + +async def get_current_user(request: Request) -> User: + """ + Extract user information from Authentik headers. + + This is a FastAPI dependency that should be used on protected endpoints. + Raises 401 if user headers are not present (not authenticated). + """ + # Header names are case-insensitive, but commonly forwarded as: + uid = request.headers.get("X-Authentik-Uid") + username = request.headers.get("X-Authentik-Username") + email = request.headers.get("X-Authentik-Email") + name = request.headers.get("X-Authentik-Name") + groups_str = request.headers.get("X-Authentik-Groups", "") + + # For development/testing, check for alternative header names + if not uid: + uid = request.headers.get("X-Authentik-Userid") + if not uid: + uid = request.headers.get("Remote-User") + + if not uid or not username: + raise HTTPException( + status_code=401, + detail="Authentication required. Authentik headers not found." + ) + + try: + safe_uid = sanitize_user_id(uid) + except ValueError as e: + raise HTTPException(status_code=400, detail=str(e)) + + # Parse groups (comma-separated) + groups = [g.strip() for g in groups_str.split(",") if g.strip()] + + return User( + uid=safe_uid, + username=username, + email=email, + name=name, + groups=groups + ) + + +async def get_optional_user(request: Request) -> Optional[User]: + """ + Extract user information if available, or return None. + Use this for endpoints that work with or without authentication. + """ + try: + return await get_current_user(request) + except HTTPException: + return None diff --git a/modules/kubernetes/ebook2audiobook/audiblez-web/backend/api/routes.py b/modules/kubernetes/ebook2audiobook/audiblez-web/backend/api/routes.py new file mode 100644 index 00000000..c474c8fa --- /dev/null +++ b/modules/kubernetes/ebook2audiobook/audiblez-web/backend/api/routes.py @@ -0,0 +1,307 @@ +from fastapi import APIRouter, UploadFile, File, HTTPException, Depends +from fastapi.responses import FileResponse +from pydantic import BaseModel +from pathlib import Path +import shutil +import asyncio +import re + +from models.schemas import Voice, JobCreate, Job, JobProgress, ChapterInfo +from services.voices import get_all_voices, get_voices_by_language, get_voice +from services.converter import job_manager +from api.auth import User, get_current_user + +router = APIRouter(prefix="/api") + + +def sanitize_filename(filename: str, max_length: int = 200) -> str: + """ + Sanitize a filename to prevent path traversal and shell injection. + Only allows alphanumeric characters, spaces, hyphens, underscores, parentheses, and dots. + """ + if not filename: + raise ValueError("Filename cannot be empty") + + # Remove any path components (prevent path traversal) + filename = Path(filename).name + + # Only allow safe characters: alphanumeric, space, hyphen, underscore, parentheses, dot + # This regex removes anything that isn't in the allowed set + safe_filename = re.sub(r'[^a-zA-Z0-9\s\-_().]', '', filename) + + # Collapse multiple spaces/dots + safe_filename = re.sub(r'\s+', ' ', safe_filename) + safe_filename = re.sub(r'\.+', '.', safe_filename) + + # Strip leading/trailing whitespace and dots + safe_filename = safe_filename.strip(' .') + + # Limit length + if len(safe_filename) > max_length: + safe_filename = safe_filename[:max_length] + + if not safe_filename: + raise ValueError("Filename contains no valid characters") + + return safe_filename + + +class RenameRequest(BaseModel): + new_name: str + + +# ============================================================================ +# Voice endpoints (no auth required - public info) +# ============================================================================ + +@router.get("/voices", response_model=list[Voice]) +async def list_voices(): + """Get all available voices.""" + return get_all_voices() + + +@router.get("/voices/grouped") +async def list_voices_grouped(): + """Get voices grouped by language.""" + return get_voices_by_language() + + +@router.get("/voices/{voice_id}/sample") +async def get_voice_sample(voice_id: str): + """Get voice sample audio file.""" + voice = get_voice(voice_id) + if not voice: + raise HTTPException(status_code=404, detail="Voice not found") + + # Try NFS storage first (persistent), then bundled samples + sample_path = Path("/mnt/samples") / f"{voice_id}.mp3" + if not sample_path.exists(): + sample_path = Path("/app/samples") / f"{voice_id}.mp3" + if not sample_path.exists(): + raise HTTPException(status_code=404, detail="Sample not available") + + return FileResponse(sample_path, media_type="audio/mpeg") + + +# ============================================================================ +# User info endpoint +# ============================================================================ + +@router.get("/me") +async def get_current_user_info(user: User = Depends(get_current_user)): + """Get current authenticated user info.""" + return { + "uid": user.uid, + "username": user.username, + "email": user.email, + "name": user.name, + "groups": user.groups + } + + +# ============================================================================ +# Upload endpoints (user-scoped) +# ============================================================================ + +@router.post("/upload") +async def upload_file(file: UploadFile = File(...), user: User = Depends(get_current_user)): + """Upload an EPUB file to user's directory.""" + if not file.filename.endswith(".epub"): + raise HTTPException(status_code=400, detail="Only EPUB files are supported") + + # Save file to user's uploads directory + upload_dir = job_manager.get_user_uploads_dir(user.uid) + + # Sanitize the filename + try: + safe_filename = sanitize_filename(file.filename) + except ValueError as e: + raise HTTPException(status_code=400, detail=str(e)) + + file_path = upload_dir / safe_filename + + with file_path.open("wb") as buffer: + shutil.copyfileobj(file.file, buffer) + + return {"filename": safe_filename, "size": file_path.stat().st_size} + + +# ============================================================================ +# Job endpoints (user-scoped) +# ============================================================================ + +@router.post("/jobs", response_model=Job) +async def create_job(job_create: JobCreate, user: User = Depends(get_current_user)): + """Create a new conversion job.""" + # Verify file exists in user's uploads + file_path = job_manager.get_user_uploads_dir(user.uid) / job_create.filename + if not file_path.exists(): + raise HTTPException(status_code=404, detail="File not found") + + # Verify voice exists + voice = get_voice(job_create.voice) + if not voice: + raise HTTPException(status_code=404, detail="Voice not found") + + # Create job with user ownership + job = job_manager.create_job( + user_id=user.uid, + filename=job_create.filename, + voice=job_create.voice, + speed=job_create.speed, + use_gpu=job_create.use_gpu + ) + + # Start conversion in background + asyncio.create_task(job_manager.run_conversion(job.id)) + + return job + + +@router.get("/jobs", response_model=list[Job]) +async def list_jobs(user: User = Depends(get_current_user)): + """Get all jobs for current user.""" + return job_manager.get_user_jobs(user.uid) + + +@router.get("/jobs/{job_id}", response_model=Job) +async def get_job(job_id: str, user: User = Depends(get_current_user)): + """Get a specific job (must be owned by user).""" + job = job_manager.get_job(job_id, user.uid) + if not job: + raise HTTPException(status_code=404, detail="Job not found") + return job + + +@router.get("/jobs/{job_id}/download") +async def download_job(job_id: str, user: User = Depends(get_current_user)): + """Download the completed audiobook.""" + job = job_manager.get_job(job_id, user.uid) + if not job: + raise HTTPException(status_code=404, detail="Job not found") + + if job.status != "completed": + raise HTTPException(status_code=400, detail="Job not completed") + + if not job.output_file: + raise HTTPException(status_code=404, detail="Output file not found") + + output_path = job_manager.get_user_outputs_dir(user.uid) / job_id / job.output_file + if not output_path.exists(): + raise HTTPException(status_code=404, detail="Output file not found") + + return FileResponse( + output_path, + media_type="audio/mp4", + filename=job.output_file + ) + + +@router.get("/jobs/{job_id}/chapters", response_model=list[ChapterInfo]) +async def get_job_chapters(job_id: str, user: User = Depends(get_current_user)): + """Get chapter metadata for a job's audiobook.""" + job = job_manager.get_job(job_id, user.uid) + if not job: + raise HTTPException(status_code=404, detail="Job not found") + + if job.status != "completed": + raise HTTPException(status_code=400, detail="Job not completed") + + return job.chapters + + +@router.delete("/jobs/{job_id}") +async def delete_job(job_id: str, user: User = Depends(get_current_user)): + """Delete a job (must be owned by user).""" + if not job_manager.delete_job(job_id, user.uid): + raise HTTPException(status_code=404, detail="Job not found") + + return {"status": "deleted"} + + +# ============================================================================ +# Audiobook endpoints (user-scoped) +# ============================================================================ + +@router.get("/audiobooks") +async def list_audiobooks(user: User = Depends(get_current_user)): + """List all completed audiobooks for current user.""" + return job_manager.get_user_audiobooks(user.uid) + + +@router.get("/audiobooks/{audiobook_id}/download") +async def download_audiobook(audiobook_id: str, user: User = Depends(get_current_user)): + """Download an audiobook by its ID (job folder name).""" + output_dir = job_manager.get_user_outputs_dir(user.uid) / audiobook_id + + if not output_dir.exists(): + raise HTTPException(status_code=404, detail="Audiobook not found") + + # Find the audio file + audio_files = list(output_dir.glob("*.m4b")) + list(output_dir.glob("*.mp3")) + if not audio_files: + raise HTTPException(status_code=404, detail="Audio file not found") + + audio_file = audio_files[0] + media_type = "audio/mp4" if audio_file.suffix == ".m4b" else "audio/mpeg" + + return FileResponse( + audio_file, + media_type=media_type, + filename=audio_file.name + ) + + +@router.delete("/audiobooks/{audiobook_id}") +async def delete_audiobook(audiobook_id: str, user: User = Depends(get_current_user)): + """Delete an audiobook and its folder.""" + output_dir = job_manager.get_user_outputs_dir(user.uid) / audiobook_id + + if not output_dir.exists(): + raise HTTPException(status_code=404, detail="Audiobook not found") + + # Delete all files in the directory and the directory itself + for file in output_dir.iterdir(): + file.unlink() + output_dir.rmdir() + + return {"status": "deleted"} + + +@router.patch("/audiobooks/{audiobook_id}/rename") +async def rename_audiobook(audiobook_id: str, rename_request: RenameRequest, user: User = Depends(get_current_user)): + """Rename an audiobook file. Input is sanitized to prevent path traversal and injection.""" + output_dir = job_manager.get_user_outputs_dir(user.uid) / audiobook_id + + if not output_dir.exists(): + raise HTTPException(status_code=404, detail="Audiobook not found") + + # Find the audio file + audio_files = list(output_dir.glob("*.m4b")) + list(output_dir.glob("*.mp3")) + if not audio_files: + raise HTTPException(status_code=404, detail="Audio file not found") + + current_file = audio_files[0] + current_extension = current_file.suffix # .m4b or .mp3 + + # Sanitize the new name + try: + safe_name = sanitize_filename(rename_request.new_name) + except ValueError as e: + raise HTTPException(status_code=400, detail=str(e)) + + # Ensure the new name has the correct extension + if not safe_name.lower().endswith(current_extension.lower()): + safe_name = safe_name + current_extension + + # Create the new path (same directory, new filename) + new_file = output_dir / safe_name + + # Check if target already exists + if new_file.exists() and new_file != current_file: + raise HTTPException(status_code=400, detail="A file with that name already exists") + + # Rename the file using pathlib (no shell commands) + current_file.rename(new_file) + + return {"status": "renamed", "new_filename": safe_name} diff --git a/modules/kubernetes/ebook2audiobook/audiblez-web/backend/api/websocket.py b/modules/kubernetes/ebook2audiobook/audiblez-web/backend/api/websocket.py new file mode 100644 index 00000000..c8239afb --- /dev/null +++ b/modules/kubernetes/ebook2audiobook/audiblez-web/backend/api/websocket.py @@ -0,0 +1,101 @@ +from fastapi import WebSocket, WebSocketDisconnect, HTTPException +from services.converter import job_manager +from models.schemas import JobProgress +from api.auth import sanitize_user_id + + +class ConnectionManager: + """Manages WebSocket connections for job progress updates.""" + + def __init__(self): + self.active_connections: dict[str, list[WebSocket]] = {} + + async def connect(self, job_id: str, websocket: WebSocket): + """Connect a websocket for a specific job.""" + await websocket.accept() + + if job_id not in self.active_connections: + self.active_connections[job_id] = [] + self.active_connections[job_id].append(websocket) + + def disconnect(self, job_id: str, websocket: WebSocket): + """Disconnect a websocket.""" + if job_id in self.active_connections: + if websocket in self.active_connections[job_id]: + self.active_connections[job_id].remove(websocket) + if not self.active_connections[job_id]: + del self.active_connections[job_id] + + async def send_progress(self, job_id: str, progress: JobProgress): + """Send progress update to all connected clients for a job.""" + if job_id in self.active_connections: + disconnected = [] + for connection in self.active_connections[job_id]: + try: + await connection.send_json(progress.model_dump()) + except: + disconnected.append(connection) + + # Remove disconnected clients + for conn in disconnected: + self.disconnect(job_id, conn) + + +manager = ConnectionManager() + + +def get_user_from_websocket(websocket: WebSocket) -> str | None: + """ + Extract user ID from websocket headers. + WebSocket connections receive HTTP headers during the upgrade handshake. + """ + # Try various header name formats + uid = websocket.headers.get("x-authentik-uid") + if not uid: + uid = websocket.headers.get("X-Authentik-Uid") + if not uid: + uid = websocket.headers.get("x-authentik-userid") + if not uid: + uid = websocket.headers.get("remote-user") + + if uid: + try: + return sanitize_user_id(uid) + except ValueError: + return None + return None + + +async def websocket_endpoint(websocket: WebSocket, job_id: str): + """WebSocket endpoint for job progress updates.""" + # Extract user from headers + user_id = get_user_from_websocket(websocket) + + # Verify job exists and user has access + job = job_manager.get_job(job_id, user_id) + if not job: + # Close connection if job not found or not owned by user + await websocket.close(code=4004, reason="Job not found or access denied") + return + + await manager.connect(job_id, websocket) + + # Register progress callback + async def progress_callback(progress: JobProgress): + await manager.send_progress(job_id, progress) + + job_manager.register_progress_callback(job_id, progress_callback) + + try: + # Send initial status + await websocket.send_json({ + "progress": job.progress, + "status": job.status, + }) + + # Wait for messages (keep-alive) + while True: + await websocket.receive_text() + except WebSocketDisconnect: + manager.disconnect(job_id, websocket) + job_manager.unregister_progress_callback(job_id, progress_callback) diff --git a/modules/kubernetes/ebook2audiobook/audiblez-web/backend/main.py b/modules/kubernetes/ebook2audiobook/audiblez-web/backend/main.py new file mode 100644 index 00000000..9ac643b4 --- /dev/null +++ b/modules/kubernetes/ebook2audiobook/audiblez-web/backend/main.py @@ -0,0 +1,41 @@ +from fastapi import FastAPI +from fastapi.staticfiles import StaticFiles +from fastapi.middleware.cors import CORSMiddleware +from pathlib import Path + +from api.routes import router +from api.websocket import websocket_endpoint + +app = FastAPI(title="Audiblez Web API", version="1.0.0") + +# CORS middleware for development +app.add_middleware( + CORSMiddleware, + allow_origins=["*"], + allow_credentials=True, + allow_methods=["*"], + allow_headers=["*"], +) + +# Health check - must be before static mount +@app.get("/health") +async def health_check(): + return {"status": "healthy"} + +# Include API routes +app.include_router(router) + +# WebSocket endpoint +@app.websocket("/ws/jobs/{job_id}") +async def websocket_route(websocket, job_id: str): + await websocket_endpoint(websocket, job_id) + +# Serve static frontend files - MUST BE LAST as it catches all routes +static_dir = Path("/app/frontend/dist") +if static_dir.exists(): + app.mount("/", StaticFiles(directory=str(static_dir), html=True), name="static") + + +if __name__ == "__main__": + import uvicorn + uvicorn.run(app, host="0.0.0.0", port=8000) diff --git a/modules/kubernetes/ebook2audiobook/audiblez-web/backend/models/__init__.py b/modules/kubernetes/ebook2audiobook/audiblez-web/backend/models/__init__.py new file mode 100644 index 00000000..e69de29b diff --git a/modules/kubernetes/ebook2audiobook/audiblez-web/backend/models/schemas.py b/modules/kubernetes/ebook2audiobook/audiblez-web/backend/models/schemas.py new file mode 100644 index 00000000..8fd53740 --- /dev/null +++ b/modules/kubernetes/ebook2audiobook/audiblez-web/backend/models/schemas.py @@ -0,0 +1,59 @@ +from pydantic import BaseModel, Field +from typing import Optional, Literal +from datetime import datetime +from enum import Enum + + +class JobStatus(str, Enum): + PENDING = "pending" + PROCESSING = "processing" + COMPLETED = "completed" + FAILED = "failed" + CANCELLED = "cancelled" + + +class Voice(BaseModel): + id: str + name: str + language: str + gender: Literal["M", "F"] + quality: str = "medium" + + +class JobCreate(BaseModel): + filename: str + voice: str + speed: float = Field(default=1.0, ge=0.5, le=2.0) + use_gpu: bool = True + + +class ChapterInfo(BaseModel): + """Chapter metadata extracted from EPUB and embedded in M4B.""" + title: str + start_ms: int + end_ms: int + + +class JobProgress(BaseModel): + progress: float = Field(ge=0, le=100) + eta: Optional[str] = None + current_chapter: Optional[str] = None + total_chapters: Optional[int] = None + status: JobStatus + + +class Job(BaseModel): + id: str + user_id: str # User who owns this job + filename: str + voice: str + speed: float + use_gpu: bool + status: JobStatus + progress: float = 0 + created_at: datetime + updated_at: datetime + error: Optional[str] = None + output_file: Optional[str] = None + total_chapters: int = 0 + chapters: list[ChapterInfo] = [] diff --git a/modules/kubernetes/ebook2audiobook/audiblez-web/backend/requirements.txt b/modules/kubernetes/ebook2audiobook/audiblez-web/backend/requirements.txt new file mode 100644 index 00000000..484ec2c6 --- /dev/null +++ b/modules/kubernetes/ebook2audiobook/audiblez-web/backend/requirements.txt @@ -0,0 +1,11 @@ +fastapi==0.115.0 +uvicorn[standard]==0.32.0 +python-multipart==0.0.12 +websockets==13.1 +aiofiles==24.1.0 +pydantic==2.9.2 +pydantic-settings==2.6.0 +ebooklib>=0.18 +pydub>=0.25.1 +beautifulsoup4>=4.12.0 +lxml>=5.0.0 diff --git a/modules/kubernetes/ebook2audiobook/audiblez-web/backend/services/__init__.py b/modules/kubernetes/ebook2audiobook/audiblez-web/backend/services/__init__.py new file mode 100644 index 00000000..e69de29b diff --git a/modules/kubernetes/ebook2audiobook/audiblez-web/backend/services/chapter_embedder.py b/modules/kubernetes/ebook2audiobook/audiblez-web/backend/services/chapter_embedder.py new file mode 100644 index 00000000..89b4f652 --- /dev/null +++ b/modules/kubernetes/ebook2audiobook/audiblez-web/backend/services/chapter_embedder.py @@ -0,0 +1,156 @@ +"""M4B chapter metadata embedding service.""" + +import re +import subprocess +import tempfile +from pathlib import Path + +from pydub import AudioSegment + +from .epub_parser import Chapter + + +def get_chapter_audio_durations(output_dir: Path) -> list[int]: + """Calculate duration of each chapter WAV file in milliseconds. + + audiblez produces files like: {bookname}_chapter_{N}.wav + e.g., mybook_chapter_1.wav, mybook_chapter_2.wav + + Args: + output_dir: Directory containing the WAV files + + Returns: + List of durations in milliseconds, ordered by chapter number + """ + durations = [] + + # Find all chapter WAV files - audiblez uses {name}_chapter_{N}.wav + wav_files = list(output_dir.glob("*_chapter_*.wav")) + + if not wav_files: + # Fallback: try any WAV files + wav_files = list(output_dir.glob("*.wav")) + + if not wav_files: + print(f"No WAV files found in {output_dir}") + return durations + + # Sort by extracting chapter number from filename using regex + # Pattern: look for _chapter_N or chapter_N in filename + def extract_chapter_num(path: Path) -> int: + name = path.stem + # Try to find chapter number with regex - handles various patterns + # e.g., "book_chapter_1", "mybook_chapter_12", "chapter_3_voice" + match = re.search(r'chapter[_-]?(\d+)', name, re.IGNORECASE) + if match: + return int(match.group(1)) + # Fallback: find any number in the filename + match = re.search(r'(\d+)', name) + if match: + return int(match.group(1)) + return 0 + + wav_files.sort(key=extract_chapter_num) + + print(f"Found {len(wav_files)} WAV files to process for durations") + for wav_file in wav_files: + try: + audio = AudioSegment.from_file(str(wav_file)) + durations.append(len(audio)) # duration in ms + print(f" Chapter WAV: {wav_file.name} - {len(audio)}ms ({len(audio)/1000:.1f}s)") + except Exception as e: + print(f" Error reading {wav_file}: {e}") + continue + + return durations + + +def generate_ffmpeg_metadata(chapters: list[Chapter], durations: list[int]) -> str: + """Generate FFmpeg FFMETADATA1 format string with chapter markers. + + Args: + chapters: List of Chapter objects with titles + durations: List of durations in milliseconds for each chapter + + Returns: + FFMETADATA1 formatted string + """ + metadata = ";FFMETADATA1\n" + + current_time_ms = 0 + + # Match chapters with durations + num_chapters = min(len(chapters), len(durations)) + + for i in range(num_chapters): + chapter = chapters[i] + duration = durations[i] + + chapter.start_ms = current_time_ms + chapter.end_ms = current_time_ms + duration + chapter.duration_ms = duration + + metadata += f"\n[CHAPTER]\n" + metadata += f"TIMEBASE=1/1000\n" + metadata += f"START={chapter.start_ms}\n" + metadata += f"END={chapter.end_ms}\n" + metadata += f"title={chapter.title}\n" + + current_time_ms = chapter.end_ms + + return metadata + + +def embed_chapters_in_m4b(input_m4b: Path, metadata_content: str) -> Path: + """Re-mux M4B with chapter metadata using FFmpeg. + + Args: + input_m4b: Path to the input M4B file + metadata_content: FFMETADATA1 formatted string + + Returns: + Path to the output M4B with chapters (same as input, replaced) + """ + output_m4b = input_m4b.with_suffix('.chaptered.m4b') + + # Write metadata to temporary file + with tempfile.NamedTemporaryFile(mode='w', suffix='.txt', delete=False) as f: + f.write(metadata_content) + metadata_file = Path(f.name) + + try: + cmd = [ + 'ffmpeg', '-y', + '-i', str(input_m4b), + '-f', 'ffmetadata', '-i', str(metadata_file), + '-map', '0:a', + '-map_metadata', '1', + '-c:a', 'copy', # Copy audio without re-encoding + '-movflags', '+faststart+use_metadata_tags', + str(output_m4b) + ] + + print(f"Running FFmpeg: {' '.join(cmd)}") + result = subprocess.run(cmd, check=True, capture_output=True, text=True) + + if result.returncode != 0: + print(f"FFmpeg stderr: {result.stderr}") + raise RuntimeError(f"FFmpeg failed: {result.stderr}") + + # Replace original with chaptered version + input_m4b.unlink() + output_m4b.rename(input_m4b) + + print(f"Successfully embedded chapters in {input_m4b}") + return input_m4b + + except subprocess.CalledProcessError as e: + print(f"FFmpeg error: {e.stderr}") + # Clean up temp file + if output_m4b.exists(): + output_m4b.unlink() + raise + finally: + # Clean up metadata file + if metadata_file.exists(): + metadata_file.unlink() diff --git a/modules/kubernetes/ebook2audiobook/audiblez-web/backend/services/converter.py b/modules/kubernetes/ebook2audiobook/audiblez-web/backend/services/converter.py new file mode 100644 index 00000000..ed7d54c1 --- /dev/null +++ b/modules/kubernetes/ebook2audiobook/audiblez-web/backend/services/converter.py @@ -0,0 +1,291 @@ +import asyncio +import uuid +import os +from datetime import datetime +from pathlib import Path +from typing import Callable, Optional +import subprocess +import json + +from models.schemas import Job, JobStatus, JobProgress, ChapterInfo +from services.epub_parser import extract_chapters, Chapter +from services.chapter_embedder import ( + get_chapter_audio_durations, + generate_ffmpeg_metadata, + embed_chapters_in_m4b +) + + +class JobManager: + """Manages conversion jobs and their state with user isolation.""" + + def __init__(self, storage_path: str = "/mnt"): + self.storage_path = Path(storage_path) + self.jobs: dict[str, Job] = {} + self.progress_callbacks: dict[str, list[Callable]] = {} + + def get_user_uploads_dir(self, user_id: str) -> Path: + """Get the uploads directory for a specific user.""" + user_dir = self.storage_path / "users" / user_id / "uploads" + user_dir.mkdir(parents=True, exist_ok=True) + return user_dir + + def get_user_outputs_dir(self, user_id: str) -> Path: + """Get the outputs directory for a specific user.""" + user_dir = self.storage_path / "users" / user_id / "outputs" + user_dir.mkdir(parents=True, exist_ok=True) + return user_dir + + def create_job(self, user_id: str, filename: str, voice: str, speed: float, use_gpu: bool) -> Job: + """Create a new conversion job for a user.""" + job_id = str(uuid.uuid4()) + now = datetime.now() + + job = Job( + id=job_id, + user_id=user_id, + filename=filename, + voice=voice, + speed=speed, + use_gpu=use_gpu, + status=JobStatus.PENDING, + created_at=now, + updated_at=now, + ) + + self.jobs[job_id] = job + return job + + def get_job(self, job_id: str, user_id: Optional[str] = None) -> Optional[Job]: + """Get a job by ID. If user_id is provided, verify ownership.""" + job = self.jobs.get(job_id) + if job and user_id and job.user_id != user_id: + return None # User doesn't own this job + return job + + def get_user_jobs(self, user_id: str) -> list[Job]: + """Get all jobs for a specific user.""" + return [job for job in self.jobs.values() if job.user_id == user_id] + + def get_all_jobs(self) -> list[Job]: + """Get all jobs (admin use only).""" + return list(self.jobs.values()) + + def update_job_status(self, job_id: str, status: JobStatus, error: Optional[str] = None): + """Update job status.""" + if job_id in self.jobs: + self.jobs[job_id].status = status + self.jobs[job_id].updated_at = datetime.now() + if error: + self.jobs[job_id].error = error + + def update_job_progress(self, job_id: str, progress: float, current_chapter: Optional[str] = None, eta: Optional[str] = None): + """Update job progress.""" + if job_id in self.jobs: + self.jobs[job_id].progress = progress + self.jobs[job_id].updated_at = datetime.now() + + # Notify callbacks + if job_id in self.progress_callbacks: + progress_data = JobProgress( + progress=progress, + eta=eta, + current_chapter=current_chapter, + status=self.jobs[job_id].status + ) + for callback in self.progress_callbacks[job_id]: + try: + asyncio.create_task(callback(progress_data)) + except Exception as e: + print(f"Error in progress callback: {e}") + + def register_progress_callback(self, job_id: str, callback: Callable): + """Register a callback for progress updates.""" + if job_id not in self.progress_callbacks: + self.progress_callbacks[job_id] = [] + self.progress_callbacks[job_id].append(callback) + + def unregister_progress_callback(self, job_id: str, callback: Callable): + """Unregister a progress callback.""" + if job_id in self.progress_callbacks: + self.progress_callbacks[job_id].remove(callback) + if not self.progress_callbacks[job_id]: + del self.progress_callbacks[job_id] + + async def run_conversion(self, job_id: str): + """Run the audiblez conversion in the background.""" + job = self.jobs.get(job_id) + if not job: + return + + try: + self.update_job_status(job_id, JobStatus.PROCESSING) + + # Prepare user-specific paths + input_path = self.get_user_uploads_dir(job.user_id) / job.filename + output_dir = self.get_user_outputs_dir(job.user_id) / job_id + output_dir.mkdir(parents=True, exist_ok=True) + + # Extract chapters from EPUB before conversion + chapters: list[Chapter] = [] + if input_path.suffix.lower() == '.epub': + chapters = extract_chapters(input_path) + self.jobs[job_id].total_chapters = len(chapters) + print(f"Extracted {len(chapters)} chapters from EPUB") + + # Build audiblez command - use the venv python/audiblez + cmd = [ + "/app/audiblez/bin/audiblez", + str(input_path), + "-o", str(output_dir), + "-v", job.voice, + "-s", str(job.speed), + ] + + if job.use_gpu: + cmd.append("-c") # --cuda flag for GPU + + # Run conversion + process = await asyncio.create_subprocess_exec( + *cmd, + stdout=asyncio.subprocess.PIPE, + stderr=asyncio.subprocess.PIPE, + cwd=str(self.storage_path) + ) + + # Monitor progress from stderr/stdout + async def read_stream(stream, is_stderr=False): + while True: + line = await stream.readline() + if not line: + break + + line_str = line.decode().strip() + print(f"{'[STDERR]' if is_stderr else '[STDOUT]'} {line_str}") + + # Parse progress from output + # Audiblez outputs progress in various formats + # We'll do a simple pattern match + if "%" in line_str: + try: + # Try to extract percentage + parts = line_str.split("%") + if len(parts) > 1: + # Find the number before % + num_str = parts[0].split()[-1] + progress = float(num_str) + self.update_job_progress(job_id, progress) + except: + pass + + # Check for chapter info + if "Chapter" in line_str or "chapter" in line_str: + self.update_job_progress( + job_id, + job.progress, + current_chapter=line_str + ) + + # Read both streams concurrently + await asyncio.gather( + read_stream(process.stdout), + read_stream(process.stderr, is_stderr=True) + ) + + # Wait for completion + returncode = await process.wait() + + if returncode == 0: + # Find output file + output_files = list(output_dir.glob("*.m4b")) + if not output_files: + output_files = list(output_dir.glob("*.mp3")) + + if output_files: + output_file = output_files[0] + + # Embed chapter metadata if we have chapters and WAV files + if chapters: + try: + durations = get_chapter_audio_durations(output_dir) + print(f"Found {len(durations)} chapter audio durations") + + if durations: + # Match chapter count with duration count + num_chapters = min(len(chapters), len(durations)) + if num_chapters != len(chapters): + print(f"Warning: chapter count ({len(chapters)}) != duration count ({len(durations)})") + + metadata = generate_ffmpeg_metadata(chapters[:num_chapters], durations[:num_chapters]) + embed_chapters_in_m4b(output_file, metadata) + + # Store chapter info in job for API access + self.jobs[job_id].chapters = [ + ChapterInfo( + title=c.title, + start_ms=c.start_ms, + end_ms=c.end_ms + ) + for c in chapters[:num_chapters] + ] + print(f"Embedded {num_chapters} chapters in M4B") + else: + print("No WAV files found for chapter duration calculation") + except Exception as e: + print(f"Failed to embed chapters: {e}") + # Continue without chapter embedding - non-fatal error + + self.jobs[job_id].output_file = str(output_file.name) + self.update_job_status(job_id, JobStatus.COMPLETED) + self.update_job_progress(job_id, 100.0) + else: + self.update_job_status(job_id, JobStatus.FAILED, "No output file generated") + else: + self.update_job_status(job_id, JobStatus.FAILED, f"Conversion failed with code {returncode}") + + except Exception as e: + print(f"Conversion error: {e}") + self.update_job_status(job_id, JobStatus.FAILED, str(e)) + + def delete_job(self, job_id: str, user_id: str) -> bool: + """Delete a job and its output files.""" + job = self.get_job(job_id, user_id) + if not job: + return False + + # Delete output directory if exists + output_dir = self.get_user_outputs_dir(user_id) / job_id + if output_dir.exists(): + import shutil + shutil.rmtree(output_dir) + + # Remove from jobs dict + del self.jobs[job_id] + return True + + def get_user_audiobooks(self, user_id: str) -> list[dict]: + """List all completed audiobooks for a user.""" + outputs_dir = self.get_user_outputs_dir(user_id) + audiobooks = [] + + if outputs_dir.exists(): + for job_dir in outputs_dir.iterdir(): + if job_dir.is_dir(): + # Look for m4b or mp3 files + audio_files = list(job_dir.glob("*.m4b")) + list(job_dir.glob("*.mp3")) + for audio_file in audio_files: + stat = audio_file.stat() + audiobooks.append({ + "id": job_dir.name, + "filename": audio_file.name, + "size": stat.st_size, + "created_at": stat.st_mtime, + }) + + # Sort by creation time, newest first + audiobooks.sort(key=lambda x: x["created_at"], reverse=True) + return audiobooks + + +# Global job manager instance +job_manager = JobManager() diff --git a/modules/kubernetes/ebook2audiobook/audiblez-web/backend/services/epub_parser.py b/modules/kubernetes/ebook2audiobook/audiblez-web/backend/services/epub_parser.py new file mode 100644 index 00000000..5bd8f37c --- /dev/null +++ b/modules/kubernetes/ebook2audiobook/audiblez-web/backend/services/epub_parser.py @@ -0,0 +1,166 @@ +"""EPUB chapter extraction service. + +This parser attempts to match audiblez's chapter detection logic to ensure +the extracted chapters align with the WAV files audiblez produces. + +audiblez iterates through EPUB ITEM_DOCUMENTs and uses is_chapter() to determine +if a document is a chapter based on content length (100+ chars) and filename patterns. +""" + +import re +from dataclasses import dataclass +from pathlib import Path + +from bs4 import BeautifulSoup +from ebooklib import epub, ITEM_DOCUMENT + + +@dataclass +class Chapter: + """Represents a chapter extracted from an EPUB.""" + title: str + index: int + duration_ms: int = 0 + start_ms: int = 0 + end_ms: int = 0 + + +def sanitize_title(title: str) -> str: + """Remove characters that break FFmpeg metadata format.""" + if not title: + return "Untitled" + # Escape special chars for FFmpeg FFMETADATA format + return (title + .replace('=', '-') + .replace(';', '-') + .replace('#', '') + .replace('\\', '') + .replace('\n', ' ') + .replace('\r', '') + .strip()) + + +def is_chapter(text: str, filename: str) -> bool: + """Determine if a document is a chapter. + + Matches audiblez's is_chapter() logic: + - Content must be over 100 characters + - Filename should match common chapter patterns + """ + if len(text) < 100: + return False + + # Check filename patterns that indicate a chapter + filename_lower = filename.lower() + chapter_patterns = [ + r'chapter', + r'part[_-]?\d+', + r'split[_-]?\d+', + r'ch[_-]?\d+', + r'chap[_-]?\d+', + r'sect', # section + r'content', + r'text', + ] + + for pattern in chapter_patterns: + if re.search(pattern, filename_lower): + return True + + # If content is substantial (1000+ chars), likely a chapter even without pattern match + if len(text) > 1000: + return True + + return False + + +def extract_title_from_content(soup: BeautifulSoup, filename: str, index: int) -> str: + """Extract a chapter title from the document content.""" + # Try to find title in common heading tags + for tag in ['title', 'h1', 'h2', 'h3']: + element = soup.find(tag) + if element and element.get_text(strip=True): + title = element.get_text(strip=True) + # Truncate long titles + if len(title) > 100: + title = title[:97] + "..." + return title + + # Fallback: use filename without extension + stem = Path(filename).stem + # Clean up common patterns + stem = re.sub(r'^(chapter|chap|ch)[_-]?', 'Chapter ', stem, flags=re.IGNORECASE) + stem = re.sub(r'[_-]', ' ', stem) + + if stem and len(stem) < 50: + return stem.title() + + return f"Chapter {index + 1}" + + +def extract_chapters(epub_path: Path) -> list[Chapter]: + """Extract chapter titles matching audiblez's chapter detection logic. + + audiblez determines chapters by: + 1. Iterating through ITEM_DOCUMENT items + 2. Checking is_chapter() based on content length and filename patterns + + This ensures our chapter count matches the WAV files audiblez produces. + + Args: + epub_path: Path to the EPUB file + + Returns: + List of Chapter objects with title and index + """ + try: + book = epub.read_epub(str(epub_path)) + except Exception as e: + print(f"Failed to read EPUB: {e}") + return [] + + chapters: list[Chapter] = [] + chapter_index = 0 + + # Iterate through documents like audiblez does + for item in book.get_items(): + if item.get_type() != ITEM_DOCUMENT: + continue + + try: + # Get content and parse with BeautifulSoup + content = item.get_content() + soup = BeautifulSoup(content, features='lxml') + + # Extract text from relevant tags (matching audiblez) + text_parts = [] + for tag in soup.find_all(['title', 'p', 'h1', 'h2', 'h3', 'h4', 'li']): + text = tag.get_text(strip=True) + if text: + text_parts.append(text) + + full_text = ' '.join(text_parts) + filename = item.get_name() or "" + + # Check if this document is a chapter + if is_chapter(full_text, filename): + title = extract_title_from_content(soup, filename, chapter_index) + chapters.append(Chapter( + title=sanitize_title(title), + index=chapter_index + )) + chapter_index += 1 + + except Exception as e: + print(f"Error processing document {item.get_name()}: {e}") + continue + + print(f"Extracted {len(chapters)} chapters from EPUB (audiblez-style detection)") + + # Debug: print first few chapters + for i, ch in enumerate(chapters[:5]): + print(f" {i+1}. {ch.title}") + if len(chapters) > 5: + print(f" ... and {len(chapters) - 5} more") + + return chapters diff --git a/modules/kubernetes/ebook2audiobook/audiblez-web/backend/services/voices.py b/modules/kubernetes/ebook2audiobook/audiblez-web/backend/services/voices.py new file mode 100644 index 00000000..5406888f --- /dev/null +++ b/modules/kubernetes/ebook2audiobook/audiblez-web/backend/services/voices.py @@ -0,0 +1,97 @@ +from models.schemas import Voice + +# Voice catalog from Kokoro-82M (used by audiblez) +# https://huggingface.co/hexgrad/Kokoro-82M/blob/main/VOICES.md +VOICE_CATALOG = { + # American English + "af_heart": Voice(id="af_heart", name="Heart", language="American English", gender="F"), + "af_alloy": Voice(id="af_alloy", name="Alloy", language="American English", gender="F"), + "af_aoede": Voice(id="af_aoede", name="Aoede", language="American English", gender="F"), + "af_bella": Voice(id="af_bella", name="Bella", language="American English", gender="F"), + "af_jessica": Voice(id="af_jessica", name="Jessica", language="American English", gender="F"), + "af_kore": Voice(id="af_kore", name="Kore", language="American English", gender="F"), + "af_nicole": Voice(id="af_nicole", name="Nicole", language="American English", gender="F"), + "af_nova": Voice(id="af_nova", name="Nova", language="American English", gender="F"), + "af_river": Voice(id="af_river", name="River", language="American English", gender="F"), + "af_sarah": Voice(id="af_sarah", name="Sarah", language="American English", gender="F"), + "af_sky": Voice(id="af_sky", name="Sky", language="American English", gender="F"), + "am_adam": Voice(id="am_adam", name="Adam", language="American English", gender="M"), + "am_echo": Voice(id="am_echo", name="Echo", language="American English", gender="M"), + "am_eric": Voice(id="am_eric", name="Eric", language="American English", gender="M"), + "am_fenrir": Voice(id="am_fenrir", name="Fenrir", language="American English", gender="M"), + "am_liam": Voice(id="am_liam", name="Liam", language="American English", gender="M"), + "am_michael": Voice(id="am_michael", name="Michael", language="American English", gender="M"), + "am_onyx": Voice(id="am_onyx", name="Onyx", language="American English", gender="M"), + "am_puck": Voice(id="am_puck", name="Puck", language="American English", gender="M"), + "am_santa": Voice(id="am_santa", name="Santa", language="American English", gender="M"), + + # British English + "bf_alice": Voice(id="bf_alice", name="Alice", language="British English", gender="F"), + "bf_emma": Voice(id="bf_emma", name="Emma", language="British English", gender="F"), + "bf_isabella": Voice(id="bf_isabella", name="Isabella", language="British English", gender="F"), + "bf_lily": Voice(id="bf_lily", name="Lily", language="British English", gender="F"), + "bm_daniel": Voice(id="bm_daniel", name="Daniel", language="British English", gender="M"), + "bm_fable": Voice(id="bm_fable", name="Fable", language="British English", gender="M"), + "bm_george": Voice(id="bm_george", name="George", language="British English", gender="M"), + "bm_lewis": Voice(id="bm_lewis", name="Lewis", language="British English", gender="M"), + + # Japanese + "jf_alpha": Voice(id="jf_alpha", name="Alpha", language="Japanese", gender="F"), + "jf_gongitsune": Voice(id="jf_gongitsune", name="Gongitsune", language="Japanese", gender="F"), + "jf_nezumi": Voice(id="jf_nezumi", name="Nezumi", language="Japanese", gender="F"), + "jf_tebukuro": Voice(id="jf_tebukuro", name="Tebukuro", language="Japanese", gender="F"), + "jm_kumo": Voice(id="jm_kumo", name="Kumo", language="Japanese", gender="M"), + + # Mandarin Chinese + "zf_xiaobei": Voice(id="zf_xiaobei", name="Xiaobei", language="Mandarin Chinese", gender="F"), + "zf_xiaoni": Voice(id="zf_xiaoni", name="Xiaoni", language="Mandarin Chinese", gender="F"), + "zf_xiaoxiao": Voice(id="zf_xiaoxiao", name="Xiaoxiao", language="Mandarin Chinese", gender="F"), + "zf_xiaoyi": Voice(id="zf_xiaoyi", name="Xiaoyi", language="Mandarin Chinese", gender="F"), + "zm_yunjian": Voice(id="zm_yunjian", name="Yunjian", language="Mandarin Chinese", gender="M"), + "zm_yunxi": Voice(id="zm_yunxi", name="Yunxi", language="Mandarin Chinese", gender="M"), + "zm_yunxia": Voice(id="zm_yunxia", name="Yunxia", language="Mandarin Chinese", gender="M"), + "zm_yunyang": Voice(id="zm_yunyang", name="Yunyang", language="Mandarin Chinese", gender="M"), + + # Spanish + "ef_dora": Voice(id="ef_dora", name="Dora", language="Spanish", gender="F"), + "em_alex": Voice(id="em_alex", name="Alex", language="Spanish", gender="M"), + "em_santa": Voice(id="em_santa", name="Santa", language="Spanish", gender="M"), + + # French + "ff_siwis": Voice(id="ff_siwis", name="Siwis", language="French", gender="F"), + + # Hindi + "hf_alpha": Voice(id="hf_alpha", name="Alpha", language="Hindi", gender="F"), + "hf_beta": Voice(id="hf_beta", name="Beta", language="Hindi", gender="F"), + "hm_omega": Voice(id="hm_omega", name="Omega", language="Hindi", gender="M"), + "hm_psi": Voice(id="hm_psi", name="Psi", language="Hindi", gender="M"), + + # Italian + "if_sara": Voice(id="if_sara", name="Sara", language="Italian", gender="F"), + "im_nicola": Voice(id="im_nicola", name="Nicola", language="Italian", gender="M"), + + # Brazilian Portuguese + "pf_dora": Voice(id="pf_dora", name="Dora", language="Brazilian Portuguese", gender="F"), + "pm_alex": Voice(id="pm_alex", name="Alex", language="Brazilian Portuguese", gender="M"), + "pm_santa": Voice(id="pm_santa", name="Santa", language="Brazilian Portuguese", gender="M"), +} + + +def get_all_voices() -> list[Voice]: + """Get all available voices.""" + return list(VOICE_CATALOG.values()) + + +def get_voice(voice_id: str) -> Voice | None: + """Get a specific voice by ID.""" + return VOICE_CATALOG.get(voice_id) + + +def get_voices_by_language() -> dict[str, list[Voice]]: + """Get voices grouped by language.""" + grouped = {} + for voice in VOICE_CATALOG.values(): + if voice.language not in grouped: + grouped[voice.language] = [] + grouped[voice.language].append(voice) + return grouped diff --git a/modules/kubernetes/ebook2audiobook/audiblez-web/frontend/index.html b/modules/kubernetes/ebook2audiobook/audiblez-web/frontend/index.html new file mode 100644 index 00000000..3cab72f1 --- /dev/null +++ b/modules/kubernetes/ebook2audiobook/audiblez-web/frontend/index.html @@ -0,0 +1,13 @@ + + + + + + Audiblez Web - EPUB to Audiobook + + + +
+ + + diff --git a/modules/kubernetes/ebook2audiobook/audiblez-web/frontend/src/App.svelte b/modules/kubernetes/ebook2audiobook/audiblez-web/frontend/src/App.svelte new file mode 100644 index 00000000..f23584da --- /dev/null +++ b/modules/kubernetes/ebook2audiobook/audiblez-web/frontend/src/App.svelte @@ -0,0 +1,279 @@ + + +
+
+
+
+

Audiblez Web

+

Convert EPUB to Audiobook

+
+ {#if currentUser} + + {/if} +
+
+ +
+
+
+ +
+ +
+ +
+
+ +
+
+ + +
+ +
+ +
+ + + + {#if error} +

{error}

+ {/if} +
+ + + +
+
+ + diff --git a/modules/kubernetes/ebook2audiobook/audiblez-web/frontend/src/main.js b/modules/kubernetes/ebook2audiobook/audiblez-web/frontend/src/main.js new file mode 100644 index 00000000..e283a988 --- /dev/null +++ b/modules/kubernetes/ebook2audiobook/audiblez-web/frontend/src/main.js @@ -0,0 +1,6 @@ +import App from './App.svelte' +import { mount } from 'svelte' + +const app = mount(App, { target: document.getElementById('app') }) + +export default app diff --git a/modules/kubernetes/ebook2audiobook/audiblez-web/frontend/src/stores/jobs.js b/modules/kubernetes/ebook2audiobook/audiblez-web/frontend/src/stores/jobs.js new file mode 100644 index 00000000..6b1623d8 --- /dev/null +++ b/modules/kubernetes/ebook2audiobook/audiblez-web/frontend/src/stores/jobs.js @@ -0,0 +1,28 @@ +import { writable } from 'svelte/store'; + +function createJobsStore() { + const { subscribe, set, update } = writable([]); + + return { + subscribe, + set, + add: (job) => update(jobs => [...jobs, job]), + updateJob: (jobId, updates) => update(jobs => + jobs.map(j => j.id === jobId ? { ...j, ...updates } : j) + ), + remove: (jobId) => update(jobs => jobs.filter(j => j.id !== jobId)), + refresh: async () => { + try { + const response = await fetch('/api/jobs'); + if (response.ok) { + const jobs = await response.json(); + set(jobs); + } + } catch (e) { + console.error('Failed to fetch jobs:', e); + } + } + }; +} + +export const jobs = createJobsStore(); diff --git a/modules/kubernetes/ebook2audiobook/audiblez-web/frontend/svelte.config.js b/modules/kubernetes/ebook2audiobook/audiblez-web/frontend/svelte.config.js new file mode 100644 index 00000000..09a1bf7a --- /dev/null +++ b/modules/kubernetes/ebook2audiobook/audiblez-web/frontend/svelte.config.js @@ -0,0 +1,5 @@ +import { vitePreprocess } from '@sveltejs/vite-plugin-svelte' + +export default { + preprocess: vitePreprocess() +} diff --git a/modules/kubernetes/ebook2audiobook/audiblez-web/frontend/vite.config.js b/modules/kubernetes/ebook2audiobook/audiblez-web/frontend/vite.config.js new file mode 100644 index 00000000..6bc60d2c --- /dev/null +++ b/modules/kubernetes/ebook2audiobook/audiblez-web/frontend/vite.config.js @@ -0,0 +1,15 @@ +import { defineConfig } from 'vite' +import { svelte } from '@sveltejs/vite-plugin-svelte' + +export default defineConfig({ + plugins: [svelte()], + server: { + proxy: { + '/api': 'http://localhost:8000', + '/ws': { + target: 'ws://localhost:8000', + ws: true + } + } + } +}) diff --git a/modules/kubernetes/ebook2audiobook/audiblez-web/generate_samples.py b/modules/kubernetes/ebook2audiobook/audiblez-web/generate_samples.py new file mode 100644 index 00000000..1b0befd1 --- /dev/null +++ b/modules/kubernetes/ebook2audiobook/audiblez-web/generate_samples.py @@ -0,0 +1,136 @@ +#!/usr/bin/env python3 +""" +Generate voice samples for all available voices. +Run this script in an environment with audiblez installed. + +Usage: + python generate_samples.py [output_dir] +""" + +import os +import sys +from pathlib import Path + +# Sample text for voice preview +SAMPLE_TEXT = "The quick brown fox jumps over the lazy dog. This is a sample of my voice for audiobook narration." + +# All voices from Kokoro-82M (audiblez) +VOICES = [ + # American English (20 voices) + "af_alloy", "af_aoede", "af_bella", "af_heart", "af_jessica", "af_kore", + "af_nicole", "af_nova", "af_river", "af_sarah", "af_sky", + "am_adam", "am_echo", "am_eric", "am_fenrir", "am_liam", "am_michael", + "am_onyx", "am_puck", "am_santa", + # British English (8 voices) + "bf_alice", "bf_emma", "bf_isabella", "bf_lily", + "bm_daniel", "bm_fable", "bm_george", "bm_lewis", + # Spanish (3 voices) + "ef_dora", "em_alex", "em_santa", + # French (1 voice) + "ff_siwis", + # Hindi (4 voices) + "hf_alpha", "hf_beta", "hm_omega", "hm_psi", + # Italian (2 voices) + "if_sara", "im_nicola", + # Japanese (5 voices) + "jf_alpha", "jf_gongitsune", "jf_nezumi", "jf_tebukuro", "jm_kumo", + # Brazilian Portuguese (3 voices) + "pf_dora", "pm_alex", "pm_santa", + # Mandarin Chinese (8 voices) + "zf_xiaobei", "zf_xiaoni", "zf_xiaoxiao", "zf_xiaoyi", + "zm_yunjian", "zm_yunxi", "zm_yunxia", "zm_yunyang", +] + + +def generate_sample(voice: str, output_dir: Path): + """Generate a voice sample using kokoro TTS.""" + try: + from kokoro import KPipeline + + output_file = output_dir / f"{voice}.mp3" + if output_file.exists(): + print(f"Skipping {voice} - already exists") + return True + + print(f"Generating sample for {voice}...") + + # Map voice prefix to language code + lang_map = { + 'a': 'a', # American English + 'b': 'b', # British English + 'e': 'e', # Spanish + 'f': 'f', # French + 'h': 'h', # Hindi + 'i': 'i', # Italian + 'j': 'j', # Japanese + 'p': 'p', # Portuguese + 'z': 'z', # Chinese + } + + # Extract language code from voice (first letter) + lang_code = lang_map.get(voice[0], 'a') + + # Initialize the Kokoro pipeline + pipeline = KPipeline(lang_code=lang_code) + + # Generate audio + generator = pipeline(SAMPLE_TEXT, voice=voice, speed=1.0) + + # Collect all audio chunks + audio_chunks = [] + for _, _, audio in generator: + audio_chunks.append(audio) + + if audio_chunks: + import soundfile as sf + import numpy as np + + # Concatenate audio + audio = np.concatenate(audio_chunks) + + # Save as WAV first, then convert to MP3 + wav_file = output_dir / f"{voice}.wav" + sf.write(str(wav_file), audio, 24000) + + # Convert to MP3 using ffmpeg + import subprocess + result = subprocess.run([ + "ffmpeg", "-y", "-i", str(wav_file), + "-codec:a", "libmp3lame", "-qscale:a", "5", + str(output_file) + ], capture_output=True) + + # Remove WAV file + if wav_file.exists(): + wav_file.unlink() + + if result.returncode == 0: + print(f"Generated {output_file}") + return True + else: + print(f"FFmpeg failed for {voice}: {result.stderr.decode()}") + return False + else: + print(f"Failed to generate audio for {voice}") + return False + + except Exception as e: + print(f"Error generating sample for {voice}: {e}") + return False + + +def main(): + output_dir = Path(sys.argv[1]) if len(sys.argv) > 1 else Path("samples") + output_dir.mkdir(parents=True, exist_ok=True) + + print(f"Generating voice samples to {output_dir}") + print(f"Total voices: {len(VOICES)}") + + for voice in VOICES: + generate_sample(voice, output_dir) + + print("Done!") + + +if __name__ == "__main__": + main() diff --git a/modules/kubernetes/ebook2audiobook/audiblez-web/samples/.gitkeep b/modules/kubernetes/ebook2audiobook/audiblez-web/samples/.gitkeep new file mode 100644 index 00000000..b046185d --- /dev/null +++ b/modules/kubernetes/ebook2audiobook/audiblez-web/samples/.gitkeep @@ -0,0 +1,2 @@ +# Voice samples go here +# Run generate_samples.py to generate voice samples for all voices diff --git a/modules/kubernetes/frigate/config.yaml b/modules/kubernetes/frigate/config.yaml new file mode 100644 index 00000000..277d6cbd --- /dev/null +++ b/modules/kubernetes/frigate/config.yaml @@ -0,0 +1,229 @@ +mqtt: + enabled: false +birdseye: + quality: 25 +detect: + fps: 1 + enabled: true +go2rtc: + streams: + vermont-1: + - rtsp://admin:REDACTED_RTSP_PW@192.168.1.10:554/Streaming/Channels/101/3 +cameras: + # # Temp disabled until valchedrym is back up + valchedrym-cam-1: + enabled: true + ffmpeg: + inputs: + #- path: rtsp://admin:REDACTED_RTSP_PW@192.168.0.11:554/Streaming/Channels/101 # <----- The stream you want to use for detection + - path: rtsp://admin:REDACTED_RTSP_PW@valchedrym.ddns.net:554/Streaming/Channels/101 # <----- The stream you want to use for detection + detect: + enabled: false # <---- disable detection until you have a working camera feed + width: 704 # <---- update for your camera's resolution + height: 576 # <---- update for your camera's resolution + rtmp: + enabled: false + record: + enabled: false + snapshots: + enabled: false + objects: + # Optional: list of objects to track from labelmap.txt (full list - https://docs.frigate.video/configuration/objects) + track: + - person + - bicycle + - car + - bird + - cat + - dog + - horse + valchedrym-cam-2: + enabled: true + ffmpeg: + inputs: + #- path: rtsp://admin:REDACTED_RTSP_PW@192.168.0.11:554/Streaming/Channels/201 # <----- The stream you want to use for detection + - path: rtsp://admin:REDACTED_RTSP_PW@valchedrym.ddns.net:554/Streaming/Channels/201 # <----- The stream you want to use for detection + detect: + enabled: false # <---- disable detection until you have a working camera feed + width: 704 # <---- update for your camera's resolution + height: 576 # <---- update for your camera's resolution + rtmp: + enabled: false + record: + enabled: false + snapshots: + enabled: false + objects: + # Optional: list of objects to track from labelmap.txt (full list - https://docs.frigate.video/configuration/objects) + track: + - person + - bicycle + - car + - bird + - cat + - dog + - horse + vermont-1: + enabled: true + ffmpeg: + inputs: + - path: rtsp://admin:REDACTED_RTSP_PW@192.168.1.10:554/Streaming/Channels/101/3 # <----- The stream you want to use for detection + roles: + - record + rtmp: + enabled: false + record: + enabled: false + snapshots: + enabled: false + detect: + enabled: false + vermont-2: + enabled: true + ffmpeg: + inputs: + - path: rtsp://admin:REDACTED_RTSP_PW@192.168.1.10:554/Streaming/Channels/201/1 # <----- The stream you want to use for detection + detect: + enabled: false # <---- disable detection until you have a working camera feed + width: 704 # <---- update for your camera's resolution + height: 576 # <---- update for your camera's resolution + rtmp: + enabled: false + record: + enabled: false + snapshots: + enabled: false + vermont-3: + enabled: true + ffmpeg: + inputs: + - path: rtsp://admin:REDACTED_RTSP_PW@192.168.1.10:554/Streaming/Channels/301/1 # <----- The stream you want to use for detection + detect: + enabled: false # <---- disable detection until you have a working camera feed + width: 704 # <---- update for your camera's resolution + height: 576 # <---- update for your camera's resolution + rtmp: + enabled: false + record: + enabled: false + snapshots: + enabled: false + vermont-4: + enabled: true + ffmpeg: + inputs: + - path: rtsp://admin:REDACTED_RTSP_PW@192.168.1.10:554/Streaming/Channels/401/1 # <----- The stream you want to use for detection + detect: + enabled: false # <---- disable detection until you have a working camera feed + width: 704 # <---- update for your camera's resolution + height: 576 # <---- update for your camera's resolution + rtmp: + enabled: false + record: + enabled: false + snapshots: + enabled: false + vermont-5: + enabled: true + ffmpeg: + inputs: + - path: rtsp://admin:REDACTED_RTSP_PW@192.168.1.10:554/Streaming/Channels/501/1 # <----- The stream you want to use for detection + detect: + enabled: false # <---- disable detection until you have a working camera feed + width: 704 # <---- update for your camera's resolution + height: 576 # <---- update for your camera's resolution + rtmp: + enabled: false + record: + enabled: false + snapshots: + enabled: false + vermont-6: + enabled: true + ffmpeg: + inputs: + - path: rtsp://admin:REDACTED_RTSP_PW@192.168.1.10:554/Streaming/Channels/601/1 # <----- The stream you want to use for detection + detect: + enabled: false # <---- disable detection until you have a working camera feed + width: 704 # <---- update for your camera's resolution + height: 576 # <---- update for your camera's resolution + rtmp: + enabled: false + record: + enabled: false + snapshots: + enabled: false + vermont-7: + enabled: true + ffmpeg: + inputs: + - path: rtsp://admin:REDACTED_RTSP_PW@192.168.1.10:554/Streaming/Channels/701/1 # <----- The stream you want to use for detection + detect: + enabled: false # <---- disable detection until you have a working camera feed + width: 704 # <---- update for your camera's resolution + height: 576 # <---- update for your camera's resolution + rtmp: + enabled: false + record: + enabled: false + snapshots: + enabled: false + vermont-8: + enabled: true + ffmpeg: + inputs: + - path: rtsp://admin:REDACTED_RTSP_PW@192.168.1.10:554/Streaming/Channels/801/1 # <----- The stream you want to use for detection + detect: + enabled: false # <---- disable detection until you have a working camera feed + width: 704 # <---- update for your camera's resolution + height: 576 # <---- update for your camera's resolution + rtmp: + enabled: false + record: + enabled: false + snapshots: + enabled: false + vermont-9: + enabled: true + ffmpeg: + inputs: + - path: rtsp://admin:REDACTED_RTSP_PW@192.168.1.10:554/Streaming/Channels/901/1 # <----- The stream you want to use for detection + detect: + enabled: false # <---- disable detection until you have a working camera feed + width: 704 # <---- update for your camera's resolution + height: 576 # <---- update for your camera's resolution + rtmp: + enabled: false + record: + enabled: false + snapshots: + enabled: false + # london-ipcam: + # enabled: false + # ffmpeg: + # inputs: + # - path: rtsp://192.168.2.2:8554/london_cam # <----- The stream you want to use for detection + # roles: + # - rtmp + # - record + # - detect + # detect: + # enabled: False + # width: 1280 + # height: 720 + # record: + # enabled: False # Not needed for this camera but keeping for reference + # events: + # retain: + # default: 10 + # objects: + # # Optional: list of objects to track from labelmap.txt (full list - https://docs.frigate.video/configuration/objects) + # track: + # - person + # - shoe + # - handbag + # - wine glass + # - knife + # - pizza + # - laptop + # - book diff --git a/scripts/cluster_manager.py b/scripts/cluster_manager.py new file mode 100644 index 00000000..84f38810 --- /dev/null +++ b/scripts/cluster_manager.py @@ -0,0 +1,277 @@ +import asyncio +import click +import logging +import time +from typing import List, Union, Optional +from kubernetes_asyncio import client, config +from kubernetes_asyncio.client.api_client import ApiClient + +logging.basicConfig(level=logging.INFO, format="%(levelname)s: %(message)s") +logger = logging.getLogger(__name__) + + +async def wait_for_healthy( + api_instance: client.AppsV1Api, + resource_type: str, + namespace: str, + name: str, + target_replicas: int, + timeout: int = 300, +) -> None: + start_time = time.time() + logger.info( + f"Waiting for {resource_type} {name} to reach {target_replicas} replicas..." + ) + + while True: + if time.time() - start_time > timeout: + logger.error(f"❌ Timeout reached for {resource_type} {name}") + return + + try: + if resource_type.lower() == "deployment": + res = await api_instance.read_namespaced_deployment_status( + name, namespace + ) + ready = res.status.ready_replicas or 0 + updated = res.status.updated_replicas or 0 + if ready == target_replicas and updated == target_replicas: + break + else: # StatefulSet + res = await api_instance.read_namespaced_stateful_set_status( + name, namespace + ) + ready = res.status.ready_replicas or 0 + if ready == target_replicas: + break + + except Exception as e: + logger.debug(f"Retrying status check for {name}: {e}") + + await asyncio.sleep(5) + + logger.info(f"✅ {resource_type} {name} is now healthy.") + + +async def wait_for_zero( + api: client.AppsV1Api, kind: str, ns: str, name: str, timeout: int +) -> tuple[str, str]: + start_time = asyncio.get_event_loop().time() + while (asyncio.get_event_loop().time() - start_time) < timeout: + try: + res = await ( + api.read_namespaced_deployment_status(name, ns) + if kind.lower() == "deployment" + else api.read_namespaced_stateful_set_status(name, ns) + ) + if (res.status.ready_replicas or 0) == 0: + return ns, name + except Exception: + return ns, name # Assume gone if error + await asyncio.sleep(3) + logger.error(f"Timeout: {kind} {ns}/{name} still has running pods.") + return ns, name + + +async def scale_resource( + api_instance: client.AppsV1Api, + resource_type: str, + namespace: str, + name: str, + replicas: int, +) -> None: + body = {"spec": {"replicas": replicas}} + try: + if resource_type.lower() == "deployment": + await api_instance.patch_namespaced_deployment_scale(name, namespace, body) + else: + await api_instance.patch_namespaced_stateful_set_scale( + name, namespace, body + ) + except Exception as e: + logger.error(f"Failed to scale {resource_type} {name}: {e}") + + +async def run_stop_tier( + api_v1: client.AppsV1Api, label: str, output_file: str, timeout: int +) -> None: + """Processes a single label tier: saves, scales to 0, and waits.""" + excluded_ns = ["kube-system", "kube-public", "kube-node-lease"] + + # 1. Discover + targets = [ + ("Deployment", api_v1.list_deployment_for_all_namespaces), + ("StatefulSet", api_v1.list_stateful_set_for_all_namespaces), + ] + + tier_resources = [] + for kind, list_func in targets: + resp = await list_func(label_selector=label) + tier_resources.extend( + [ + (kind, item) + for item in resp.items + if item.metadata.namespace not in excluded_ns + ] + ) + + if not tier_resources: + logger.warning(f"No resources found for label: {label}") + return + + # 2. Save & Scale + active_jobs: set[tuple[str, str]] = set() + wait_tasks = [] + + # Append to file so we don't overwrite previous tiers + with open(output_file, "a") as f: + for kind, item in tier_resources: + ns, name = item.metadata.namespace, item.metadata.name + reps = item.spec.replicas or 0 + f.write(f"{kind} {ns} {name} {reps}\n") + active_jobs.add((ns, name)) + + await scale_resource(api_v1, kind, ns, name, 0) + wait_tasks.append(wait_for_zero(api_v1, kind, ns, name, timeout)) + + # 3. Wait for this tier to finish before moving to next + logger.info(f"Tier [{label}]: Waiting for {len(active_jobs)} resources to stop...") + for coro in asyncio.as_completed(wait_tasks): + finished_ns, finished_name = await coro + active_jobs.discard((finished_ns, finished_name)) + if active_jobs: + remaining_ns = sorted({ns for ns, name in active_jobs}) + logger.info( + f"[{label}] Pending: {len(active_jobs)} | Namespaces: {', '.join(remaining_ns)}" + ) + + logger.info(f"✅ Tier [{label}] successfully shut down.") + + +@click.group() +def cli(): + pass + + +@cli.command() +@click.argument("labels", nargs=-1, required=True) +@click.option("--output", "-o", default="resources.txt", help="Output state file") +@click.option("--timeout", "-t", default=3600) +def stop(labels: List[str], output: str, timeout: int): + """Stop tiers sequentially. Usage: stop 'app=web' 'app=db'""" + + async def main(): + await config.load_kube_config() + # Clear/Create file at start + open(output, "w").close() + + async with ApiClient() as api_client: + api_v1 = client.AppsV1Api(api_client) + for label in labels: + logger.info(f"🚀 Processing Shutdown Tier: {label}") + await run_stop_tier(api_v1, label, output, timeout) + logger.info("🏁 Sequence complete. Cluster is gracefully stopped.") + + asyncio.run(main()) + + +@cli.command() +@click.argument("labels", nargs=-1, required=True) +@click.option("--file", "-f", default="resources.txt") +@click.option("--timeout", "-t", default=3600, help="Seconds to wait per resource") +def start(labels: List[str], file: str, timeout: int): + asyncio.run(run_start_sequence(labels, file, timeout)) + + +async def run_start_sequence(labels: List[str], file_path: str, timeout: int) -> None: + await config.load_kube_config() + + async with ApiClient() as api_client: + apps_v1 = client.AppsV1Api(api_client) + + # 1. Load the entire snapshot into memory for filtering + try: + with open(file_path, "r") as f: + # Format: Kind Namespace Name Replicas + snapshot_lines = [line.strip().split() for line in f if line.strip()] + except FileNotFoundError: + logger.error(f"Snapshot file {file_path} not found.") + return + + # 2. Iterate through labels in the order provided + for label in labels: + logger.info(f"🚀 Starting Tier: {label}") + + # Find resources in this tier by querying K8s for the label + # then matching against our snapshot file data + tier_resources = await get_resources_by_label(apps_v1, label) + + # Cross-reference: Only start things that are in BOTH the K8s label query AND our file + # This ensures we restore them to the CORRECT previous replica count + to_restore = [] + tier_keys = {(r["ns"], r["name"]) for r in tier_resources} + + for kind, ns, name, reps in snapshot_lines: + if (ns, name) in tier_keys: + to_restore.append((kind, ns, name, int(reps))) + + if not to_restore: + logger.warning(f"No resources found in snapshot for tier: {label}") + continue + + # 3. Scale and Wait for this specific tier + await process_start_tier(apps_v1, to_restore, timeout, label) + + logger.info("🏁 All tiers started successfully.") + + +async def get_resources_by_label(api: client.AppsV1Api, label: str) -> List[dict]: + """Helper to find what currently exists in the cluster with this label.""" + targets = [ + api.list_deployment_for_all_namespaces, + api.list_stateful_set_for_all_namespaces, + ] + found = [] + for list_func in targets: + resp = await list_func(label_selector=label) + for item in resp.items: + found.append({"ns": item.metadata.namespace, "name": item.metadata.name}) + return found + + +async def process_start_tier( + api: client.AppsV1Api, resources: list, timeout: int, label: str +): + active_jobs = set() + scale_tasks = [] + wait_tasks = [] + + # Wrapper to track which job finishes + async def tracked_wait(kind, ns, name, target, t_out): + await wait_for_healthy(api, kind, ns, name, target, t_out) + return (ns, name) + + for kind, ns, name, reps in resources: + active_jobs.add((ns, name)) + scale_tasks.append(scale_resource(api, kind, ns, name, reps)) + wait_tasks.append(tracked_wait(kind, ns, name, reps, timeout)) + + # Trigger all scales for this tier + await asyncio.gather(*scale_tasks) + + # Monitor health + for coro in asyncio.as_completed(wait_tasks): + finished_ns, finished_name = await coro + active_jobs.discard((finished_ns, finished_name)) + + if active_jobs: + remaining_ns = sorted({ns for ns, name in active_jobs}) + logger.info( + f"[{label}] Pending Health: {len(active_jobs)} | Namespaces: {', '.join(remaining_ns)}" + ) + + logger.info(f"✅ Tier [{label}] is healthy.") + + +if __name__ == "__main__": + cli() diff --git a/scripts/image_pull.sh b/scripts/image_pull.sh new file mode 100755 index 00000000..84494846 --- /dev/null +++ b/scripts/image_pull.sh @@ -0,0 +1,10 @@ +#!/usr/bin/env bash + +for n in $(kubectl get nodes -o wide | grep node | awk '{print $1}'); do + echo $n; + kubectl drain $n --ignore-daemonsets --delete-emptydir-data && \ + ssh wizard@$n < image_pull_remote.sh + # Check result + kubectl get --raw "/api/v1/nodes/$n/proxy/configz" | jq '.kubeletconfig | {serializeImagePulls, maxParallelImagePulls}' + kubectl uncordon $n +done diff --git a/scripts/image_pull_remote.sh b/scripts/image_pull_remote.sh new file mode 100755 index 00000000..537cc080 --- /dev/null +++ b/scripts/image_pull_remote.sh @@ -0,0 +1,14 @@ +#!/usr/bin/env bash + +# Containerd +sudo sed -i 's/.*max_concurrent_downloads.*/max_concurrent_downloads = 5/g' /etc/containerd/config.toml +sudo systemctl restart containerd + +# Kubelet +#sed serializeImagePulls: false # Allow container images to be downloaded in parallel +#maxParallelImagePulls: 20 # To limit the number of parallel image pulls. + +sudo sed -i '/serializeImagePulls:/d' /var/lib/kubelet/config.yaml && \ +sudo sed -i '/maxParallelImagePulls:/d' /var/lib/kubelet/config.yaml && \ +echo -e 'serializeImagePulls: false\nmaxParallelImagePulls: 5' | sudo tee -a /var/lib/kubelet/config.yaml +sudo systemctl restart kubelet diff --git a/scripts/setup-containerd-pullthrough.sh b/scripts/setup-containerd-pullthrough.sh new file mode 100755 index 00000000..332cb953 --- /dev/null +++ b/scripts/setup-containerd-pullthrough.sh @@ -0,0 +1,115 @@ +#!/usr/bin/env bash +set -euo pipefail + +############################################ +# CONFIGURATION +############################################ + +# Internal pull-through registry endpoint +# Examples: +# http://registry.internal:5000 +# https://registry.internal +INTERNAL_REGISTRY="http://10.0.20.10:5002" + +# Path where containerd reads registry configs +CERTS_DIR="/etc/containerd/certs.d" + +# Optional: path to CA file if INTERNAL_REGISTRY uses HTTPS with custom CA +# Leave empty if not needed +INTERNAL_CA_PATH="" + +# Restart containerd at the end +RESTART_CONTAINERD=true + +############################################ +# REGISTRIES TO MIRROR +############################################ + +REGISTRIES=( + "docker.io" + "registry-1.docker.io" + "registry.k8s.io" + "quay.io" + "ghcr.io" + "gcr.io" + "us-docker.pkg.dev" + "public.ecr.aws" + "mcr.microsoft.com" +) + +############################################ +# FUNCTIONS +############################################ + +require_root() { + if [[ "$(id -u)" -ne 0 ]]; then + echo "ERROR: must be run as root" >&2 + exit 1 + fi +} + +ensure_containerd_config_path() { + local cfg="/etc/containerd/config.toml" + + if [[ ! -f "$cfg" ]]; then + echo "Generating default containerd config" + containerd config default > "$cfg" + fi + + if ! grep -q 'config_path *= *"/etc/containerd/certs.d"' "$cfg"; then + echo "Enabling config_path in containerd config" + + # Minimal and safe append if section exists + if grep -q '\[plugins\."io.containerd.grpc.v1.cri".registry\]' "$cfg"; then + sed -i '/\[plugins\."io.containerd.grpc.v1.cri".registry\]/a \ config_path = "/etc/containerd/certs.d"' "$cfg" + else + cat >> "$cfg" <<'EOF' + +[plugins."io.containerd.grpc.v1.cri".registry] + config_path = "/etc/containerd/certs.d" +EOF + fi + fi +} + +write_hosts_toml() { + local registry="$1" + local dir="$CERTS_DIR/$registry" + local file="$dir/hosts.toml" + + mkdir -p "$dir" + + cat > "$file" <> "$file" <> /opt/data/apt-packages.txt && sort -u -o /opt/data/apt-packages.txt /opt/data/apt-packages.txt` + They will be auto-reinstalled on next container restart. + +## Communication Preferences +- Viktor prefers terse, technical responses +- No emojis unless asked +- Lead with the answer, not the reasoning +- When unsure, say so rather than guessing + EOT + } +} + +# --- Deployment --- + +resource "kubernetes_deployment" "hermes_agent" { + metadata { + name = "hermes-agent" + namespace = kubernetes_namespace.hermes_agent.metadata[0].name + labels = { + app = "hermes-agent" + tier = local.tiers.aux + } + } + spec { + strategy { + type = "Recreate" + } + replicas = 1 + selector { + match_labels = { + app = "hermes-agent" + } + } + template { + metadata { + labels = { + app = "hermes-agent" + } + annotations = { + "reloader.stakater.com/search" = "true" + } + } + spec { + # Init container 1: bootstrap config into writable PVC + init_container { + name = "bootstrap-config" + image = "docker.io/library/busybox:1.37" + command = ["sh", "-c", <<-EOF + # Create subdirectories + mkdir -p /opt/data/memories /opt/data/skills /opt/data/sessions /opt/data/cron /opt/data/logs + mkdir -p /opt/data/pip-packages/bin /opt/data/npm-global/bin + + # Copy config from ConfigMap and inject secrets + cp /config/config.yaml /opt/data/config.yaml + cp /soul/SOUL.md /opt/data/SOUL.md + + # Replace API key placeholder with actual value from secret + NVIDIA_KEY=$(cat /secrets/NVIDIA_API_KEY) + sed -i "s|__NVIDIA_API_KEY_PLACEHOLDER__|$${NVIDIA_KEY}|g" /opt/data/config.yaml + + # Generate .env from mounted secret files + echo "# Auto-generated from Vault secret/hermes-agent" > /opt/data/.env + for f in /secrets/*; do + key=$(basename "$f") + val=$(cat "$f") + echo "$${key}=$${val}" >> /opt/data/.env + done + + # Fix ownership (hermes container user) + chown -R 1000:1000 /opt/data + EOF + ] + volume_mount { + name = "data" + mount_path = "/opt/data" + } + volume_mount { + name = "config" + mount_path = "/config" + } + volume_mount { + name = "soul" + mount_path = "/soul" + } + volume_mount { + name = "secrets" + mount_path = "/secrets" + } + } + + container { + name = "hermes-agent" + image = "nousresearch/hermes-agent:latest" + # Wrap entrypoint: restore apt packages (as root) then hand off to original entrypoint + command = ["bash", "-c", <<-EOF + if [ -f /opt/data/apt-packages.txt ] && [ -s /opt/data/apt-packages.txt ]; then + echo "Restoring apt packages: $(cat /opt/data/apt-packages.txt | tr '\n' ' ')" + apt-get update -qq && \ + xargs -a /opt/data/apt-packages.txt apt-get install -y -qq --no-install-recommends 2>&1 | tail -5 + echo "Done restoring apt packages" + fi + exec /opt/hermes/docker/entrypoint.sh gateway run + EOF + ] + + env { + name = "HERMES_HOME" + value = "/opt/data" + } + + # Persist pip packages across restarts + env { + name = "PIP_TARGET" + value = "/opt/data/pip-packages" + } + env { + name = "PYTHONPATH" + value = "/opt/data/pip-packages" + } + + # Persist npm global packages across restarts + env { + name = "NPM_CONFIG_PREFIX" + value = "/opt/data/npm-global" + } + + # Add persistent bin dirs to PATH + env { + name = "PATH" + value = "/opt/data/pip-packages/bin:/opt/data/npm-global/bin:/opt/data/apt-local/usr/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin" + } + + volume_mount { + name = "data" + mount_path = "/opt/data" + } + + resources { + requests = { + cpu = "100m" + memory = "512Mi" + } + limits = { + memory = "1Gi" + } + } + + liveness_probe { + exec { + command = ["pgrep", "-f", "hermes"] + } + initial_delay_seconds = 30 + period_seconds = 30 + failure_threshold = 3 + } + } + + volume { + name = "data" + persistent_volume_claim { + claim_name = kubernetes_persistent_volume_claim.data_proxmox.metadata[0].name + } + } + volume { + name = "config" + config_map { + name = kubernetes_config_map.hermes_config.metadata[0].name + } + } + volume { + name = "soul" + config_map { + name = kubernetes_config_map.hermes_soul.metadata[0].name + } + } + volume { + name = "secrets" + secret { + secret_name = "hermes-agent-secrets" + } + } + } + } + } + lifecycle { + ignore_changes = [spec[0].template[0].spec[0].dns_config] + } +} + +# --- Service --- + +resource "kubernetes_service" "hermes_agent" { + metadata { + name = "hermes-agent" + namespace = kubernetes_namespace.hermes_agent.metadata[0].name + labels = { + app = "hermes-agent" + } + } + spec { + selector = { + app = "hermes-agent" + } + port { + port = 80 + target_port = 8642 + } + } +} + +# --- Ingress --- + +module "ingress" { + source = "../../modules/kubernetes/ingress_factory" + namespace = kubernetes_namespace.hermes_agent.metadata[0].name + name = "hermes-agent" + tls_secret_name = var.tls_secret_name + protected = true + extra_annotations = { + "gethomepage.dev/enabled" = "true" + "gethomepage.dev/name" = "Hermes Agent" + "gethomepage.dev/description" = "Self-improving AI agent" + "gethomepage.dev/icon" = "mdi-robot" + "gethomepage.dev/group" = "AI & Data" + "gethomepage.dev/pod-selector" = "" + } +} diff --git a/stacks/hermes-agent/secrets b/stacks/hermes-agent/secrets new file mode 120000 index 00000000..ca54a7cf --- /dev/null +++ b/stacks/hermes-agent/secrets @@ -0,0 +1 @@ +../../secrets \ No newline at end of file diff --git a/stacks/hermes-agent/terragrunt.hcl b/stacks/hermes-agent/terragrunt.hcl new file mode 100644 index 00000000..f4c920ab --- /dev/null +++ b/stacks/hermes-agent/terragrunt.hcl @@ -0,0 +1,13 @@ +include "root" { + path = find_in_parent_folders() +} + +dependency "platform" { + config_path = "../platform" + skip_outputs = true +} + +dependency "vault" { + config_path = "../vault" + skip_outputs = true +} diff --git a/stacks/hermes-agent/tiers.tf b/stacks/hermes-agent/tiers.tf new file mode 100644 index 00000000..eb0f8083 --- /dev/null +++ b/stacks/hermes-agent/tiers.tf @@ -0,0 +1,10 @@ +# Generated by Terragrunt. Sig: nIlQXj57tbuaRZEa +locals { + tiers = { + core = "0-core" + cluster = "1-cluster" + gpu = "2-gpu" + edge = "3-edge" + aux = "4-aux" + } +} diff --git a/stacks/infra/tiers.tf b/stacks/infra/tiers.tf new file mode 100644 index 00000000..eb0f8083 --- /dev/null +++ b/stacks/infra/tiers.tf @@ -0,0 +1,10 @@ +# Generated by Terragrunt. Sig: nIlQXj57tbuaRZEa +locals { + tiers = { + core = "0-core" + cluster = "1-cluster" + gpu = "2-gpu" + edge = "3-edge" + aux = "4-aux" + } +} diff --git a/stacks/n8n/workflows/.gitkeep b/stacks/n8n/workflows/.gitkeep new file mode 100644 index 00000000..e69de29b diff --git a/stacks/phpipam/secrets b/stacks/phpipam/secrets new file mode 120000 index 00000000..ca54a7cf --- /dev/null +++ b/stacks/phpipam/secrets @@ -0,0 +1 @@ +../../secrets \ No newline at end of file diff --git a/stacks/phpipam/tiers.tf b/stacks/phpipam/tiers.tf new file mode 100644 index 00000000..eb0f8083 --- /dev/null +++ b/stacks/phpipam/tiers.tf @@ -0,0 +1,10 @@ +# Generated by Terragrunt. Sig: nIlQXj57tbuaRZEa +locals { + tiers = { + core = "0-core" + cluster = "1-cluster" + gpu = "2-gpu" + edge = "3-edge" + aux = "4-aux" + } +} diff --git a/stacks/status-page/tiers.tf b/stacks/status-page/tiers.tf new file mode 100644 index 00000000..eb0f8083 --- /dev/null +++ b/stacks/status-page/tiers.tf @@ -0,0 +1,10 @@ +# Generated by Terragrunt. Sig: nIlQXj57tbuaRZEa +locals { + tiers = { + core = "0-core" + cluster = "1-cluster" + gpu = "2-gpu" + edge = "3-edge" + aux = "4-aux" + } +} diff --git a/stacks/vault/secrets b/stacks/vault/secrets new file mode 120000 index 00000000..ca54a7cf --- /dev/null +++ b/stacks/vault/secrets @@ -0,0 +1 @@ +../../secrets \ No newline at end of file