Draft (NOT applied) of a new infra stack deploying Piper as an in-cluster text-to-speech service for the portal-assistant Gateway (portal-assistant issue #3, ADR-0003). Bulgarian (bg_BG-dimitar-medium) + English (en_US-lessac-medium), voice chosen per request. Why this shape: - CPU-only, always-on (replicas=1, no GPU): Piper runs in real time on CPU, so this keeps TTS off the OOM-prone shared T4 that the two GPU siblings (tts/chatterbox, portal-stt) already contend for. Bulgarian isn't on chatterbox anyway (its langs exclude bg). - OpenAI-compatible image (openedai-speech-min, /v1/audio/speech) so the Gateway gets raw audio bytes per its tts.synthesize(text, lang) -> bytes contract and treats Piper + the future edge-tts fallback identically — same shape chatterbox already uses. - Voices on an NFS-SSD PVC, downloaded from rhasspy/piper-voices by an init container on first boot; a ConfigMap maps request voice bg/en -> .onnx model. - ClusterIP only (audio stays on the LAN; the Gateway is the only externally exposed component, ADR-0001). Mirrors the just-written portal-stt sibling stack's conventions. terraform fmt clean; terraform validate passes (only the codebase-wide kubernetes_namespace deprecation warnings). HITL: operator reviews + applies via GitOps; do not apply from a worktree. Open items flagged in main.tf (image choice on a frozen upstream; resource sizing to confirm with krr). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
33 lines
1.6 KiB
HCL
33 lines
1.6 KiB
HCL
include "root" {
|
|
path = find_in_parent_folders()
|
|
}
|
|
|
|
dependency "platform" {
|
|
config_path = "../platform"
|
|
skip_outputs = true
|
|
}
|
|
|
|
# portal-tts: in-cluster text-to-speech for the portal-assistant Gateway
|
|
# (portal-assistant issue #3, ADR-0003). One ALWAYS-ON Deployment of Piper
|
|
# (ghcr.io/matatonic/openedai-speech-min, OpenAI-compatible /v1/audio/speech)
|
|
# serving Bulgarian `bg_BG-dimitar-medium` + English `en_US-lessac-medium`, voice
|
|
# chosen PER REQUEST, behind a single ClusterIP Service
|
|
# `portal-tts.portal-tts.svc:8000`. Speech path: /v1/audio/speech.
|
|
#
|
|
# CPU-ONLY: Piper is a fast CPU neural TTS — NO GPU node selector / toleration /
|
|
# nvidia.com/gpu request. This deliberately keeps TTS off the OOM-prone shared
|
|
# T4 (the two GPU siblings tts/chatterbox + portal-stt already contend for it);
|
|
# Bulgarian isn't available on chatterbox anyway (ADR-0003). replicas=1, never
|
|
# scaled to zero — no off-peak gate needed when there's no GPU to free.
|
|
#
|
|
# Voices live on an NFS-SSD PVC, downloaded from rhasspy/piper-voices by an init
|
|
# container on first boot (both .onnx + .onnx.json), then persist. A ConfigMap
|
|
# supplies voice_to_speaker.yaml mapping request voice "bg"/"en" -> .onnx model.
|
|
#
|
|
# PLUGGABLE: ADR-0003 keeps TTS a swappable backend with edge-tts as an online
|
|
# Bulgarian fallback — that switch is Gateway-side; nothing here changes for it.
|
|
#
|
|
# nfs_server comes from config.tfvars (192.168.1.127) via the root inputs.
|
|
#
|
|
# HITL: agent drafts; operator applies via GitOps, then verifies the rollout +
|
|
# a bg/en /v1/audio/speech smoke test (curl returns audio bytes).
|