infra/stacks/portal-tts/terragrunt.hcl

include "root" {
  path = find_in_parent_folders()
}

dependency "platform" {
  config_path  = "../platform"
  skip_outputs = true
}

# portal-tts: in-cluster text-to-speech for the portal-assistant Gateway
# (portal-assistant issue #3, ADR-0003). One ALWAYS-ON Deployment of Piper
# (ghcr.io/matatonic/openedai-speech-min, OpenAI-compatible /v1/audio/speech)
# serving Bulgarian `bg_BG-dimitar-medium` + English `en_US-lessac-medium`, voice
# chosen PER REQUEST, behind a single ClusterIP Service
# `portal-tts.portal-tts.svc:8000`. Speech path: /v1/audio/speech.
#
# CPU-ONLY: Piper is a fast CPU neural TTS — NO GPU node selector / toleration /
# nvidia.com/gpu request. This deliberately keeps TTS off the OOM-prone shared
# T4 (the two GPU siblings tts/chatterbox + portal-stt already contend for it);
# Bulgarian isn't available on chatterbox anyway (ADR-0003). replicas=1, never
# scaled to zero — no off-peak gate needed when there's no GPU to free.
#
# Voices live on an NFS-SSD PVC, downloaded from rhasspy/piper-voices by an init
# container on first boot (both .onnx + .onnx.json), then persist. A ConfigMap
# supplies voice_to_speaker.yaml mapping request voice "bg"/"en" -> .onnx model.
#
# PLUGGABLE: ADR-0003 keeps TTS a swappable backend with edge-tts as an online
# Bulgarian fallback — that switch is Gateway-side; nothing here changes for it.
#
# nfs_server comes from config.tfvars (192.168.1.127) via the root inputs.
#
# HITL: agent drafts; operator applies via GitOps, then verifies the rollout +
# a bg/en /v1/audio/speech smoke test (curl returns audio bytes).
portal-assistant: land voice stacks + switch TTS to edge-tts (intelligible Bulgarian) The portal-assistant voice-assistant stacks (portal-tts, portal-stt, portal-assistant) were applied to the live cluster from feature branches but never landed on master — the GitOps source of truth. This lands all three and, in portal-tts, fixes Bulgarian speech. Bulgarian was unintelligible: the local Piper voice (bg_BG-dimitar-medium via espeak-ng) mangles Bulgarian consonants — a synth->Whisper round-trip turned "Добър ден" into "Обърден", and a user heard pure gibberish. English was fine. portal-tts now runs openai-edge-tts (Microsoft edge-tts neural voices) for BOTH languages instead of Piper — ADR-0003 always named edge-tts as the online Bulgarian-quality fallback. Validated before landing: edge bg round-trips through Whisper verbatim ("Добър ден! Как сте днес? ..."). The gateway maps detected language bg/en to the edge voice names via new TTS_VOICE_BG / TTS_VOICE_EN env (bg-BG-KalinaNeural / en-US-AvaNeural). No GPU, no NFS model store, no secrets — edge fetches voices from Microsoft per request (egress verified). The assistant already needs the internet for the Claude brain, so an online TTS adds no new failure mode. The brain stays Sonnet with no extended thinking (already the default — a live turn answers directly in ~3.4s), per the latency-over-smartness ask. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> 2026-06-17 20:25:29 +00:00			`include "root" {`
			`path = find_in_parent_folders()`
			`}`

			`dependency "platform" {`
			`config_path = "../platform"`
			`skip_outputs = true`
			`}`

			`# portal-tts: in-cluster text-to-speech for the portal-assistant Gateway`
			`# (portal-assistant issue #3, ADR-0003). One ALWAYS-ON Deployment of Piper`
			`# (ghcr.io/matatonic/openedai-speech-min, OpenAI-compatible /v1/audio/speech)`
			# serving Bulgarian `bg_BG-dimitar-medium` + English `en_US-lessac-medium`, voice
			`# chosen PER REQUEST, behind a single ClusterIP Service`
			# `portal-tts.portal-tts.svc:8000`. Speech path: /v1/audio/speech.
			`#`
			`# CPU-ONLY: Piper is a fast CPU neural TTS — NO GPU node selector / toleration /`
			`# nvidia.com/gpu request. This deliberately keeps TTS off the OOM-prone shared`
			`# T4 (the two GPU siblings tts/chatterbox + portal-stt already contend for it);`
			`# Bulgarian isn't available on chatterbox anyway (ADR-0003). replicas=1, never`
			`# scaled to zero — no off-peak gate needed when there's no GPU to free.`
			`#`
			`# Voices live on an NFS-SSD PVC, downloaded from rhasspy/piper-voices by an init`
			`# container on first boot (both .onnx + .onnx.json), then persist. A ConfigMap`
			`# supplies voice_to_speaker.yaml mapping request voice "bg"/"en" -> .onnx model.`
			`#`
			`# PLUGGABLE: ADR-0003 keeps TTS a swappable backend with edge-tts as an online`
			`# Bulgarian fallback — that switch is Gateway-side; nothing here changes for it.`
			`#`
			`# nfs_server comes from config.tfvars (192.168.1.127) via the root inputs.`
			`#`
			`# HITL: agent drafts; operator applies via GitOps, then verifies the rollout +`
			`# a bg/en /v1/audio/speech smoke test (curl returns audio bytes).`