- Add modal provider (GLM-5-FP8) as primary model with non-streaming mode (GLM-5 uses non-standard reasoning_content field incompatible with streaming) - Add curl --retry flags to init container downloads for reliability - Fallback chain: GLM-5 → Gemini 2.5 Flash → Llama 3.3 70B
47 KiB
47 KiB