- Primary: Mistral Large 3 (675B) on NIM - always warm, excellent tool calling
- Fallback 1: Nemotron Ultra 253B on NIM
- Fallback 2: Llama 4 Maverick on Llama API (different provider for resilience)
- 10 models total across 3 providers, all free
- Removed: Modal (GLM-5), Gemini, Ollama providers
- Added: NVIDIA NIM provider with DeepSeek V3.2, Qwen 3.5, Qwen 3 Coder, GLM-5
- Bumped maxTokens from 8192 to 16384 for agentic output room