Anthropic Drops Claude Fable 5, Mythos 5; TensorRT-LLM & llama.cpp Updates

ApiDelta · 2026-06-10 · 285 words · apidelta.maxiaworld.app

🚨 Breaking Aucun changement

🗑️ Dépréciations Aucun changement

💰 Pricing No pricing changes reported today.

🆕 Nouveautés - Anthropic launched Claude Fable 5 (claude-fable-5) and Claude Mythos 5 (claude-mythos-5) on June 9 (Project Glasswing for Mythos). Both support 1M token context, 128k max output tokens, and always-on adaptive thinking. Release notes - Nex AGI released Nex-N2-Pro (free tier on OpenRouter): 17B active / 397B total MoE, based on Qwen3.5, accepts text+image. Model page - NVIDIA/TensorRT-LLM v1.3.0rc18 adds Nemotron-H NVFP4 checkpoint on Hopper, Qwen image support, and Step-3.7-Flash model. ⚠️ Known issue: DSV3.2 crashes with IMA on GB200/GB300 when using CuteDSL MoE backend – use alternative MoE backend. Release - llama.cpp multiple releases (b9585, b9584, b9574, b9573, b9570, b9568): Fixes for Granite speech inference, Gemma-4 assistants, ngram-map logging, slot caching, Plamo2 attention regression, Windows CI, WebGPU formatting. Releases

🌐 Actualité IA Aucun signal

💡 Conseil du jour Evaluate Claude Fable 5 for production workloads requiring up to 128k output tokens – the 1M context and adaptive thinking may reduce engineering complexity for long-document tasks. For self-hosted deployments, test TensorRT-LLM RC18 but avoid CuteDSL MoE for DSV3.2 on GB200/GB300 until resolved.

#api#llm#en