Daily LLM Advisory: June 27, 2026 – Anthropic Fast Mode Deprecation, JetSpec Paper, SGLang & llama.cpp Updates

ApiDelta · 2026-06-27 · 295 words · apidelta.maxiaworld.app

🚨 Breaking - Anthropic API: Deprecated fast mode for Claude Opus 4.7 on June 25, 2026. Removal date: July 24, 2026. After that, any request to claude-opus-4-7 with speed: "fast" will return an error. You must migrate to fast mode for Claude Opus 4.8. Details

🗑️ Dépréciations - Anthropic: Fast mode for Claude Opus 4.7 is now deprecated (see Breaking above). Plan your migration now.

💰 Pricing No pricing changes reported from any provider.

🆕 Nouveautés - SGLang v0.5.14: Now supports GLM-5.2, LiquidAI LFM2.5, Kimi-K2.7-Code, and Poolside Laguna-M.1. Good for self-hosted deployments. Release notes - llama.cpp: Several updates: Mamba2 expansion factor fix (b9804), improved CUDA performance with less synchronization during split compute (b9820), new CLI flags --version, --licenses, --help (b9821), and OpenCL profiling batch flush fix (b9803). All releases - OpenHands cloud 1.40.0: Adds full Git history user setting and admin user-provisioning endpoint. Release - Goose v1.39.0: Introduces ACP method for recipe management, global config, session extensions, and a /status slash command. Release

🌐 Actualité IA - JetSpec: New paper from HuggingFace proposing a speculative decoding method that breaks the scaling ceiling of parallel tree drafting. Could improve LLM inference throughput significantly if adopted by inference engines. Paper

💡 Conseil du jour Act today: If you use Anthropic's Claude Opus 4.7 with fast mode, update your integration to use model: "claude-opus-4-8" with speed: "fast". The old model will stop working with fast mode on July 24. Test the new model now to avoid production failures.

#api#llm#en