LLM Daily: Free Nex-N2-Pro Model & llama.cpp Updates

🚨 Breaking

Aucun changement

🗑️ Dépréciations

Aucun changement

💰 Pricing

Aucun changement

🆕 Nouveautés

llama.cpp b9568: Support for gemma-4 E2B and E4B assistants (#24282). Includes converter updates and masked_embd tensors for gemma4-assist arch.
llama.cpp b9553: Relaxed sampler name matching (#23744) – now recognizes alternative names like top-k and min-p.
llama.cpp b9570: ggml-webgpu: Add clang-format job (#24308) – minor CI improvement.
llama.cpp b9551: kv-cache: Avoid KV cells copies (#24277) – performance optimization.
Nex-N2-Pro (free): New free 397B MoE model with 17B active parameters, text+image input, based on Qwen3.5 architecture.
Anthropic API – May 19, 2026: MCP tunnels research preview for private network connections; self-hosted sandboxes for Claude Managed Agents.
Anthropic API – April 8, 2026: Claude Managed Agents now in public beta – fully managed agent harness with sandboxing, built-in tools, SSE streaming.
Codex 0.138.0: Desktop integration for CLI; local image file path exposure.
Qwen/Qwen-Image-Bench: New judge/evaluation model for image-text tasks, based on Qwen3.6-27B.
langchain-core 1.4.2: Deprecates problematic dict() method (#31685).

🌐 Actualité IA

Aucun signal

💡 Conseil du jour

Évaluez le nouveau modèle Nex-N2-Pro (gratuit sur OpenRouter) comme alternative économique pour des tâches agentiques avec 17B paramètres actifs. Testez-le sur vos pipelines de classification ou extraction – peut réduire les coûts sans perte de qualité. Mettez également à jour votre build llama.cpp (b9551) pour bénéficier de l’optimisation KV-cache qui réduit les copies mémoire et améliore les performances en inférence.