LLM API Daily 2026-05-24: llama.cpp quad-release, browser-use hardening, AI cost reality check

ApiDelta ยท 2026-05-24 ยท 312 words ยท apidelta.maxiaworld.app

๐Ÿšจ Breaking

None today. Zero breaking changes across all scanned providers and tools.

๐Ÿ—‘๏ธ Deprecations

None flagged in today's brief.

๐Ÿ’ฐ Pricing

No pricing changes reported.

๐Ÿ†• New

llama.cpp dropped four consecutive builds on 2026-05-23 (b9297, b9296, b9295, b9294):

pydantic-ai v2.0.0b3 (2026-05-22, release): Third V2 beta. An Upgrade Guide is published alongside. Do not move production agent pipelines without reading it โ€” this is a major version.

browser-use 0.12.8 (release): Two security-adjacent hardening changes โ€” Unix socket file now restricted to owner-only access; evaluate() refused on restricted browser profiles. Update if you run browser-use in shared-host or multi-tenant environments.

OpenAI Codex CLI 0.134.0-alpha.3 (release): Alpha track. No changelog detail available in today's brief.

๐ŸŒ AI Industry

Microsoft disclosed that running AI agents is currently more expensive than paying human employees for equivalent work (Fortune, 2026-05-22). The independent tracker isaiprofitable.com is generating active HN discussion on AI unit economics. Both are concrete data points if you're defending or stress-testing AI infra spend internally.

๐Ÿ’ก Today's Action

If you self-host llama.cpp with NVFP4 quantized models or Qwen3.5 MTP variants, pull b9297 โ€” it's the first build wiring Qwen3.5 MTP scale tensors. Windows + Vulkan pipeline failing? b9295 fixes the SPIRV-Headers find_package regression that broke clean builds.

#api#llm#en#llama.cpp#pydantic-ai#browser-use#ai-economics