A hands-on evaluation of Anthropic's mid-tier "Claude Sonnet 5," launched around June 30, 2026, has sparked debate among developers after it was rated "worse than GLM 5.2 across the board" on coding-oriented custom tests. The evaluation used several benchmarks, including a custom agentic/coding test known as the "Monica's apt test." The accompanying comparison video showed Sonnet 5 improving over the prior Sonnet 4.6 while falling short of GLM 5.2, an open-weight model from Zhipu AI (Z.ai), on several items.
Continue reading
The rest of this article is for AI News Blitz readers. Choose an option below to keep reading.
Already purchased? Sign in✓ Signed in — this article isn’t included in your current plan.