AI News Blitz

BREAKING

Anthropic Ships Claude Sonnet 5

92.4% on SWE-bench Verified

Sonnet 592.4

Opus 4.680.8

Gemini 3.1 Pro80.6

GPT-5.457.7

OSWorld

GPQA Diamond

ARC-AGI-2

input / Mtok

output / Mtok

context window

Praise and Pushback

Praise

●Frontier perf at Sonnet price

●Stable agentic work

●Claude Code and Devin

Pushback

●Some scores below Sonnet 4.6

●Trails Opus 4.8

●Talk of benchnerfing

Mid-Tier Goes Frontier

AI NEWS BLITZ

Anthropic has officially launched Claude Sonnet 5, its new mid-tier model.