ainewsblitz.com

Research

Benchmarks and the research community

Benchmarks, papers, and research-community updates.

Foundation Models: StepFun Step 3.7 Flash Ranks Second on Claw-Eval

StepFun's Step 3.7 Flash placed #2 on Claw-Eval General, behind Claude Opus 4.6, performing well on long-horizon tasks. The result adds to a field where open-weight omni models like NVIDIA Cosmos 3 are pushing on both modality coverage and quality.