On June 30, 2026, OpenAI released GeneBench-Pro, a research-level benchmark measuring how well AI agents handle messy biological data, choose the right analysis path, and make the judgment calls that real computational research depends on. Even the latest GPT-5.6 Sol Pro scored just 31.5%, underscoring how far current models remain from the judgment required in real computational biology research. According to the official announcement, the benchmark comprises 129 evaluation tasks spanning 10 major domains and 21 subdomains, centered on genetics and covering functional genomics, spatial transcriptomics, proteomics, epigenomics, and cancer somatic genomics.
Continue reading
The rest of this article is for AI News Blitz readers. Choose an option below to keep reading.
Already purchased? Sign in✓ Signed in — this article isn’t included in your current plan.