ainewsblitz.com

Breaking

OpenAI Cuts Inference Costs by Over Half on Some Models via Software

  • Foundation Models
  • Infra & Chips

OpenAI engineers shared internally in early June 2026 that they had found a software optimization cutting inference costs by more than half on some existing models, according to reports. As reported by The Information, after applying the optimization, the number of Nvidia GPUs needed to serve traffic from ChatGPT's logged-out guest users dropped to "a couple hundred." (the-decoder)

Continue reading

The rest of this article is for AI News Blitz readers. Choose an option below to keep reading.

$20
Read this article
$29/month
Unlimited — all 3,293 articles, the full archive, and comprehension quizzes
Save 72%
$98/year
≈ $8.17/month
Unlimited, billed once a year