OpenAI Turns to Cerebras as Real-Time AI Takes Priority
The partnership adds up to 750 megawatts of low-latency compute capacity as models grow longer, heavier, and harder to run.
Topics
News
- OpenAI Turns to Cerebras as Real-Time AI Takes Priority
- Google Bets on Personalized AI at Scale by Linking Gmail, Photos, and Search
- Meta Shifts Resources From Metaverse Toward AI, Wearables
- India Tells Quick Commerce Apps to Drop 10 Minute Delivery Pitch
- What Scott Adams' Dilbert Got Right About Power at Work
- Apple Turns to Google to Power the Next Phase of AI
[Image source: Chetan Jha/MITSMR India]
OpenAI is tying up with chipmaker Cerebras to cut latency and boost real-time performance across its AI products by adding up to 750 megawatts of ultra low-latency compute capacity to its infrastructure.
The move reflects a growing focus on speed as AI systems handle longer outputs, complex reasoning, code generation, image creation, and agent-based workflows.
Cerebras builds purpose-built AI systems designed to accelerate inference by consolidating massive amounts of compute, memory, and bandwidth onto a single, wafer-scale chip. By avoiding the data movement bottlenecks common in conventional hardware, the company claims its systems can deliver significantly faster responses, particularly for long-form outputs.
AI responses typically involve an iterative loop, user input, model reasoning, and output generation, and latency at any stage can affect how users interact with the system.
Faster responses tend to translate into longer engagement and support for higher-value workloads.
The multi-year agreement will roll out capacity in phases through 2028 and is reported to be worth more than $10 billion.
“OpenAI’s compute strategy is to build a resilient portfolio that matches the right systems to the right workloads,” said Sachin Katti of OpenAI. “Cerebras adds a dedicated low-latency inference solution to our platform. That means faster responses, more natural interactions, and a stronger foundation to scale real-time AI to many more people.”
Andrew Feldman, co-founder and CEO of Cerebras, framed the partnership as a step toward a broader shift in how AI is consumed. “Just as broadband transformed the internet, real-time inference will transform AI, enabling entirely new ways to build and interact with AI models,” he said.
The deal comes amid wider concerns about trust and accuracy on the internet, an issue Cerebras has highlighted independently.
In a recent post on X, the company pointed out that a majority of users no longer trust online information, citing the spread of AI-generated misinformation even across credible sources.