Nvidia Unveils ‘Rubin’ AI Platform as Demand for Compute Intensifies
Major cloud providers and AI labs, including AWS, Google Cloud, Microsoft Azure, Oracle Cloud, OpenAI, Anthropic, Meta, and xAI, are expected to adopt Rubin-based systems starting in 2026.
Topics
News
- CES 2026 Day 2 Signals AI’s Shift From Platforms to Physical Systems
- Musk’s xAI Raises $20 Billion in Upsized Funding Round
- Agentic AI Will Reshape Customer Service, but Gaps Remain, Study Says
- Accenture To Buy UK AI Group Faculty In Billion Dollar Deal
- Microsoft Acquires Osmos to Embed Agentic AI into Fabric
- CES 2026 Opens With AI at the Center of the Global Tech Stack
[Image source: Krishna Prasad/MITSMR Middle East]
Nvidia Corp.. the world’s dominant supplier of AI computing chips, on Tuesday unveiled Rubin, its next-generation AI hardware platform, a rack-scale system built around six tightly integrated chips designed to cut the cost and time required to train and run large artificial intelligence models.
Rubin combines a new CPU, GPU, networking switches, data processing units, and interconnect technology into what Nvidia describes as a single AI supercomputer architecture.
The company said the platform is aimed at meeting rising demand for both training and inference, particularly for large reasoning models and agentic AI systems that process long data sequences.
The platform succeeds Nvidia’s Blackwell architecture and is named after astronomer Vera Rubin.
Nvidia said Rubin can train mixture-of-experts (MoE) models using fewer GPUs and lower the cost per inference token than the previous generation, though independent performance benchmarks have yet to be released.
At the core of the system are the Vera CPU and Rubin GPU, connected through Nvidia’s sixth-generation NVLink interconnect.
The platform also integrates BlueField-4 data processing units, ConnectX-9 networking, and Spectrum-6 Ethernet switches, reflecting Nvidia’s push toward full-stack data center infrastructure rather than standalone accelerators.
Nvidia said Rubin introduces updates to its Transformer Engine, confidential computing capabilities, and reliability features designed to improve uptime in large AI clusters. The company is positioning Rubin for workloads such as multistep reasoning, large-scale inference, and video generation, where performance is increasingly constrained by memory bandwidth and networking, “not just raw compute.”
Major cloud providers and AI labs, including Amazon Web Services, Google Cloud, Microsoft Azure, Oracle Cloud, OpenAI, Anthropic, Meta, and xAI, are expected to begin adopting Rubin-based systems from 2026.
Several server makers such as Dell, HPE, Lenovo, and Supermicro, have also said they plan to ship systems built on the new architecture.
Rubin-based products are expected to enter production environments in the second half of 2026, with initial deployments focused on hyperscale data centers and AI-focused cloud providers. Nvidia said the platform is designed to scale to very large clusters, as AI models continue to grow in size and complexity.
The launch highlights Nvidia’s strategy of tightly coupling compute, networking, and software as competition intensifies in AI infrastructure, and as cloud providers look for ways to manage rising costs associated with large-scale model training and deployment.