Google Unveils AI Chips for Training and Inference
The company said its new TPU 8t and TPU 8i are designed separately for model training and real-time serving.
Topics
News
- Google Unveils AI Chips for Training and Inference
- India Sets New Rules for Online Gaming
- Infosys Teams With OpenAI to Scale Enterprise AI
- India Moves to Build Military AI at Home: Report
- SpaceX Wins Option to Buy Cursor for $60 Billion Before IPO
- Meta to Track Staff Keystrokes, Screens for AI Training
Google Cloud on Wednesday unveiled the eighth generation of its custom artificial intelligence chips with two distinct designs, one focused on training models and the other on inference workloads.
The chips, named TPU 8t and TPU 8i, are part of Google’s Tensor Processing Unit family and are expected to become generally available later this year.
The company said the two designs are intended to address different computational requirements within AI systems.
“With the rise of AI agents, we determined the community would benefit from chips individually specialized to the needs of training and serving,” said Amin Vahdat, Senior Vice-President and Chief Technologist for AI and Infrastructure at Google, in a blog post.
Training involves building AI models using large datasets, while inference refers to the process of running those models after deployment. Google said TPU 8t is designed for large-scale training workloads, while TPU 8i is built to handle latency-sensitive inference tasks.
The company said TPU 8t is capable of delivering nearly three times the compute performance per pod compared with its previous generation. It also supports large-scale configurations, with systems designed to scale to thousands of chips connected through shared high-bandwidth memory.
TPU 8i, the inference-focused chip, incorporates higher memory bandwidth and increased on-chip memory. Google said this is intended to support workloads where systems process large volumes of requests and interactions in real time.
The architecture is designed “to deliver the massive throughput and low latency needed to concurrently run millions of agents cost-effectively,” Sundar Pichai wrote in a blog post.
Google said the inference chip offers about 80% better performance per dollar than the previous generation, while both chips are designed to improve power efficiency.
The company added that the systems are supported by liquid cooling technology and integrated hardware and software components.
Google said its chips will be offered alongside hardware from Nvidia within its cloud platform.
The TPU systems have been used internally for years and are also available to external customers. Organizations such as Citadel Securities are among those Google cited as users of its AI infrastructure.
Google said both TPU 8t and TPU 8i were developed with Google DeepMind and are designed to support a range of AI workloads.


