April 5, 2023

Google Claims Its TPU v4 Outperforms Nvidia A100

Jaime Hampton

A new scientific paper from Google details the performance of its Cloud TPU v4 supercomputing platform, claiming it provides exascale performance for machine learning with boosted efficiency.

The authors of the research paper claim the TPU v4 is 1.2x–1.7x faster and uses 1.3x–1.9x less power than the Nvidia A100 in similar sized systems. The paper notes that Google has not compared TPU v4 to the newer Nvidia H100 GPUs due to their limited availability and 4nm architecture (vs. TPU v4’s 7nm architecture).

As machine learning models have grown larger and more complex, so have their compute resource needs. Google’s Tensor Processing Units (TPUs) are specialized hardware accelerators used for building machine learning models, specifically deep neural networks. They are optimized for tensor operations and can significantly boost efficiency in the training and inference of large-scale ML models. Google says the performance, scalability, and availability make TPU supercomputers the workhorses of its large language models like LaMDA, MUM, and PaLM.

Google CEO Sundar Pichai announcing TPU v4 at Google I/O 2021. (Source: Google)

The TPU v4 supercomputer contains 4,096 chips interconnected via proprietary optical circuit switches (OCS), which Google claims are faster, cheaper, and utilize less power than InfiniBand, another popular interconnect technology. Google claims its OCS technology is less than 5% of the TPU v4’s system cost and power, stating it dynamically reconfigures the supercomputer interconnect topology to improve scale, availability, utilization, modularity, deployment, security, power, and performance.

Google engineers and paper authors Norm Jouppi and David Patterson explained in a blog post that thanks to key innovations in interconnect technologies and domain-specific accelerators (DSAs), Google Cloud TPU v4 enabled a nearly 10x leap in scaling ML system performance over TPU v3. It also boosted the energy efficiency by approximately 2-3x compared to contemporary ML DSAs and reduced CO2e by approximately 20x over DSAs in what the company calls typical on-prem datacenters.

The TPU v4 system has been operational at Google since 2020. The TPU v4 chip was unveiled at the company’s 2021 I/O developer conference. Google says the supercomputers are actively used by leading AI teams for ML research and production across language models, recommender systems, and other generative AI.

Regarding recommender systems, Google says its TPU supercomputers are also the first with hardware support for embeddings, a key component of Deep Learning Recommendation Models (DLRMs) used in advertising, search ranking, YouTube, and Google Play. This is because each TPU v4 is equipped with SparseCores, which are dataflow processors that accelerate models that rely on embeddings by 5x–7x but use only 5% of die area and power.

One-eighth of a TPU v4 pod from Google’s ML cluster located in Oklahoma, which the company claims runs on ~90% carbon-free energy. (Source: Google)

Midjourney, a text-to-image AI startup, recently selected TPU v4 to train the fourth version of its image-generating model: “We’re proud to work with Google Cloud to deliver a seamless experience for our creative community powered by Google’s globally scalable infrastructure,” said David Holz, founder and CEO of Midjourney in a Google blog post. “From training the fourth version of our algorithm on the latest v4 TPUs with JAX, to running inference on GPUs, we have been impressed by the speed at which TPU v4 allows our users to bring their vibrant ideas to life.”

TPU v4 supercomputers are available to AI researchers and developers at Google Cloud’s ML cluster in Oklahoma, which opened last year. At nine exaflops of peak aggregate performance, Google believes the cluster is the largest publicly available ML hub that operates with 90% carbon-free energy. Check out the TPU v4 research paper here.

Google Cloud’s 2023 Data and AI Trends Report Reveals a Changing Landscape

Partners Line Up for Google Cloud Ready for AlloyDB Designation

Applications: Artificial Intelligence

Technologies: Cloud, Network, Processors, Systems

Vendors: Google Cloud, NVIDIA

Tags: AI, deep learning, Google Cloud, LLMs, machine learning, ML, TPU v4, TPUs

Only registered users may comment. Register using the form below.

Check off newsletters you would like to receive*
- HPCwire
- EnterpriseTech
- Datanami
- Technology Conferences & Events
- Advanced Computing Job Bank
- Technology Product Showcase
Email*
Name*
First Last
Organization*
Job Function*
Industry*
Country*
City*
State*
Province*
- Please check here to receive valuable email offers from Datanami on behalf of our select partners.

Google Claims Its TPU v4 Outperforms Nvidia A100

Join the discussion Cancel reply

Only registered users may comment. Register using the form below.

April 19, 2024

April 18, 2024

Sponsored Partner Content

Get your Data AI Ready – Celebrate One Year of Deep Dish Data Virtual Series!

Supercharge Your Data Lake with Spark 3.3

Learn How to Build a Custom Chatbot Using a RAG Workflow in Minutes [Hands-on Demo]

Overcome ETL Bottlenecks with Metadata-driven Integration for the AI Era [Free Guide]

Gartner® Hype Cycle™ for Analytics and Business Intelligence 2023

The Art of Mastering Data Quality for AI and Analytics

Leading Solution Providers

Tabor Network

Sponsored Whitepapers

Building an Operational Data Warehouse for Real-time Analytics

Can You Use Kafka as a Database?

Sponsored Multimedia

The Power of DataOps: Bring Automation to Life
No Comments

Tactical Steps for Cloud Migration
No Comments

Immuta Data Access Platform
No Comments

Data Mesh: Fact or Fiction?
No Comments

Contributors

Featured Events

Call & Contact Center Expo

AI & Big Data Expo North America 2024

AI Hardware & Edge AI Summit 2024

CDAO Government 2024

Google Claims Its TPU v4 Outperforms Nvidia A100

Join the discussion Cancel reply

Only registered users may comment. Register using the form below.

April 19, 2024

April 18, 2024

Most Read Features

Most Read News In Brief

Most Read This Just In

Sponsored Partner Content

Leading Solution Providers

Tabor Network

Sponsored Whitepapers

Sponsored Multimedia

Contributors

Featured Events

Share

Copy short link