March 22, 2022

NVIDIA RAPIDS Accelerator for Apache Spark Enables AT&T to Ring Up New Opportunities

March 22, 2022 — AT&T’s wireless network connects more than 100 million subscribers from the Aleutian Islands to the Florida Keys, spawning a big data sea.

Abhay Dabholkar runs a research group that acts like a lighthouse on the lookout for the best tools to navigate it.

“It’s fun, we get to play with new tools that can make a difference for AT&T’s day-to-day work, and when we give staff the latest and greatest tools it adds to their job satisfaction,” said Dabholkar, a distinguished AI architect who’s been with the company more than a decade.

Recently, the team tested on GPU-powered servers the NVIDIA RAPIDS Accelerator for Apache Spark, software that spreads work across nodes in a cluster.

It processed a month’s worth of mobile data — 2.8 trillion rows of information — in just five hours. That’s 3.3x faster at 60 percent lower cost than any prior test.

A Wow Moment

“It was a wow moment because on CPU clusters it takes more than 48 hours to process just seven days of data — in the past, we had the data but couldn’t use it because it took such a long time to process it,” he said.

Specifically, the test benchmarked what’s called ETL, the extract, transform and load process that cleans up data before it can be used to train the AI models that uncover fresh insights.

“Now we’re thinking GPUs can be used for ETL and all sorts of batch-processing workloads we do in Spark, so we’re exploring other RAPIDS libraries to extend work from feature engineering to ETL and machine learning,” he said.

Today, AT&T runs ETL on CPU servers, then moves data to GPU servers for training. Doing everything in one GPU pipeline can save time and cost, he added.

Pleasing Customers, Speeding Network Design

The savings could show up across a wide variety of use cases.

For example, users could find out more quickly where they get optimal connections, improving customer satisfaction and reducing churn. “We could decide parameters for our 5G towers and antennas more quickly, too,” he said.

Identifying what area in the AT&T fiber footprint to roll out a support truck can require time-consuming geospatial calculations, something RAPIDS and GPUs could accelerate, said Chris Vo, a senior member of the team who supervised the RAPIDS tests.

“We probably get 300-400 terabytes of fresh data a day, so this technology can have incredible impact — reports we generate over two or three weeks could be done in a few hours,” Dabholkar said.

Three Use Cases and Counting

The researchers are sharing their results with members of AT&T’s data platform team.

“We recommend that if a job is taking too long and you have a lot of data, turn on GPUs — with Spark, the same code that runs on CPUs runs on GPUs,” he said.

So far, separate teams have found their own gains across three different use cases; other teams have plans to run tests on their workloads, too.

Dabholkar is optimistic business units will take their test results to production systems.

“We are a telecom company with all sorts of datasets processing petabytes of data daily, and this can significantly improve our savings,” he said.

Other users including the U.S. Internal Revenue Service are on a similar journey. It’s a path many will take given Apache Spark is used by more than 13,000 companies including 80 percent of the Fortune 500.

Register free for GTC to hear AT&T’s Chris Vo talk about his work, learn more about data science at these sessions and hear NVIDIA CEO Jensen Huang’s keynote.

Source: Karthikeyan Rajendran, product manager, NVIDIA RAPIDS

Only registered users may comment. Register using the form below.

Check off newsletters you would like to receive*
- HPCwire
- EnterpriseTech
- Datanami
- Technology Conferences & Events
- Advanced Computing Job Bank
- Technology Product Showcase
Email*
Name*
First Last
Organization*
Job Function*
Industry*
Country*
City*
State*
Province*
- Please check here to receive valuable email offers from Datanami on behalf of our select partners.

NVIDIA RAPIDS Accelerator for Apache Spark Enables AT&T to Ring Up New Opportunities

Join the discussion Cancel reply

Only registered users may comment. Register using the form below.

April 24, 2024

April 23, 2024

April 22, 2024

Sponsored Partner Content

Get your Data AI Ready – Celebrate One Year of Deep Dish Data Virtual Series!

Supercharge Your Data Lake with Spark 3.3

Learn How to Build a Custom Chatbot Using a RAG Workflow in Minutes [Hands-on Demo]

Overcome ETL Bottlenecks with Metadata-driven Integration for the AI Era [Free Guide]

Gartner® Hype Cycle™ for Analytics and Business Intelligence 2023

The Art of Mastering Data Quality for AI and Analytics

Leading Solution Providers

Tabor Network

Sponsored Whitepapers

Top 6 Strategies for Reducing Data Warehouse Costs

Building an Operational Data Warehouse for Real-time Analytics

Sponsored Multimedia

The Power of DataOps: Bring Automation to Life
No Comments

Tactical Steps for Cloud Migration
No Comments

Immuta Data Access Platform
No Comments

Data Mesh: Fact or Fiction?
No Comments

Contributors

Featured Events

AI & Big Data Expo North America 2024

AI Hardware & Edge AI Summit Europe

AI Hardware & Edge AI Summit 2024

CDAO Government 2024

NVIDIA RAPIDS Accelerator for Apache Spark Enables AT&T to Ring Up New Opportunities

Join the discussion Cancel reply

Only registered users may comment. Register using the form below.

April 24, 2024

April 23, 2024

April 22, 2024

Most Read Features

Most Read News In Brief

Most Read This Just In

Sponsored Partner Content

Leading Solution Providers

Tabor Network

Sponsored Whitepapers

Sponsored Multimedia

Contributors

Featured Events

Share

Copy short link