GPUs Seen Lifting SQL Analytic Constraints
Big data analytics has traditionally been about developing cutting-edge software to run on commodity hardware. But there are signs that innovation at the hardware layer, such as with GPUs, can take analytics to the next stage, and maybe eliminate bottlenecks slowing SQL-based analytics.
Organizations that are tired of waiting for big SQL analytic jobs to complete may want to take a look at what GPUs can provide, NVidia (NASDAQ: NVDA) vice president and general manager Jim McHugh told Datanami at Strata + Hadoop World last week.
“They really hit a point where they’re frustrated with the data,” McHugh says. “When it takes 10 seconds to a do a query, and you have 10 or 15 queries going on—it just becomes time consuming and wearisome to go through that process.”
Nvidia used its booth in the Strata + Hadoop World Expo last week to showcase the new DGX-1, a new deep-learning supercomputer that it says is 250 times more powerful than a typical X86-based server. The company continually ran a live computer vision demo involving a GPU-based neural network that attempted to identify random objects, such as a car or an orange, placed before the camera. Attendees got a kick out of pointing the camera at other objects, such as Nvidia representatives or the ceiling, and seeing what the GPU would come up with.
While the DGX-1 brings a definite “wow” factor, Nividia is likely to get more heavy-duty analytic work out of the trusty Tesla K80, which is finding its way into more cloud services. Amazon (NASDAQ: AMZN) Web Services made some hay with its announcement this week that it’s offering public cloud services based on the Tesla K80 GPUs ahead of its rival Microsoft (NASDAQ: MSFT) Azure.
Thanks to a collection of analytic software companies building in GPU technology, those K80s (and possibly the DGX-1 in the future) are set to drive SQL workloads to new heights, McHugh says.
“This is pure SQL speedup,” McHugh says. “I’m just speeding up the SQL database so it’s 10 to 100 times faster. If it was just one [GPU-based SQL database vendor] I’d say what’s your trick. But it’s not one–it’s Kinetica, MapD, and Sqream. They have SQL under the hood.”
We’ve written about how these companies are solving some big problems. For example, the United Postal Service is using software from Kinetica (formerly GIS Federal) to track how its army of 200,000 mail carriers go about their routes each day, with an eye on identifying problem areas and improving routing.
We’ve also written how MapD’s innovative use of in-memory and GPU computing can enable users to analyze vast amounts of data in an interactive manner. Datanami readers will also recall how SQream claims that customers can get data warehouse-like performance out of a shoebox sized GPU-equipped system.
McHugh is clearly pleased how the big data ecosystem is responding to his company’s GPUs. “The great thing about these tools, most of them, is you don’t need to learn a new toolset,” he says. “I’ve got my SQL guys, I have my SQL queries. Boom, I bring it in and we’re going.”
The speed-up that GPUs offer will allow customers to examine the outliers and the “long-tail” of the data, because they’re no longer CPU constrained, McHugh says. “I think that’s one of the coolest things going on,” he says. “I’ve been in the big data and analytics and business intelligence for a while, and that’s been a frustration point. Now with these types of solutions, it’s actually changing quite a bit.”
The availability of Nvidia K80s on AWS will be a turning point, MapD founder Todd Mostak wrote in a blog post last week.
“This is the time for GPUs and this event is a trigger point,” he wrote. “We see it everywhere, hear it from the smartest observers and see it in the application growth. Data growth has swamped the capacity of CPU-bound systems to keep up–it is truly the ‘Age of the GPU.'”