Follow Datanami:
September 18, 2017

The Next Generation Analytics Database – Accelerated by GPUs

As organizations demand more and more from their analytics and data science teams, processing power looms as one of the fundamental roadblocks to easy success.

Organizations are facing a variety of processing-related data challenges: data centers sprawling to hundreds of nodes to provide processing power; data architects turning to convoluted pipelines to accommodate specialized tools; and business users frustrated by slow BI tools and latent results from batch queries.

A new generation of databases, accelerated by NVIDIA GPUs, are providing the way forward.

GPUs offer thousands of processing cores, and are ideal for general-purpose parallelized computation—in addition to video processing!  GPUs differ significantly from standard CPUs in that today’s GPUs have around 4,500 cores (computational units) per device, compared to a CPU which typically has 8 or 16 cores. GPUs are now exploding in popularity in areas such as self-driving cars, medical imaging, computational finance, and bioinformatics, to name a few.

Analytics and data science tasks in particular benefit from parallelized compute on the GPU. Kinetica’s GPU-accelerated database vectorizes queries across the many thousands of GPU cores and can produce results in a fraction of a time compared to standard CPU-constrained databases. Analytics queries, such as SQL aggregations and GROUP BYs, are often reduced to seconds—down from several minutes with other analytics systems. Business users, working with tools such as Tableau or PowerBI, see dashboards reload in an instant—no time to even think about getting coffee!

Solving the compute bottleneck also results in substantially less hardware needs than before. Organizations are replacing 300-node Spark clusters with just 30 nodes of a GPU-accelerated database. They’re running models and queries significantly faster than with any other analytics solution and also benefiting from huge savings on datacenter and data management overhead.

Many financial organizations have been innovating with GPUs for more than five years and are among the first businesses to realize their value. Some of these financial companies are deploying thousands of GPUs to run algorithms — including Monte Carlo simulations —  on rapidly changing, streaming trading data to compute risk, for example—essential for regulatory compliance.

A GPU-accelerated database that can natively perform custom computation on data distributed across multiple machines makes it easier for data science teams. Kinetica’s in-database analytics framework provides an API that makes it possible to do compute on the GPU using familiar languages such as Python. Customized data science workloads are now able to be run alongside business analytics in a single solution and on a single copy of the data.

 

The Next Generation Analytics Database - Accelerated by GPUs

Enterprises can run sophisticated data science workloads in the same database that houses the rich information needed to run the business and drive day-to-day decisions. This neatly solves the data movement challenge because there isn’t any data movement, which leads to simpler Lambda architectures and more efficient Machine Learning and AI workloads. Quants, data scientists, and analysts can deploy a model from a deep learning framework via a simple API call, or train their models on the latest data; users can experience the GPU’s benefits and in-memory processing without needing to learn new programming languages.

With Kinetica, in-database compute can be extended to machine learning libraries such TensorFlow, Caffe (a deep learning framework), and Torch (a machine learning and neural-network framework). These libraries can be extremely compute-hungry, especially on massive datasets, and benefit greatly from GPU horsepower and distributed compute capabilities

GPU-accelerated database solutions can be found in utilities and energy, healthcare, genomics research, automotive, retail, telecommunications, and many other industries. Adopters are combining traditional and transactional data, streaming data, and data from blogs, forums, social media sources, orbital imagery, and other IoT devices in a single, distributed, scalable solution.

Learn more: Advanced In-Database Analytics on the GPU

Download the eBook:  How GPUs are Defining the Future of Data Analytics

Pick up your free copy of the new O’Reilly book “Introduction to GPUs for Data Analytics” from Kinetica booth #825 at Strata Data Conference in NY next week.

Datanami