Follow Datanami:
June 14, 2018

Dremio Announces the Gandiva Initiative for Apache Arrow

MOUNTAIN VIEW, Calif., June 14, 2018 — Dremio, the Data-as-a-Service Platform company, announced today a new open source initiative for columnar in-memory analytics based on Apache Arrow. The Gandiva Initiative for Apache Arrow leverages the LLVM Project, an open source compiler, to significantly improve the speed and efficiency of performing in-memory analytics using Apache Arrow, making these improvements widely available to many languages and popular libraries, initially for C++ and Java, and eventually others including Python, Ruby, Go, Rust, and JavaScript.

The Gandiva Initiative for Apache Arrow provides the following benefits to make analytical data transportable and more efficient:

  • Faster time to insight in analytics, machine learning, and data sciences.
  • Lower cost of operations on cloud infrastructure for analytics, machine learning, and data sciences.

“Apache Arrow was created to provide an industry-standard, columnar, in-memory data representation,” said Jacques Nadeau, co-founder and CTO of Dremio, and PMC Chair of Apache Arrow. “Dozens of open source and commercial technologies have since embraced Arrow as their standard for high-performance analytics. The Gandiva Initiative introduces a cross-platform data processing engine for Arrow, representing a quantum leap forward for processing data. Users will experience speed and efficiency gains of up to 100x in the coming months.”

The Power of LLVM

LLVM is an open source project originally developed by Swift language creator Chris Lattner. LLVM’s Just-in-Time compilation capabilities incorporate runtime information to produce highly optimized assembly code for the fastest possible evaluation.

By combining LLVM with Apache Arrow libraries, low-level operations on Apache Arrow in-memory buffers such as sorts, filters, and projections can be highly optimized for specific runtime environments, improving resource utilization and providing faster, lower-cost operations of analytical workloads.

Availability

The Gandiva Initiative will be made available during the 2018 DataWorks Summit in San Jose. Attendees are encouraged to attend the session “Using LLVM to Accelerate Processing of Data in Apache Arrow” on Thursday, June 21. For downloads, documentation, and ways to become involved with the Gandiva Initiative, visit www.dremio.com

 About Dremio

Dremio reimagines analytics for modern data. Created by veterans of open source and big data technologies, Dremio is a fundamentally new approach that dramatically simplifies and accelerates time to insight. Dremio empowers business users to curate precisely the data they need, from any data source, then accelerate analytical processing for BI tools, machine learning, data science, and SQL clients. Dremio starts to deliver value in minutes, and learns from your data and queries, making your data engineers, analysts, and data scientists more productive. For more information, visit www.dremio.com.


Source: Dremio

Datanami