Follow Datanami:
April 6, 2016

MapR Converges SQL and JSON With Apache Drill v1.6

SAN JOSE, Calif., April 6 — MapR Technologies, Inc., provider of the industry’s only Converged Data Platform, today announced the availability of Apache Drill 1.6 as the unified SQL layer for the MapR Converged Data Platform via tighter integration with MapR-DB.  Customers and partners benefit from the flexibility of reporting and analytics on JSON data stored in MapR-DB tables, realizing faster time-to-value with insights gleaned from operational data.

According to Hadoop Weekly, “The Apache Drill project has one of the fastest release velocities in the Hadoop ecosystem with a new release nearly every month.”  Version 1.6 of Apache Drill, which is now available on the MapR Converged Data Platform, offers a new MapR-DB document database plugin, enhanced performance and scale, and optimized Tableau and BI tool experience.

Interest and adoption of Drill, which was recognized as one of the best in open source big datatechnologies, continues to grow in popularity. Thousands of users have downloaded Drill and numerous organizations have it in production, interactively analyzing up to PBs of data. Additionally, over 6,000 BI analysts and developers worldwide have completed Drill training courses provided by the free On-Demand Training program from MapR.

Apache Drill is a game changer for us,” said Edmon Begoli, CTO of PYA Analytics. “Most recently, we have been able to query, in under 60 seconds, two years worth of flat PSV files of claims, billing, and clinical data from commercial and government entities, such as the Centers for Medicaid and Medicare Services. Drill has allowed us to bypass the traditional approach of ETL and data warehousing, convert flat files into efficient formats such as Parquet for improved performance, and use plain SQL against very large volumes of files.”

Highlights of Drill 1.6 include:

  • Flexible and operational analytics on NoSQL – The new MapR-DB document database plugin allows analysts to perform SQL queries directly on JSON data stored in MapR-DB tables. There are a variety of pushdown capabilities available with this plugin to provide optimal interactive experience.
  • Enhanced query performance – Provides better query performance on data in Hadoop and NoSQL systems via numerous query planning improvements, such as partition pruning, metadata caching and other optimization improvements. Delivers up to 10-60X performance gains in query planning compared to the previous releases of Drill.
  • Better memory management – Delivers greater stability and scale which enables customers to run not only larger but also more SQL workloads on a MapR cluster.
  • Improved integration with visualization tools like Tableau – Offers metadata query performance improvements and introduces client impersonation for end-to-end security from the visualization tool to data in Hadoop.  Version 1.6 also provides enhanced SQL Window functions.

Drill is used in a variety of use cases.  For example, media companies can instantly query and analyze incoming content delivery network (CDN) files without requiring data transformations, allowing them to analyze several terabytes of CDN logs and reduce customer attrition.  High-tech chip manufacturers can develop offerings that allow them to better analyze dropped calls and provide that information to their handheld device partners and thereby improve quality of service.  Communications providers can instantly query and analyze logs from cell towers that enable mobile operators to proactively monitor and improve subscriber experience.

“Operational analytics on document databases such as MapR-DB is a rapidly growing use case,” said Neeraja Rentachintala, senior director, Product Management, MapR Technologies. “For the first time, there is a stack that allows BI developers and business analysts to store and query data in native formats without cumbersome ETL or transformation, providing end-to-end flexibility and scale.”

About MapR Technologies

MapR provides the industry’s only converged data platform that integrates the power of Hadoop and Spark with global event streaming, real-time database capabilities, and enterprise storage, enabling customers to harness the enormous power of their data. Organizations with the most demanding production needs, including sub-second response for fraud prevention, secure and highly available data-driven insights for better healthcare, petabyte analysis for threat detection, and integrated operational and analytic processing for improved customer experiences, run on MapR. A majority of customers achieves payback in fewer than 12 months and realizes greater than 5X ROI. MapR ensures customer success through world-class professional services and with free on-demand training that over 50,000 developers, data analysts and administrators have used to close the big data skills gap. Amazon, Cisco, Google, HPE, Microsoft, SAP, and Teradata are part of the worldwide MapR partner ecosystem. Investors include Google Capital, Lightspeed Venture Partners, Mayfield Fund, NEA, Qualcomm Ventures and Redpoint Ventures. Connect with MapR on LinkedIn, and Twitter.

Source: MapR

Datanami