Follow Datanami:
October 16, 2014

Waterline Data Science Joins Hortonworks Technology Partner Program

NEW YORK, N.Y., Oct. 16 — Waterline Data Science today announced at Strata + Hadoop World New York that it has joined the Hortonworks Technology Partner program. Hortonworks is the leading contributor to and provider of Apache Hadoop. Waterline Data Science will integrate Hortonworks Data Platform (HDP) with Waterline Data Inventory to enable data self-service on Hadoop, allowing users to find, understand, and help govern Hadoop data.

By joining the Hortonworks Technology Partner program, Waterline Data Science will work to enable and accelerate the deployment of a modern data architecture, integrating with the Hortonworks Data Platform—the industry’s only 100-percent open source Hadoop distribution, explicitly architected, built, and tested for enterprise-grade deployments.

Companies are deploying Hadoop “data lakes” to provide unprecedented access to data for data science and analytics to uncover new business insight. But Hadoop’s advantages of frictionless ingest, flexible schema on read, and lack of data governance, present problems for users trying to find and understand the data. Waterline Data Inventory addresses these problems by building a complete inventory of data assets in Hadoop and by opening access to Hadoop data through data self-service. As a result, data scientists can be more productive, business analysts can easily augment reporting and BI with Hadoop data without coding, and data governance teams can start controlling Hadoop data.

“There is no point building a predictive model of the wrong column, and without a data inventory, you don’t know if you have the wrong column,” said John Mount, co-author of the book, Practical Data Science with R. A data inventory is also valuable for Hadoop data governance, according to Sunil Soares, author of Big Data Governance.

Alex Gorelik, Founder and CEO, states “a major complaint with Hadoop is once you’ve loaded the data, extracting value is like finding a needle in a stack of needles. Waterline Data Inventory lets business users find the best needles in the stack of needles, without having to write code, and without having to wrangle the entire stack. That’s our secret sauce, and key to deliver faster time to value and broad Hadoop adoption.”

Hortonworks Data Platform was built by the core architects, builders and operators of Apache Hadoop and includes all of the necessary components to manage a cluster at scale and uncover business insights from existing and new big data sources. With a YARN-based architecture, HDP enables multiple workloads, applications and processing engines across single clusters with optimal efficiency. A reliable, secure and multi-use enterprise data platform, HDP is an important component of the modern data architecture, helping organizations mine, process and analyze large batches of unstructured data sets to make more informed business decisions.

“Hortonworks is dedicated to expanding and empowering the Apache Hadoop ecosystem, accelerating innovation and adoption of 100-percent open source enterprise Hadoop,” said John Kreisa, vice president of strategic marketing at Hortonworks. “We welcome Waterline Data Science to the Hortonworks Technology Partner Program and look forward to working with them to help strengthen Hadoop’s role as the foundation of the next-generation data architecture.”

Datanami