Too many big data initiatives are science projects that take months of effort, risk failure and require highly trained data scientists with scarce skills. According to a CSC survey, 55 percent of big data projects aren’t completed and many others fall short of their objectives.Read more...
Databricks to Deliver Spark Distribution Offering for SAP HANA Platform
SAN FRANCISCO, Calif., July 1 – Databricks, the company founded by the creators of Apache Spark – the popular open-source processing engine – today announced a new partnership with SAP and to deliver a Databricks-certified Apache Spark distribution offering for the SAP HANA platform. The announcement was made at the Spark Summit 2014, being held June 30 – July 2 in San Francisco.
The Databricks-certified distribution offering for SAP HANA contains the Spark processing engine that works with any Hadoop distribution out of the box, providing a more complete data store and processing layer for Hadoop. Certified by Databricks to be compatible with the Apache Spark Distribution, this enables the rapidly growing set of “Certified on Spark” applications to run out of the box and on SAP HANA. This production-ready distribution offering is the first result of Databricks’ new partnership with SAP.
“We’re thrilled to be embarking on this journey with SAP to bring together two powerful technologies to better enable enterprises to derive value from their data,” said Ion Stoica, CEO of Databricks. “SAP HANA is both an incredibly powerful and fast analytics engine, as well as a repository for some of the most valuable enterprise data by virtue of the enterprise applications that it helps run. This integration will help enable the large and growing community of Hadoop and Spark developers and applications to harness these capabilities immediately via Spark as well as extend the reach of SAP HANA.”
SAP HANA integrated with Spark will help enable real-time applications and interactive analysis across corporate application data with content stored in Hadoop Distributed File System (HDFS). Developers and data scientists developing on Spark can also benefit from end-to-end data processing acceleration in SAP HANA by leveraging its comprehensive suite of in-memory engines and libraries for transactional applications, analytics, predictive, machine learning, text, graph and geospatial analysis. This helps simplify the integration of mission-critical applications with contextual data stored in Hadoop-like data stores. As a result, in-memory computation is enabled to happen where data resides and can help minimize costly and time-consuming data movement.
“SAP has continually been at the forefront of innovation to simplify and better serve customers, and bringing together Spark and SAP HANA is simply the latest example of this,” said Steve Lucas, president, Platform Solutions, SAP. “This can allow enterprises to build on SAP HANA’s value proposition by providing some of the best-of-breed capabilities across the full spectrum of data and processing needs without the need to painstakingly stitch together independent solutions.”
Developers and data scientists will be enabled to more easily create a new class of applications with SAP HANA and Spark. For example, they can span data domains, such as applications that integrate inventory analysis with social media trends for retailers; combine sensor data with billing systems to deliver personalized resource and cost-saving recommendations for utilities; or converge patient data with epidemiological information to construct better staffing decisions for healthcare providers.