GigaSpaces Closes Analytics-App Gap With Spark
Data analytics and cloud vendors are rushing to support enhancements to the latest version of Apache Spark that boost streaming performance while adding new features such as data set APIs and support for continuous, real-time applications.
In-memory computing specialist GigaSpaces this week joined IBM and others jumping on the Spark 2.1 bandwagon with the roll out an upgraded transactional and analytical processing platform. The company said Wednesday (July 19) the latest version of its InsightEdge platform leverages Spark data science and analytics capabilities while combining the in-memory analytics juggernaut with its open source in-memory computing data grid.
The combination provides a distributed data store based on RAM and solid-state drives, the New York-based company added.
The upgrade was prompted by the new Spark capabilities along with growing market demand for real-time, scalable analytics as adoption of fast data analytics grows.
The in-memory computing platform combines analytical and transactional workloads in an open source software stack, and then streams applications such as Internet of Things (IoT) sensor data.
The analytics company said it is working with Magic Software (NASDAQ and TASE: MGIC), an application development and business integration software vendor, on an IoT project designed to speed ingestion of telemetry data using Magic’s integration and intelligence engine.
The partners said the sensor data integration effort targets IoT applications such as predictive maintenance and anomaly detection where data is ingested, prepped, correlated and merged. Data is then transferred from the GigaSpaces platform to Magic’s xpi engine that serves as the orchestrator for predictive and other analytics tasks.
Along with the IoT partnership and combined transactional and analytical processing, the Spark-powered in-memory computing platform also offers machine learning and geospatial processing capabilities along with multi-tier data storage for streaming analytics workloads, the company said.
Ali Hodroj, GigaSpace’s vice president of products and strategies, said the platform upgrade responds to the growing enterprise requirement to integrate applications and data science infrastructure.
“Many organizations are simply not large enough to justify spending valuable time, resources and money building, managing, and maintaining an on-premises data science infrastructure,” Hodroj asserted in a blog post. “While some can migrate to the cloud to commoditize their infrastructure, those who cannot are challenged with the high costs and complexity of cluster-sprawling big data deployments.”
To reduce latency, GigaSpaces and others are embracing Spark 2.1 fast data analytics, which was released late last year. (Spark 2.2 was released earlier this month.)
Vendors such as GigaSpaces are offering tighter collaboration between DevOps and data science teams via a unified application and analytics platform. Others, including IBM, are leveraging Spark 2.1 for Hadoop and stream processing distributions.
IBM (NYSE: IBM) said this week the latest version of its SQL platform targets enterprise requirements for data lakes by integrating Spark 2.1 on the Hortonworks Data Platform, the company’s Hadoop distribution. It also connects with Hortonworks DataFlow, the stream-processing platform.