Lockheed Aims to Make Apache Storm Easier to Use
The data analytics arm of defense contractor Lockheed Martin yesterday announced that it’s making a big data analysis tool available as open source. Dubbed StreamFlow, the tool is designed to make it easier for beginner programmers to work with Apache Storm.
Apache Storm is a distributed computing framework designed to process streams of data in real time. The technology, which was acquired and then released as open source by Twitter several years ago, is a key component of the emerging big data technology stack, and is often deployed alongside the Kafka messaging system and in Hadoop clusters.
While Storm is powerful and in the right hands can be used to extract “perishable insights” from fast-moving data, it also presents a new paradigm for developers, and can be complex to set up and run, a common theme running across many big data technologies. The folks at Lockheed Martin Data Analytics developed StreamFlow to simplify Storm for less technical users and, effectively, to make it enterprise-ready.
“The ultimate goal of StreamFlow is to make working with Storm easier and faster, allowing non-developers and domain experts of all kinds to contribute to real-time data-driven solutions,” says Jason O’Connor, vice president of Analysis & Mission Solutions with Lockheed Martin Information Systems & Global Solutions.
StreamFlow includes several components, including: a Web-based interface for building and monitoring Storm topologies; an interactive topology builder; a dashboard for monitoring the performance of Storm topologies; a topology engine that solves some Storm complexities such as ClassLoader isolation and serialization metrics; and a modular framework for publishing new capabilities in the form of Sprouts and Bolts.
Lockheed envisions StreamFlow helping users create real-time analytic pipelines, particularly in the areas of systems telematics (Internet of Things), cyber security, and medical care. Future plans include open sourcing of additional frameworks and support for further real-time processing systems like Apache Spark, the company says.
Lockheed Martin has decades of experience dealing with streaming sensor data from aircraft, spacecraft, and missile defense systems, and its Data Analytics arm is parlaying that experience into non-military areas, including cyber security, medical analytics, forecasting, and quantum computing.
The shift to real-time analytics is viewed as an inevitable move in the industry, particularly as the velocity and volume of data continues to expand at geometric rates and the Internet of Things (IoT) makes its impact. The first generation of Hadoop applications were built largely in the batch-oriented MapReduce paradigm, and Apache Storm, Kafka, and other technologies are expected to help usher in the era of real-time analytics.
We’re still at the very beginning of the real-time era, however, and that newness brings unfamiliarity and growing pains. According to a Forrester survey from 2014, there was a 66 percent increase in firms’ use of streaming analytics. While customers are starting to adopt streaming analytic tools like Apache Storm in larger numbers, the open source technologies lack the enterprise features that many companies expect, Forrester says.
“The streaming application programming model is unfamiliar to most application developers,” Forrester analysts Mike Gualtieri and Rowan Curran wrote in The Forrester Wave in July. “It’s a different paradigm from normal programming where code execution controls data. In streaming applications, the incoming data controls the code.”
StreamFlow installs on Windows, Linux, and Unix machines, and is available from Lockheed’s GitHub account under an Apache 2 license.