July 22, 2014

Streaming Analytics Ready for Prime Time, Forrester Says

Alex Woodie
data pipe

Analytic platforms that generate insights from data in real time are mature enough for enterprises to begin adopting them, Forrester says in its latest report. While open source streaming analytic products like Apache Storm are proving popular, Forrester says they lack key functionality found in the offerings of proprietary vendors, such as top-rated Software AG.

You don’t need a Forrester analyst to know that streaming analytics is red hot at the moment. If Hadoop has opened our eyes to what is possible with big data, then the excitement around real-time streaming is all about compressing and accelerating what vendors call “time to insight.”

Adoption of real-time analytic platforms has soared as enterprises begin get a taste of what they can do. In just the past two years, Forrester has detected a 66 percent increase in firms’ use of streaming analytics, based on the group’s 2014 Business Technographics survey of nearly 750 decision makers.

It’s all about acting on what Forrester calls “perishable insight,” which is information that firms can “only detect and act upon at a moment’s notice.” Streaming analytics are just about the only game in town for harnessing the “white-water flow” of perishable insights originating from the Internet of Things, mobile phones, market data, sensors, Web clickstream, and transactions, the firm says.

Enterprises are fully aware of the need for streaming analytics. The big question is what platform design will resonate best, and what products will gain market share the quickest. Those are the types of questions where a Forrester analyst comes in quite handy.

The problem with real-time streaming platforms is that they don’t resemble the types of analytic applications that enterprises are used to. “The streaming application programming model is unfamiliar to most application developers,” write Forrester analysts Mike Gualtieri and Rowan Curran in The Forrester Wave on July 17. “It’s a different paradigm from normal programming where code execution controls data. In streaming applications, the incoming data controls the code.”

Developers are still familiarizing themselves with the streaming operators that are used in real-time analytic systems. There are a handful of operator types–such as filters, aggregators, correlators, and locators, as well as time-window operators, temporal operators, enrichment operators, and various custom and third-party operators–and developers typically must assemble them in such a way to get the desired result.

forrester_wave_streaming

Forrester Wave for Big Data Streaming Analytics Platforms, Q3 2014

Forrester ranked just a handful of streaming analytic products in its latest Wave, which favored bigger, more established vendors whose products have proven themselves across multiple industries.

The requirement to have streaming analytic operators eliminated open source favorite Apache Storm from inclusion in Forrester’s report. Despite the fact that the Hadoop-compatible add-on has seen several high-profile deployments at big outfits like The Weather Channel, Spotify, and Twitter (which released Storm into open source), Storm has its downsides, Forrester says, namely that it’s “a very technical platform that lacks the higher order tools and streaming operators that are provided by the vendor platforms.”

And while Apache Spark was not specifically mentioned by Forrester, one can imagine that its Spark Streaming platform may lack some of industry-proven credentials that enterprises like to see before throwing their weight behind a product. There’s also the fact that Spark Streaming is not actually a real-time streaming in the technical sense of the term, but more of a “micro batch” framework.

Stringent inclusion criteria also prevented Forrester from including streaming analytic products from startups, such as DataTorrent, whose Hadoop-compatible Real-Time Streaming (RTS) product clocked in with the capability to process 1.5 billion events per second when DataTorrent released the product into general availability at last month’s Hadoop Summit. It also doesn’t include vendors playing on the edge of real-time analytics, such visualization software developer ZoomData; in-memory data grid developers, such as GridGain and ScaleOut Software; megavendors like Oracle and Microsoft, which don’t sell standalone streaming analytics tools; other open source products, like the S4 Apache Incubator project, or Apache Kafka and Apache Samza, which came out of LinkedIn; or any of the NoSQL and NewSQL vendors, such as DataStax or VoltDB, that offer some analytic functions atop a fast transactional database.

That leaves us with just seven real-time streaming platforms that made the cut in Forrester’s Wave for Q3 2014, including those form Software AG, SAP, TIBCO, IBM, Informatica, Vitria, and SQLStream. Here’s a rundown on these top real-time streaming platforms, according to Forrester’s ranking:

  1. Software AG: The heart of Software AG’s real-time streaming offering is Apama, a product that has a long history as a complex event processing (CEP) platform. Since it was released way back in 2001, Apama has seen widespread use on Wall Street, where it powers algorithmic trading applications but it’s also seen use in retail banking, telecommunications, logistics, government, energy, and manufacturing. Apama was acquired from Progress Software in 2013.
  1. IBM: Forrester says Big Blue’s InfoSphere Streams is “industrial strength,” and can support the “gnarliest of use-cases” (cowabunga dudes!) It scored highest in performance and scalability, and is second in functionality only to Apama. The software came out of IBM Research in 2009, has deployments in healthcare, financial services, telecommunications, government, energy, and utilities.
  1. SAP: The German software giant’s Event Stream Processor (ESP) has a “long, rich history” as one of the original CEP platforms. SAP’s roadmap call for integrating ESP into the in-memory HANA database, which won SAP praise from Forrester.
  1. TIBCO: The 2013 acquisition of StreamBase gave TIBCO a reputable story to tell in the real-time streaming business, and complement’s The Information Bus Company’s 15-year history in the high-frequency trading market. Forrester liked StreamBase’s intuitive interfaces and that it can be used by non-developers.
  1. Informatica: Forrester was impressed enough with Informatica’s 2011 decision to redesign its RulePoint business rules engine with support for streaming analytics to make it one of the Leader’s in the current Wave report. It particularly likes that RulePoint applications can be configured using either streaming operator constructs or business rule constructs.
  1. Vitria Technology: The lone entry in the Strong Performer’s category, the Sunnyvale, California company has a “proven track record” of helping companies in a variety of industries. Forrester was particularly impressed with Vitria’s “unified platform” approach, which the group says gives users unique options.
  1. SQLstream: The San Francisco, California company’s reliance on ANSI SQL gives it a unique place in the world of streaming analytics, and helps to lessen the learning curve for developers. A relatively shallow market presence pulled down the total score for SQLstream, relegating it as the only company in the Contender’s category.

Forrester does not pretend that its Wave is a comprehensive list of real-time streaming systems. In addition to the products and vendors listed above that did not make the cut, there are many hosted streaming analytic offerings available that organizations may want to check out. This includes Amazon Kinesis, which was launched in late 2013, and Google DataFlow, which was launched barely a month ago.

These are exciting times in the big data analytics world, and you can bet that real-time streaming platforms will continue to occupy a bigger chunk of the market as time goes on. The rise of in-memory systems and data grids, the ongoing reformation of Hadoop towards interactive and real-time processing, and trickle-up competition from NoSQL databases with analytic aims will combine to accelerate the evolution of real-time streaming into a standard component of the analytic stack.

Related Items:

Hadoop and NoSQL Now Data Warehouse-Worthy: Gartner

Enterprise-class real-time stream analytics for Big Data

It’s Sink or Swim in the IoT’s Ocean of Bigger Data