Follow Datanami:
September 19, 2013

Apache Takes Storm Into Incubation

Isaac Lopez

Get used to saying it: “Apache Storm.”

On Wednesday night, Doug Cutting, Director for the Apache Software Foundation (ASF), announced that the organization will be adding the distributed real time computation system known as Storm as the foundations newest Incubator podling.

Storm was created by BackType lead engineer, Nathan Marz in early 2011, before the software (along with the entire company) was acquired by Twitter. At Twitter, Storm became the back bone of the social giant’s web analytics framework, tracking every click happening within the rapidly-expanding Twittersphere. The Blue Bird also uses Storm as part of its “What’s Trending” widget.

In September of 2011, Marz announced that Storm would be released into open source, where it has enjoyed a great deal of success, getting used by such companies as Groupon, Yahoo!, InfoChimps, NaviSite, Nodeable, Ooyala, The Weather Channel, and more.

Today, Marz says that Storm is being used to fill the real time processing gap that the batch-oriented Hadoop isn’t currently able to. “The lack of a ‘Hadoop of real time’ has become the biggest hole in the data processing ecosystem,” said Marz in a recent statement. Storm, says Marz, fills that gap. “Like how MapReduce greatly eases the writing of parallel batch processing, Storm’s primitives greatly ease the writing of parallel real time computation.”

Demonstrating Marz’s point, Twitter has recently released a Hadoop-Storm Hybrid called “Summingbird.” Developers with The Blue Bird say Summingbird fuses the two frameworks into one, allowing for developers to use Storm for short-term processing and Hadoop for deep data dives, without devs needing to use the gum and duct tape approach in piecing the two frameworks together.

While Storm is still just an incubator podling, gaining acceptance as a Top-Level-Project is seemingly just a matter of time. Having been open sourced for the last two years under the Eclipse Public License, Storm enters the Apache Incubator as a mature project with an active community, detailed documentation, and global meetups which include the Bay area, Boston, and London.

Related items:

Yahoo! Spinning Continuous Computing with YARN 

Twitter Conjures Up a Hadoop-Storm Hybrid, Ponders IPO 

LinkedIn Open Sources Samza Stream Processor 

Datanami