Follow Datanami:
September 21, 2016

Flink Distro Now Available from data Artisans

The company behind Apache Flink, data Artisans, today launched the first commercial distribution of the upstart stream processing engine. The new software product, called the dA Platform, is identical to the open source Flink project at this time, and comes with 24/7 technical support from the team that develops Flink.

The availability of technical support will spur further adoption of Flink as the primary engine for streaming data processing, predicts data Artisans CEO and co-founder Kostas Tzoumas, one of the originators of Flink.

“Basically it’s a vehicle for more traditional enterprise companies to adopt Flink in a much easier way,” Tzoumas tells Datanami. “We actually found this was a barrier for adoption. Some companies, in order for them to adopt open source, they need a vendor to call. So this provides that.”

The German company, which currently has 17 workers, employs most of the committers behind the Apache Flink project, Tzoumas says. “The same Flink committers who are building Flink are the ones who are supporting the dA distribution,” he says. “So the know-how is the best we can get out there.”

The dA Platform is based on Apache Flink version 1.1.2. In fact, the dA Platform is currently identical to the code you can download from the Apache Flink website. However, that won’t be the case for long, as the plan calls for adding more proprietary components to the dA Platform, which is closed source and is not free.

Tzoumas says the company, which also has an office in San Francisco, is currently assessing the types of add-ons that will benefit dA Platform users the most. He wouldn’t say specifically what the company would build, but he did mention that ways to improve the monitoring of Flink jobs and deploying the software to infrastructure are possible areas where dA Platform customers may see improvements.Flink_2

The company is adamant that Flink is not forking, and that the Apache Flink project will continue to be the main vehicle for delivering core enhancements. “Our philosophy is that if there’s a feature that’s a natural fit for Flink then it goes into Flink,” Tzoumas says. “If there’s a feature that does not naturally fit in, but it’s outside the scope of the Flink project, then it goes to the dA Platform.”

This is the same pattern followed in the Hadoop world. “Hadoop is not the only part of the Hadoop solution,” Tzoumas adds. “There’s lots of pieces around how do you manage a Hadoop cluster with thousands of Hadoop jobs at scale, that don’t fall within the scope of the Apache Hadoop project, but are, for example, in Hortonworks HDP or Cloudera CDH.”

While data Artisans has been supporting some customers with their Flink projects for a while, the launch of the dA Platform marks the beginning of data Artisans’ life as a product company. The company will be showcasing its status as the commercial face of Flink at the Strata + Hadoop World show next week in New York City. To that end, Tzoumas will be signing copies of a book on Flink that he co-wrote with MapR‘s Ellen Friedman and published by O’Reilly.

Flink has garnered a lot of interest in the big data community for its capability to process large amounts of streaming data with low latency and exactly-once semantics. Like Apache Spark, Flink is YARN compatible for deployment atop existing Hadoop clusters. But Flink is seen as carrying key advantages over Spark Streaming, particularly because it was developed from the ground up as a streaming engine first, whereas Spark was originally developed for the batch paradigm.

Last week, more than 350 people attended the annual user conference around the technology Flink Forward featured presentations from prominent Flink users, including Airbnb, Uber,, and Alibaba, which runs the biggest Flinch cluster in the world and uses the technology to power the recommendation engine on its website.

“Netflix is also exploring using Flink at large scale for its next-generation streaming infrastructure, which they presented at Flink Forward 2016. “We’re working with companies that operate on an extremely large scale and most of the adoption of Flink comes from there,” he says. “We’re also working with vendors and Hadoop distributors, and some of them will make Flink available as part of their platform.”

Customers are charged separately for the dA Platform depending on if they’re in development or in production. Exact pricing wasn’t available at press time.

Related Items:

Flink: Worth a Second Look

Apache Flink Gears Up for Emerging Stream Processing Paradigm

Merging Batch and Stream Processing in a Post Lambda World