Apache Flink Survey Shows Enterprises Are Investing Heavily in Stream Processing
BERLIN and SAN FRANCISCO, Dec. 19, 2017 — Enterprises are investing heavily in stream processing technology, according to the second annual Apache Flink user survey data Artisans announced today: the vast majority (87 percent) of organizations surveyed are planning to deploy more applications powered by Apache Flink software in 2018. Of dozens of new application types developers are building or planning to build, machine learning (64 percent) both for model scoring (34 percent) and model training (30 percent), anomaly detection/system monitoring (27 percent) and business intelligence/reporting (25 percent) are the most popular, followed by recommendation/decisioning engines (22 percent) and security/fraud detection (19 percent), to round out the top five. Most respondents (70 percent) say their team or department is growing and hiring in 2018. Nearly as many (59 percent) expect their team or departmental budget to increase.
Drawing on insights from 217 IT leaders, software engineers, application developers, and data/systems architects from 28 countries, the survey shows that the ability to react to data in the moment is becoming a top priority among enterprises of all sizes, from small organizations earning under $1 million in annual sales (10 percent of respondents) to very large enterprises with over $1 billion in earnings (18 percent of respondents). By adopting Flink and a data streaming architecture, enterprises can get insights from their data in milliseconds.
Current and Future Use
Global companies such as Alibaba, ING, Netflix, SK Telecom, Telefonica, and Uber use Flink as the stream processing platform of choice for large-scale stateful applications that manage high volumes of data. One quarter of respondents are processing at least 1 billion events per day, with 1 percent processing at least 1 trillion events per day:
- 1 percent process 1 trillion or more events per day
- 24 percent process between 1 billion and 999 billion events per day
- 18 percent process from 100-999 million events per day
- 43 percent process up to 99 million events per day
The volume of events is expected to grow exponentially as organizations implement more live data applications in the coming years. Of those who are planning to deploy more Flink applications in 2018:
- 62 percent expect to deploy one to five more applications
- 11 percent say six to 10 more applications
- 8 percent say 10+ applications
- 7 percent expect to deploy a whopping 20+ additional applications in 2018
Apache Flink’s streaming execution model can be used for processing both continuous (streaming) datasets and static (finite or batch) datasets, to cover a broad range of data processing use cases within a single platform. Today, 46 percent of respondents use Flink only for continuous (streaming) data, while 47 percent use it for a mixture of continuous data and static (finite) datasets, and six percent use it only for static datasets.
“This year’s survey presents clear evidence that stream processing is becoming widely adopted across enterprises of all sizes and in a variety of industries outside of technology, with financial services, insurance, real estate and telecommunications leading the pack,” said Kostas Tzoumas, co-founder and CEO of data Artisans and a PMC member of Apache Flink. “The market is expected to reach upwards of $13 billion USD by 2021, and we’re seeing a range of new applications being put into production, including machine learning, security and fraud detection, systems monitoring and Internet of Things. We are privileged to be part of such a vibrant community, and data Artisans is committed to ensuring Flink is constantly evolving to meet future use cases, and that we are providing the training, services, and support infrastructure to enable users to maximize the full potential of their data applications.”
Since implementing Flink, the respondents have seen many benefits. Forty six percent of respondents reported that high volumes of data are now available in real-time (enabling them to move beyond batch processing), and 46 percent also said it is easier for them to build distributed applications. Other benefits include:
- 37 percent report improved scalability of applications
- 35 percent have seen improved performance of applications
- 32 percent cite simplified application design
- 29 percent report reduced application complexity
Apache Flink has also been credited with driving tangible business benefits that transcend the realm of the IT team by accelerating the innovation cycle, helping to keep systems up and running, and boosting revenue, areas that will likely increase as companies expand their use of live data applications:
- 23 percent are able to bring new applications online faster
- 20 percent see improved reliability of applications
- 15 percent say their systems are more resilient
- 11 percent report cost savings
- 4 percent have seen an increase in revenue
Satisfaction and Areas of Focus and Development
Ninety two percent of respondents expressed satisfaction with Flink, of which 58 percent were very or completely satisfied. Diving into specific areas that rank highest (very or completely satisfied), Flink’s strength in managing high-volume, high-velocity streaming datasets is evident in the top four areas of satisfaction:
- 76 percent for event time handling
- 74 percent for DataStream API (stream processing)
- 72 percent for throughput and latency
- 71 percent for windowing & watermarks
Apache Flink is among the fastest-growing Apache Software Foundation projects. As more companies adopt and configure Flink to their organization’s specific needs, more user support will be needed. The top requests for new features or developments among the survey respondents were additional documentation, programming guides, and resources for getting started (55 percent); better tooling for non-engineering users (43 percent); and more support for programming languages beyond Java, Scala, and SQL (34 percent).
About the Survey
To better understand how organizations are using and plan to use Apache Flink software and to learn which features they like best and what features and improvements they would like to see in the future, data Artisans commissioned Researchscape International to conduct an online survey of 217 IT leaders, software engineers, application developers, data/systems architects, data scientists and analysts. The survey was fielded from November 6 to December 1, 2017.
Out of 28 countries, respondents were most often from the United States (24 percent), China (13 percent), and Germany (12 percent). Fifty seven percent worked at organizations with 100-9,999 employees, one-third at organizations with up to 999 employees, and 14 percent worked for organizations 10,000 or more employees. One-third of the organizations had annual sales of $1-100M, nearly two-fifths (18 percent) top $1B in annual sales, while 10 percent earned under $1M and eight percent earned between $100-500M.
About data Artisans
data Artisans was founded by the creators of open source Apache Flink to bring real-time data applications to the enterprise. dA Platform 2, with open source Apache Flink, provides turnkey stream processing to businesses, enabling them to manage and deploy live data applications so they can react to data instantaneously and make better business decisions. Global companies such as Alibaba, ING, Netflix and Uber use Flink as the stream processing engine to power large-scale stateful applications, including real-time analytics, search and content ranking, and fraud detection.
Source: data Artisans