Demand for Workflow Orchestration Solutions on the Rise, Says New Report
Prefect Technologies, producer of the open-source data workflow automation platform Prefect, announced the release of a new report by data and AI consulting firm Gradient Flow titled the “2022 State of Workflow Orchestration.”
Prefect sponsored this survey report, and the company said it reveals the rising demand for workflow orchestration tools driven by companies looking to modernize data infrastructure and the rise of machine learning modeling. Orchestration solutions can assist data professionals with writing, scheduling, monitoring, and managing data and ML pipelines.
The online survey ran from Feb. 6 to April 4, 2022, and saw participation from 581 respondents, including data scientists and analysts, data and ML engineers, software engineers and developers, and DevOps engineers.
The report’s introduction notes that this is a transitional time where companies are moving from experimentation with artificial intelligence-powered data and machine learning to actual implementation. Firms are investing in technologies like data integration, DataOps, data governance, and data platforms as a strategy for implementing analytics and AI. The authors mention that recently, “machine learning researchers are rallying around ‘data-centric AI,’ — a collection of tools and techniques for cleaning, augmenting, and enhancing datasets to improve the accuracy of ML and AI models.”
Founder and Principal of Gradient Flow, Ben Lorica, said: “More organizations now want to enable analytics and AI to benefit their businesses, creating demand for not only data talent but also foundational data technologies. This includes everything from data integration and DataOps to data governance and platforms—as well as orchestration. Organizations need orchestrators to make data available for downstream applications including data science and AI systems. The growing demand for workflow orchestration has led to the emergence of a host of open source and SaaS solutions in an area ripe for innovation.”
The survey began with a question asking about what percentage of data professionals’ recurring tasks are handled with an orchestration solution: 43% of those surveyed said they use an orchestrator to run over half of all recurring tasks, and that number rose to 68% among those who hold data/ML engineering roles. Additionally, orchestrators are used by 91% of data/ML engineers for more than a quarter of all recurring tasks.
Concerning all use cases, 29% of all respondents reported data science as their primary use case. Data/ML engineers also reported three primary use cases: data movement (21%), machine learning/MLOps (16%), and data transformation (14%).
As far as which orchestration systems the respondents use, the report found that the two most popular orchestration frameworks among all those surveyed were Apache Airflow (36%) and Prefect (14%), with Airflow being the dominant choice for data scientists. Prefect (17%) and Dagster (12%) joined Airflow (32%) among those with Data/ML Engineer roles.
The survey’s authors then asked which features are most important in workflow orchestration solutions and concluded that the top three are ease of use (38%), caching (37%), and monitoring (37%).
The report concludes that with 91% of those surveyed using orchestration in at least a quarter of their data workflows, “data integration and data orchestration are active areas with well-funded startups building a wide array of solutions.” The authors also note that low-code and no-code tools are quickly cropping up to accommodate less technical users, given the current shortage of data professionals.
“This report shows that the market for orchestration is growing significantly and starting to evolve beyond legacy products,” added Jeremiah Lowin, Founder and CEO of Prefect. “Data scientists, engineers, and developers all ultimately need dataflow automation that will ensure their code runs and their data arrives as expected, while most importantly making it easy to identify when a failure occurs and how to fix it. The result is less time spent writing defensive code to protect against failure, and more time on productive objectives.”
Download the report here.
Related Items:
How Data-Centric AI Bolsters Deep Learning for the Small-Data Masses
Investment in Machine Learning Keeps Growing, DataRobot Finds
Orchestrator Emerges to Speed ML Models to Production