Follow Datanami:
March 1, 2024

2024 State of Apache Airflow Report Shows Rapid Growth in Airflow Adoption


Organizations have been working on becoming more data-driven for many years at this point, with mixed results. We understand that the value of data is undeniable. However, it has now become more crucial than ever for organizations to function as data-driven entities. The urgency of this issue is underscored by the emergence of AI and ML. 

The 2024 State of Apache Airflow helps us understand the growth, usage, and user community dynamics of navigating the complexities of data orchestration, data pipelines, and data delivery. 

The data for the report was compiled by surveying 281 professionals who were directly or indirectly responsible for managing their organization’s Airflow instances. This included data engineers, software engineers, DevOps engineers, and solutions architects. 

The report highlights three critical trends – rapid growth in Airflow adoption, increasing reliability of Airflow for mission-critical data delivery, and Airflow becoming the standard for AI pipelines. 

Apache Airflow is an open-source platform for creating, scheduling, and monitoring data and computing workflows. With its graphical UI, integration with major cloud platforms, and ease of use, Airflow is becoming a popular tool for orchestrating workflows or pipelines. 

One of the key findings of the report is that AL and ML facilities by Airflow spiked by 24 percent year over year. The top applications for Airflow included the internal analytics dashboard, GenAI, and ML. There has also been a 68 percent increase in Airflow downloads year on year with over 165.7 million downloads.  Airflow contributors outpace Apache projects Spark and Kafka with 2.8K contributors

While the AI market is projected to continue growing, there are some concerns, including fragmented tool stacks, siloed tools, and a looming threat of stricter compliance and operational requirements. 

According to Apache, a growing number of businesses are addressing these concerns using Astronomer – the unified data platform built on Apache Airflow service. Astronomer allows organizations to ensure data privacy, and unify data across the cloud, team, and deployments. It also helps accelerate development through seamless integrations. 

(Andrey Suslov/Shutterstock)

The report reveals that Apache Airflow has evolved from a niche tool to a mission-critical asset. Two-thirds (67 percent) of companies have more than 6 people using Airflow. This indicates a growing trend of data teams using Airflow to deliver business outcomes. 

It is not just the number of people using Airflow that is impressive, the frequency of interaction is also high. The report shows that 55 percent of users interact with Airflow daily, with another 26 percent using it at least once a week.  It appears that companies are growing in their confidence in Airflow and their vision of what the platform can accomplish. 

Around half (46 percent) of respondents shared that they feel Airflows is a “very important” tool for their business operations. Outages and the need for scalability were the top two concerns of the respondents. The respondents also shared that the most concerning aspects of data downtime include internal systems being negatively affected, throttling of data team productivity, and issues with revenue-generation apps or services. 

The results of the survey highlight how Apache Airflow meets the increasing challenges of data orchestration, scaling, and facilitation of best practices. Perhaps the greatest impact of Airflow has been in AI pipeline delivery. The momentum positions Airflows as one of the leading data orchestration platforms for AI development and deployment. 

Related Items 

How Airflow 2.8 Makes Building and Running Data Pipelines Easier

Astronomer’s Apache Airflow-Powered Astro Outperforms Competitors with 30M Monthly Downloads

AWS Announces General Availability of Amazon Managed Workflows for Apache Airflow