Data Pros Are Maxed Out: Survey
The demands of the data-driven life are wearing on data professionals, according to survey results unveiled this week by data pipeline automation solution developer Ascend.io, which found that 96% of data professionals are at or over capacity.
In previous years, technology was the bottleneck to getting value out of growing volumes of data. But in its second annual DataAware Pulse Survey, Ascend.io concludes finds that the limitation now is finding the human capacity necessary to put big data and technology to work, particularly as it pertains to data pipelines.
The company states: “The majority (79%) of respondents indicated that their infrastructure and systems are able to scale to meet their increased data volume processing needs, which further highlights that the new problem with scale is focused on team capacity and less on technology capacity.”
Seventy-four percent of the 406 data professionals (i.e. data engineers, data scientists, data architects, and data analysts) that Ascend.io surveyed during the second quarter concluded that their organizations’ needs for data products are growing faster than their team sizes, Ascend.io says.
Among data engineers–who are principally charged with building the data pipelines that are required to move data from source systems to the systems used for advanced analytics and AI–81% stated the data needs were growing faster than HR can find warm bodies to build and maintain the pipelines.
“Everyone is still feeling the pain, and bottlenecks still persist in the data landscape,” Ascend.io wrote in its survey results, which can be found here. “More than ever, all roads lead to data engineering.”
Ascend.io, of course, develops software designed to help automate the creation and maintenance of data pipelines. The company’s offering introduces a declarative approach to data pipeline building, which enables engineers to focus on high-level data movement while the software handles the nitty-gritty details of instantiating the pipeline, filling it with data, performing periodic maintenance to optimize performance, and turning them off when they’re no longer needed.
“Data pipelines are fueling nearly every data-driven initiative across the business,” says Sean Knapp, who’s the CEO and founder of Ascend.io, in a press release. “However, as innovations at the infrastructure layer continue to enable processing of greater volumes and velocities of data, businesses face a new scaling challenge: how to enable their teams to achieve more, and faster.
“Our research shows that team sizes are not scaling at a fast enough rate to keep up with the needs of the business,” continues Knapp, who’s a 2021 Datanami Person to Watch. “Combined with our data that highlights almost every data professional today is already at capacity, this leaves little room for strategic work and innovation.”
The data pipeline building boom is in full swing, as organizations look to gather, move, and exploit this natural resource. The survey found that 93% of respondents anticipate the number of data pipelines in their organization to increase between now and the end of the year. What’s more, 56% expect the number of data pipelines to increase by more than 50% by December 2021.
Every category of data professional is feeling pressure to up their game to alleviate the data backlog. For example, enterprise architects were 2.7 times more likely to indicate data architecture was the most backlogged, Ascend.io found. For data analysts, the number was 4.5x, while for data scientists, 5.1x. But data engineers took the cake, as the survey found they were a 7.1x more likely to identify data engineering as the reason for the delays in rolling out data-driven products and insights.
“Across the board, our research shows that data engineering has become a significant roadblock for many teams,” Knapp says. “Simply put, data engineers are overburdened with building and maintaining mission-critical yet fragile pipelines.”
The backlog in data engineering can lead to downstream delays, and potential confusion as users attempt to self-serve their own data, he continues. “As a result, each team becomes overwhelmed with their own workload – a dynamic that can quickly lead to organizational friction, if not properly addressed, as well as inhibit the ability to meet the data needs of the business,” he says.