ChaosSearch-Unisphere Survey Finds Data Quality and Timeliness Issues Have Increased in Past 3 Years
BOSTON, May 17, 2022 — ChaosSearch today announced the results of a survey of more than 200 IT leaders and professionals, which analyzed how organizations are leveraging modern data architectures, such as data warehouses, data lakes, data lakehouses, data mesh, and data fabric. The survey was conducted by Unisphere Research, a division of Information Today, Inc., and fielded among the subscribers of Database Trends and Applications (DBTA).
For decades, the data warehouse was the standard for archiving and retrieving data for analysis and reporting. However, the emergence of new data architectures, coupled with the exponential growth in the volume and velocity of data organizations can analyze, have created new challenges and opportunities for IT teams. To search, access, and leverage the insights these new data sources offer, enterprises need to modernize the way they design, implement, and manage their data delivery and analytics systems.
“The heavy reliance on older data infrastructures and warehouses is hindering organizations’ ability to become data-driven,” said Joe McKendrick, research analyst and author of the 2022 Data Delivery and Consumption Patterns Survey. “This survey highlights the need for more modern, cloud-based solutions that expedite and democratize data analysis across the enterprise—without compromising data quality. It would be wise for every IT and data professional to consider how much time they spend organizing, transforming, and prepping data to be analyzed—and to consider how a new solution might solve for those headaches.”
The 2022 Data Delivery and Consumption Patterns Survey examined these trends in more detail to determine current data environments, purchasing plans for 2022 and beyond, and perspective from those working to build and leverage data environments. Key findings include:
- Data warehouses are still foundational data architecture—though teams have started exploring new solutions. Data warehouses are used at 82% of the enterprises surveyed. Meanwhile, about half of enterprises have implemented data lakes, and one in five have adopted data lakehouses, as well as considering the role of data mesh and/or fabric, to support decision making.
- Investments in data lakes, specifically, are on the rise. More than half of data executives (56%) indicate they will be ramping up their spending on data lakes over the next three years. Data lake users specifically are well-invested in their infrastructures, with 66% of this group planning for growth.
- Data quality and timeliness are the most pressing issues cited by respondents. Sixty-five percent report these issues have increased over the past three years, with 20% of the increase reported as “substantial.” Additionally, 69% of respondents using data warehouses say their data delivery issues have increased.
- There’s still latency and lag time for many organizations when it comes to delivering data to users within the enterprise. Forty-four percent say it takes a day or longer for users to get the data they need. A majority of traditional data warehouse users (51%) report such a lag.
- Data replication has increased over the last three years, according to 71% of respondents. Twenty-three percent report this increase as “significant”—which is likely causing greater issues tied to cost, complexity, and data access.
- Data lakes and data lakehouses support a widely diverse set of workloads. Not only do data lakes support analytics and decision support (cited by 98% of respondents), but they also support customer transactions, artificial intelligence/machine learning, and big data clusters—according to respondents.
“The need to store and access data has been critical to enterprise success for decades, but it has never been possible for companies to search and analyze that data from within the same architecture,” said Thomas Hazel, CTO, Founder, and Chief Scientist, ChaosSearch. “With the emergence of new, cloud-based, modern data platforms, organizations now have the power to accelerate their analytics initiatives and overcome the barriers created by traditional databases—such as manual data prep and transformation. This survey highlights the growing need amongst data professionals to deliver unified search and SQL analytics that power better business decisions, which is the exact opportunity our data lake platform aims to create.”
The survey was announced at Data Summit 2022, DBTA’s annual event in Boston, where Hazel is delivering a keynote today on modern data infrastructures. Additionally, Dave Armlin, VP solution architect & customer success, will be leading a session at 10:45 am ET on May 18 focused on the future of cloud analytics.
To learn more and to download a copy of the report, visit here.
The survey was conducted by Unisphere Research, a division of Information Today, Inc., in partnership with ChaosSearch. More than 200 IT leaders and professionals among the subscribers of DBTA were surveyed. The most common job titles among survey respondents were directors/managers of IT, directors/managers of analytics, and data architect. Popular industries included technology, financial services, healthcare, manufacturing, and telecommunications. Two out of five survey respondents were from companies with 5,000 or more employees.
ChaosSearch helps modern organizations Know Better® by activating the data lake for analytics. The ChaosSearch Data Lake Platform indexes customers’ cloud data, rendering it fully searchable and enabling analytics at scale with massive reductions of time, cost and complexity.
ChaosSearch was purpose-built for cost-effective, highly scalable analytics encompassing full text search, SQL and machine learning capabilities in one unified offering. The patented ChaosSearch technology instantly transforms your cloud object storage (Amazon S3, Google Cloud Storage) into a hot, analytical data lake. Cloud-based organizations including Equifax, Blackboard, and Digital River rely on the ChaosSearch Data Lake Platform to query their cloud object storage directly within their preferred visualization tools.