Transforming ‘Data Swamps’ into Data Lakes
Data lakes and the growing number of analytics tools emerging to access their contents are expanding the universe of business users able to reap insights via what a new survey calls “point-and-click” accessibility.
The vendor survey of more than 500 business users released on Tuesday (July 17) by visual analytics specialist Arcadia Data found that 72 percent use data lakes to gather business intelligence. Nevertheless, the survey identified usage gaps that often turn the repositories into “data swamps.”
The survey also makes the case that Hadoop remains viable despite a steady migration to the cloud. The survey found that 62 percent of respondents said they deployed data lakes in Hadoop clusters. Regardless of where data lakes are deployed, a majority of those polled said having the right business intelligence tools was the key to analyzing data and acting on the resulting intelligence.
The survey findings also underscore how new tools are expanding access beyond data scientists and engineers to “casual” business users who lack coding or scripting skills once required to tap into data stored in lakes. Vendors like Arcadia tout their visual analytics and business intelligence software as among the tools enabling easier access to data.
Hence, the survey found that 61 percent of respondents said data lake access tools have enabled business users to author and edit reports and dashboards without coding. Three quarters said those tools lead to better decisions.
Still, plenty of gaps remain, including better data preparation tools and “advanced” analytics capabilities. For example, only half of those polled said business users were able to blend data sets stored in data lakes with outside data. Fewer still felt they could use existing tools to view “complex correlations” among data.
“Business users want direct, fast access to data,” said Priyank Patel, co-founder and chief product officer at Arcadia Data. “It’s no longer acceptable to be chained to mountains of data without a clear, simple and effective way to derive value from it.” With new tools, Patel added, “You don’t have to be a data scientist or a PhD to do so.”
Arcadia Data, San Mateo, Calif., has steadily upgraded its visual analytics platform as a bridge between business users and self-service access to big data. The latest enterprise version of its platform released in March is positioned as a “data-native” and “AI-driven” architecture that addresses what the company asserts is a growing requirement for business intelligence tools for both data lakes and data warehouses.
The latest version also attempts to address gaps in data preparation through the ability to handle more data formats on the fly without the help of the IT department. It also includes a free tool for KSQL, the streaming SQL engine for Apache Kafka that allows users to analyze streaming data in real time.