Follow Datanami:
January 4, 2021

Peering Into the Crystal Ball of Advanced Analytics


The world of advanced analytics was evolving quickly at the end of 2020. And according to our panel of experts who volunteered predictions on the topic, the accelerated pace of change in advanced data analytics will continue in 2021.

2021 kicks off a new decade for advanced analytics, and a new attitude is apparent. GoodData CEO Roman Stanek, for one, is bullish on the potential.

“The 2010s were the ‘Lost Decade’ for data, in large part due to Silicon Valley’s misplaced obsession with Hadoop,” Stanek tells Datanami. “The 2020s, in contrast, will be data’s ‘Decade of Growth.’ Snowflake captured an entire cloud data market and will change the data landscape as we know it. Standardized cloud storage will redefine data management and the data value chain. The result? Massive growth and the software industry’s first $100 billion IPO.”

Led by Snowflake, cloud data warehouses have become popular places to organize data used by analytic teams. But in 2021, the shine begins to wear off on cloud data warehouses, states Tomer Shiran, co-founder and chief product officer at Dremio.

“The cloud data warehouse vendors have leveraged the separation of storage from compute to deliver offerings with a lower cost of entry than traditional data warehouses, as well as improved scalability,” Shiran says. “However, the data itself isn’t separated from compute–it must first be loaded into the data warehouse, and can only be accessed through the data warehouse. This includes paying the data warehouse vendor to get the data into and out of their system. So, while upfront expenses for a cloud data warehouse may be less, the costs at the end of the year are likely significantly higher than expected. By leveraging modern cloud data lake engines and open source table formats like Apache Iceberg, however, companies can now query data in the data lake directly without any degradation of performance, resulting in an extreme reduction in complex and costly data copies and movement.”

The cloud will continue to gobble up analytics workloads in 2021 (Blackboard/Shutterstock)

Back in 2010, the market was focused on big data–obsessively, some might say. While data volumes continue to grow at geometric rates, the term “big data” just doesn’t pack the same punch that it used to, argues Justin Borgman, the CEO of the Presto-backer Starburst Data.

“’Big data’ is irrelevant. [In 2021] we’ll see business leaders pointing instead to ‘wide data’ to make data-driven decisions, which encompasses all types of data no matter where it lives–in the cloud, on prem, in data lakes, or data warehouses,” Borgman says. “Organizations will face massive data deficit challenges. The fast-pace at which things changed this year [2020] made old data irrelevant quickly. This year was a wake-up call for many business leaders and brought attention to the importance of combining all data when developing analytical models to ensure data is relevant, reliable, and up-to-date.”

Many data workloads that were slated for on-prem Hadoop clusters are now running in the cloud as containerized workloads. The container trend will grow in 2021, says Haoyuan “HY” Li, founder and CEO of Alluxio.

“Containerized application deployments and Kubernetes have started to gain traction with enterprises increasingly moving away from traditional Hadoop-based data lakes,” Li says. “While moving away, enterprises are realizing the benefit of abstracting the physical infrastructure while also adopting public clouds for agility. Vendor lock in is a concern but at the same time a uniform toolset across environments is a must to reduce spending on the expertise required to operate across environments, such as hybrid and multi-cloud. Container-based deployments for compute abstraction alongside new abstraction services for storage anywhere, will be the solution of choice for enterprises moving off Hadoop.”

Not everybody is buying into the mantra that data lakes are poised to replace data warehouses. You can count Paige Roberts, the open source relations manager for Micro Focus (owner of the Veritca data warehouse), among them.

“I think the data warehouse vendors have an unbeatable head start because building a solid, dependable analytical database like Vertica can take 10 years or more alone,” Roberts tells us. “The data lake vendors have only been around about 10 years and are scrambling to play catch-up.”

An analytic divide is predicted to emerge between the data-haves and the data have-nots

With so many on-prem and cloud-based data lakes, data warehouses, and databases in the mix, it’s clear that data will be on the move in 2021 as organizations seek to unify data for analytics, says Luke Han, co-founder and CEO of Kyligence.

“CDOs and CAOs will increasingly view their datasets and analytics beyond the boundaries of cloud and data platforms,” Han says. “While the expense of data movement will motivate data teams to leave data where it was born, many will pursue ways to engineer their analytics pipelines to source data from multiple public and private cloud platforms, and across cloud storage, data warehouses, and data lakes.”

Hand-crafted is great for IPAs and fine furniture, but not so much for data dashboards. In 2021, the amount of time that people spend manually building dashboards will decrease as standardized data feeds become more popular, says Peter Bailis, founder and CEO of Sisu Data.

“Manually curated dashboards will finally start to fall as the dominant way most business owners consume data. We’ll see greater adoption of platforms offering more dynamic views of data, including insight news feeds and personalized results,” Bailis predicts. “More data coming from standardized pipelines make common analyses magnitudes cheaper. Data platforms will provide templated, ‘off the shelf’ analyses that leverage these standardized schemas for few- to no-click, out-of-the-box insights.”

Advanced analytics capabilities will be on the upswing, but the benefits of those insights won’t accrue evenly across the board, laments Alan Jacobson, chief data and analytics officer at Alteryx.

“Like the much-publicized ‘digital divide’ we’re also seeing the emergence of an ‘analytic divide,’” Jacobson says. “Many companies were driven to invest in analytics due to the pandemic, while others have been forced to cut anything they didn’t view as critical to keep the lights on…and for these organizations, analytics was on the chopping block. This means that the analytic divide will further widen in 2021, and this trend will continue for many years to come. Without a doubt, winners and losers in every industry will continue to be defined by those that are leveraging analytics and those that are not.”

How structured, SQL databases fair against unstructured data lakes will be closely watched in 2021 (Semisatch/Shutterstock)

Scattered data strategies will be reined in this year as companies get more targeted and disciplined in their approach to data, which will lead to more widespread success in analytics, predicts Sandhya Balakrishnan, Brillio’s head of analytics for the U.S. region.

“In our industry, the focus is shifting from ‘store every piece of data which might be useful’ to an emphasis on usability,” Balakrishnan says. “This shift will usher in larger investments in comprehensive data governance practices to incorporate automated data quality management, data catalogue, and lineage adhering to data security and privacy practices, data discovery, and data preparatory tools which will help users fish for data from the data lakes. As a result, 2021 will be the year of data analytics.”

Quit futzing around with your data, because time is running out, warns Raj Verma, CEO of SingleStore (formerly MemSQL).

“Managing your data is no longer a luxury, but a necessity and determines how successful you or your company will be,” Verma says. “If you can remove complexity or cost of managing data, you’ll be very effective. Ultimately, the winner of the space will take the complexity and cost out of data management, and workloads will be unified so you can write one single SQL query to manage and access all workloads across multiple data residencies.”

AI and BI have had an on-going affair that’s lasted years. Will they finally tie the knot in 2021? Ramprakash (Ram) Ramamoorthy, director of AI research at Zoho, is hearing wedding bells.

“With COVID-19 accelerating the need for smarter software technology, businesses, and consumers can expect to see more AI capabilities embedded in smart BI technology,” Ramamoorthy says. “With conversational AI advancing and gaining popularity as a business solution, conversational analytics will enable businesses to simply ask questions regarding sales, revenue reports, important metrics, and so on, with AI ingesting information and generating answers within seconds. AI will also play a role in automated storytelling, allowing users to determine key insights effortlessly. The system will be able to track changes and build a narrative that is easily digestible to users and points to key findings.”

Conversational analytics and natural language processing (NLP) is at an inflection point, thanks to tremendous improvements in accuracy wrought by deep learning. In 2021, organizations will start to put those technological benefits to good use, says Sam Mahalingam, the CTO of Altair.

“Just as we are using Google Home and Alexa in our everyday lives, conversational analytics through NLP will be the golden ticket for enterprises in extracting valuable big data insights from their business operations,” Mahalingam says. “This includes unearthing trends that may have gone unnoticed and allowing experts from within the enterprise to engage with data in a meaningful way.”

The coronavirus pandemic kicked off a struggle for survival that forced organizations to get creative with data. Those creations will start to pay dividends in 2021 as data executives go on offense with their data, says Alation‘s co-founder and Chief Data and Analytics Officer (CDAO) Aaron Kalb.

NLP is poised for a break-out year in 2021 (Wright Studio/Shutterstock)

“A ‘proactive analytics phoenix’ will rise from the devastation of the COVID crisis,” Kalb says. “When the pandemic turned the world economy upside down, organizations were forced to invest rapidly in business intelligence and data catalog software just to understand what the heck was going on and make basic business decisions. As we enter a new normal in 2021, they’ll be able to leverage those reactive investments to do proactive business process optimization.”

The COVID work-from-home (WFH) mandate has exposed shortcomings in the user interfaces of business applications, but analytics can help get the most relevant data on the screen with the rise of the analytics-led Employee Experience (EX), opines Jeff Gallino, founder and CTO of CallMiner.

“There will be a big shift in focus to EX technology, especially as WFH environments have made it difficult for brands to manage, engage, equip, and evaluate remote customer service employees,” Gallino says.

The shift to self-service analytics will continue in 2021 as IT budgets get stretched. The good news is this will become easier with better tools and tech, says Krishna Tammana, CTO of Talend.

“As the pandemic continues in 2021, companies will look to further reduce dependencies on IT functions with self-serve analytics,” Tammana says. “This will help them turn data into valuable, shareable assets more quickly. Remote workforces and online expansions are draining IT resources. Automated data preparation, curation, stewardship, quality controls, and machine learning tools will help to stem the tide of IT demands.

2020 was an emotional year all around (but don’t let’s start). The questions, then, become: How do the emotions play with analytics, and what does this have to do with 2021? The answers could be a renewed focus on emotional analytics, per Paul Moxon, SVP of data architecture Denodo.

“Emotion is a key factor affecting customer behavior and has a strong influence on brand loyalty,” Moxon says. “Therefore, it is increasingly useful for companies to find a way to measure emotions of customers during their decision-making processes. Emotional analytics focuses on studying and recognizing the full gamut of human emotions that includes mood, attitude and personality. It employs predictive models and AI/ML to analyze human movements, word choices, voice tones, and facial expressions.”

Related Items:

2021: Cloudy with a (Very Good) Chance of AI and Analytics

2021 Predictions: Data Science

2020: A Big Data Year in Review