Follow Datanami:
January 3, 2022

Data Science and AI Predictions for 2022


The pace of technological change increased in 2021, and if history is any guide, will continue to accelerate in 2022. At the leading edge of high tech are data science and artificial intelligence, two disciplines that promise to keep the pace of change at a high level.

Interest in AI, machine learning, and data science is extremely high, if the number of predictions on these topics is any indication. We start this batch of predictions with DataKitchen CEO Chris Bergh, who notes that the global AI market is projected to grow at a compound annual growth rate (CAGR) of 33% through 2027. But that significant growth comes with a hidden risk: reputational harm due to bias and a lack of accountability in AI processes.

“The problem is that algorithms can absorb and perpetuate racial, gender, ethnic, and other social inequalities and deploy them at scale,” Bergh says. “Many in the data industry recognize the serious impact of AI bias and seek to take active steps to mitigate it. The data industry realizes that AI bias is simply a quality problem, and AI systems should be subject to this same level of process control as an automobile rolling off an assembly line. In 2022, data organizations will institute robust automated processes around their AI systems to make them more accountable to stakeholders.”

Python had a rip-roaring 2021, when it overtook Java to become the most popular language in the world, according to the TIOBE Index. What will the versatile scripting language bring in 2022? According to the folks at Anaconda, Python will continue slithering its way into our lives.

“Python will also continue expanding to newer use cases beyond data science in 2022,” the company writes in its 2022 predictions blog piece. “Stan [Seibert, senior director of community innovation] at Anaconda believes that, for use cases like microcontrollers and IoT devices, where other programming languages have typically dominated, we’ll see growth in the adoption of Python due to the rise of MicroPython and CircuitPython. Taking the question in a different direction, Joseph J. Currenti, Sr. Technical Account Manager at Anaconda, and Lucia said they expect Python to be used more in game development as developers look to AI to create more immersive gaming experiences.”

AI requires a combination of data and compute (normally in the cloud) to be successful. In 2022, we’ll see AI converge with data and the cloud, requiring a more cohesive management approach, says Anand Rao, global AI lead for PwC.

“By itself, AI can’t do much to solve important problems. It needs data and scalable computing power,” Rao says. “That’s why leading companies are increasingly administering data, AI and cloud (DAC) as a unified whole. We’ll see an influx of companies in 2022 take a lifecycle approach to managing these three interconnected operations and when developing AI governance. Businesses are continually looking at strategy, fine-tuning execution and enhancing operations. When data, AI and cloud work together smoothly, end-to-end, the result is a supple and powerful system that realizes more value from data and solves more problems faster.”

Analytics grew quickly in 2021, but organizations were still faced with a gap between what they have been able to accomplish within their data center, and the types of Web-scale deployments they would like to run in the cloud, according to Jeff Whitaker, the vice president of products at Excelero.

“In 2022, new performance infrastructure in the cloud for compute, networking, and storage is being built out, and we will see the convergence of analytics environments,” Whitaker says. “As a result, many companies will migrate their core business applications and database environments to the cloud, uniting their data in a central resource. From BI, database analytics and into the AI/ML environments, it’s now entirely possible for near-real time analysis of data to be done in the cloud, using cloud engines together with the Web-scale data platforms.”

Having good AI models is great. But it’s tough to beat better data, according to Omer Har, the co-founder and CTO of Explorium, who notes that

In lieu of better AI models, trying adding more data (andromina/Shutterstock)

“Expert opinion is coalescing around the idea–championed by AI pioneer Andrew Ng–that the best way to improve AI performance is with better data, not better algorithms,” Har says. “That’s not to say algorithms aren’t important, but we’ve reached a point of diminishing returns. Research suggests organizations can improve AI performance much more, and much faster, by training existing algorithms on wider data that’s carefully curated. In 2022, we’ll see access to external data emerge as a strong competitive advantage. Where before businesses might have raced to be first with AI, now they’ll aim to outperform competitors by training their AI on the most up-to-date, relevant data.”

While we’ve succeeded in bringing AI to bear on information-rich fields, such as NLP and image analysis, Andrew Kasarskis, the chief data officer at Sema4, sees a significant hurdle in our ability to scale these AI deployments as a result of one fundamental unmet requirement: the efficient allocation of data curation resources.

“This is a need for technological and process innovation that I’d love to exist but don’t yet see happening,” Kasarskis says. “When obtaining those large corpuses of well-labeled data to train the AI, some human manual and semi-manual work is inevitably needed. This work is always expensive, never scales well, and frequently takes experts with esoteric knowledge away from important value-generating activities. Figuring out the most efficient way to allocate manual curation work seems, to me, like a significant unmet need that impedes progress in the use of data technology, particularly in biomedicine.”

Synthetic data has emerged to turbocharge some AI applications, and that trend will reach new heights in 2022, says Rev Lebaredian, the vice president of simulation technology in the Omniverse Engineering department at Nvidia.

Use of synthetic data for AI training is projected to grow in 2022 (image courtesy

“The rate of innovation in AI has been accelerating for the better part of decade, but AI cannot advance without large amounts of high quality and diverse data,” Lebaredian says. “Today, data captured from the real world and labeled by humans is insufficient both in terms of quality and diversity to jump to the next level of artificial intelligence.  In 2022, we will see an explosion in synthetic data generated from virtual worlds by physically accurate world simulators to train advanced neural networks.”

Rob Gibbon, a product manager at Ubuntu publisher Canonical, sees wider deep adoption of ML and AI in ’22.

“Artificial intelligence has finally come of age, and that’s down in no small part to collaborative open source initiatives like the TensorFlow, Keras, PyTorch, and MXNet deep learning projects. Continuing into 2022, we will see ever broader adoption of machine learning and artificial intelligence in the widest variety of applications imaginable –  from the most trivial and mundane to those that are truly transformative.

Thinking about adopting robotic process automation in 2022? A better idea may be outfitting certain processes with AI and human-in-the-loop technologies and techniques, says Varun Ganapathi, a co-founder and CTO at AKASA.

“Digital transformation efforts in a number of industries have driven massive adoption of robotic process automation (RPA) during the past decade The hard truth is that RPA is a decades-old technology that is brittle and has real limits to its capabilities–leaving a trail of broken bots which can be expensive and time-consuming to fix,” Ganapahi says. “Emerging machine-learning-based technology platforms combined with human-in-the-loop approaches to automation are already redefining what it is possible to automate across a number of industries where complexity, exceptions, and outliers train the AI to work smarter, making automation stronger.”

Will RPA give way to human-in-the-loop plus AI in 2022? (Thomas Soellner/Shutterstock)

We’ve been blessed with a plethora of advance analytic and predictive tools and technologies. In 2022, we’ll reach a new pinnacle in our ability to piece them together in a hybrid manner to accomplish worthwhile goals, predicts Marco Varone, founder and CTO of

“Hybrid AI is a key trend and strategic direction we’ll see more of in 2022,” Varone says. “Recently we have been seeing an important advancement in natural language understanding (NLU) based on the combination of different techniques, symbolic AI and machine learning, to improve overall results and better tackle even more complex enterprises’ cognitive problems. This is the future of AI because you can harness the best techniques available to solve a problem.”

The spectrum of data science practitioners who have the experience and skills necessary to develop AI applications has been quite narrow. In 2022, amid a hiring crunch, we’ll see that aperture of personas widen, according to Alicia Frame, the director of product management for data science at Neo4j.

In 2022, companies will need to embrace the role of the ‘citizen data scientist,’ which are employees who work with predictive/prescriptive analytics models but whose primary job function lies outside the field of data science and analytics,” Frame says. “The data science field is one of the fastest growing, and with the workforce currently experiencing ‘The Great Resignation,’ companies will need to make data science more accessible in order to help fill gaps on their teams.”

Legions of citizen data scientists will be called upon to help move their companies’ AI goals forward in 2022  (maximmmmum/Shutterstock)

Many companies aspire to become completely data-driven. But putting too many eggs in the AI basket would be a mistake, according to Domino Data Lab CEO Nick Elprin, who foresees another major public failure of an algorithmic business.

“While there’s no public post-mortem of Zillow’s dramatic exit from the iBuying market, Zillow is a warning sign about the risks of algorithmic business,” Elprin says. “Model-driven businesses are immensely powerful but also hard to get right. As more companies develop model-driven strategies, we will see more of them stumble — either because they didn’t properly manage probabilistic risks, didn’t properly integrate data science with business processes and domain knowledge, or relied on too much ‘magic’ without understanding fundamentals and statistics.”

Having an analytical insight is one thing, but actually doing something about it is something else entirely. In 2022, we’ll see more companies moving beyond just analytics into actions and decisions, says Michael O’Connell, chief analytics officer at TIBCO

“Today’s fast-changing business climate demands real-time visibility and up-to-the-minute recommendations from data and analytics,” O’Connell says. “To survive the post-pandemic world, organizations need to be able to predict what’s going to happen next based on the data they have; and develop more discipline around decisions and actions. Processes for measuring impact and closing the decision intelligence loop will sharpen their focus.”

AI adoptees have skewed towards larger companies in the past. But in 2022, we’ll start to see the SMB get in on the action, says Bob Lamendola, senior vice president of technology and head of the Digital Services Center at Ricoh USA.

“Today, AI is being deployed and explored in many pilot programs to better understand the technology, uncover challenges and validate outcomes,” Lamendola says. “And while the adoption of AI capabilities applies more toward larger enterprises now, we anticipate a significant shift in focus and adoption within mid-market and some smaller organizations next year. The opportunities to re-imagine IT Operations through an insight and analytics driven engine with the ability to self-heal and auto remediate is too compelling to be overlooked.”

Related Items:

2022 Big Data Predictions from the Cloud

2021 Big Data Year in Review: Part 2

Big Data Year in Review: Part 1