Follow Datanami:
March 22, 2021

A ‘Glut’ of Innovation Spotted in Data Science and ML Platforms

(Blue Planet Studio/Shutterstock)

These are heady days in data science and machine learning (DSML) according to Gartner, which identified a “glut” of innovation occurring in the market for DSML platforms. From established companies chasing AutoML or model governance to startups focusing on MLops or explainable AI, a plethora of vendors are simultaneously moving in all directions with their products as they seek to differentiate themselves amid a very diverse audience.

“The DSML market is simultaneously more vibrant and messier than ever,” a gaggle of Gartner analysts led by Peter Krensky wrote in the Magic Quadrant for DSML Platforms, which was published earlier this month. “The definitions and parameters of data science and data scientists continue to evolve, and the market is dramatically different from how it was in 2014, when we published the first Magic Quadrant on it.”

The 2021 Magic Quadrant for DSML is heavily represented by companies to the right of the axis, which anybody who’s familiar with Gartner’s quadrant-based assessment method knows represents the “completeness of vision.” No fewer than 13 of the 20 vendors to make the quadrant’s cut landed on the right side, which indicates active innovation.

Generating new DSML features and exploring new DSML methods is the name of the game in this fast-moving business, Gartner says. “There remains a glut of compelling innovations and visionary roadmaps,” the analysts wrote. “…[V]endors are heavily focused on innovation and differentiation, rather than pure execution. Innovation remains key to survival and relevance.”

The Connecticut-based analyst firm did not sound surprised to conclude that the cloud biggies have moved strongly into the space. “The long-expected gigantic presence in this market of Google and Amazon is now easily felt as they compete with Microsoft for supremacy in terms of DSML capabilities in the cloud,” the analysts write.

2021 Gartner Magic Quadrant for Data Science and Machine Learning platforms (Source: Gartner)

However, that does not mean that they are sucking all the air out of the room, as smaller companies have found success in the market, with a few achieving what Gartner termed “hypergrowth.” A few well-established leaders from the previous generation of statistical tools, like SAS, MathWorks, and IBM (SPSS) are also doing well, Gartner notes. In fact, those three vendors are collectively doing better than AWS, Google, and Microsoft when it comes to ability to execute.

The DSML market is young and vibrant, and there is ample revenue and funding opportunities for companies that differentiate themselves on the product side, Gartner says. There is just a “moderate” level of M&A activity at this time, which indicates a growing market. With that said, the vendors who made Gartner’s cut had to prove themselves by meeting certain customer-count and financial performance criteria. And of course, they have to have a product that meets the definition of an DSML platform.

Which begs the question: Just what is an DSML platform? Gartner defines it as a place “to source data, build models and operationalize machine learning,” either by certified, card-carrying data scientists or people who are doing data science work, i.e. citizen data scientists, data engineer, or ML specialists.

Beyond that broad definition, Gartner identified 13 other capabilities that may (or may not) exist in a given DSML platform, including: data ingestion; data preparation; data exploration; feature engineering; model creation and training; model testing; deployment; monitoring; maintenance; data and model governance; explainable artificial intelligence (XAI); business value tracking; and collaboration.

Here’s a brief description of the pros and cons provided for each of the vendors listed in the Magic Quadrant, courtesy of Gartner:

Leaders Quadrant

(Ilyafs/Shutterstock)

Databricks Unified Data Platform

Pros: Scalable multi-cloud support; empowerment of data scientists; execution and expansion.

Cons: Lack of support for citizen data scientists; need for governance and responsible AI; growing cloud competition.

Dataiku Data Science Studio

Pros: Support for citizen data scientists; focus on business value; market traction.

Cons: Heavy use of extensions and plugins; emerging story around “XOps” (i.e. unified management of data, ML, models, and platforms); pricing for smaller teams.

IBM Watson Studio on IBM Cloud Pak for Data

Pros: support for multiple personas; composite AI vision; responsible AI and governance.

Cons: scope of auto AI features; doubts about Watson brand; lack of clarity in product-bundling.

MathWorks MATLAB

Pros: Robust composite AI capabilities; integrated domain knowledge; verifiable and reliable ML.

Cons: Interface lacks usability among non-engineers and non-scientists; interpretability of ML models; lack of augmented DSML capabilities.

SAS Viya

Pros: Market understanding and presence; cloud-native architecture and open source integration; automated feature engineering and modeling.

Cons: Perceived high cost; product bundling; marketing strategy.

TIBCO Software (various products)

Pros: Leading edge DSML capabilities; integration of DS and BI/analytics; support for collaboration and applied analytics.

Cons: Limited ModelOps capabilities; lack of support for citizen data science capabilities; financial growth in 2020.

Visionaries Quadrant

AWS (various products)

Pros: Breadth and depth of cloud platform; performance and scalability; data labeling and human-in-the-loop capabilities

Cons: Lack of attention on citizen data scientist; rapid rollout of products and maturity; maturity of on-prem, hybrid, and multi-cloud support

DataRobot Enterprise AI Platform

Pros: Sales strategy and execution; high-touch customer service; successful acquisitions.

Cons: Complexity of product portfolio; resource-heavy onboarding; capability gaps.

Google Cloud AI Platform

Pros: Responsible AI vision and capabilities; research contributions; cohesion and simplification of consolidated products.

Cons: Rapid pace of change; steep learning curve; lack of capabilities for on-prem, hybrid, and multi-cloud deployments.

KNIME Analytics Platform

Pros: Breadth and depth of DSML capabilities; commitment to open source; visual workflow coherence.

Cons: Limitations in enterprise deployments; responsible AI vision; low market traction.

Microsoft Azure Machine Learning

Pros: Strong support for enterprise DS; support for multiple personas; openness and partnerships.

Cons: Requirement of use of other Azure services; immaturity of on-prem, hybrid, and multi-cloud capabilities; lack of support for augmented DSML capabilities.

RapidMiner (various products)

Pros: Support for multiple personas; “clear vision and delivery of aligned features”; expandability and governance.

Cons: Growth rate; average advanced analytics capabilities; academic perception of product.

H2O.ai (various products)

Pros: Vision for value creation; extensive automation; rich AI explainability features (XAI).

Cons: Lack of some data access and data prep features; OEM partner strategy; collaboration and cohesion.

Challengers Quadrant

(Ollyy/Shutterstock)

Alteryx Analytics Process Automation

Pros: Support for multiple personas; product packaging and go-to-market strategy; customer support.

Cons: Changing product portfolio; high cost; lack of innovation.

Niche Players Quadrant

Alibaba Cloud’s Platform for AI (PAI) Studio and Data Science Workshop

Pros: Strong community in China; advanced use-case modeling; and seamless integration.

Cons: Focus on Asia; lack of product vision; narrow usage and focus on professional data scientists.

Altair Knowledge Studio and Knowledge Works

Pros: Ease of use; support for data pipelines; customer satisfaction

Cons: Functional gaps in lineup; limited rollouts in some industries; relatively slow growth.

Anaconda Enterprise

Pros: Trusted and flexible platform; based on open source; culture of collaboration.

Cons: Focus on technical audience; lack of model operationalization functions; runtime stability.

Cloudera Data Platform

Pros: Native Spark on Kubernetes; support for complex data workloads; metadata support for DataOps and MLOps.

Cons: No GUI for development; lack of coherence of products; domain-specific solutions.

Domino Data Lab Data Science Platform

Pros: Support for large, expert teams; mature MLOps capabilities; support for on-prem, hybrid, and multi-cloud.

Cons: Support for small, immature DS teams; low market visibility; open source vision;

Samsung SDS Brightics AI

Pros: Comprehensive ecosystem vision; data access, prep, and visualization; ease of use and collaboration.

Cons: Limited adoption outside of Asia; gaps in product vision; limited capabilities in ModelOps and explainability.

This is indeed a great time to be in the data science and machine learning business. Whether you’re a user of these tools or helping to develop them, the rapid pace of innovation is not only exciting but good for business as a whole.

Related Items:

Data Science and ML Platform Market Heats Up

Data Science, ML Platform Leader Board Shuffled

AutoML Tools Emerge as Data Science Difference Makers

 

Datanami