Data Science and ML Platform Market Heats Up
If you’re in the market for data science and machine learning tools, we have great news: The market is absolutely booming in 2020. With a ton of healthy competition, vendors are investing heavily to differentiate their products and drive innovation. The vibrant market is also diversifying, with separate tracks evolving for users with different skill levels and goals.
Gartner isn’t typically one to get overly exuberant about things. That’s just not the way in Stamford, Connecticut. But analysts with the storied firm opened up a bit in a recent report and stated that the market for data science and machine learning platforms is “beyond healthy” and “thrillingly innovative.”
“The broad mix of vendors offer a granular range of capabilities, with solutions appropriate for most levels of maturity,” the company wrote in its Magic Quadrant for Data Science and Machine Learning Platforms. “The definitions and parameters of data science and data scientists continue to evolve, and the space is dramatically different from this Magic Quadrant’s inception in 2014.”
Gartner placed six vendors in the Leaders Quadrant for the February 2020 report, up from four vendors the last time Gartner put the DS and ML report together, in January 2019. SAS and TIBCO Software (which bought Statistica) reprise their earlier appearances in the Leaders Quadrant. Dataiku, Alteryx, Databricks, and MathWorks, meanwhile, moved into the Leaders Quadrant from surrounding venues.
KNIME and RapidMiner, which had previously been in Gartner’s Leaders Quadrant, found themselves in the Visionaries square, alongside vendors like Google, DataRobot, Domino Data Lab, and H2O.ai. IBM is the lone entry in the Challengers Quadrant, while Anaconda and Altair, which acquired DataWatch, occupy the Niche Player’s quadrant.
Gartner heralded Alteryx for changing its market perception from that of a pure-play data preparation provider into a full-blown data science and ML platform. It credits two acquisitions – ClearStory Data and Feature Labs – with helping to drive some of this transformation. “Alteryx’s no-code approach is attractive to organizations that want to use ML but need an easy-to-use platform built for business analysts and citizen data scientists,” the Gartner analysts write.
Databricks was heralded for its strong growth and execution, as well as a healthy product roadmap, according to Gartner’s report. The scalability of Databricks’ Apache Spark-based cloud platform was also a plus, as was the availability of customer success engineers. But if you’re looking for simple and easy-to-use development interface, Databricks probably isn’t the best solution, as the company has a “heavy slant towards technical audience,” including data scientists and machine learning engineers.
Dataiku was promoted to the Leaders Quadrant largely thanks to the work its done to enable collaboration among different users, including data scientist and data engineers, but also citizen data scientists. Its Data Science Studio software offers different UIs for folks with different skill levels, but even experts can be productive with a notebook-style interface. Governance and compliance with data regulations were also highlights.
MathWorks, which develops MATLAB, was promoted to the Leaders Quadrant largely through the work the vendor has done to bolster support for the full range of data science activities, from data pre-processing and model development all the way to production. Gartner was particularly impressed with the integrated simulation capability, thanks to hooks into its Simulink platform, while pre-built solutions for predictive maintenance was another highlight.
SAS is a perennial contender for any endeavor into advanced analytics and data science, and Gartner is impressed with the Cary, North Carolina firm’s work in this space. In particular, Gartner was impressed with the model operationalization and management capabilities of SAS’s Visual Data Mining and Machine Learning, which includes features like performance monitoring, automated retraining, version control, and lineage.
TIBCO Software clawed its way back into the Leaders Quadrant for DS and ML platforms by successfully integrating all the acquisitions it’s made over the years, including Spotfire, Jaspersoft, Insightful, Statistica, Alpine Data, StreamBase Systems, and Orchestra Networks. “TIBCO is simplifying and streamlining TIBCO Data Science while keeping the platform open and supportive of the fast moving ML landscape,” Gartner says.
Domino Data Lab was heralded for its open architecture, which allows customers to consolidate data science assets and workloads, while helping to automate the ML workflow. Gartner indicated customers are happy with Domino’s capability to support collaboration between data science experts and business users.
DataRobot maintained its spot by successfully integrating a host of acquisitions (ParallelM, Paxata, Nutonian, and Nexosis) over the past three years, and appealing to a broad mix of users, from data scientists and statisticians to business analysts and developers, Gartner says.
Google earned props from Gartner for having one of the “largest ML stacks” in Cloud AI Platform, which is “an excellent choice for top-notch data science talent.” But it was dinged for not being a full end-to-end pipeline, since users must use other Google products, like Cloud Data Fusion, Data Studio, Big Query, and others. Usability by citizen data scientists was also questioned.
KNIME offers a popular open source product (KNIME Analytics Platform) that can be augmented through a commercial product (KNIME Server) to provide enterprise data science features, such as collaboration, automation, and operationalization. Revenues may not be growing fast, but Gartner likes KNIME for its “strong focus on innovation” and deep connections to the data science community.
H2O.ai received kudos from Gartner for having both open source (H2O-3 and Sparkling Water) and commercial products (Driverless AI), as well as a vision of important trends like augmented data science and explainabiltiy. The high performance of its open source software was also highlighted.
Microsoft is an up-and-coming vendor in this space through its Azure Machine Learning offering. Gartner likes how Azure ML gives citizen data scientists a drag-and-drop interface for augmented analytics, but without giving up more advanced capabilities for data science experts.
RapidMiner brings together data science experts and citizen data scientists with its offerings, which include a core open source product and a commercial offering that brings enterprise features. Gartner also likes the new compliance and auditing of ML models through its Automated Model Ops offering.
The Rest of the Class
IBM is moving forward with its main product, Watson Studio, which Gartner appreciated for its “strong fundamentals” around data management and information architecture. The product can be used by multiple groups of users, and multi-cloud and hybrid deployments is also a plus for IBM, according to Gartner. Like many other vendors, IBM offers AutoAI capability, but Gartner says more innovation is needed as AutoML is “set to become table stakes.”
Altair is in a period of transition at the moment, having just over a year ago acquired Datawatch jus, which in turn had just acquired Angoss. Its flagship offering, Knowledge Studio, has a strong UI and is well-liked by data scientists who are more visually oriented. Altair features a strong user base in the financial services sector, where its decision trees are popular, Gartner says.
Anaconda was heralded for offering a loosely coupled distribution where data scientists can tap into a stream of open source innovation in the Python and R ecosystems. Scalability and support for GPUs has bolstered the product, which is particularly strong among experienced data scientists