Inside IBM’s Rejiggered Watson Lineup
Think you know Watson? Think again, as IBM has made extensive changes to the lineup of AI and cognitive products and services that bears the name of its founder. Sumit Gupta, the VP of IBM’s AI, machine learning, and HPC efforts, walks Datanami through the Watson re-positioning.
People may remember Watson as the Hadoop-based question-and-answer supercomputer that IBM built to beat Ken Jennings at Jeopardy! back in 2011. Or they may recall a cognitive system bearing the Watson name installed at the Sloan Kettering Cancer Center in 2013. Memories of IBM investing $1 billion in its new Watson Group in 2014 may crop up, or they might recall the Watson-branded APIs IBM launched following the acquisition of AlchemyAPI in 2015.
All those products (among others) still exist in the Watson brand. But Watson today is much, much more, Gupta explained during a meeting the recent NVidia GPU Technology Conference in San Jose, California. “Watson is the brand we use for all our AI software,” he says.
To better understand Watson, it’s useful to break Watson down into three main groups of products and services, Gupta says.
- On the bottom are do-it-yourself Watson tools that data scientists can use to build custom AI, machine learning, and deep learning applications;
- In the middle are Watson APIs that customers can use to call a variety of pre-built AI and cognitive services;
- At the top are pre-built Watson solutions that are geared to specific industries.
IBM’s repositioning of Watson came to light during IBM’s recent Think conference, where the company announced that its Watson software would be available on any public cloud (not just IBM’s), as well as on premise.
“That was the most important announcement we made” at IBM Think, Gupta says of Watson’s multi- and hybrid-cloud capabilities. “The other thing is we announced a unification of all our data scientist AI software.”
Watson’s DIY Data Science Tools
The Watson lineup includes several DIY data science tools, including Watson Studio, Watson Machine Learning, Watson Machine Learning Community Edition (WMLCE), Watson Machine Learning Accelerator (WMLA), and Watson Open Scale.
Watson Studio is the notebook-style development environment where data scientists write code for machine learning, deep learning, and AI applications. “Watson Studio is the artist formerly known as DSX,” Gupta says, referring to the Apache Spark- and Juypter-based Data Science Experience tool that IBM launched in 2016.
Watson Studio is available with cloud, desktop, and local deployment options. It’s also integrated with a range of machine learning deployment tools and Watson API services (discussed below). It also hooks into a data catalog known as Watson Knowledge Catalog, which helps users organize, prep, and govern their data.
Watson Machine Learning, meanwhile, is where data scientists manage the deployment, monitoring, and retraining of models created in Watson Studio. The software was formally launched in 2017 after a 12 month beta.
IBM offers two versions of Watson Machine learning, including WMLCE, which is a free entry-level version of the product. The software runs on Power and X86, and includes popular open source machine learning frameworks, including Tensorflow and Caffe. It also includes SnapML, the open source machine learning framework based on Scikit Learn that came out of IBM’s Zurich lab in the past few years.
Watson Machine Learning Accelerator, meanwhile, is an enterprise-level tool that IBM originally launched in 2017 as PowerAI Enterprise. WMLA is designed to handle large-scale deep learning and machine learning workloads that span large clusters of machines, which is something that WMLCE isn’t designed to do.
WMLA includes an advanced scheduler that’s based on an HPC-related acquisition that IBM made several years ago. The company has integrated Platform Computing’s Load Sharing Facility (LSF) technology into WMLA to enable customers to more efficiently divvy up available processing resources to keep teams of data scientists busy, instead of forcing them to wait around for resources to train their models, Gupta explains.
“What we’re able to do [with] Watson Machine Learning Accelerator is we can do elastic distributed training,” he said. “Let’s say you launch a job with 16 GPUs. I launch a job two hours later, and the scheduler will release eight GPUs. So it’ll take your job, shrink it from 16 to eight GPUs, very elegantly, so you don’t lose anything. It actually finishes the last iterations, but it does it within a few seconds, and it gives me eight of them.”
Gupta says LSF, combined with the distributed storage provided by Elastic Storage Server (ESS) and Spectrum Scale, product forms the core of its big data analytics strategy in a post-Hadoop world.
“This is our post Hadoop stack,” he said. “We essentially are offering a replacement for YARN [with LSF]. The problem with YARN is, it was really built for Hadoop jobs, which are batch and sort of slow. Hadoop by default is a slow batch-processing system. Analytics, AI, and deep learning — everything about it is high performance. In fact, it looks like HPC clusters, which is why LSF [is] the same engine that we use in Watson ML Accelerator to go after analytics and AI.”
Watson Open Scale, meanwhile, helps data scientists track the quality of the model, including whether its accuracy is high enough or whether bias has been introduced into the model. Everything in Open Scale is based on open source, Gupta says. The company has also open source the kernel of the product and is distributing it as AI Fairness 360.
In fact, openness is a key attribute of the entire Watson tools stack, he says. “This is all open infrastructure,” he says. “Let’s say you’re using Anaconda Enterprise. Anaconda works just fine with our software. We can take models that they built and import them into Watson Machine Learning and manage them. We can use Watson Machine Learning Accelerator to manage Anaconda job. If you’re suing SAS, if you’re using any other analytics framework, we work with them.”
Watson Apps, Solutions, and APIs
As a brand, Watson is constantly evolving. In addition to the five DIY Watson tools discussed above, the brand is composed of the following 13 related products:
- Watson Assistant — Build an AI assistant for a variety of channels, including mobile devices, messaging platforms, and even robots;
- Watson Discovery — A product for extracting knowledge from data using natural language understanding (NLU) techniques;
- Watson Discovery News – A product built on Watson Discovery and D3.js that features pre-enriched collection of news content from the Internet;
- Watson Natural Language Understanding – A product designed for advanced text analytics;
- Watson Knowledge Studio – A customizable cognitive solution designed to learn the linguistic nuances of particular companies and industrial domains;
- Watson Visual Recognition – A product for tagging and classifying visual content using machine learning;
- Watson Speech to Text – A producdt for converting audio and voice to written text;
- Watson Text to Speech – A product for converting written text into natural audio;
- Watson Language Translator – An deep learning-based product that translates written text to and from 23 languages;
- Watson Natural Language Classifier – A machine learning-based text classification and labeling product that supports nine languages;
- Watson Personality Insights – A product that uses linguistic techniques to detect personality characteristics in written text;
- Watson Tone Analyzer – A product that uses linguistic techniques to detect tone in written text.
- Watson Compare & Comply – a product that uses machine learning to make sense of complex text for the purpose of compliance
But wait, there’s more Watson! In addition to those Watson applications, IBM offers the following eight Watson solutions:
- Watson Advertising – A new product based on The Weather Company’s advertising sales division;
- Watson Marketing – A pre-built solution used to analyze customer experiences across channels, build B2B and mobile marketing campaigns, and heighten personalization;
- Watson Education – A collection of applications designed to deliver personalized curriculum, AI-based tutoring, and vocabulary aids to students;
- Watson Financial Services – An AI-based application developed with partner Promontory for helping financial services firms comply with regulations;
- Watson Health – A multi-faceted solution designed to apply AI to various aspects of a healthcare delivery system, including clinicians, administrators, executives, IT, research, government and compliance, and strategy;
- Watson IoT – An AI solutions designed to help companies manage and optimize real estate, facilities, and assets in the field;
- Watson Media – An application designed to infuse AI in a media workflow or video library;
- Watson Talent – An application that brings AI to human resources and human capital management challenges.
As if that wasn’t enough Watson for you, IBM also offers these 92 Watson APIs, conveniently housed in a public GitHub location. Clearly, Watson is much more than a single entity or thing these days. For data scientists or anybody building a cognitive application, it’s good to know that IBM is making some of its tools and services available for free downloads.