AWS Takes the ‘Muck’ Out of ML with SageMaker
Amazon Web Services aims to take the “muck” out of machine learning with SageMaker, a new end-to-end machine learning and deep learning stack unveiled today at the AWS re:Invent conference.
Machine learning has progressed tremendously over the years, AWS CEO Andy Jassy said during this morning’s keynote address from Las Vegas, Nevada. But it’s still mainly the domain of highly trained data scientists working at large technology companies.
The number and complexity of steps involved — from preparing the data to choosing the algorithm to training the model to tuning the hypermeters to putting it into production at scale and then retraining the model – is daunting, Jassy said. “When you’re trying to build a machine learning model, it’s not easy,” he said. “There are blockers every step of the way.”
First, you have to figure out how to aggregate all your data, Jassy said. “You’ve got to have some way to visualize and explore your data so you know which algorithms might be worth training. You have to figure out how to pre-process the data so the algorithm can do the work.”
Then you have to choose which algorithm framework to use. “And there are lots of choices,” he said. “The reality, in the real world, is the first algorithm you choose often is not the one that works. So you have to try a lot of these, and that’s a lot of work.”
Then when you choose an algorithm, you have to train the data. “And even relatively small models consume a lot of compute,” Jassy continued. “And you have to figure out how to reliability deploy it. It’s hard. Must companies who do machine learning today have separable teams just managing the training environment.”
Once you’ve selected the right algorithm and trained it on some data, then you have to tune it. “It turns out doing that tuning is quite difficult,” Jassy said. “There are thousands — and in some cases in the largest models, millions — of parameters you have to tune. And you usually start off with something random, and then there’ a lot of trial and error and a lot of hit or miss and it takes lot of time.”
When you have a tuned, usable model, now you have to figure out how to deploy that model, “which is a different set of computer science skills,” Jassy said. “Then you have to figure out how to manage that at scale and run the infrastructure.
“That’s a lot of challenges just to build machine learning,” he continued. “Everyday developers just throw up their hands in frustration. It’s just too much work. I want to do it, but don’t have enough time to do all those things and to have to do all that heavy lifting that’s required.”
Enter the SageMaker
AWS’s answer to this dilemma is SageMaker, which Jassy described as “an easy to way to build, train and deploy models for everyday developers.”
The SageMaker service includes several components. The key ones include:
- a Jupyter-based data science notebook for creating machine learning workflows;
- a selection of 10 pre-selected and pre-configured machine learning algorithms, such as K-means cluster and factorization machines;
- native and pre-configured integration with TensorFlow and MXnet;
- automated, one-click deployment to EC2 resources, including fast GPU instances;
- automated tuning through the Hyperparameter Optimiztion (HPO) service;
- AB testing capabilities;
- Health check monitoring, security, and routine maintenance.
Matt Wood, who heads up machine learning at the Seattle, Washington company, said that SageMaker “takes away most of the muck of machine learning.”
AWS has done “a ton of work” to natively optimize TensorFlow and MXnet into the SageMaker system, Jassy said. “So again you don’t have to worry about the behind-the-scenes setting up of the framework,” he said. “It’s all set up for you in SageMaker.”
The new HPO service “uses machine learning to inform the machine learning model,” Jassy said, and can eliminate the need to hand-tune upwards of a million individual hyperparameters in a neural network.
“What it means for machine learning model builders now is you don’t have to worry about the tuning of the parameters,” Jassy continued. “You just have to worry about should I change the amount and should I change the type of data. This is a huge weight off of builders’ back.”
SageMaker is modular and extensible, AWS says. Users can select pre-built Juypter-based data science notebooks, or developer their own. Likewise, customers can use the pre-built algorithms and frameworks, or develop their own algorithms. Tensorflow and MXnet are the deep learning frameworks supported today, but the company plans to natively support all of them in the future, Jassy said.
Jassy said the performance of the 10 pre-configure machine learning algorithms will exceed any other cloud service as a result of the work that AWS has done to optimize the performance. Eight of the 10 algorithms run 10x faster “than you’ll find anywhere else, and 2 run them run 3x faster,” he said.
The way AWS got those performance figures is to configure the algorithms so that they can run fast by looking at the data only once, instead of making multiple passes at it. “They only need to make one pass through that data, even if it’s petabytes in size,” he said.
The service is open, Jassy said. SageMaker customers can develop and run the machine learning applications on SageMaker, develop them on SageMaker and run somewhere else, or develop them somewhere else and run them on AWS.
“This is a big deal for everyday developers and scientists in machine learning,” Jassy said. “This should make it much more accessible to customers.”
For more info on SageMaker, see this AWS blog post.