Follow Datanami:
December 2, 2021

AWS Announces Six New Amazon SageMaker Capabilities

LAS VEGAS, Dec. 2, 2021 — Wednesday, at AWS re:Invent, Amazon Web Services, Inc. (AWS), an Amazon.com, Inc. company, announced six new capabilities for its industry-leading machine learning service, Amazon SageMaker, that make machine learning even more accessible and cost effective. Today’s announcements bring together powerful new capabilities, including a no-code environment for creating accurate machine learning predictions, more accurate data labeling using highly skilled annotators, a universal Amazon SageMaker Studio notebook experience for greater collaboration across domains, a compiler for machine learning training that makes code more efficient, automatic compute instance selection machine learning inference, and serverless compute for machine learning inference. To get started with Amazon SageMaker, visit aws.amazon.com/sagemaker.

Driven by the availability of virtually infinite compute capacity, a massive proliferation of data in the cloud, and the rapid advancement of the tools available to developers, machine learning has become mainstream across many industries. For years, AWS has focused on making machine learning more accessible to a broader audience of customers. Today, Amazon SageMaker is one of the fastest growing services in AWS history with tens of thousands of customers, including AstraZeneca, Aurora, Capitol One, Cerner, Discovery, Hyundai, Intuit, Thomson Reuters, Tyson, Vanguard, and many more customers who use the service to train machine learning models of all sizes, some of which on the extreme now consist of billions of parameters capable of making hundreds of billions of predictions every month. As customers further scale their machine learning model training and inference on Amazon SageMaker, AWS has continued to invest in expanding the service’s capability, delivering more than 60 new Amazon SageMaker features and functionalities in the past year alone. Today’s announcements build on these advancements to make it even easier to prepare and gather data for machine learning, train models faster, optimize the type and amount of compute needed for inference, and expand machine learning to an even broader audience.

  • Amazon SageMaker Canvas no-code machine learning predictions: Amazon SageMaker Canvas expands access to machine learning by providing business analysts (line-of-business employees supporting finance, marketing, operations, and human resources teams) with a visual interface that allows them to create more accurate machine learning predictions on their own—without requiring any machine learning experience or having to write a single line of code. As more companies seek to reinvent their businesses and customer experiences with machine learning, more people in their organizations need to be able to use advanced machine learning technology across different lines of business. However, machine learning has typically required specialized skills that can require years of formal education or intensive training with a challenging and evolving curriculum. Amazon SageMaker Canvas solves this challenge by providing a visual, point-and-click user interface that makes it easy for business analysts to generate predictions. Customers point Amazon SageMaker Canvas to their data stores (e.g. Amazon Redshift, Amazon S3, Snowflake, on-premises data stores, local files, etc.), and the Amazon SageMaker Canvas provides visual tools to help users intuitively prepare and analyze data. Amazon SageMaker Canvas then uses automated machine learning to build and train machine learning models without any coding. Business analysts can review and evaluate models in the Amazon SageMaker Canvas console for accuracy and efficacy for their use case. Amazon SageMaker Canvas also lets users export their models to Amazon SageMaker Studio, so they can share them with data scientists to validate and further refine their models.
  • Amazon SageMaker Ground Truth Plus expert data labeling: Amazon SageMaker Ground Truth Plus is a fully managed data labeling service that uses an expert workforce with built-in annotation workflows to deliver high-quality data for training machine learning models faster and at lower cost with no coding required. Customers need increasingly larger datasets that are correctly labeled to train ever more accurate models and scale their machine learning deployments. However, producing large datasets can take anywhere from weeks to years and often requires companies to hire a workforce and create workflows to manage the process of labeling data. In 2018, AWS launched Amazon SageMaker Ground Truth to make it easier for customers to produce labeled data using human annotators through Amazon Mechanical Turk, third-party vendors, or their own private workforce. Amazon SageMaker Ground Truth Plus expands on this capability with a specialized workforce with specific domain and industry expertise, as well as qualifications to meet customers’ data security, privacy, and compliance requirements for highly accurate data labeling. Amazon SageMaker Ground Truth Plus has a multistep labeling workflow that includes pre-labeling powered by machine learning models, machine validation of human labeling to detect errors and low-quality labels, and assistive labeling features (e.g. 3D cuboid snapping, removal of distortion in 2D images, predict-next in video labeling, and auto-segment tools) to reduce the time required to label datasets and help reduce the cost of procuring high-quality annotated data. To get started, customers simply point Amazon SageMaker Ground Truth Plus to their data source in Amazon Simple Storage Service (Amazon S3) and provide their specific labeling requirements (e.g. instructions for how medical experts should label anomalies in radiology images of lungs). Amazon SageMaker Ground Truth Plus then creates a data labeling workflow and provides dashboards that allow customers to follow data annotation progress, inspect samples of completed labels for quality, and provide feedback to generate high-quality data so customers can build, train, and deploy highly accurate machine learning models more quickly.
  • Amazon SageMaker Studio universal notebooks: A universal notebook for Amazon SageMaker Studio (the first complete IDE for machine learning) provides a single, integrated environment to perform data engineering, analytics, and machine learning. Today, teams across different data domains want to collaborate using a range of data engineering, analytics, and machine learning workflows. The practitioners of these domains often cross areas of knowledge like data engineering, data analytics, and data science and want to be able to work across the various workflows without needing to switch data exploration tools. However, when customers are ready to integrate data across analytics and machine learning environments, they often have to juggle multiple tools and notebooks, which can be cumbersome, time consuming, and prone to error. Amazon SageMaker Studio now allows users to interactively access, transform, and analyze a wide range of data for multiple purposes all from within a universal notebook. With built-in integration with Spark, Hive, and Presto running on Amazon EMR clusters and data lakes running on Amazon S3, customers can now use Amazon SageMaker Studio to access and manipulate data in a universal notebook without having to switch services. In addition to developing machine learning models using their preferred framework (e.g. TensorFlow, PyTorch, or MXNet) to build, train, and deploy machine learning models in Amazon SageMaker Studio, customers can browse and query data sources, explore metadata and schemas, and start processing jobs for analytics or machine learning workflows—without leaving the universal Amazon SageMaker Studio notebook.
  • Amazon SageMaker Training Compiler for machine learning models: Amazon SageMaker Training Compiler is a new machine learning model compiler that automatically optimizes code to use compute resources more effectively and reduce the time it takes to train models by up to 50%. Today’s state-of-the-art deep learning models are so large and complex that they require specialized compute instances to accelerate training and can consume thousands of hours of graphical processing unit (GPU) compute time to train a single model. To further accelerate training times, data scientists typically try to augment training data or tune hyperparameters (variables that govern the machine learning training process) to find the best performing and least resource-intensive version of a model. This work is technically complicated, and data scientists often do not have time to optimize the frameworks needed to train models to run on GPUs. Amazon SageMaker Training Compiler is a new machine learning model compiler that is integrated with the versions of TensorFlow and PyTorch in Amazon SageMaker that have been optimized to run more efficiently in the cloud, so data scientists can use their preferred frameworks to train machine learning models through more efficient use of GPUs. With a single click, Amazon SageMaker Training Compiler automatically optimizes the trained model and compiles it to execute training up to 50% faster.
  • Amazon SageMaker Inference Recommender automatic instance selection: Amazon SageMaker Inference Recommender helps customers automatically select the best compute instance and configuration (e.g. instance count, container parameters, and model optimizations) to power a particular machine learning model. For large machine learning models commonly used for natural language processing or computer vision, selecting a compute instance with the best price performance is a complicated, iterative process that can take weeks of experimentation. Amazon SageMaker Inference Recommender removes the guesswork and complexity of determining where to run a model and can reduce the time to deploy from weeks to hours by automatically recommending the ideal compute instance configuration. Data scientists can use Amazon SageMaker Inference Recommender to deploy the model to one of the recommended compute instances, or they can use the service to run a performance benchmark simulation across a range of selected compute instances. Customers can review benchmark results in Amazon SageMaker Studio and evaluate the tradeoffs between different configuration settings including latency, throughput, cost, compute, and memory.
  • Amazon SageMaker Serverless Inference for machine learning models: Amazon SageMaker Serverless Inference offers pay-as-you-go pricing inference for machine learning models deployed in production. Customers are always looking to optimize costs when using machine learning, and this becomes increasingly important for applications that have intermittent traffic patterns with long idle times. For example, applications like personalized recommendations based on consumer purchase patterns, chatbots fielding incoming customer calls, and forecasting demand based on real-time transactions can have peaks of activity based on external factors like weather conditions, promotional offerings, or holidays. Providing just the right amount of compute for machine learning inference is a difficult balancing act. In some cases, customers over-provision capacity to accommodate peak activity, which allows for consistent performance but wastes money when there is no traffic. In other cases, customers under-provision compute to constrain costs at the expense of providing enough compute capacity to perform inference when conditions change. Some customers try manually adjusting computing resources on the fly to accommodate changing conditions, but this is tedious and manual work. Amazon SageMaker Serverless Inference for machine learning automatically provisions, scales, and turns off compute capacity based on the number of inference requests. When customers deploy their machine learning model into production, they simply select the serverless deployment option in Amazon SageMaker, and Amazon SageMaker Serverless Inference manages compute resources to provide the precise amount of compute needed. With Amazon SageMaker Serverless Inference, customers only pay for the compute capacity they use for each request and the amount of data processed, without having to manage the underlying infrastructure.

“Customers across all industries and sizes are excited about how Amazon SageMaker has helped them scale their use of machine learning such that it has become a core part of their operations and allows them to invent new products, services, and experiences for the world,” said Bratin Saha, Vice President of Amazon Machine Learning at AWS. “We’re excited to expand our industry-leading machine learning service to an even broader group of customers, so they too can drive innovation in their business and help solve challenging problems. With these new Amazon SageMaker tools, we’re introducing a whole new group of users to the service while also providing additional capabilities for existing customers to make it easier to transform data into valuable insights, accelerate time to deployment, improve performance, and save money throughout the machine learning journey.”

About Amazon Web Services

For over 15 years, Amazon Web Services has been the world’s most comprehensive and broadly adopted cloud offering. AWS has been continually expanding its services to support virtually any cloud workload, and it now has more than 200 fully featured services for compute, storage, databases, networking, analytics, machine learning and artificial intelligence (AI), Internet of Things (IoT), mobile, security, hybrid, virtual and augmented reality (VR and AR), media, and application development, deployment, and management from 81 Availability Zones within 25 geographic regions, with announced plans for 27 more Availability Zones and nine more AWS Regions in Australia, Canada, India, Indonesia, Israel, New Zealand, Spain, Switzerland, and the United Arab Emirates. Millions of customers—including the fastest-growing startups, largest enterprises, and leading government agencies—trust AWS to power their infrastructure, become more agile, and lower costs. To learn more about AWS, visit aws.amazon.com.

About Amazon

Amazon is guided by four principles: customer obsession rather than competitor focus, passion for invention, commitment to operational excellence, and long-term thinking. Amazon strives to be Earth’s Most Customer-Centric Company, Earth’s Best Employer, and Earth’s Safest Place to Work. Customer reviews, 1-Click shopping, personalized recommendations, Prime, Fulfillment by Amazon, AWS, Kindle Direct Publishing, Kindle, Career Choice, Fire tablets, Fire TV, Amazon Echo, Alexa, Just Walk Out technology, Amazon Studios, and The Climate Pledge are some of the things pioneered by Amazon. For more information, visit amazon.com/about.


Source: AWS

Datanami