Follow Datanami:
June 10, 2020

AWS Upgrades SageMaker Labeling Tool

via Shutterstock

Amazon Web Services has added a 3D visualization capability to its SageMaker data labeling tool used to build training data sets for machine learning models.

AWS said this week its SageMaker data labeling service called Ground Truth introduced in 2018 now includes a workflow for labeling of point clouds, a set of data points generated by tools like 3D scanners or Lidar sensors. Among the applications is labeling huge 3D data sets used to train models incorporated into self-driving car navigation systems. Those data sets can grow to hundreds of megabytes, making labeling extremely arduous.

The new 3D point cloud labeling tool is billed as a custom workflow that includes a built-in editor and new “assistive” labeling features, the company (NASDAQ: AMZN) said in a June 10 blog post.

SageMaker Ground Truth helps users label data used for machine learning models. Customers can choose either automated labeling, where a machine learning algorithm assists with the labeling, or they can choose to tap into a pool of human labelers, such as its Mechanical Turk crowdsourced service.

Ground Truth now incorporates a sensor fusion feature that allows it to synchronize a point cloud with up to eight cameras. Users can then interchangeably apply labels to 2D images and 3D point clouds.

After storage, input data for 3D point clouds must be described in what is known as a manifest file. Once prepared, AWS said its Ground Truth tool allows users to create task types like object detection, object tracking and semantic segmentation, which involves partitioning the points in a cloud frame into pre-defined categories.

Another challenge are data sets containing a mix of 3D Lidar data and two-dimensional camera images. Those data and images must be synchronized so data scientists can map 3D points and 2D coordinates on images captured by on-board cameras.

Stitching together the outputs requires a “global coordinate system” that accounts for the location of cameras on a car and where cameras are pointed. From there, the AWS tool computes the coordinates of all data points in the 3D cloud.

Information including vehicle position, the location of Lidar data and images in AWS storage is also saved to the manifest file, at which point the SageMaker managed service automates what would otherwise be a “significant workload” for machine learning model developers, the company said.

Source: AWS

In a self-driving vehicle demonstration, AWS illustrated how the Ground Truth tool can be used to automatically label key video frames via the assistive labeling feature.

AWS said the point cloud labeling feature is now available on SageMaker Ground Truth in several North American, European and Asia Pacific regions.