Follow Datanami:
March 31, 2021

AWS Adds Explainability to SageMaker


Amazon Web Services is adding an AI explainability reporting feature to its SageMaker machine learning model builder aimed at improving model accuracy.

SageMaker Autopilot now generates a model explainability report via SageMaker Clarify, the Amazon tool used to detect algorithmic bias while increasing the transparency of machine learning models. The reports would help model developers understand how individual attributes of training data contribute to a predicted result.

The combination is promoted as helping to identify and limit algorithmic bias and explain predictions, allowing users to make informed decisions based on how models arrived at conclusions, AWS said this week.

The reports also include “feature importance values” that allow developers to understand as a percentage the correlation between a training data attribute and how it contributed to a predicted result.

“The higher the percentage, the more strongly that feature impacts your model’s predictions,” the company said in a blog post unveiling the reporting capability.

The feature importance values can be accessed using the SageMaker Autopilot APIs.

The model explainability reports would also allow developers to remove less important attributes to accelerate model predictions. Further, developers could use Autopilot to check model accuracy and fairness by identifying those attributes causing bias and confirming their low priority as a model feature.

Along with greater model transparency, the new SageMaker tool is designed to allow model builders to automate the building, training and tuning of machine learning models based on an individual user’s data. It first loads tabular data from Amazon Simple Storage Service to train the model, then selects a target column for predictions. Once the preferred algorithm is selected, training and tuning are automated for the desired model.

A “model leaderboard” function allows users to optimize or retrain a model based on a ranked list of recommendations. Autopilot then deploys the model in production and monitors performance.

If data is missing, AWS said Autopilot fills it in, then automatically infers the types of predictions best suited to available data, be they binary classification, multiple-class or regression.

Autopilot can also be used to select advanced algorithms such as decision trees and deep neural networks. Models are then trained and tuned based on those algorithms to match models to available data.

The model leaderboard ranks all machine learning models generated by a user’s data, accounting for accuracy and precision, then deploys the best model for a given use case.

AWS said use cases for SageMaker Autopilot include price predictions for financial services and real estate, customer churn prediction and risk assessments.

Recent items:

AWS Bolsters SageMaker with Data Prep, a Feature Store, and Pipelines

An Open Source Alternative to AWS SageMaker

–Editor’s note: A previous version of this story incorrectly described SageMaker Autopilot. The new feature is the ability to produce explainability reports.