IBM’s Uncertainty Quantification 360 Toolkit Boosts Trust in AI
June 9, 2021 — In a blog post, Prasanna Sattigeri and Vera Liao, Researchers at IBM Research AI, discussed a new open source tool enabling data science teams to communicate uncertainty in machine learning models. The blog post is included in part below.
Would you feel safe in a self-driving car that confidently misidentifies the side of a tractor-trailer as a brightly lit sky and refuses to brake or warn the human driver? Probably not.
Unfortunately, such mishaps have indeed taken lives. AI systems based on deep learning have a reputation for making overconfident predictions, even when they are wrong—with serious consequences at times.
This is where Uncertainty Quantification (UQ) comes in—the tech enabling an AI to express that it is unsure, giving it intellectual humility and boosting the safety of its deployment. And this is what our Uncertainty Quantification 360 (UQ360) open-source toolkit is all about.
Released at the 2021 IBM Data & AI Digital Developer Conference, it’s aimed at giving data science practitioners and developers state-of-the-art algorithms to streamline the process of quantifying, evaluating, improving, and communicating uncertainty of machine learning models.
Building on the socially responsible tradition of other open source efforts released by IBM Research in the field of trustworthy AI — AI Fairness 360, AI Explainability 360, Adversarial Robustness 360, AI FactSheets 360 — it is the first comprehensive toolkit of its kind.
We invite you to use it and contribute to it.
The adverse effect of poor uncertainty
It’s not just about self-driving cars. There are many other applications where it is safety-critical for AI to express uncertainty. For example, a chatbot unsure when a pharmacy closes and providing a wrong answer may result in a patient not getting the medication they need.
Take sepsis. Many people die each year from complications of this disease. Early detection of sepsis is important, and AI can help – but only when AI predictions are accompanied by meaningful uncertainty estimates. Only then can doctors immediately treat patients that AI has confidently flagged as at risk and prescribe additional diagnostics for those that an AI has expressed a low level of certainty about. If the model produces unreliable uncertainty estimates, patients may die.
Common explainability techniques shed light on how AI works, but UQ exposes limits and potential failure points. Users of a house price prediction model would like to know the margin of error of the model predictions to estimate their gains or losses. Similarly, a product manager may notice that an AI model predicts a new feature A will perform better than a new feature B on average, but to see its worst-case effects on KPIs, the manager would also need to know the margin of error in the predictions.
High-quality uncertainty estimates and effective uncertainty communication can also improve human-AI collaboration.
Consider the following scenario: a nurse practitioner uses an AI system to help diagnose skin disease. If the AI’s confidence (the uncertainty estimate for a given diagnosis) is high, the nurse practitioner accepts the AI decision; otherwise, the AI recommendation is discarded, and the patient is referred to a dermatologist. Uncertainties serve as a form of communication between the AI system and the human user to achieve the best accuracy, robustness, and fairness.
Click here to read the full announcement.