In Automation We Trust: How to Build an Explainable AI Model
As AI becomes more advanced and complex, the algorithms and logic powering it become less transparent. This lack of clarity can be unnerving for some people. Recent high-profile AI failures illustrate this. For example, Amazon built AI-powered recruiting software to help review and recommend applicant resumes, but failed to recognize that the data feeding the algorithm was biased toward male job seekers. Apple and Goldman Sachs used machine learning to determine creditworthiness of Apple Card applicants—only to realize that the credit history data the algorithm was based on is inherently sexist.
Amazon and Apple learned in a very public way that AI models are only as good as the data that you feed into them, and if the data is compromised, discriminatory or otherwise flawed, you’re not going to get accurate results.
Consumers and regulators are catching on. Research suggests that consumers are more likely to use AI if they are able to even slightly modify its algorithms, and they’ll be more likely to trust the algorithms if they are able to play a role in its decision-making processes and have some control over its outcomes—especially when those decisions are derived from their own user data. They’re pressuring for a right to explanation to be given for the outcome of an algorithm and investigating new methods for delivering explainable AI or bringing the human-in-the-loop to get deeper control of the algorithms and the data that drives their outcomes.
This change is crucial. It’s human nature to forgive human error, and hold AI to a much higher standard. Machines are supposed to be infallible, and they’re often marketed that way. Autonomous driving is a great example.
According to the National Highway Traffic Safety Administration, 37,000 Americans died in car accidents in 2018—only one of which involved an autonomous vehicle. Yet, Americans continue to show distrust in self-driving technologies because they’re unfamiliar with how they work. Even more importantly, today, when a car accident happens, the driver is the primary responsible party. Until manufacturers can explain what happened in each and every accident with their cars and take accountability when they happen, the dream of a self-driving car cannot become a reality.
Explainable AI Provides Transparency and Builds User Trust
In order for AI to truly succeed on a large scale, consumers need to trust the algorithms that power the decision-making process. This requires transparency into AI models and the data that is being fed into the algorithms, giving users peace of mind that their personal data is being used appropriately to inform and improve decisions.
This can be achieved using explainable AI. From a backend development perspective, explainable AI allows model developers, business users, regulators and end users to better understand why certain predictions are made and to course correct as needed. Further, it enables them to correct and improve their models before they’re deployed at scale.
If Amazon or Apple had built explainable AI into their development processes, they likely would have identified the inherent bias in their data and been able to amend the weight of certain criteria to get a more equitable, desired result. They didn’t, and so faced very public, very embarrassing AI fails.
Here are five things to keep in mind when implementing explainable AI in your AI models to gain user trust:
1. Establish Principles for Algorithmic Accountability
A car manufacturer is not sued for injuries or deaths as a result of speeding. A maker of kitchen knives is not held accountable if one of their products results in a chopping accident during dinner prep. Humans make the poor decisions that lead to those actions, not the manufacturer or designer.
AI changes that dynamic. If AI is providing guidance to the user, the responsibility continues to sit with the user. If AI is automating the decision without human input, there is a shift of responsibility. Data scientists and data engineers need to understand that any decision a machine makes is ultimately their responsibility, and they need to take that responsibility to heart. Their employers need to realize that as well, and establish the corresponding ethics, code of conduct and literacy.
2. Ensure Quality of Your Data
In many cases, the source of the problem is not in the algorithm, but the algorithm bias. This happens when the data that is informing the decision-making process isn’t reliable, accurate or even fair. This can happen, for example, because it was inspired by a human decision that was privileging one arbitrary group of users over others. The damage done by AI can be huge because AI can turn wrong or unfair decisions into something that becomes systemic, automatic and can run at potentially unlimited scale.
Organizations need to understand that data quality drives AI successes or failures. Data could even be used to cheat an algorithm to drive wrong decisions, to the same extent that a malicious computer virus can take control over your computer. Organizations urgently need to put in place an intensive data resiliency program, using powerful data quality engines to test, monitor, and tear apart the data in a million different ways to determine its quality. Run every test, assess every scenario, figure out if any wrong decisions could be made, how they are made, and how you can alleviate those unintended results. Test, test, and test again until all scenarios are exhausted. You simply cannot afford to put something into production that turns off users, embarrasses your organization or causes harm.
3. Build Explainable AI Into Your Agile Development Process
Agile enables a continuous development process while putting quality controls in place throughout the software lifecycle. These quality controls should have explainable AI built into the process. Do this from the very beginning of your software selection process and product development and then follow through as your data product goes through updates and adds features and functionality. As each build is tested for code errors and customer experience bottlenecks before it is put into production, it should also be tested for explainable AI. Run data through the updated software to ensure the expected and desired results continue to be produced—and then make any necessary changes. Note that if your business is being regulated by GDPR, this is not only a best practice, but a compliance mandate as stated in Article 22, “automated individual decision-making, including profiling.
4. Ensure There’s a Human in the Loop
Machines don’t feel remorse, regret, or any responsibility. It is critical that some sort of human touch is inserted somewhere in the AI loop. Not to check for errors or to slow down the process, but to act as the coach in the learning process for inferring a fair decision or to act as the conscience and the controller of a decision. Someone needs to be asking why a decision is being made and the ethical implications of having a machine take an autonomous action.
5. Be Transparent About the Data Being Used to Inform Decisions
Transparency is absolutely critical when building user trust in your AI model. Everyone that could be impacted by your algorithms, including consumers when decisions are inferred with personal data, need to have a way to understand the decisions being made by your data product. You don’t necessarily need to publish your algorithm—that would be giving up intellectual property and likely would just confuse users—but you should bring clarity on the underlying decision path and related data drivers.
What’s interesting in the Apple credit card example is neither Apple nor Goldman Sachs intended to discriminate against women, but the reality is that they did it at scale. Although the problem was reported as a consumer discrimination scandal, this is also an example of a high tech leader–and a poster child company in the area of AI and machine learning–losing control of how its business operates.
Statistician W. Edwards Deming said, “In God we trust, all others must bring data.” He was highlighting the importance of statistical measurement and analysis to drive better, transparent and more informed decisions. Now, data professionals start to realize that this brings trust on their shoulders. We need to bring clarity in our data pipelines and data-driven processes. Does your organization have a code of conduct and related controls for your data-driven initiatives?
About the author: Jean-Michel Franco has dedicated his career to developing and broadening the adoption of innovative technologies and is currently the Senior Director of Product Marketing at Talend. He is an expert of GDPR and data privacy, working on the front lines with Talend’s clients every day to prepare them for compliance and learn the new roles employees will take in relation to data use. Prior to Talend, he created and developed a business intelligence practice for HP (formerly EDS) and worked for SAP EMEA as Director of Marketing Solutions in France and North Africa and later Innovation Director for Business & Decision.