Enterprise AI and the Paradox of Accuracy
As more organizations begin to experiment with AI in their internal operations, an interesting paradox is developing. The best way to think about the state of AI is that it behaves as a very powerful “mimic.” If you show it example data or model an existing process, AI will be able to perform at massive scale, delivering dramatically increased throughput and reduced processing times. However, the most common concern with implementing AI revolves around the notion of “accuracy;” i.e., does it perform the task accurately enough to be useful.
The paradox is that as an intelligent mimic, AI often uncovers significant inconsistencies in the example data or the existing process it has been applied to. As a result, the conclusion is often “AI is not yet smart enough to perform this task,” when in reality AI is simply uncovering existing inconsistencies in the human processes being automated.
Let’s look at this paradox in practice.
While customer-facing applications of AI get a lot of the press, many businesses are looking to apply AI to less glamorous back-office operations; e.g.:
- Automating business processes with the goal of making them more efficient and freeing up staff to focus on higher value activities;
- Analyzing large volumes of existing enterprise content to gain new insights about market trends, sales opportunities, customer sentiment, insights, etc.
In both cases, existing approaches tend to be highly manual — humans making decisions based on their subject-matter expertise and the information they are able to consume. Whether reviewing market research data, analyzing contracts, responding to RFPs, or matching resumes to job descriptions, the accuracy and consistency of those decisions is never objectively measured. Instead there is a heavy reliance on gut instinct and subjective judgement along the way.
When you introduce AI, rigorous measurement of the accuracy and consistency of these decisions is a given. As users create models and train them, they can immediately see how well they perform against the target outcome – what’s been defined by subject matter experts as the right answer or decision. They can see exactly how much it improves as the model is trained.
Yet, it’s these detailed measurements that makes many users pause. Given the choice to go with their gut sense vs. a machine learning model that performs at 65% or 75% or 85% accuracy, users often opt to stick with the manual process because the machine learning model is “not accurate enough” in their view.
Many users have unrealistic/inflated expectations for AI. They think it’s a magic solution to their data problem and are disappointed when it is not 100% perfect. When their disappointment exceeds their enthusiasm, projects tend to lose momentum quickly.
This problem is exacerbated when users are not focused on a business outcome. They look at the accuracy of their AI project in a vacuum vs. how much more accurate it is / or how much time it will save, compared to what they are currently doing manually.
What’s interesting is how AI becomes a forcing function for people to iron out their existing understanding of a problem. Often times, the problem they think they are solving and the problem they ultimately solve end up being quite different. This is because successful AI projects require a careful defining of inputs and outputs in order to achieve success. If either are poorly defined, then the results are disappointing.
The other key challenge is the notion of accuracy itself. In the machine learning world, data scientists generally bristle when they hear people ask about a model’s “accuracy,” as there isn’t a single metric whereby the efficacy of a machine learning algorithm is measured. Indeed, accuracy is a uniquely poor metric of a machine learning algorithm’s efficacy.
Rather, efficacy should be calculated in the context of the problem being solved. Specifically, the business owners and subject matter experts involved in the evaluation of AI’s impact on a particular use case need to first understand the impact of the different types of errors an AI model may produce.
This is often referred to as the “precision-recall tradeoff.” The question to business process owners is posed as follows: Is it better for your AI model to flag anything which *could* be an issue (recall-biased), or to only flag instances which are certainly issues (precision-biased)? Understanding these tradeoffs at the outset will help determine training strategies and remediation processes that need to be put in place when errors arise.
There are few key success factors that can help users and solution providers overcome these inflated expectations:
- Treat AI like other technology projects – not a godsend. IT projects are not perfect. They require business context to solve business problems and produce business results.
- Identify a business outcome up front. That will help take the focus off the imperfection and put it on how AI can truly help advance the desired business outcome.
- Involve the business SMEs more – and innovation labs less. At its best and most useful, AI is augmenting what SMEs are already doing – making them more productive and making their gut instinct smarter and more accurate.
- Measure the efficacy of existing processes to understand what a reasonable target for the machine learning algorithm’s accuracy is. If your existing process is 90% accurate 85% accuracy could be a reasonable target, but if your existing process is only 20% accurate then expecting 85% accuracy is unreasonable.
- Provide and expect model “explain-ability.” Too many AI solutions operate as a black box. It’s essential that users are able to understand the efficacy of their models in a very detailed way. Decisions need to come with audit trails – why did the model arrive at this decision? What training data was used to arrive at this decision and who added it when? Make it clear what kinds of error modes the model is likely to experience and provide tools to explain the model’s “thought process” when making a decision.
By taking into account some of the predetermined misconceptions about AI and its accuracy, and taking steps to address them up front, we can turn our organizations’ attention to the real value and productivity improvements AI can deliver and start to put it to work for them.
About the Author: Tom Wilde has 25 years of experience in solving the complex problems of digital content top his role as CEO of Indico, which provides Enterprise AI solutions for intelligent process automation. Prior to Indico, Tom was the Chief Product Officer at Cxense (see-sense), a leading data management provider, founder of Ramp, an enterprise video content management company, and held senior roles at Fast Search, Miva Systems, and Lycos. Tom is a frequent industry contributor and earned his MBA in Entrepreneurial Management from Wharton.