Mining New Opportunities in Text Analytics
Words. They’re piling up at an enormous rate, but what do they all mean? According to text analytics experts, in the rush to utilize AI, companies are missing the opportunity to use well-established techniques to harness the meanings, trends, and human emotions embodied in all those words, and in the process become a contextually driven enterprise.
Text analytics is a well-trod branch of data mining that essentially turns unstructured text into structured data, using natural language processing (NLP) and other techniques, so that it can be analyzed in an automated and scalable manner. While text is considered unstructured, there is an enormous amount of complexity and nuance contained in high-level human language, which makes text analytics extremely fertile ground for gleaning insights about people and what they’re thinking and feeling.
Companies have been mining text for decades to identify trends and uncover insights contained within large collections of words. While the technologies behind text mining have evolved a bit thanks to novel machine learning approaches, some of the most effective text mining techniques have not changed significantly in years.
“Don’t fall for the hype that AI will solve all of your text analytics needs,” writes Forrester analyst Boris Evelson in his June round-up of text analytics platforms. “Just the opposite; in this evaluation we found that rules still rule. Mostly rules-based text analytics platforms are much more accurate out of the box and require much less training than platforms based mostly on machine learning.”
While ML is bound to take over some of the work from rules-based approaches in the next three to five years, Evelson writes, today’s top-performing text-analytic platforms rely mostly on rules-based approached, with some machine learning mixed in.
That’s the approach advocated by SAS, which was identified as one of the leaders in the Forrester report, along with IBM, Clarabridge, and Micro Focus (via HP’s IDOL platform). According to SAS text analytics expert Mary Beth Moore, the SAS Visual Text Analytics product uses unsupervised machine learning algorithms to do the first pass on a group of words, and then relies on rule-based techniques to take the next step.
“The unsupervised piece is great for saying, mathematically here are the relationships between terms, but it’s not going to automatic say, this is why it’s important,” Moore tells Datanami. “It doesn’t really ‘mean’ anything else until you start putting in the subject matter expert, the business rules, form a taxonomy, and then you’re adding in some Boolean logic or supervised logic to complement that.”
SAS does a lot around automated rule generation, Moore says. “You can just click on terms and the system will automatically generate the rule on which to search,” she says. “It’s a huge step in overcoming the manual nature of the data scientist writing rules. That opens up text analytics applications to people beyond data scientists.”
Growing Use Cases
The text analytics market is projected to grow from $3.9 billion in 2017 to more than $9 billion in 2025, nearly an 18% compound annual growth rate, according to a recent report by QY Reports. Some of the most popular use cases for text analytics are in sales and marketing, such as predicting purchases or calculating lifetime value. It’s also useful in predicting fraud and identifying supply chain or human resources issues.
IBM offers what it dubs “cognitive mining” with its Watson Explorer Deep Analytics Edition to find insights hidden in data. The software giant says an insurance company that adopted it achieved 90% accuracy in automatically coding medical claims, which corresponded with a 30% increase in claim processing efficiency and a 20% reduction in mistakenly unpaid claims.
Sentiment analysis is one of the most popular use cases for text analytics, and one of the easiest to get started with, thanks to the wide availability of social media data. Consumers are more likely today to post a complaint on Twitter or Facebook than to contact the company directly, which makes these important channels to monitor with text analytics.
Companies that run call centers are also big users of text analytics, as well as the automated question-and-answer dialog that can be powered by more advanced forms of AI. But there are many ways that manufacturers, banks, and retailers have implemented text analytics, which is truly a flexible technology
But just like everything in data analytics, there’s no silver bullet in text analytics. Instead, success largely hinges on the ability of data scientists and business users to work together to achieve an end. While AI techniques like deep learning have automated some aspects of big data analytics, it has yet to make a big impact on text analytics.
Unfortunately, the hype around AI has overshadowed some of the potential use cases that could be enabled with existing text analytics strategies, she says. “It’s foundational,” SAS’s Moore says. “They overlooked it.”
Getting Started with Text Analytics
Companies that want to get started with text analytics should keep a few things in mind, according to Moore, who cut her teeth on text analytics as an intelligence analyst for the United States Marine Corps. For starters, they should identify what they need to know from the text before they start collecting it.
“The use cases for pulling pull insights from unstructured text are far and wide,” Moore says. “So once you start actually implementing it, it’s understanding the intent of the information you need to get out of the data, what you plan to do with it, and making sure that it’s connected to the end business user side.”
Once you understand your purpose and have charted some intended outcomes from the analytics, the sky is the limit, Moore says. “From there you can scale your use case,” she says. “You can keep it as narrow as makes sense or you can expand it and scale it an enterprise level. The process behind it is the same.”