How Big Data Is Changing How Businesses Use The News
In the old days, a folded copy of The Wall Street Journal was the sign of a well-informed executive. My, how times have changed. While the WSJ is still a good read, big data technology is super-charging how businesses use the news.
Some say that news isn’t big data. And while it may be true that the volume of machine-generated data dwarfs what mere mortals can create by hand, machines so far can’t give us the human context that a good reporter can. News, after all, is the first draft of history, as the 20th century journalist Alan Barth wrote, so if you want to know what people are up to, “the news” is a good place to start.
While the rise of the Internet and citizen bloggers has sapped professional journalists of much of their influence this new millennium, the Net has also given us access tomany more news sources than we could ever hope to get by reading paper. Now, businesses are figuring out how to gain advantage by pairing intelligent algorithms with this rich set of human-generated data.
One outfit on the cutting edge of news mining is Alchemy API, the Colorado-based company that was recently acquired by IBM. Earlier this spring it unveiled Alchemy Data News, a new service that continually scours the Internet for news, indexes it, and then allows users ask intelligent questions of that data.
“We joke that Alchemy reads the news so you don’t have to,” says Elliot Turner, founder and CEO of AlchemyAPI, and now an IBMer. “It’s really aimed at people who have a question but don’t have the data to answer it with. So instead of coming in with a document, I’m coming in with a question and I’m asking the world’s news and blogs that question.”
The service is different than standard search engines because of the intelligent way it extracts facts and then makes those facts searchable in a semantic manner. For example, say you wanted to know every politician who travelled to Europe in the past two weeks, or every company that’s been acquired or sued in the past two weeks. There are no ways to do those searches using traditional key-word search engines, Turner says, other than manually procuring lists of politicians and companies and then manually engaging in hundreds of thousands of searches.
Instead of taking that brute-force approach and hoping to find the needle in the haystack, Alchemy API uses a more intelligent approach that leverages the corpus of data that hard-working journalists have already assembled.
“We’re enabling semantic search. You can come in and ask much more semantic questions,” he says. “Even if the words ‘acquired’ or ‘sued’ aren’t in there, using these NLP [natural language processing] algorithms, we’ll find relevant content.”
Alchemy is continually checking 75,000 news and blog sites and indexing hundreds of thousands of articles per day, with the aim of making the facts included in those stories available to executives to make business decisions. “It allows me to do trend analysis, to extract facts, to get information about my competitors, without actually having someone tasked with going and reading that information,” Turner says. “We’re trying to enable the world’s news to be used as a programmatic signal for the world’s app developers.”
Early adopters include financial services companies who are looking for anything that impacts the risks or costs that they’re trying to manage or profit from. Retailers and logistics firms are also looking to utilize it to get a jumpstart on emerging trends.
Another big data firm that’s got its eyes on the world’s news sites is Avention. The company (which was formerly called OneSource) monitors about 30,000 news and blog sites as part of its data enrichment service, which also taps into about 70 other procured data sources, such as Dun&Bradstreet, Thomson Reuters, Morningstar, and LexisNexis.
As Ray Renteria, vice president of product at Avention, describes, the company does the hard work of monitoring various public and private data sources to give sales and marketing professionals an insider’s view into important activities that are occurring inside of companies and organizations.
“Companies use our application and our data to both prospect for opportunities and also to provide deep insight for the sales engagement portion,” he says. “When a sales rep picks up the phone to reach out to a customer, they have an understanding of that company that their competitors might not have. They will understand recent news and product launches. They’ll also have enough information to navigate the account by understanding…the corporate hierarchy.”
Like Alchemy API, Avention has done the hard work of ensuring that the data it procures for customers can be easily integrated and used as a signal for a predictive application. The aggregating and parsing of the news is critical for the “business signal development framework,” as the Massachusetts company calls it.
And while Alchemy and Avention are breaking down human-generated news into data that can be plugged into machines to automate various processes, other firms such are working the other way—namely, taking data collected by machines or people, and using that data to generate narratives fit for human consumption.
That’s part of what’s done in Quill, a data-understanding and storytelling system developed by an Illinois firm named Narrative Science. “It’s not just a natural language understanding platform–it’s a narrative generation platform, where the system itself understands what’s going on in the world, then explains what’s going on in the world,” says Narrative Science co-founder and chief scientist Kris Hammond.
Recent advances in the field of deep learning have made systems like Quill possible, he says. “For recognition-based learning systems, we have the ability for things to learn from massive amount of data, from massive number of examples, and huge propagation networks, and that’s terrific,” Hammond says. “It’s incredibly exciting to look at the world of big data and realize that we can now take…a combination of things we did in the past along with the analytic techniques that are currently in place to explain things based on this incredibly rich mass of data that people are amazingly frustrated with because of the gap between their data and their ability to understand it.”
There’s a lot going on in the world, and journalists and bloggers are often the first ones to tell us about it. Thanks to big data technology, it’s becoming more practical to incorporate that field intelligence–that first draft of human history, if you will–into the predictive models that executives increasingly rely on to make good, fast decisions.
This represents a radical re-thinking of business intelligence, according to Alchemy API’s Turner. “The idea of BI measuring what’s going on inside my business has been around for so long. I have sales performance, broken down by region, clicks on website, etc.,” Turner says. “But it only shows you part of the picture. How can I understand what’s going on with my funnel metrics if there’s something my competition has done outside my four walls, that’s in the world’s news but outside my BI system? Making this information accessible, making it so you can hyper segment based on time and space and topic and other things, we think, is going to disrupt what’s possible for allowing people to not only listen to what’s going on in public, but to use that information to drive automated processes.”