Follow Datanami:
December 6, 2011

Digital Reasoning Refreshes Text Analytics

Datanami Staff

This week another analytics company raked in a fresh round of funding. This time it was entity-oriented text analytics software provider, Digital Reasoning. While the amount has not been publicly stated, the small Tennessee-based company has scored some major federal contracts since 2004—and with the new investment claims it will be able to push smarter text analytics into the enterprise realm.

Digital Reasoning’s tagline is that they are providers of “automated understanding for big data” which is a direct reference to the core of their business, the Synthesys technology. Their focus on entity-oriented analytics means that they are also focusing on answering questions that haven’t been asked yet, a capability that holds great promise in the big data era.

According to the company’s CEO and founders, Digital Reasoning was founded with the idea that software should be intelligent enough to both read and understand text as humans do. They put their work toward this goal into play to comb through vast amounts of unstructured textual data from U.S. intelligence agencies to look for potential threats to national security.  This purpose-built software for entity-oriented analytics became the core of the Synthesys technology at the root of their business.

Synthesys has four main components, including the data ingestion platform, a scalable infrastructure piece, the entity-oriented analytics engine, and the visualization/workflow elements. The software pulls in both structured and unstructured text data and then looks for predefined elements in time, geographic and other contexts to come up with an “understanding” of those entities and their relationships to…well, the universe.

In essence, these bits of understanding allow users to extract at another level to determine further relationships and see what is important. As the company claims, it “automates this understanding of large complex data by eliminating the requirement for taxonomy/ontology and combining a number of formerly point solutions into an integrated entity oriented analytics solution.”

As noted, the work was born out of work for U.S. intelligence operations. The named entity recognition, geo and temporal “reasoning”, contextual search, link analysis and foreign language support made it rather attractive, especially given the fact that much of the data that required textual analysis was in Arabic and other languages.

According to a spokesman from the U.S. Army Biometrics Intelligence Program, during the early testing of the software “against classified message traffic, the system was used against terrorist data to explore extracted concepts. With little training the system was cross-linking complex names, locations and events.” This meant that the team was not only able to find answers to questions they had in general, but could also find new connections that allowed them to ask questions they never might have thought of.

Part of the technical appeal of the company’s offering is that it plays well with frameworks that are already in place for dealing with large volumes of unstructured data. It integrates with Hadoop, Hbase,  Cassandra and other architectures.