Follow Datanami:
June 11, 2012

Big Data is All in the Semantics

Datanami Staff

Last week the annual Semantic Technology Conference wrapped up in San Francisco, leaving attendees with a wealth of information about the use of semantic technologies in healthcare, financial services, life sciences and beyond. From tapping this tech for sentiment analysis to wrapping semantics around complex events like disasters, this was a conference that was packed with real-world applications.

According to Luca Scagliarini, Vice President of Strategy at Expert System, a small company trying to make big waves in the semantic field, the set of technologies around semantic understanding of data “are made for the growing amounts of diverse types of information that fall outside the realm of ERP, CRM and other structured formats/ The advantage of the semantic approach is its ability to provide access to unstructured information that, when integrated into your existing data, adds great depth and insight.”

Expert System, which got its start in Italy in 1989, actually found its way onto the desktop systems of millions with its early Microsoft Office partnership. The company integrated their linguistic and semantic technologies into the software suite and has since gone on to reach a number of Fortune 500 customers via its flagship Cognito semantic platform.  Chevron, Raytheon, ANA, and Telecom Italia are among the company’s top-tier clients.

We caught up with Scagliarini and the company’s CTO, Marco Varone to discuss what users require from semantic platforms, how they are being used, and how a smaller company  with a pure-play offering can compete with the titans who are integrating semantic tools into their exiting analytics and BI platforms.

What are some of the most unique uses of semantic technologies (using unstructured data) you’ve seen to date?

We are seeing a lot of innovative ways to effectively use unstructured information inside the enterprise. Often these applications are a bit obscure for the average person. For example, instead of going through the cumbersome process of manually adding metadata to any document or presentation produced, we are seeing applications for sharing and transferring knowledge within a company using automatic tagging and categorization.

We are also seeing innovative and (finally) strategic use of social media information taken from the communication “streams” (as I like to call them) between customers and the company. This is not sentiment analysis, but much more. By mining social media, RSS feeds and other sources, our customers are using semantics to identify signals that market dynamics are changing. In turn, they are able to measure the impact of marketing campaigns or to manage global supply chain risks, both strategic activities.

In a recent news release, you mention that you are building on the “strength” of Hadoop, but that it lacks the ability to let users tap into analytics/NLP capabilities. Can you explain this in more detail?

Traditionally Hadoop is a good approach to structured information. But today’s big data is more than rows and columns, and a lot of insight can be derived from text. For those who are using Hadoop (or are planning to), semantic components are now available (and easily integrated) for analytics, mining and natural language processing.

How does a platform like your disambiguate text in real-time—what powers your platform’s ability to be accurate in other words?

Our ability to be accurate is due to the strength of our semantic engine—which processes text and performs morphological, grammatical and logical analysis (technically, it is a deep parser)—and the richness of our proprietary semantic network.  Our semantic network includes all the concepts of a language and all the relationships that exist between these concepts in a language. Our semantic network has more than 400,000 concepts and 2 million links (relationships) for the English language alone (and English is only one of the languages we cover), and a rich set of attributes for each concept. Both are optimized to ensure fast analysis and high precision (more than 90% correctness) in disambiguation.  (Including the below graphic…)

How will you compete with larger analytics platform vendors that already have text analytics integrated and ready to go?

Applying full semantic analysis and disambiguation allows you to achieve better performance on all of the text analysis activities (extraction and categorization) and also on search. This is very visible on tests comparing our semantic approach to statistic or even linguistic approaches that most traditional technologies use. In addition, our technology is easy to integrate into an existing infrastructure (Microsoft SharePoint, for example) and this lowers the risk for customers to choose us. 

To review some of the highlights of the Semantic Technology Conference where Expert System and others presented and exhibited platforms and semantic tools, you can stop by the main site’s listing of the week’s topics.