Follow Datanami:
March 18, 2013

High Performance Big Data Use Cases

Isaac Lopez

Big data means a lot of things to a lot of different people, but what is becoming increasingly clear as the largest market players strategies start to unfold, big data is about real-time analysis and data driven decision-making. 

As the saying goes, “time is money,” and big data installations aren’t cheap.  IBM’s James Kobielus published an article recently detailing the “hardcore big data use cases,” where big data is being applied at the extreme scale. 

Datanami examines these use cases here, giving enterprise managers food for thought in examining the maximization of systems and application ROI in the big data world.

 

—Start— Whole Population Analytics–>

Whole Population Analytics

Imagine being able to do real time, interactive analytics on an entire customer base.

“Until big data came our way,” says Kobielus, “few data scientists have had the luxury of being able to amass petabytes of data on every relevant variable of every entity in the population under study.”  This, however, is changing, he notes, as tools like Hadoop gain in popularity and the price of storage, processing and bandwidth decrease. 

“We will be able to do true 360-degree whole-population analysis,” says Kobielus, discussing the ability to have interactive access to the entire population of analytical data, rather than samples, subsets, or slices.

—Next— Microsegmentation Analytics—>

Microsegmentation Analytics

“Storing petabytes of data and having it accessible in real time means you can gain an ‘X-ray view’ of what’s going on inside their heads,” says Kobielus in explaining Microsegmentation Analytics.  It’s the logical next step to doing whole population analytics, giving enterprises the ability to do segmentation by sentiments, propensities and experiences.

“Being able to drill into the entire aggregated population of, says customer data, including rich real-time behavioral data,” says Kobielus, enterprises will be able to “do more fine-grained target marketing, nuanced customer experience optimization, and context-sensitive next best action.”

Long tail analytics also become more powerful, explains Kobielus, giving you access to overlooked product niches of keen interest to specific customer segments.

 

—Next— Behavioral Analytics—>

Behavioral Analytics

A broad field, with a wide range of applications, behavioral analytics provides a powerful use case for and extreme scale use case where real time analytics is necessary.

Social graph analysis is the most well-known example of behavioral analytics, says Kobielus, which powers an array of applications where complex behavioral patterns must be rapidly identified, including:

  • Anti-fraud
  • Influence analysis
  • Sentiment monitoring
  • Market segmentation
  • Engagement optimization
  • Experience optimization
  • And others…

“Graph models are powerful enablers for fine-grained predictive modeling of human behaviors because they help identify the likely behaviors of individuals in their fuller context of groups, relationships, and influence,” explains Kobielus.

By examining individual behaviors, organizations are able to predict responses and optimize outcomes.

—Next— Unstructured Analytics—>

Unstructured Analytics

Easily, the most talked about use case for big data is unstructured analytics, the “Variety” branch of the “Three V” triumvirate.  

“The sheer size of unstructured formats, compared to structured relational data, makes managing it a big-data core use case from the word ‘go’,” says Kobielus. The list for this type of data is as exhaustive as a Bubba Gump menu:

  • Enterprise content management systems data
  • Social media data
  • Text data
  • Blog data
  • Log data
  • Sensor data
  • Event data
  • RFID data
  • Imaging data
  • Video data
  • Speech data
  • Geospatial data
  • More…

Gaining semantic insights into all this data is resource heavy, and require resource intensive processes such as natural language processing, text mining, and machine learning.   

—Next— Multistructured Analytics—>

Multistructured Analytics

If getting real time semantic information from a stream of unstructured data wasn’t process intensive enough, putting that data into a contextual framework that includes several other types of both structured and unstructured data creates a host of new resource intensive challenges.

Multistructured analysis, says Kobielus, refers to applications that require unified discovery, acquisition, storage, management and analysis of all data types. 

“For example, customer influence analysis often needs to mine unstructured social media alongside semi-structured call-center logs, structured transaction data and various geospatial coordinates.”

The benefits of this type of analysis is the creation of powerful relationship graph models for behavioral segmentation, as well as a deeper appreciation for customer awareness, sentiments, and propensities, says Kobielus.

—Next— Temporal Analytics—>

Temporal Analytics

Data happens in real time and often has semantic meaning based on what is happening, when it is happening.  Temporal analytics aims to corral and correlate that data providing a window of analysis as it relates to different aspects of time: historical, current, and predictive.

“Businesses require a 360-degree view of the world through the customer’s eyes that is updated moment-to-moment,” explains Kobielus. “Ideally, you’ll need to roll up a unified view that combines everything you already know about the customer with everything new that you can glean from their real-time online behavior, plus everything that you can predict about their likely behavior under various future scenarios.”

The compute power needed for this is considerable. Kobielus explains that these real time customer experience optimizations require an automated decision-making process that leverages several different data-streams, including historical transactions, real-time clickstreams, and predictive behavioral processes that continuously tune what the customer sees.

—Next— Multivariate Analytics—>

Multivariate Analytics

Multivariate statistical analytics is a huge field with high performance implications.  Enterprises using multivariate analytics are looking for detailed, interactive, multidimensional statistical analysis, and trying to find correlation between the variables.

“This requires a big data platform that can execute these models in a massively parallel manner,” says Kobielus.

Included in this category are the following uses:

  • Aggregation
  • Correlation (historical and current data)
  • Modeling & Simulation
  • What-if analysis
  • Forecasting of alternative future states
  • Semantic exploration of unstructured data (Including real time streaming & multimedia)

“Regression analysis, market basket analysis and other mainstays of advanced analytics all fall into this category,” says Kobielus.

—Next— Multi-scenario Analytics—>

Multi-scenario Analytics

In the 1983 movie, “WarGames,” Matthew Broderick hacked into a supercomputer named WOPR (War Operation Plan Response; a.k.a. “Whopper”) capable of predictive modeling for nuclear war scenarios. This supercomputer was using multi-scenario analytics to plan responses to possible nuclear attacks.

The emergence of big data enables this type of planning in the enterprise, giving executives the ability to engage in what-if analysis and forecast for possible future outcomes. 

“Some of the more sophisticated data-science initiatives involve building complex models of multiple linked business scenarios across different business, process and subject-area domains,” explains Kobielus. He explains that employing high performance big data gives enterprises the ability to use such key features as strategy maps, ensemble modeling, and champion-challenger modeling.

Kobielus explains that enterprises taking this route will need to “develop models against multiple information types, including unstructured content and real-time event streams, while leveraging state-of-the-art algorithms in sentiment analysis and social network analysis.”  

—Next— Sensor Analytics—>

Sensor Analytics

Often dubbed “The Internet of Things,” this area of analytics refers to the growing network of internet connected sensors that continuously send feedback to a central repository for aggregation, correlation, and analytics.

“Some call it the ‘RFID Internet,’” say Kobielus, explaining that the sensors can be used to give digital identities for every component, subassembly and product within online supply chains.

Analysts say that sensor data will outstrip every other form of data collection in its sheer size, bringing us into the era of the “brontobyte” (one billion exabytes) in the not so distant future. In the big data era, sensor data is a way to quantify the external world so that it can be fed to the number crunching algorithms.

“We find sensor analytics in medical monitoring, traffic management, hazard protection, emergency response, security incident and event monitoring, and many other critical real-world applications,” says Kobielus.

Related Items:

Six Super-Scale Hadoop Deployments

Intel CIO’s Big Data Prescription

Six Big Name Schools with Big Data Programs

Datanami