Big Data • Big Analytics • Big Insight

June 19, 2013

IDC Talks Convergence in High Performance Data Analysis

Isaac Lopez

At the International Supercomputing Conference (ISC’13) this week, convergence is in the air as many discussions are including talk about the merging of traditional high performance technical computing with the rising data tides of the enterprise. Putting data and use cases on display, the analysts at IDC gave their view of this convergence space – and shared their own name for it: High Performance Data Analysis (or HPDA).

As we discussed yesterday, the challenges of big data are nothing new to the technical users of high performance computing (HPC). But with the emergence of big data, enterprises are finding that they, too, require advanced data-intensive simulations and analytics to pan value from their data before it becomes a missed opportunity. For some vendors, this can represent a swing in bottom line dollars that begins to approach a billion dollars.

IDC analyst, Steve Conway says that their shiny, new term covers data-intensive modeling and simulation (as it’s been practiced forever in the original big data market, High Performance Computing), as well as the newer analytical methods that are gaining adoption by commercial entities as they move up for the first time into HPC.

Conway explained that within this convergence space can be observed a set of use cases that are repetitive enough to be viewed as pursuable markets. These include the following:

  • Fraud and error detection
  • National security and crime fighting
  • Health care and medical informatics (including drug design, personalized medicine, outcomes-based diagnosis, and systems biology)
  • Customer acquisition/retention
  • Smart electrical grids
  • Design of social network architectures

In speaking to a specific use case, he mentioned PayPal, which is tasked with detecting fraud across all of the considerable eBay properties.  “They were using Hadoop, Cassandra, and the usual stuff to try to detect fraud, but their volumes were so large that it took them two weeks – too late, the train left the station,” Conway exclaimed.

PayPal eventually came to the conclusion that in order to find suspicious patterns in related data sets in real-time, they would need to use an HPC approach. This meant developing the complex algorithmic component to detect the patterns, as well as acquiring the hardware to run it with (not to mention the talent). Acquiring everything from Infiniband to SGI Storage and Altix ICE clusters, PayPal found themselves squarely in the HPC domain. The result, said Conway, was that in their first year of implementation they were able to detect and save $710 million in fraud that they wouldn’t have been able to detect previously.

Of course, when you manage to deliver a ROI figure of that magnitude, finding ways to expand that goodness becomes a priority. Conway says that PayPal is looking to do just that first by hiring more HPC people “as fast as they can find them.” He says they’re also making plans to expand HPC use within the company in two more application areas, including personalization of the consumer experience, as well as managing their large systems infrastructure – “the whole thing,” said Conway.

Insurance quote giant, Geico, has also peaked into the use of HPC. Apparently, Geico is using a high performance cluster as part of their backbone process for insurance quoting. “Every Friday, Saturday, and Sunday for sixty hours of clock-time, Geico recalculates for every insurance product they have to create a quote for every adult American and household,” he explained. Every Monday morning the data has been crunched and a pristine, new quote is ready for whoever happens to dial in that week. Conway noted that while this use case isn’t necessarily real-time, given the amount of data and processing power needed, it’s something that fits in the HPC wheelhouse.

Addressing the server forecast for the HPDA market space, the IDC analysts showed that the space is still a sliver of the overall HPC market. In 2013, IDC is predicting that the overall HPC market will be approximately $11.4 billion with HPDA representing 6.9% of a $786 million figure. By 2015, IDC forecasts that HPDA will comprise approximately 7.3% of the HPC markets projected $13.48 billion at $989 million. Conway pointed out that while the percentages are still in single digits, the compound annual growth rate between 2010 and 2015 is projected at a healthy 10.4 percent.

Of course, the concept of HPDA is not new. Even in their own figures, they show that as far back as 2009, this emerging segment represented 6.2% of the HPC market. Said Conway, “we’re not terminological inventors…fewer terms are usually better, but we ran into an issue that required us to do this, which is that this is turning out to be a convergence market in the sense that there are two communities coming together here. So high performance data analysis is the term that we use – it seems to be a compromise term that both sides are ok with.”

Related items:

HPC Analysts: Hadoop is a Tall Tree in a Broad Forest 

Turning Big Data Into Information with High-Performance Analytics 

On Algorithm Wars and Predictive Apps