February 15, 2013

IBM Targets Virtualized Big Data

Isaac Lopez

Big data and virtualization are two technologies that together shows promise to provide advances in how efficiently and effectively organizations can dissect their data and turn it into useful analytics. However where real time data streaming is involved, virtualization is often considered synonymous with degradation of performance – particularly in the world of x86 (at least according to IBM as it cross examines architectures).

This performance degradation can mean hiccups in the datastream that results in compromised data fidelity and the poisoning of the predictive analytics well.

IBM says that their new Power and Pure Systems address these performance issues, providing organizations with the power to stream real time data without the loss of performance that causes skewed results. Based on the new POWER7+ processors, IBM says that the new systems make the relationship between real time data streaming and loss of performance a thing of the past.

Datanami recently spoke with Ian Jarman, IBM Enterprise Systems Product Manager and Nancy Kopp, IBM Director of Big Data Strategy about their recent announcements surrounding Power and Pure Systems to find out more about how their offerings (utilizing IBMs Power7 architecture) are optimized to handle big data workloads.

“The Power7 chip is optimized for big data performance,” explained Jarman. “We’ve optimized our Cognos and SPSS solutions from IBM software group on Power. Cognos, for example, gets approximately 40% benefit vs. x86 running on Power because of the optimizations as well as the features of the power 7 chip (which include the larger number of threads, higher L3 cache, etc.).”

Culminating from a $244 million DARPA contract for the purposes of developing a petascale supercomputer architecture, the features of the POWER7 may prove to be a boon for businesses that need real time predictive analytics, says IBM. Jarman explained that the new family of Power Systems are optimized for IBM’s analytics software, offering performance benefits and differentiate themselves vs. those using x86 architectures for similar tasks. “PowerVM and its efficiency is a key differentiator between IBM Power Systems and x86,” Jarman told Datanami, explaining that the new Power 750 and Power 760 members of their Power Systems family are specifically designed for the virtualized environment.

“When you think about how people are consuming business intelligence and analytics – you need that low latency, high concurrency, really fast ingest, the combination of Power7 and DB2 allow us to take advantage of that,” explained Kopp, referring to the interaction between IBM’s Power7 chip and its relational model database server. “Having the ability to do both data in motion and data at rest, with the increased capabilities that we’re bringing to the market with the PureData System for analytics on performance – and density – really pull together a powerful solution for our customers. That’s what makes the solution unique compared to Oracle and Teradata, because you also wrap around all the data integration in governance, and some of the capabilities that we bring to the table and having access to the data in a much more of a virtualized architecture than we’ve ever really seen in the past.”

With this new family, IBM says that they focused on key areas that benefit big data usage, including an increase to the performance of the analytics. They say they’ve accomplished this through a redesign that increases the scan rates and throughput across the entire system, making it very balanced and tuned for performance out of the box. They also indicate that scaling was an important consideration in the design of the systems.

“Data growth is a key challenge that our customers are facing,” commented Kopp. “A customer like the New York Stock exchange might see data growth up to 100 to 200 percent and need systems that are significantly better from an efficiency standpoint. Our new system focuses on datacenter efficiency from an analytics standpoint, and has 50% more capacity from our previous system. It’s far more dense, has overall better capacity, and draws less power. It’s more dense than both Oracle and Teradata systems on the market today – so much more efficient.”

IBM says that telcos and utility companies are among the targets for their Pure Data System for analytics running on their new Power Systems. “A telco provider has a lot of network data,” commented Kopp. “If they see anomalies, they need to pull [the network] down and maybe leverage SPSS to make a predictive model to be better aware of where the network failure may occur the next time. Once a model is created, [the administrator] can operationalize and push it out via the Pure Data System for Operational Analytics where there may be thousands of concurrent users at the same time.”

 “A lot of customers I’m hearing are struggling with how difficult it is to manage these [x86] systems,” added Kopp. “There has been no more difficult time in terms of agility and keeping administrative costs down than now – and we’re seeing that with Hadoop.”

IBM says their price point should help address these cost concerns, and offer another comparison point. “For the first time, we are going to have an entry price point under $6000 for Power7+ based servers – this is going to make it very price competitive with x86 workloads running on AIX or IBM i,” added Jarman. “That’s a big deal in addition to the performance.”


Related Articles:

CTO Sees Virtualized Big Data as Next Challenge 

New IBM Mainframe Targets Massive Data 

Virtualizing the Mighty Elephant