IBM Fellow Tracks HPC, Big Data Meld
A quick glance at the agenda for last month’s International Supercomputing Conference (not to mention last November’s sessions for the annual Supercomputing Conference) reveals that the high performance computing community is finding renewed vigor in the debates about the “big data” problems enterprise and research users face.
From sessions that explored specific data-intensive applications, to those that broached novel ways to speed and automate the underlying infrastructure, big data has increasingly been the star of the HPC show(s) and this trend is expected to continue as HPC-oriented storage, network, system and software vendors make their case for their ability to handle data-intensive workloads that mine through both structured and unstructured data.
This is not a surprise since the world’s top parallel supercomputers are designed for maximum performance on datasets that are larger than many enterprise users will ever encounter. From massive physics simulations to astrophysics research that recreates the universe’s birth, these systems are designed to quickly plow through incredible amounts of data.
In reality, the “big data” challenges (among which are data volume, variety and velocity barriers) that are receiving so much attention are problems that the HPC community has been exploring for years. It’s no wonder that the vendors in this arena, on both the systems and software levels, are finding an easy route to message the value of their wares to data-laden enterprise customers.
Dr. Guru Rao is an IBM Fellow at the Systems and Technology Group and leads the direction for the company’s enterprise systems arm. In a recent piece on the HPC, big data meld, he claims that what’s happening in the HPC and big data meld for researchers is also happening in more enterprise settings—across more industries.
IBM cleaned up on both the Top500 list of the fastest supercomputers this year, as well as on the newer benchmark behind the Graph 500.
Rao says that elements of the datacenter that once served small niches in the market are finding their way into more mainstream discussions as more companies look to advanced analytics and high performance hardware to solve their challenges. He says that since the datacenter is the hub that handles the surge of structured and unstructured data, it is requiring a rethink of IT—one that is reshaping how IT infrastructure is conceived and implemented.
To highlight this, Rao points to high performance computing, which he admits has often been the distinct realm of the national labs, universities and the top-tier companies that could afford supercomputing technology. He notes that the increased focus on advanced analytics and higher performing systems is revealing a mesh between high performance and technical computing—that mainstream enterprise users are using HPC for “workloads that require a lot of computing power and data, such as simulations, computer modeling and analytics.”
Rao continued, noting:
“Last Wednesday, IBM and LLNL announced a collaboration to help industrial partners use HPC to boost their competitiveness in the global economy. IBM will make researchers available to work hand-in-hand with businesses and organizations on specific projects in such areas as improving our electric grid, advancing manufacturing, discovering new materials and leveraging Big Data. This new collaboration will address a need for increased access to supercomputers expressed by businesses and government institutions that need quickly process very large data sets.
As supercomputers improve, they become dramatically better in terms of affordability, performance and size. Sequoia enables a company or university to acquire a petaflop computer in just five racks. In contrast, if you used the technology in the second fastest computer in the world, it would take approximately 80 racks to create a one petaflop system. The difference in cost, size and power consumption is something that a company can apply for immediate competitive advantage.”