RAID on Enterprise Big Data
This week Dr. Ercan Kamber, CTO of high performance storage company, RAID, released a research document that shows how both HPC and big data workloads can benefit from parallel file systems. In the report, Kamber also provides a description of the architectural differences between Lustre and GPFS.
RAID, Inc. has been on the high performance computing scene since the mid-1990s with a line of I/O-targeted products, including solutions to solve storage and interconnect issues, mainly in HPC environments. Like many companies with a strong foothold in high performance computing, however, RAID seems to be seeing its golden opportunity to further appeal to enterprise markets, many of which are scrambling for solutions to meet growing data storage, analysis and management demands.
While Kamber’s position that parallel file systems are the key to solving the major I/O hurdles ahead, he says that this opinion is well founded. According to his company, such systems “enable high performance by allowing system architects to use various storage technologies and high-speed, low latency interconnects to obtain the desired performance, accommodating any and all demands of even the most intensive computing environments.”
Of particular value in the whitepaper (free but registration is required) is the breakdown of differences between GPFS, StorNext and Lustre. Kamber puts these differences in context, again with the needs of enterprise (versus academic or traditional HPC environments) in mind. He makes it clear that parallel file systems in cluster or departmental (multi-server) environments, the I/O subsystem can see performance boosts as well improved scalability as data is shared in both heterogeneous and homogenous environments.
Kamber says that the new way of doing business with big data is forcing enterprises to look at their data differently. Storage technology is not keeping up with data growth, so in a way, enterprises don’t have the technology to store it and many don’t have the technology to analyze it either. He said RAID wants to address these problems with tested solutions (not using too many experimental technologies), and they’re looking to HPC technologies being twisted to fit into enterprise environments.