Follow Datanami:
November 8, 2021

Application Productivity Being Crushed by Data Bottlenecks?

The HPC Application Workload Tradeoff:  More Nodes or an Optimized System Fabric? Cornelis Networks Says, “Why Not Both For the Same Budget?”

The convergence of HPC, high performance data analytics, and AI create massive data sets prompting organizations to reevaluate their supporting HPC infrastructures. The volume and complexity of data being collected, stored, moved, processed, and analyzed puts enormous pressure on the HPC system capabilities of many organizations, resulting in productivity bottlenecks.

The solution to solving these bottlenecks is not only adding more processors. While this approach has worked over the years it is not adequate on its own. Acquiring more nodes can sometimes be costly and will not provide a framework able to stand the test of time in big data environments. The best solution resides in a flexible and scalable HPC system fabric and a close working relationship with the vendor and the customer to find the right balance between the nodes and the fabric.

Why Focus on the Fabric?

The HPC communication fabric is foundational to application performance, but with any HPC system acquisition the fabric selection requires careful consideration to achieve optimal balance and performance.  The HPC fabric must be planned and deployed for ever increasing data traffic – ultimately delivering a highly productive system capable of supporting application challenges such as high-fidelity modeling and simulation, advanced visualization, and complex data analytics. There is no simple fix. There is no one-size-fits all solution.

According to Cornelis Networks, the leading independent provider of High Performance Fabrics for HPC, HPDA and AI, typical big data environments place a tremendous challenge on the HPC fabric. The heartbeat of an HPC infrastructure, where the balance and optimization of bandwidth, latency, message rate, and scaling limitations, among other factors, play a foundational role in determining overall system performance and of course, productivity.

“In this converged era of massive data sets and complex computation requirements, smart organizations will take a close look at their HPC system fabrics, as the critical consideration to solving productivity bottlenecks,” said Mark Spargo, a senior vice president at Cornelis Networks.  “In some cases, additional nodes may make sense, but in all cases, the evaluation of the HPC system fabric is paramount.”

According to Steve Conway, senior advisor for HPC market dynamics at Hyperion Research, “The rapid rise of high-performance data analysis, including data-intensive simulation as well as AI and other advanced analytics, has elevated the importance of HPC fabrics and inaugurated a period of high innovation in this area. We advise HPC sites to consider their current and future requirements and investigate the growing range of fabric options, including companies such as Cornelis Networks, before making a system purchase”.

Selecting the Right HPC Fabric is a Financially Smart Approach

Evaluating the HPC system fabric before making a system purchase is a financially sound approach – an approach that offers an additional benefit. The fabric typically makes up 20-30% of the overall cluster budget.  But it’s not at all uncommon for organizations to pay more for the system fabric than is really needed. Focusing on the right, most efficiently configured fabric solution, and not buying fabric components that won’t be fully utilized, will give an organization the most optimal fabric solution while in many cases, actually saving money. Money that could be allocated to purchasing additional nodes increasing overall performance. It’s the best of both worlds.

Experience is the Best Guide

Cornelis Networks has more than two decades of experience in the development and evolution of high-performance fabrics. With a seasoned leadership team, a world-class sales and support organization, and roots going back to fabric technology innovators including SilverStorm, PathScale, QLogic, Cray and Intel, Cornelis delivers high-performance fabric solutions on a global scale for leading scientific, commercial and government organizations.

According to Spargo, “Not conducting a careful analysis of the application and data challenges, current and future, and falling short of configuring an HPC fabric designed specifically for the anticipated workload, will very likely result in an HPC system infrastructure that will crumble under a data overload—a situation that no organization wants to face.”

But Spargo is passionate about the role he and the Cornelis Networks team would like to play. “We have a rather unique culture at Cornelis Networks.  While our business is architecting and enabling our partners to install HPC fabric solutions, we focus on building long-term relationships with our end-user customers.  We speak to organizations with complicated data challenges every day, and sincerely tell them all the same thing—bring us your most difficult challenges.  We can bring the collective power and experience of our solution partners—CPU and GPU vendors, OEM system vendors, resellers, and system integrators.  Our mission as a multi-partner, independent solution provider is focused on delivering the best fabric-optimized customer solutions possible—and establishing a long-term, mutually rewarding relationship.”

“We see dozens of companies building campaigns out of bullet points and throwing around all the standard buzzwords. Everyone wants to claim they are a leader in bandwidth or latency or some other metric.  The truth is, architecting an optimal, highly performant fabric for any HPC installation takes considerable expertise, earned over decades of delivering solutions. This has been the Cornelis DNA since early 2000.”

Our Customer Centered Approach

Cornelis Networks is ramping up for a significant growth year in 2022. The company is delivering its fabric solutions in 100Gbps increments to commercial, industrial and government installations today, and there is strong, widespread interest with the company taking pre-orders for its Omni-Path Express software enhancement coming in Q-1, 2022.  Cornelis Omni-Path Express is the next generation of high performance fabrics, a proven hardware foundation combined with the OpenFabrics Interfaces (OFI) software framework, that delivers the industry’s lowest latency, highest message rate, and best collectives performance, all at the industry’s lowest CPU utilization. Over the next several months, Cornelis will be making a number of exciting announcements related to products, customer installations, new channel partners, and corporate growth.

Spargo added, “We want to hear from any organization looking to optimize their HPC environment to evaluate whatever data workload and workflow challenges they are facing.  I’d urge them to take a close look at Cornelis and reach out to us. We are all in this together—vendors, partners, and end user organizations, and we learn something from each of our customer installations. If we can’t deliver an optimal fabric based on an organization’s needs, we’ll be transparent and tell them just that. But, so far, that hasn’t happened.”

Datanami