Follow Datanami:
November 27, 2012

Marking Spikes for the Enterprise Graph

Nicole Hemsoth

In some ways, the story of Sun Microsystems closely parallels what’s happening with the present merge between high performance computing and the more enterprise-geared momentum behind big data.

Shoaib Mufti, a former senior manager at Sun on the hardware, software, ASIC and test development team, points to the recent ways high performance computing research and development have been winding a steady path into enterprise settings.

In a recent sit-down, Mufti described the evolution of moving from an HPC-driven hardware business into a more general consumption IT market via the Oracle blend, stating that this can be done successfully, particularly when the market is at an inflection point as it was during the Sun-Oracle blend. Mufti, whose Sun days gave way to a new life as the VP of R&D at Cray’s big data spinout, sees the same potential for this happening under the Cray umbrella, due in large part to the big data deluge.

For a company like Cray, which just bought cluster company, Appro after setting the bones on its big data arm, called YarcData, the moral of the Sun story can be keenly felt. The market now has a freshly-diversified cluster champ combo (Cray-Appro) with a solid existing base (HPC) and some clever software layers and capabilities that can extend it beyond that base. Throw in a graph analytics play and, barring a lack of appeal for this analytical approach, Cray, with its many arms reaching into the unknown, could be onto something.

The problem is, how do you take a company that’s been tucked away in the quiet but busy corner of supercomputing and push them into a crowded huddle of other enterprise-aimed players? If you ask Mufti, it’s about presenting a high-value market consumable technologies that no else is offering—and in a way that leverages the best of the world it has historically played best in, which is supercomputing. These are lessons he remembers well from Sun, and ones that might translate into Cray’s refreshed approach to marching into businesses IT world.

But for the typical enterprise shop, the thought of shipping in a Cray chings up at least a few dollar-sign warning dings. And even if money wasn’t an issue, there’s an ease of use and learning curve element to contend with. The lofty realm of supercomputing is far enough away from many enterprise users on its own, but enter the still-fringe concept of using graph analytics approaches to solve day-to-day business problems and the whole concept of looking to Cray could seem a bit far-fetched. After all, who uses graph analytics to do anything but benchmark and tackle a very narrow set of problems that can be handled in a slightly less high performance (expensive) way?

Mufti says graph analytics has incredible potential to reach larger markets, but at this point it’s about proving true applicability for high value verticals and applications. While the first generation of high performance big data applications and systems were about getting the architecture and tooling right, the next wave of innovation is going to happen around refining what is possible with data. Graph analytics, he says, isn’t just about hum-drum analytics—it’s about being able to discover new questions and make connections between mind-numbingly large data points.

Some will point to Hadoop as the end-all for this sort of need, then extend their pointers to the price. After all, if your sole business isn’t riding on fast big graph analytics, does it make sense to kick investment into another box (in this case the UriKA graph appliance, for instance) versus a commodity cluster to sit in the corner?

Mufti says that for the businesses that do rely on graph approaches, the ability to make increasingly sophisticated, fast connections between disparate, previously unhinged data floaters could be a game-changer and highly worthy of any investment. The financial services, government/intelligence and bioinformatics arenas are the most apt folks to look into graph analytics to solve their problems now, but he feels that market will grow, especially as ease of use, manageability and security components tweak the option (as were added in their last UriKA release).

We talked loosely about the added capability of graph analytics for real word problems. Mufi said that the real advantage is that users can tap graph approaches to discover new questions. For example, think of your list of Facebook friends as a visual cluster with mutual connections and even potential “people you might know” extended connections that are fed by having friends or organizations in common. Pretty cool stuff on its own, but where the added dash of power comes is when you can “grade” these connections into far more intricate webs. For example, seeing connections that are scaled with importance (instead of your wife being just another friend, this is a higher ranked connection and her connections reveal a detailed string of family members that form a mini-network).

Move this example out to a national security program that requires the ability to quickly pull together entire social networks based on interactions, importance of key players, and oftentimes fed by multiple data sources (including phone, web, credit card and social media, for example). Or more scientifically, imagine the ability to discover new connections between proteins during DNA and RNA research, which presents one of the best “puzzle piece” problems in computer science right now—and one that’s right in the graph analytics sweet spot, alongside financial risk and related pattern-discovery applications.  

The YarcData R&D guru said that proving the graph analytics story is going to be an uphill battle at first, but that the addition of use cases in an ever-widening stream of verticals will make the case for the high-value areas they’re seeking. “We have made progress in finding actual use cases with the machines. We are a slightly different area in big data than others—we’re in graph analytics, where you can not only search big data but discover interesting things and new questions. This is a new area and people are trying to figure out how they can fit the discovery aspect into their operational environments.”

Mufti discussed other approaches, including Hadoop, noting that, “with graph analytics, our focus has been on what the real problems are that can be solved with graph analytics. If you’re trying to find complex relationships inside of large datasets, that’s the way to go. A lot of customers are aware of what they want to do but don’t know how to start, that’s why we’re starting with actual user problems.”

Part of the reason Sun and other HPC-oriented companies weren’t able to grow to gargantuan size on their own was because they chose a market that was at the ultra-high end of the computing spectrum (HPC). While high-value orders might be the norm there, to break into where the real money lies requires a company or offering with a clear, distinct enterprise in mind. What Cray seems to be looking for is its Oracle—but whether or not they’re able to achieve that through an advanced analytics approach remains to be seen.

To move things closer to the Sun and Oracle parallel, if Cray is able to leverage its now-in-house cluster, storage and graph analytics approach to appeal to a new enterprise segment that is outside of its bread and butter in HPC-oriented verticals, it may be able to take the business IT world by surprise. While Sun produced plenty of enterprise/commodity hardware prior to Oracle, it managed to do so in a way that kept the highest-tier workload needs as a continuous priority through its HPC group. With the addition of an advanced business software framework to lay across its hardware background, it paved the way for a beneficial set of comprehensive offerings. If Cray is able to wrap a story around graph analytics and its supercomputing roots as they translate to the enterprise, then they’re set to be one of the hottest companies to watch in 2013.

Datanami