Follow Datanami:
May 9, 2016

Why ONI May Be Our Best Hope for Cyber Security Now


Huge volume of network data has made it all but impossible for the good guys to detect new security threats, which has created space for the bad guys to operate. But thanks to a new Apache big data project called Open Network Insight (ONI), the good guys now have a powerful way to cut through the noise and identify bad guys and their malicious schemes.

Cybersecurity is very much an arms race. When the bad guys get too far ahead, the good guys innovate and build a better mousetrap, and the bad guys try to find a way around it. It’s all very Tom & Jerry-ish, and it’s been that way pretty much since the Internet was created.

But recently, it seems as though the bad guys have been getting around our mousetraps more and more frequently. The last great mousetrap built in this space–the security information and event management (SIEM) products that were first conceived and built in the early 2000s–simply can’t keep up with the volume of network data today. And the mice have proliferated.

Cloudera CEO Tom Reilly knows this better than most. Reilly previously ran the SIEM software vendor ArcSight, which was acquired some years ago by Hewlett-Packard, and then he ran HP’s Enterprise Security group before taking the top job at the biggest Hadoop distributor.

In an interview with Datanami, Reilly spelled out the deficiencies in the old SIEM approach, and why the new Open Network Insight (ONI) approach–which leverages the combination of standards, openness, and big data tech like Hadoop and machine learning—represents our best bet yet for a secure cyber future.ONI logo

Enter the ONI

“I’m a big fan of ArcSight and the technology, but it’s getting outstripped by a number of things,” Reilly says. “First of all, they were not able to deal with the volume of data that’s needed today in order to do anomaly detection, especially on network traffic. You just could not put netflows or deep data packet captures into a SIEM. But Hadoop is very naturally designed for that.”

Mid-term storage poses another conundrum for SIEMs running on traditional relational databases. “Typically, SIEMS are optimized [to hold] 60 to 90 days of data, which makes it very hard to capture a low and slow attack that occur over a six-month period,” he says. “But Hadoop is perfect for keeping data. As a matter of fact, we recommend customers never delete any of their data.”

The signature-based approach of traditional SIEMS, which leans heavily on the skill of human analysts to write and interpret, is another weakness that cybercriminals are only too happy to exploit.

“The cyber criminals are very savvy,” Reilly says. “Some of the greatest customers of SIEM tools were the cybercriminals. They’re saying ‘What are the rules these things ship with and how do I get around those rules?’ You can no longer have a backward-looking signature-based approach.”Reilly pull quote

Think about those words: You can no longer have a backward-looking signature-based approach. They are telling, especially coming from Reilly.  From this point out, security solutions that use signatures are essentially obsolete. But how do you build something that’s forward-looking, and what does it look like?

Anomalous Flux

ONI was designed as a forward-looking tool that uses machine learning algorithms to detect emerging threats, which are the threats that everybody is worried about, since existing SIEM products do a fine job with the older known threats. The software, which Cloudera announced in March, is backed by tech giants Intel (NASDAQ: INTC) and a number of security startups; an eBay (NASDAQ: EBAY) data scientist is also involved, although the company is not formally backing the effort.

Reilly describes the impact that he anticipates ONI will have.ONI diagram

“What it allows companies very quickly to do is probably the most impactful thing in anomaly detection, which is look at network traffic, specific to their environment, without having to build the algorithms, without having to build the tools, and without having to go through vendor selection because it’s an open source project,” Reilly says. “With supervised learning, an analyst can look at these anomalies and determine if they require further forensics or if they’re benign and then they become part of the pattern.”

What makes ONI potentially disruptive is not just its use of big data technology. Emerging platforms like Hadoop and NoSQL databases clearly have an edge over traditional SIEMs, as we have reported extensively at Datanami. But the fact that ONI combines that big data tech with a standard data model and an open, collaborative approach to information sharing makes it especially intriguing to security professionals.

“Say there are five banks and you’re each using the ONI data model to identify anomalies on your network,” Reilly says. “Prior to having a shared data model, Bank A would have had to call Bank B and say ‘Here’s what I’m seeing. I have an unknown device pinging to this site on a 47-second interval continually. Are you seeing that?’ And the other guy would say, ‘Well I’ve got to write a rule to go search and find that.’ Whereas in our world, you just pass over the algorithms and they should be able to run it very quickly.”

Corporations are rather poor at collaborating right now, but the bad guys are very good at it. “They share code and rent out each other’s bot-net armies. They share techniques and post to bulletins,” Reilly says. “The beauty of this is, if we can get to a shared data model and shared collaboration, suddenly you’re leveraging the best data scientist, cybersecurity guy at another bank who wrote this amazing algorithm to detect the most advanced attack and he can share it with you very quickly.”

Crying Out for Openness

eBay’s Austin Leahy is a data scientist by trade. Since he’s started working in the security field, he’s played with a fair share of security tools and products from many vendors. None of them are up to the task that companies face today, he says.

Austin Leahy

Principal Data Scientist for Global Threat Managment at eBay Austin Leahy

“Security is still surprising nascent when you go out and look at the vendor landscape,” Leahy tells Datanami. “When you go out and look at a lot of the companies that are building these things, you see security products that …feel like BI circa 2005 to 2008 than the advanced analytics that you sometimes see in the rest of the field.”

ONI feels different. “When I first saw ONI presented, I immediately knew it was something special,” he says. “The opens source aspect of it, and the ability to contribute and the brain trust built up there, is a big part of what led me to become an individual contributor to it. It’s a really cool project.”

Apart from security events like RSA, there’s a decided lack of sharing in the security space. It’s not that security researchers don’t want to share, but more that the community lacks a way to share data efficiently. This is part of what attracted Leahy to ONI, and why he is contributing to the underlying codebase.

“It feels like an open source problem,” Leahy says. “I think there’s a significant lag in the marketplace that gives open source the ability to come in and capture the imagination and get a significant amount of traction.”

Leahy likes ONI from a purely functional standpoint too, and sees it possibly become that “single pane of glass” that security professionals can use to sift the wheat from the chaff.

“Being able to centralize, to be able to take an iPython notebook and instead of having to open up three more browser tabs in FireEye, or Cyphort or RSA Analytics, now you’re pulling data and you’re actually writing code to build a knowledgebase for your team–these are things that to me, they’re the classic open source dynamic that’s cry out for community contribution.”

Keeping Up with the CyberVor-ians

Don’t look at ONI as a silver bullet to your security problems, however. While Leahy says there’s some really interesting stuff happening with the machine learning, it’s not a replacement for human analysts and researchers who can apply context.

ONI_Monster“There’s no cookie cutter out of the box solution here,” Leahy says. “The best thing that we can hope for is an open source project that’s incredibly extensible and a platform that’s incredibly extensible.”

ONI not only gives you the power of advanced analytics and parallel data processing, but it’s free and relatively easy to use (although you still need Hadoop, which is not always easy to use). ONI also provides a standard upon which the next-generation of big-data based SIEM and user-based analytics solutions can build, and that is clearly important to Cloudera and its emerging ecosystem of security vendors.

At the end of the day, emerging security threats demand better approaches. Cyber criminals are using the huge volume of data as cover for their nefarious deeds, and they’re getting better and better at evolving their techniques to remain invisible. Machine learning approaches are clearly needed to give the good guys an edge, while open collaboration and standardization in data formats can help sustain that edge.

Projects like ONI clearly bring more upside than downside, especially considering the wide-open nature of cybersecurity today and the increase in attack surfaces that the mobile and IoT revolutions are bringing. If your security team doesn’t see the value in pooling resources with other security teams to build a collective wall against emerging cyber threats, then something is probably wrong.

Or as Leahy put it: “You want to be working for the companies that are feeding the threat intel feeds and not receiving them, because that’s the only way to stay ahead.”

Related Items:

Why Machine Learning Is Our Last Hope for Cybersecurity

Super Scalable SIEMs Set to Tackle Big Security Challenges