3D Visualizer Goes from Cybersecurity to Open Source
A 3D visualization tool developed by the network security firm OpenDNS to identify malicious websites and domains will soon be available to anybody via open source license. While OpenGraphiti was developed to solve problems in the security realm, the GPU-powered software can be used to visualize large datasets for any use case.
OpenDNS security researcher Thibault Reuille started developing OpenGraphiti about a year ago to help the firm ferret out malicious websites and domains, and identify the cybercriminals controlling them. Reuille, who previously worked at GPU maker NVidia, is an expert in big data visualization, and used his OpenGL programming skills to create OpenGraphiti.
“It’s a great tool to visualize an abstract bunch of data that at first glance doesn’t really look related,” says OpenDNS security expert and evangelist Andrew Hay, who is accompanying Reuille at the BlackHat2014 conference taking place this week in Las Vegas. “It really helps you focus your investigation going forward.”
Hay likens the tool’s 3D visualizations to “Minority Report” dashboards, in reference to the powerful reporting tools used by police in 2002 sci-fi movie. “It’s not a 3D image, but a 3D model,” he tells Datanami. “You’re flying through the resulting graph data so you can grab different nodes and
move them around, and drill in where you want to. It’s far more interactive than a pie chart and it’s definitely visually stunning and aesthetically pleasing.”
So far, company researchers have used the tool to identify about 10,000 malicious websites, which the company subsequently added to its blacklist of websites that it blocks for customers.
At the BlackHat conference, the company will demonstrate how the tool helped them find malicious websites by correlating email addresses with malicious domains. The analysis starts with email addresses (pulled from a proprietary WhoIs database of website registrations) that are known to have been used to register malicious websites, and then seeing how they’re connected to unknown domains.
The company also used the tool to investigate the Syrian Electronic Army, to cull common data points from the Verizon data breach report, and to analyze domains associated with Cryptolocker, a growing class of “ransonware” that uses Trojan horses to lock people’s computers until they pay cybercriminals to unlock them. “With all the adjacencies, it’s incredible what pops when you start looking at the data, especially if you start with a known bad domains or website, and just branch out from that,” Hay says. “You can see what else is malicious. It just becomes obvious.”
Security researchers are increasingly looking to graph databases to give them the edge against increasingly sophisticated cybercriminals and malware. Security use cases are increasingly common for popular graph databases, such as Neo4j, GraphLab, Giraph, Sqrrl, and Titan, and now you can add OpenGraphiti to the list.
OpenGraphiti was written in C and C++, and uses OpenCL and OpenGL APIs to leverage the parallelized math processing capabilities of Nvidia GPUs. At OpenDNS, the tool has generated graphs with 80,000 nodes and 60,000 edges. That’s on a single Mac laptop, which took several hours to generate the graph. The company has not yet tested it on a cluster of GPU-enabled machines (it’s awaiting delivery of bigger GPU-enabled machines), but Hay expects the software to scale, owing to its OpenGL underpinnings.
The company decided about three months ago to place the tool in the open source realm, where it could be more widely used and further refined. It’s shown its usefulness in security, but Hay envisions it being adopted to visualize all types of data across industries. “Anybody with a sizable dataset that they’re tired of using their own Python scripts to do something with, this should help lower the bar to get those things visualized,” he says.
Hay has already experimented with crunching non security-related datasets in OpenGraphiti, including mapping the entire Internet, analyzing 12 year’s worth of M&A transactions for publicly traded companies, and comparing multiple people’s LinkedIn contacts. “There’s a lot of questions you can ask of the data, and conclusions you can draw from the visualizations,” Hay says.
Any loosely related data that can be outputted in CSV or JSON formats can be loaded into OpenGrapheti, and no data science is required. “We’re trying to handle data sets that are too big for the regular tools to use and need to be manipulated by data scientists or security people who don’t have a really strong knowledge of graph theory or advanced mathematics,” Hay says.
The software will be available from GitHub starting Wednesday.