Too many big data initiatives are science projects that take months of effort, risk failure and require highly trained data scientists with scarce skills. According to a CSC survey, 55 percent of big data projects aren’t completed and many others fall short of their objectives.Read more...
How Analytics is Driving Military Intelligence
After the September 11 attacks, it became apparent that the United States intelligence agencies needed to get better at integrating disparate pieces of data. Since then, we’ve witnessed a massive increase in those agencies’ capabilities. And now some branches of the military, including the US Marine Corps, are experimenting with big data analytics technologies, such as Hadoop and graph databases, in pursuit of giving field commanders better intelligence.
One of the companies developing big data analytic solutions for the military’s intelligence services is Modus Operandi. Wave Exploitation Framework, the name of Modus Operandi’s flagship big data analytics offering, is designed to give forward-deployed units the capability to determine how people, organizations, places, and events are connected over time. The software utilizes technology such as Hadoop, graph databases, semantic triple-stores, and natural language processing algorithms to deliver Facebook-like insights into the lives of bad guys, with the ease of a Google-like query interface.
“The issue with 9/11 was, we had all the information, but nobody connected the dots,” says Modus Operandi’s chief scientist Dr. Eric Little. “Our system connects dots. We’re not about just passing through more data or warehousing data or just simply providing people more data. We actually make the data smart and we make the data easily consumable for people to actually use for real decision-making.”
Wave uses Hadoop applications such as HBase, Accumulo, and CloudBase to ingest massive amounts of poly-structured data into the system, and natural language processing algorithms to make sense of words and human expression. The data could be anything from field reports to social media postings, from public news accounts to e-commerce transactions, from bank records to records in classified government databases.
Once the data is mapped and reduced in Hadoop, it’s loaded into a graph database that’s hosted on a semantic triple-store (the company uses a variety of them). The graph database allows users to make connections between people, organizations, events, and places that would otherwise be difficult to make by hand. Facebook doesn’t track connections between people with Access and Excel–or pencil and paper, for that matter– and now, neither does the military.
As Dr. Little explains, Wave excels at tracking high value individuals and organization of interest, such as suspected terrorists or affiliates of Al-Qaeda. “Our system provides the ability to link that kind of data together, and in near real time give the end user the ability to very rapidly connect data and search across large amounts of data,” he tells Datanami.
More recently, the company has developed a graphical interface called BLADE that sits atop the graph database and makes it easier for users to interact with. BLADE generates SPARQL queries from the comfort of a graphical interface.
Dr. Little calls BLADE a semantic wiki, and compares it to how Wikipedia works. “Instead of just writing pages that are are roughly connected by hyperlink that have to be maintained by a body of experts, what makes it semantic is the wiki pages are auto-generated out of the graph,” he says.
“So if you input a new piece of information, like ‘This guy has a connection to this organization,’ that organization will appear on his page. And so you can click on that organization and it will take you to the page for the organization, then it gives you the links back to all the underlying reports where the information in the graph came from.”
The US Marine Corps is currently conducting field tests with Wave and the BLADE interface. The idea is that Modus Operandi’s solution will be used by intelligence analysts in the field–19- or 20-year olds with M-16 rifles who have been trained to gather and report intel from the field. The company is also working with the US Army.
“They are looking for people and groups and locations and things of interest,” Dr. Little says. “A lot of them are trying to predict things like the routes that are probably going to have roadside bombs planted on them, and who are the individuals in certain areas that we have to worry about, who is expressing the most anti-American sentiment. Or where are the soft targets, the market places and other places where somebody may decide pull an event that would do harm to people.”
Gaining access to this sort of information in the field could potentially help US forces untangle the complicated web of connections and affiliations of the people and groups they meet on deployments. It certainly has applications in hunting terrorists, and could also be used in assessing the risks faced by American embassies in hostile nations.
Before coming to Modus Operandi, Dr. Little built several big analytical systems before companies in the oil and gas business and pharmaceutical industries. He understands the power of this approach to overcome the constraints of small spaces and budgets.
“You can do a lot if you have a supercomputer. But we’re not building these things on multi-million dollar supercomputing machines,” he says. “Most people can’t afford that and the Army can’t go dragging those around. So the challenge is how do you provide people with analytics on a lot of data on modest machinery that’s still very fast. The techniques we’re using are very cutting edge, and allow you to run Hadoop on a large amount of data.”
Big, Smart and Easy
Modus Operandi aimed to build its big data solution to be three things: big, smart, and easy–as in, able to handle big data, enrich the data with intelligence, and easy to use. Normally, a defense contractor would suffice itself with trying to hit two out of three of those. But Modus Operandi is motivated to deliver on all three of those.
To do that, the company has hired people with a diverse array of backgrounds, including database design, game design, machine learning and semantics, ontological engineers, hardware experts, and, even a little metaphysics (Dr. Little’s Ph.D. is in philosophy and cognitive neuroscience). “The secrete sauce is the gluing together,” he says. “You can’t do this just with computer scientists or just with UI guys. The trick to pulling it off is having a highly diverse team.”
The product is a work in progress. But if it find acceptance under the tough working conditions war, there’s a good chance it might be adopted in other areas, such as bioinformatics and chemical informatics.