How Big Data Analytics Can Help Fight ISIS
The rapid rise of the terrorist group Islamic State in Iraq and Syria (ISIS) this summer grabbed the world’s attention and led to military action by the U.S. and its allies. While drones and smart bombs pick apart ISIS from the air, you can be assured that American cyber warriors and defense analysts are using big data analytic techniques to glean insights about the group as well.
Of course, the intelligence gathering and cyber warfare activities undertaken by the U.S. and its allies in the battle against ISIS is classified, so we have no firsthand knowledge of what actual technologies are being brought to bear. But by speaking with experts at analytic software firms that supply the defense and law enforcement communities, we can piece together the types of analytic approaches that the military and intelligence community is likely taking, and especially what role big data is playing in the conflict.
ISIS as Disgruntled Consumer
What’s striking is that the big data challenges and opportunities that government analysts face are not that different from the challenges and opportunism presented to data analysts in the private sector.
They both need to collect, store, join, and interpret multiple streams of structured and unstructured data. They’re forced to use different analytic tools to make sense of different data sets. They face data quality and stewardship issues, and sometimes they’re forced to make assumptions that could turn out wrong. And in the end, they’re both trying to get a more detailed picture of a certain group of people, what motivates them, who they’re connected to, and what they’re likely to do next.
“You can think of the Federal Government in a relationship where terrorists are, in effect, a kind of consumer that the government is attempting to understand at a greater level of granularity,” says Chris Sailer, vice president of professional services at Saffron Technology, a Cary, North Carolina-based developer of cognitive computing and artificial intelligence software that’s used by private companies, law enforcement, and non-governmental organizations.
“They’re actually trying to understand the same types of questions you’d ask of a consumer group,” Sailer continues. “If I’m an insurance company, a financial services entity, Walmart or Amazon, I want to understand my customers–what they intend on purchasing, what their habits are. And all those things and aspirations translate right over to what [the Federal Government is] trying to understand about an opponent and an entity like ISIS and its individual players.”
Peace Through Superior Analytics
Saffron isn’t working with the government on tracking ISIS. But Sailer imagines that advanced analytic technologies such as natural language processing (NLP), graph databases, and cognitive computing platforms would be prominent pieces of the project.
“If you’re looking at the national security level, what they doing is processing communications from across the world and they’re leveraging applications like this to essentially break down the tone, tenor, content and context of individual communications after they’ve been transcribed and, converted from voice to text,” Sailer says. “The ability to do this is a combination of brute computing capacity and the ability to collect [data] at an astronomical scale. But the other part is leveraging advanced analytics and graph stores and cognitive computing platform to be able to do what a human analyst can’t.”
Getting good data may be the hardest part of cracking the terrorist’s nut. While ISIS is more technologically sophisticated than earlier terrorist groups and is active in social media, its members don’t advertise their affiliations. And if they do use the Internet to communicate, they likely are using encryption to hide it, or speaking in code words that can be tough to decipher.
“The trick with these groups is they don’t have a registry. They’re not paying dues,” says Dr. Eric Little, chief scientist at Modus Operandi, a Melbourne, Florida-based company that develops big data analytics technology for the military. “It’s not like you can find out who’s exactly in the group and who’s not in the group. They’re secretive and they operate in a very covert fashion. It’s a huge challenge. And it’s also an area of the world where we don’t have an open capability to penetrate. We do a lot of our examination of these groups from afar.”
It’s a cat and mouse game, and the terrorists are careful. For example, when it became known that the U.S. government could intercept satellite phone communications in the early days of the war in Afghanistan, Al-Qaeda stopped using them and went back to traditional forms of communications. The wariness of Al-Qaeda was part of why it took 10 years to find Osama Bin Laden. “He was clever,” Little says. “He didn’t open his mouth where he could be easily tracked or tapped. He used runners and couriers.”
Asymmetrical Analytic Warfare
If big data is going to help topple ISIS, it will be in an indirect fashion. While it appears there’s not much direct information available about the group, there is supporting data that can tell us who’s supporting the group, who’s feeding and financing them, and supplying it weapons. That data can be mined and processed using different analytical techniques.
“I think the key to being successful is the ability to pull in multiple kinds of data,” Little says. “So if you know something about the person whose message you’re looking at, you may be more inclined to think this person is talking about a terrorist-like activity….There’s no silver bullet. You can’t just run NLP on things and expect it to give you your answers. You really have to use multiple kinds of technologies.”
For the past six years, the Department of Defense has funded a project called the Minerva initiative that’s aimed at spotting so-called “social contagions” that could cause civil unrest or insurgencies at home and abroad. According to published stories, the program employs data mining to analyze and determine threats based on data pulled from Twitter and other sources. The program has a strong sociological bent, and has been somewhat controversial due to its analysis of non-violent protestors (although determining how and why previously non-violent protestors take up arms appears to be one of the goals of the project).
In addition to social media, other data types that the U.S. government will use to fight ISIS are geo-spatial and temporal in nature. A Washington D.C-area company called GIS Federal is working with the U.S. Army’s Cyber Center of Excellence in Fort Gordon, Georgia, to track terrorist movements in Syria and Iraq using the company’s GPU-powered database, dubbed GPUdb.
The GPUdb is fed with surveillance data from drones (also known as unmanned aerial vehicles, or UAVs) and is able to plot that data across space and time. “We’re the computational engine or the database that is responsible for space and time data,” says Amit Vij, CEO of GIS Federal. “We basically are given that data from different teams and agencies across the IC [intelligence community].”
GPUdb gives military analysts the capability to blend the space-time plot data with other data sources, such as messages captured by the intelligence community’s signals intelligence (an unclassified Twitter feed substituted for that during the recent demo that GIS Federal presented Datanami). This gives intelligence analysts the capability to make inferences and connections between multiple data types using live data in real time.
One non-obvious way an analyst might go about cracking the terrorist-identification problem is by tracking disease outbreaks. “One of the interesting things that’s come out of Syria has been the emergence of documented Polio cases,” Saffron’s Sailer says. “Foreign fighter movement hails from areas where there’s a disproportionate portion of infected individuals, like northern Nigeria, Waziristan, Pakistan, and Afghanistan. The groups that are present in these areas–Taliban, Boko Haram, and Al-Shabaab– have all taken really staunch exceptions and oppositions to vaccination efforts.”
The World Health Organization (WHO) has assembled detailed information on the polio outbreaks in Syria, and have used DNA tracking to determine where the particular strains of polio virus originated. Plotting this “wake vortex” of polio cases on a map will tell an analyst where terrorist fighters or ISIS members came from.
The Power of Graph Analytics
Graph analytics and semantic triple stores are still fairly new in the commercial world, but the U.S. is probably starting to employ them to track terrorists as well. During the wars in Iraq and Afghanistan, it was common for U.S. military personnel to confiscate the cell phones and SIM cards of captured fighters sent to prison. The fighter and all of the phone numbers on the phone or SIM card would be plotted in a graph to determine hierarchies and identities.
“Graph databases coupled with strong visual analytics–you run data like that through a given prison population and the answers just jump right out at you,” Sailer says. “With graph databases, what you’re essentially doing is force multiplying the analyst. With the aid of analytics, a single analyst brings the potency of a 25-person analytic staff. That’s what we hope to bring to bear.”
Little is a big believer in the power of graph analytics and semantic triple stores to help find answers in complex data. But they don’t work as well when the data set is incomplete or has a high level of uncertainty, which he suspect may be the case in the hunt for ISIS.
“You have to be able to weight your graphs and provide a kind of mathematically based set of heuristics on those graphs in order to shake out some of those associates,” he says. “That being said, we’re currently looking at the combination of both logic-based and math-based graph analytics, and using those inside of these scalable cloud architectures” like Hadoop.
We may never know exactly what big data analytics technologies and techniques that the U.S. is bringing to bear against ISIS and other modern terrorist groups. It may closely resemble the analytic capabilities of public companies like Google and Facebook. As the body of knowledge around ISIS and its members expands, the analytics will undoubtedly grow with it.