Data Aggregation in the Public Sector
Intelligence operations in the United States face a deluge of data from a diverse range of sources that they have to analyze to protect the country’s security interests. Bob Flores, Founder of Applicology Incorporated and former Chief Technology Officer of the CIA, shared his views on how the public sector is dealing with this deluge, particularly with regard to how the intelligence community collects and aggregates data in a video produced by MarkLogic.
Flores began with noting that traditional relational databases have struggled to keep pace with the increasing data load that many institutions see. “The analytical methodologies of the past don’t scale to the data that we’re seeing today,” Flores said.
This concept that data is outgrowing the ability of relational databases to properly process it is nothing new. The opportunity for growth lies in the potential methods to supplement or even supplant such databases. For Flores, one of the main keys lies in data aggregation.
“Data aggregation is a huge deal, it always has been, but even more so today because of the disparate nature of the data sources,” Flores mentioned. Those disparate data sources specifically (in Flores’s case) include the significant growth in sensors used by the military. They relay information like machine maintenance data from various locations and systems. This can cause problems for those relational databases if not aggregated properly according to Flores.
“In the past we tried to bring in all the disparate data, normalize it, and stuff it into one relational database so we had this one great big database we could go to search for information. Of course, that’s only as good as your aggregation tools are and frankly these days there’s so much data that the scalability of the way we did that aggregation comes into question.”
As a result, Flores has found that the commercially available tools that go out and ‘massage the data,’ or altering such that it is in a more standardized and possibly compressed format, are rather useful and allow the end users and analysts to achieve its goals and find links among the data sources.
The intelligence community, per Flores, would like to get to the point where they can analyze the data in a manner that predicts future events. To aid with that, the intelligence applications of big data will likely have to include social media, as government institutions analyze content and context of posts on Twitter, for example, to determine when major events are likely to happen.
“The big concern,” Flores said, “is ‘what’s going to happen next week’ or ‘what’s going to happen the week after that?’ As I’m looking at tweets that are coming out of various parts of the world, what kinds of things do I feel comfortable saying about that kind of data that will make the intelligence community smarter about future events?”
Predictive analytics are on the tip of the tongue of most institutions that employ a form of big data analytics, so it is not surprising to see such interest from the United States intelligence community.