Big Data for the Common Good
The prospects of utilizing big data sets originating from governmental agencies are growing by leaps and bounds, and providing tantalizing new possibilities for bolstering not only private profits, but the public good as well. In just the last week, progress has been made with open data initiatives not only in the U.S., but Europe too.
Last week, the White House Office of Science and Technology Policy (OSTP) and the federal government’s Networking and Information Technology R&D (NITRD) program hosted a “Data to Knowledge to Action” event in Washington D.C. The event was attended by representatives from dozens of governmental agencies, universities, non-profit organizations, and technology vendors, such as Teradata, Splunk, and IBM.
At the event, the OSTP announced it’s sponsoring a second round of big data projects that build off the first round unveiled in March 2012. That first event made big headlines, and also brought some big bucks to bear when the Obama Administration committed $200 million to help build big data projects.
This time around, the OSTP highlighted four new big data projects in particular that should improve the lives of Americans by focusing on areas such as cancer research, geointellingece, economics, linguistics, and dozens more being driven by companies and organizations attending the event.
For starters, the OSTP is throwing its support behind CancerLinQ, a five-year, $80-million initiative aimed at establishing a network to centrally record the outcomes of cancer treatments around the country. Currently, good detailed information about cancer treatments and outcomes is only available through the 3 percent of cancer patients who participate in clinical trials. CancerLinQ aims to bring unlock information on the other 97 percent to improve the field of oncology.
Another project aims to increase the rates of those participating in clinical trials. According to the OSTP, just half of all clinical trials fail to reach their recruitment targets. To help with the situation, a group of drug companies (including Novartis, Pfizer, and Eli Lilly and Co.) are coming together to improve the clinicaltrials.gov website by providing more detailed and patient-friendly information about available trials.
The Earth sciences get a boost from a new collaboration between NASA and Amazon Web Services (AWS) to make space-based data about the Earth widely available to the public over the Internet. The OSTP says this will bolster projects like Citizen Science Alliance’s Zooniverse.org, a website for crowd-sourcing solutions to scientific problems that involve lots and lots of data. You can read more about the NASA-AWS collaboration here.
The fourth new OSTP-backed initiative involves DataKind, a non-profit organization that that aims to match data scientists with non-profit and non-governmental organizations. Hadoop-distributor Pivotal has agreed to work with DataKind to help social organizations make better use of their big data sets. DataKind is also going to work with The Mission Continues, which aims to match military veterans with jobs, and Medic Mobile, a group that attempts to improve the health of under-serviced and disconnected communities.
Projects such as these improve the odds for making novel scientific discoveries, says Suzi Iacono, co-chair of NITRD’s Big Data Senior Steering Group. “We are seeing progress on so many fronts,” she says. “For example, developing the foundations of scalable algorithms, integrating human and machine reasoning in large-scale inferences, extracting knowledge from large, diverse, and complex data sets, and altogether facilitating powerful new approaches to discovery and decision-making.”
Another big data group making waves in the public sphere is the UK-based Open Data Institute. Speaking at the Strata Conference in London event last week, Gavin Starks, the CEO of ODI, said he is seeing progress made in open data initiatives around the world.
“I was heartened when the G8 signed an open data charter,” he says in a keynote. “And it was an astonishing thing to see ‘machine readable’ in a presidential order,” referring to President Obama’s May 9 executive order that all government data that is posted for free public use on the data.gov must be in a machine-readable format.
|Open Data Initiative CEO Gavin Starks|
“But it’s not just about government,” Starks says. “It’s part of a global trend towards a government 2.0 landscape, but it’s broadening. We’ve seen at the ODI commercial companies like Virgin, Rackspace, and Deloitte.”
These for-profit companies realize they have much to gain by opening up their big data troves and sharing the riches for the benefit of humanity. “There’s a real positive intent,” Starks says. “Surprisingly we’ve had organizations come to us and say, ‘We have these data sets. We’re not innovating as much as we could. If we make it open, how can we make it open to help a broader community engage with our information?'”
Recently, the ODI announced a new program that enables any organization–including for-profit ventures as well as non-profit entities–to set up their own open data organization, which are dubbed ODI nodes. Currently there are 13 ODI nodes around the world, ranging from Chicago to Moscow.
The goal is to deliver more transparency not only with government, but with corporations as well. “To help people build trust, I think, is the key outcome that people are looking for,” Starks says. “How can we build trust with organizations that we’re dealing with?”