NOAA Launches Big Data Project
A big data project launched by the U.S. Commerce Department brings together major cloud analytics vendors and the Open Cloud Consortium to develop infrastructure needed to access the more than 20 terabytes of satellite weather data generated each day.
The National Oceanic and Atmospheric Administration (NOAA), a Commerce Department agency, currently manages the daily torrent of observational data collected by weather satellites and other ocean- and land-based sensors. Last month, NOAA and Commerce officials announced a research agreement with cloud providers to develop the infrastructure needed for open, scalable access to weather data.
The partners include Amazon Web Services, Google Cloud Platform, IBM, Microsoft and the Open Cloud Consortium. These Infrastructure-as-a-Service (IaaS) providers will in turn form “data alliances” with potential users and resellers of government weather data.
In one possible deployment scenario, NOAA said IaaS providers “could help to position NOAA’s data next to their own high performance computing, analytic and storage services,” allowing potential users to “take advantage of that positioning to run algorithms, perform research and create inventions.”
In a blog post this week, the White House Office of Science and Technology Policy, which is spearheading government open data efforts, said “industry saw great untapped economic potential in making NOAA’s environmental data more accessible, and that this economic potential could far outweigh the data distribution costs.”
Commerce Secretary Penny Pritzker unveiled the NOAA Big Data Project during a policy forum sponsored by the American Meteorological Society. “These collaborations will create open platforms where private industry, academia and individual innovators can access our data through the cloud on a completely new scale,” Pritzker said. “This announcement is a big deal. For the first time, we are giving the public the opportunity to mine our data.”
The data alliances will serve as prototypes for developing new distribution channels for NOAA weather data. The agency will meanwhile maintain its existing web sites and other portals to “ensure that all of the data distributed through the new ‘data alliance’ model is also in the public domain and accessible non-preferentially.”
The NOAA data project will seek to leverage a range of data analytics engines to process weather and climate datasets, including Google BigQuery and its Cloud DataFlow, IBM’s Bluemix platform and the Microsoft Azure Government platform.
The Open Cloud Consortium said its collaboration with NOAA aims “to make finding and accessing this data easier for the academic, non-profit and research communities, to enable scientific analysis and to drive discovery,” said Robert Grossman, consortium director.
Amazon Web Services said it is hosting an online information session on May 7 to discuss further collaboration with the AWS-led data alliance. The cloud provider also said it is looking for ways to shift more NOAA data onto its public cloud platform and “build an ecosystem of innovation around it.”
AWS already hosts several public datasets, including Landsat satellite imagery, NASA satellite images, a human genome project and U.S. census data.