Amazon Hosting 20 TB of Climate Data
Looking to save the world through data? Amazon, in conjunction with the NASA Earth Exchange (NEX) team, today released over 20 terabytes of NASA-collected climate data as part of its OpenNEX project. The goal, they say, is to make important datasets accessible to a wide audience of researchers, students, and citizen scientists in order to facilitate discovery.
“Up until now, it has been logistically difficult for researchers to gain easy access to this data due to its dynamic nature and immense size,” writes Amazon’s Jeff Barr in the Amazon blog. “Limitations on download bandwidth, local storage, and on-premises processing power made in-house processing impractical. Today we are publishing an initial collection of datasets available (over 20 TB), along with Amazon Machine Images (AMIs), and tutorials.”
The OpenNEX project aims to give open access to resources to aid earth science researchers, including data, virtual labs, lectures, computing and more.
Per NASA’s OpenNEX:
OpenNEX is designed to engage the global community of Earth scientists in cross-disciplinary research by combining global Earth observation datasets, shared scientific tools and workflows, and the power of cloud computing to enhance scientific collaboration and accelerate progress towards understanding emerging changes in the Earth system. OpenNEX is developing resources for scientists seeking to enhance their skills on a variety of research topics. OpenNEX learning resources include online lectures from the world’s leading scientific experts and hands-on data analysis and modeling exercises enabled through virtual machines and shared workflows.
Datasets available on Amazon include the following:
- Data for Climate Assessment – This package is made up of climate scenarios created from the General Circulation Model, which is a mathematical model that takes in such things as atmospheric and oceanic circulation, taking account of an array of variables, including temperature and humidity.
- Landsat Global Land Survey – This dataset contains space-based mid-resolution data from the past four decades, giving researchers the opportunity to observe changes that have happened in this period of time.
- MODIS Vegetation Indices – Used in such applications as biogeochemical, land-use planning, and land cover change detection, the MODIS Vegetation Indices provide information on vegetation conditions over time.
Additionally, NASA is publishing their open source Webification tool which provides easy access to all of the above data sets in a simplified format. According to Barr, through the use of this tool, data can be accessed via URL and returned in JSON or binary format.
To help get users acquainted with the data and how to process them through AWS, NASA will be putting on a series of virtual workshops.
Those interested in the datasets can find them stored in Amazon S3 at s3://nasanex.