Big Data in Space: Martian Computational Archeology
Summer 1975. While Led Zeppelin was releasing “Kashmir”, the Viking spacecrafts leave Cape Canaveral (Florida, USA) to their destination in Planet Mars. With a budget of $1 billion, this is the most ambitious NASA mission to Mars.
Almost 40 years later. While Coldplay releases “Paradise”, the data gathered by the Vikings is being studied again, as it will help in future missions to Mars. This time, the XXI century computing technology is giving a hand.
Each Viking consisted in a lander module that was detached from an orbiter used for terrain survey and communications. Viking 1 landed at Chryse Planitia and Viking 2 at Utopia Planitia. Both Vikings were a total success by means of data gathered and a lifetime above the expected. Viking 1 lasted for more than 6 years, 2245 sols (a sol is a Martian day with a duration of 24 hours, 39 minutes and 35.224 seconds), and was lost due to a faulty software update. On the other hand, Viking 2 was lost because of a battery failure after more than 3 years of service, 1281 sols.
Besides the cameras, the Vikings carried a variety of sensors and instruments that were used to study the biology, chemical composition, meteorology, seismology and magnetic properties of the Red Planet.
All lander instruments were controlled by NASA/JPL during the initial nominal mission but then, the meteorological ones were responsibility of the University of Washington (Seattle). While the Planetary Data System (PDS) was used for publishing the nominal mission data, many later data didn’t make it to the public.
The PDS is a distributed data system used by NASA to store and organize data obtained by robotic missions sent within our Solar System. The challenges faced by this system are not only about format standardization and storage reliability. They are also about credibility, as data is peer reviewed in order to meet high quality standards imposed by PDS. Additionally, planetary scientists individually curate each Science discipline node in which the PDS is divided.
(left) Viking’s Data Acquisition and Processing Units
At the end of 2005 the University of Washington closed the Viking Computing Facility. The “data-torch” was passed then to the Finnish Meteorological Institute (FMI) , a long-standing cooperation partner.
The FMI saw in this action a great opportunity, as data reanalysis would be extremely useful for future missions like Mars MetNet. This mission brings together Finland, Russia and Spain, aiming to deploy a meteorological network on the Mars surface consisting in several tens of probes. We, the Distributed Systems Architecture Research Group, got involved by bringing cloud computing to the first of a long list of challenges involving onboard instruments.
Returning to the Viking, data and programs for sorting and analysis including the processing environment were optimized for a PRIME computer built in the late 70s. The FMI team ported all the data into a Linux environment using Perl scripts.
(left) PRIME 400 ICF computing area (1977).
The original binary mission data was transferred into 2 DVDs, being 400 MB devoted to the meteorological instruments. However, this data is being analyzed again for identifying instrument failures and instrument calibration changes. When the analysis is ready the full Viking meteorological data set will be for the first time available for the scientific community.
In this context, the MEIGA-MetNet Project team is applying a model for estimating the eclipses provoked by Phobos, Mars’ biggest moon, and their correlation with the Viking landers data.
For this task, we projected a computing framework that takes advantage of public cloud infrastructures, as they provide cheap, on-demand and scalable resources for the sporadic processing required.
Framework for processing the Phobos’ eclipses data.
This framework is being extended to support the process of other data and not only that from the Viking landers. All the upcoming data processing is conceived as a calibration of the cloud system, like that which is done with any mission critical instrument. This is because live data will be patched through the framework once all the processing functions are available.
At this moment, no cloud storage system has been considered because each individual task carries its own data. Once more of the above mentioned functions are implemented, different storage strategies will be studied, especially when processing will be (even) more complex.
Space exploration has always returned great ideas and inventions to the investment done by humanity. Nowadays, when many companies are trying to find ways to get useful information from old data, this article tried to provide some of these ideas. Once again Albert Einstein was right: “Learn from yesterday, live for today, hope for tomorrow. The important thing is not to stop questioning”… or processing.
A.-M. Harri, W. Schmidt, P. Romero, L. Vazquez, G. Barderas, O. Kemppinen, C. Aguirre, J.L. Vazquez-Poletti, I.M. Llorente and H. Haukka: Phobos eclipse detection on Mars, theory and practice. Technical Report, 2012 (to be published), Finnish Meteorological Institute, Finland.
O. Kemppinen: Analysis of unpublished high-resolution Martian meteorological data from Viking landers 1 and 2, M.Sc. Thesis, 2011, Aalto University, Finland.
P. Romero, G. Barderas, J.L. Vázquez-Poletti and I.M. Llorente: Chronogram to detect Phobos Eclipses on Mars with the MetNet Precursor Lander. Planetary and Space Science, vol. 59, n. 13, 2011, pp. 1542-1550.
J. L. Vázquez-Poletti, G. Barderas, I. M. Llorente and P. Romero: A Model for Efficient Onboard Actualization of an Instrumental Cyclogram for the Mars MetNet Mission on a Public Cloud Infrastructure. PARA2010: State of the Art in Scientific and Parallel Computing, Reykjavik (Iceland), June 2010. Proceedings published in Lecture Notes in Computer Science (LNCS). Volume 7133. Springer Verlag.
About the Author
Jose Luis Vazquez-Poletti
Dr. Jose Luis Vazquez-Poletti is Assistant Professor in Computer Architecture at Complutense University of Madrid (UCM, Spain), and a Cloud Computing Researcher at the Distributed Systems Architecture Research Group (http://dsa-research.org/).
He is (and has been) directly involved in EU funded projects, such as EGEE (Grid Computing) and 4CaaSt (PaaS Cloud), as well as many Spanish national initiatives.
From 2005 to 2009 his research focused in application porting onto Grid Computing infrastructures, activity that let him be “where the real action was”. These applications pertained to a wide range of areas, from Fusion Physics to Bioinformatics. During this period he achieved the abilities needed for profiling applications and making them benefit of distributed computing infrastructures. Additionally, he shared these abilities in many training events organized within the EGEE Project and similar initiatives.
Since 2010 his research interests lie in different aspects of Cloud Computing, but always having real life applications in mind, specially those pertaining to the High Performance Computing domain.