Follow Datanami:
January 30, 2012

A “Jurassic Park” for Archaeological Data

Nicole Hemsoth

Long, long ago—well, 1993, to be exact—author Michael Gruber prophesized a coming age of “digital archaeology.” According to his definition, this was a sort of archaeology that, as a discipline, didn’t exist yet, but would soon be developed to deal with the problems of a new world of data.

To enhance his argument, Gruber pointed to NASA as a prime example of this need for digital archaeology. As he stated in his 1993 Wired article:

“Information sent back from space missions in the 1960s stored on deteriorating magnetic tape, in formats designed for computers that were obsolete twenty years ago. NASA didn’t have the funds to transfer the data before the machines became junk. The National Center for Atmospheric research has “thousands of terabits” of data on aging media that will probably never be updated because it would take a century to do it. The archival tapes of Doug Engelbart’s Augment project – an important part of the history of computing – are decaying in a St. Louis warehouse. “

NASA went on to solve these problems, and as time marches on, other fields, including archaeology proper, have been following suit in creative ways.

But the problem facing curators of academic information collections isn’t just about data acquisition and basic management any longer—the data needs to be accessible, securely stored for future generations, and be quickly and easily usable by researchers. This familiar problem exists in all disciplines, but this week archeology data was the subject of some news and a fresh round of grant funding.

The Digital Archaeological Record, or tDAR, is the United States’ largest digital store of global archaeological data. It houses everything from 3D scans of artifacts to more traditional archaeological research materials, including digitized books and scholarly papers. In many ways, this is the “Jurassic Park” for big data—an incredibly large collection of critical archaeological data, but one that still requires a great deal of fine-tuning to ensure accessibility, longevity and usability.

As Arizona State University professor and sustainability scientist, Keith Kintigh summarized, “In laboratory-based science, experiments can be repeated; however, you can’t dig a site twice…The archaeological record provides our only access to most of human history.”

Since scientists can “dig a site twice” and the research data is crucial to maintain, tDAR was formed in 2009. Initial support for tDAR was made possible via a large grant from the Mellon Foundation, followed by another grant ($1.2 million) issued from the same organization this past week. The new funds will allow tDAR to respond to the changes in how data is created, stored and accessed in this area of study—a major issue due to the diversity of data types.

tDAR is overseen by Arizona State University’s Center for Digital Antiquity, which maintains the collection of hugely diverse media and data types. However, according to the center’s executive director, Francis P. McManamon, “approximately 40,000 archaeological investigations take place every year in the United States, yet only a handful thoroughly publish their findings and the supporting data in traditional, general distribution books. Most projects do produce limited distribution paper reports that end up in just a few of the thousands of state and federal agency offices and university libraries.”

In addition, researchers at the center note that there is no reliable way to discover the existence of reports relevant to a particular research topic and the reports are frequently difficult to use and expensive to obtain.

Without secure and efficient ways to store this data, a “tragic” situation can occur, according toJulie Newberg at Arizona State University. Newberg writes that loss of this information would be the loss of “irreplaceable information about our national and global heritage and [would] represent a wate of time, effort and public money that has been expended to collect, analyze and report the data.”

ASU’s Keith Kintigh put the importance of big archaeological data in context, noting that, “For example, human societies both contribute to and respond to gradual environmental change. Archaeological evidence allows us to better understand the conditions under which societies are resilient to long-term change, and the configurations that lead to collapse.”

As we have reported previously, there is a new wave in both academia and enterprise settings to find ways to transfer “tangible” information sources into archived, accessible, safely-stored resources for future generations. tDAR is another prime example of how academics are addressing the challenges of massive datasets that are, in many senses, “mission critical” to the future of the field.

In this way the data curation priorities are no different than in enterprise contexts—meaning that academia and the business world could benefit from sharing stories and solutions when it comes to massive, “mission critical” data curation.

Related Stories

The Evolving Art (and Business) of Data Curation

Astronomers Leverage “Unprecedented” Data Set

Big Data in Space: Martian Computational Archeology

Datanami