We often hear about the advantages of mining massive data sets for the benefit of big business and scientific research, but this week scholars in the humanities demonstrated their ability to find new ways to extract value from a range of data repositories.
The impetus behind the drive to encourage growth of new data processing and management tools in the humanities was spurred by a competition among academic institutions called the “Digging into Data Challenge.”
Fourteen teams from the Netherlands, Canada, the UK and the US have been awarded a combined total of $4.8 million in grant funding to find ways to apply computational techniques to big data to change the nature of humanities and social sciences research.
The aims of the challenge, which started in 2009, revolve around how to address the problems of big data, which is reshaping the landscape for humanities and social sciences research. As the organizers claim, the world is becoming digitized—including the materials that researchers in the humanities and social sciences have been using for many decades (newspapers, books, etc.). This challenge is meant to spur the “research community to help create the new research infrastructure for 21st century scholarship.”
According to a release today, the projects that received funding covered a wide variety of topics, including “using information retrieval techniques to investigate changes in Western music; using high resolution imaging to study the ancient Egyptian mummification process; using data mining technology to shed light on the impacts of economic opportunity and spatial mobility on social structure; and using natural language processing to analyze large text archives in the study of human rights abuses.”
Some notable efforts that were selected for funding include a project to develop new ways of exploring full text content of digital historical records, another project that will analyze a vast set of Open Access research publications using natural language processing and social network analysis methods to identify patterns and trends in research communities, and an effort to create a scalable workbench called InterDebates with the goal of digging into data provided by hundreds of thousands of digitized print documents, including books.