Since the announcement last March of the federal “Big Data Research and Development Initiative”, big data has become the popular label for computer science research.
The rampant use of the label is quite justified--coping with big data is a active challenge for every area of computer science. Aside from the technology developments, another interesting feature of the "big data" phenomenon is that it is an idea that is drawing researchers and practitioners together across many disciplines.
Within computer science, multiple areas are focused on solving big data challenges. Database researchers, for instance, are working on accessing huge data sets distributed across multiple computer nodes while storage, network and application researchers tend to their own specific big data problems.
To give a sense of some of the work that is taking place behind the scenes in big data research, consider the following:
Different technologies are emerging depending on whether access is needed for queries or for updates.
Networking researchers are pushing for moving large data sets at ever higher speeds.
Theorists working in machine learning have seen an explosions of interest as scientists and corporations alike are looking for new knowledge both in data elements and in the linkages between data elements -- i.e. in the graph structure of data.
Graphics and visualization researchers are inventing new techniques to summarize structures and insights from gigantic data sets in visual form for rapid human comprehension.
Numerical analysts are creating efficient simulation techniques for detailed results that necessitate handled large quantities of data. All of this work is interrelated--for example analysts running simulations need to move big data sets through networks, and need visualization techniques to make sense of them.
Beyond drawing together computer science subfields, researchers in far more disparate fields are now finding common interests clustered around "big data". The same infrastructure and techniques that business use to develop marketing plans are of interest to humanists studying libraries of thousands of books. Researchers in both the natural and physical sciences dealing with "big data" can draw on not only advances in computer science, but on advances from work that is initally targeted at business and digital humanities.
Advances and challenges in "big data" from multiple points of view are the topic of an upcoming symposium presented by the Yale Department of Computer Science: "Big Data Science: A Symposium in Honor of Martin Schultz".
Speakers and poster presentations on big data topics ranging from theory of computer science to applications in social media will be presented. Information about the event, which will be free and open to the public, can be found at at this site.