September 3, 2012

What Sociologists Say About Big Data

Ian Armas Foster

Part of the purpose of big data research is to cull the vast amount of data being generated by social network sites like Facebook and Twitter such that businesses can get a better idea of the success of their latest marketing campaign. Vendors and researchers have made enormous progress collecting all this data and even analyzing it a little.

However, social scientists are left jumping up and down and waving their arms around asking why they have yet to be consulted in all of this. After all, Facebook and Twitter are self-described “social” networks.

Sociologists believe that analyzing the realm of those social networks should not fall entirely into the hands of the data scientists. Eindhoven University’s Chris Snijders and Uwe Matzat teamed up with the University of Deusto’s Ulf-Dietrich Reips to publish a paper in the International Journal of Internet Sciences that outlined the contributions the social sciences could make to the big data industry.

“The interesting point,” the paper reads “is that these limitations [in big data research] can (and have to) be addressed by theory guided research that is typically conducted by social scientists. Accordingly, opportunities emerge for those social and behavioral scientists who are willing to collaborate with the Big Data researchers in the natural, engineering, and computer sciences.”

According to the paper, social science has already found results regarding the internet and the people who use it which would be useful to big data research. The major finding they cite is that British researchers from Reuters and the Oxford Internet Institute have found that social media has established itself as the “Fifth Estate” alongside the legislature, judiciary, executive, and press, the latter of which it has already suppressed in importance.

Social media is truly a powerful thing. A British citizen was sent to jail for incendiary racial remarks on a Twitter account. Both the recent Libyan and Egyptian revolutions have been sparked by angry protestors gathering and planning their demonstrations on Facebook and Twitter. It is this power that businesses and researchers wish to harness.

So what is it the social scientists want to help with? They do research by conducting surveys and hoping the responses they get are reliable and insightful. “In short, the crucial point is that the combination of large but sparse Big Data with smaller but rich survey data offers the opportunity to link the individual-level and the community-level characteristics with the individual online data.”

Put simply, social science could gather some basic insights from surveys which would help refine big data research. Take video games for example. The paper notes that video games, especially those with massive online components, are more likely to succeed if they make the gamers feel as if they belong to a special social group as a result of their play.

While this finding is not exactly outside the realm of common sense, it is incredibly difficult to gauge the social impact of a video game before it is released. Video games are designed with a mind toward ease of use, enjoyable gameplay, difficulty, enthralling storyline, and a variety of other things that can be beta-tested easily before launch. However, the success of the game’s social aspect is impossible to judge before the game is released and the social center created.

Analyzing which micro-processes, as the paper calls them, lead to a successful social center is essential to the design of future video games and a job for the social scientists.

Of course, businesses other than video game companies wish to utilize the vast amount of data out there as well. People can be quick to say “Let’s analyze all this data!” without stopping to ask why they should. Even if that initial step is taken, it can be difficult to progress to and identify which data is important.

The paper claims that “one could consider empirical sociological and social-psychological analyses of processes of tie-formation and bring these back to a limited number of behavioral mechanisms, such as homophily of different kinds, reciprocity, scope of access to other nodes, etc. This knowledge can then be used as input for the selection and formulation of mathematically tractable models of tie-formation.”

In essence, sociologists know what behavioral traits (or mechanisms) tend toward certain product attachment (or tie-formation). For example, an environmentalist living in Seattle may be more likely to drive a Subaru because a Subaru is good at driving in Seattle’s inclement weather and is gas-friendly whereas a construction worker in North Carolina may drive a Ford because Ford markets itself as the workman’s vehicle. It is insights like this that social scientists believe big data research is missing.

The paper is optimistic about the ability of social science and big data to coalesce and do some good to the world. “Furthermore, many argue that the combination of Big Data efforts with social science theory would be useful for the prediction of social and economic crises.

The FuturICT project is an outcome of (and a starting point for) researchers in several countries who share these hopes.” As already mentioned, social media played a role in the recent African/Middle Eastern revolutions. Big data, perhaps even Hadoop and its well-known predictive-friendly capabilities, could help to identify those budding revolutions before they happen.

Sociology has a place in big data research. Whether or not it has as a big place as it wants remains to be seen, however, its role in steering research to where it can be most useful could be important.

Related Stories

Six Super-Scale Hadoop Deployments

How 8 Small Companies are Retooling Big Data

Cloudera CTO Reflects on Hadoop Underpinnings

MapR Floating Google Cloud