What Sociologists Say About Big Data
Part of the purpose of big data research is to cull the vast amount of data being generated by social network sites like Facebook and Twitter such that businesses can get a better idea of the success of their latest marketing campaign. Vendors and researchers have made enormous progress collecting all this data and even analyzing it a little.
However, social scientists are left jumping up and down and waving their arms around asking why they have yet to be consulted in all of this. After all, Facebook and Twitter are self-described “social” networks.
Sociologists believe that analyzing the realm of those social networks should not fall entirely into the hands of the data scientists. Eindhoven University’s Chris Snijders and Uwe Matzat teamed up with the University of Deusto’s Ulf-Dietrich Reips to publish a paper in the International Journal of Internet Sciences that outlined the contributions the social sciences could make to the big data industry.
“The interesting point,” the paper reads “is that these limitations [in big data research] can (and have to) be addressed by theory guided research that is typically conducted by social scientists. Accordingly, opportunities emerge for those social and behavioral scientists who are willing to collaborate with the Big Data researchers in the natural, engineering, and computer sciences.”
According to the paper, social science has already found results regarding the internet and the people who use it which would be useful to big data research. The major finding they cite is that British researchers from Reuters and the Oxford Internet Institute have found that social media has established itself as the “Fifth Estate” alongside the legislature, judiciary, executive, and press, the latter of which it has already suppressed in importance.
Social media is truly a powerful thing. A British citizen was sent to jail for incendiary racial remarks on a Twitter account. Both the recent Libyan and Egyptian revolutions have been sparked by angry protestors gathering and planning their demonstrations on Facebook and Twitter. It is this power that businesses and researchers wish to harness.
So what is it the social scientists want to help with? They do research by conducting surveys and hoping the responses they get are reliable and insightful. “In short, the crucial point is that the combination of large but sparse Big Data with smaller but rich survey data offers the opportunity to link the individual-level and the community-level characteristics with the individual online data.”
Put simply, social science could gather some basic insights from surveys which would help refine big data research. Take video games for example. The paper notes that video games, especially those with massive online components, are more likely to succeed if they make the gamers feel as if they belong to a special social group as a result of their play.
While this finding is not exactly outside the realm of common sense, it is incredibly difficult to gauge the social impact of a video game before it is released. Video games are designed with a mind toward ease of use, enjoyable gameplay, difficulty, enthralling storyline, and a variety of other things that can be beta-tested easily before launch. However, the success of the game’s social aspect is impossible to judge before the game is released and the social center created.
Analyzing which micro-processes, as the paper calls them, lead to a successful social center is essential to the design of future video games and a job for the social scientists.
Of course, businesses other than video game companies wish to utilize the vast amount of data out there as well. People can be quick to say “Let’s analyze all this data!” without stopping to ask why they should. Even if that initial step is taken, it can be difficult to progress to and identify which data is important.
The paper claims that “one could consider empirical sociological and social-psychological analyses of processes of tie-formation and bring these back to a limited number of behavioral mechanisms, such as homophily of different kinds, reciprocity, scope of access to other nodes, etc. This knowledge can then be used as input for the selection and formulation of mathematically tractable models of tie-formation.”
In essence, sociologists know what behavioral traits (or mechanisms) tend toward certain product attachment (or tie-formation). For example, an environmentalist living in Seattle may be more likely to drive a Subaru because a Subaru is good at driving in Seattle’s inclement weather and is gas-friendly whereas a construction worker in North Carolina may drive a Ford because Ford markets itself as the workman’s vehicle. It is insights like this that social scientists believe big data research is missing.
The paper is optimistic about the ability of social science and big data to coalesce and do some good to the world. “Furthermore, many argue that the combination of Big Data efforts with social science theory would be useful for the prediction of social and economic crises.
The FuturICT project is an outcome of (and a starting point for) researchers in several countries who share these hopes.” As already mentioned, social media played a role in the recent African/Middle Eastern revolutions. Big data, perhaps even Hadoop and its well-known predictive-friendly capabilities, could help to identify those budding revolutions before they happen.
Sociology has a place in big data research. Whether or not it has as a big place as it wants remains to be seen, however, its role in steering research to where it can be most useful could be important.
Six Super-Scale Hadoop Deployments
How 8 Small Companies are Retooling Big Data
Cloudera CTO Reflects on Hadoop Underpinnings
June 2, 2023
- Esri Announces Winners of the 2023 ArcGIS Online Competition
- Accenture Acquires Nextira, Expanding Engineering Capabilities in AI & ML
- ReproCell, HNCDI, and IBM Introduce Pharmacology-AI to Optimize Drug Response Analysis
- BigID Revolutionizes Auto-Classification with Classifier Tuning
June 1, 2023
- Databricks Releases Keynote Lineup and Generation AI Programming for 2023 Data + AI Summit
- New Relic Launches Amazon Security Lake Integration
- Latest Couchbase Capella Release Features New Developer Platform Integrations and Greater Enterprise Features
- Anyscale Launches Aviary: Open Source Infrastructure to Simplify LLM Deployment
- Census Announces GitLink to Bring Software Engineering Best Practices to Data Activation Workflows
- GridGain Releases Conference Schedule for Virtual Apache Ignite Summit 2023
- Automation Anywhere and AWS Bring the Power of Generative AI to Mission Critical Mainstream Enterprise Processes
- Domino Reveals Breakthrough Innovations for Swift and Cost-effective Enterprise AI Deployment
- Acceldata to Illuminate Cloud-Based Management Solutions at Enterprise Data Summit
May 31, 2023
- AWS Announces General Availability of Amazon Security Lake
- Cloudera and Clalit Unite to Enhance Israeli Healthcare with Advanced Data Analytics
- SAS’s Intelligent Decisioning Earns Top Spot in Forrester’s AI Decisioning Platforms Evaluation
- MariaDB Ushers in New Era with Paul O’Brien as CEO, Unveils Ambitious Growth Plan
- Precisely Advances Leading Data Quality Portfolio, Providing Unparalleled Support to Customers on their Journey to Data Integrity
- Lightmatter Raises $154M to Deliver Photonic Products to Customers
- Aporia Partners with Databricks to Empower Organizations to Monitor ML Models in Real Time
Most Read Features
- Tableau Jumps Into Generative AI with Tableau GPT
- Data Mesh Vs. Data Fabric: Understanding the Differences
- Vector Databases Emerge to Fill Critical Role in AI
- Which BI and Analytics Vendors Are Incorporating ChatGPT, and How
- Google Claims Its TPU v4 Outperforms Nvidia A100
- LLMs Are the Dinosaur-Killing Meteor for Old BI, ThoughtSpot CEO Says
- The Semantic Layer Architecture: Where Business Intelligence is Truly Heading
- Open Source Provides Path to Real-Time Stream Processing
- Hallucinations, Plagiarism, and ChatGPT
- Beyond the Moat: Powerful Open-Source AI Models Just There for the Taking
- More Features…
Most Read News In Brief
- Microsoft Unifies Data Management, Analytics, and ML Into ‘Fabric’
- Mathematica Helps Crack Zodiac Killer’s Code
- Nine Things I Learned at Tableau Conference 2023
- Informatica Claims 80% Speedup for Data Management Tasks with LLMs
- Big Data Career Notes: May 2023 Edition
- AI Chatbots: A Hedge Against Inflation?
- IBM Embraces Iceberg, Presto in New Watsonx Data Lakehouse
- We’re Still in the ‘Wild West’ When it Comes to Data Governance, StreamSets Says
- Venture Capital Funding Plummets, But AI Investment Growing Strong
- Databricks Enhances Lakehouse Governance with Okera Acquisition and Immuta Investment
- More News In Brief…
Most Read This Just In
- DataStax and ThirdAI Announce Partnership to Democratize Access to Advanced AI Tech
- ServiceNow and Hugging Face Release StarCoder LLM for Code Generation
- Pega Announces Pega GenAI to Infuse Generative AI Capabilities in Pega Infinity ’23
- Sumo Logic Names Joe Kim as President and CEO
- Google Cloud’s Generative AI Revolutionizing Workplace Applications: Major Enterprise Partnerships Announced
- Red Hat OpenShift AI Accelerates Generative AI Adoption Across the Hybrid Cloud
- MariaDB Unveils Distributed SQL Vision at OpenWorks 2023, Boosting Scalability for MySQL and PostgreSQL Communities
- Francisco Partners Completes Acquisition of Sumo Logic
- Informatica Announces Expanded Industry Focus and Zero Cost Data Pipelines and Transformations with AWS
- Google Cloud Unveils A3 GPU Supercomputer: Next-Gen Power for Advanced AI Models
- More This Just In…
Sponsored Partner Content
Inside the ROI of Informatica iPaaS
Wakefield Survey: Monte Carlo’s 2023 State of Data Quality Survey
Achieving reliable data is a marathon not a sprint—get O’Reillys Data Quality Fundamentals
Get your single source of Snowflake data access truth, for free
40+ financial datasets, pre-integrated in Apperate.
Informatica Ranks as the #1 Data Engineering Vendor
IEEE Conference on Artificial Intelligence 2023June 5 @ 8:00 am - June 6 @ 5:00 pmSanta Clara CA United States
Enterprise Data SummitJune 7
CDAO Insurance 2023June 13 - June 14
ODSC Europe 2023June 14 - June 15London United Kingdom