May 10, 2013

Nate Silver Warns Against Big Data Assumptions

Ian Armas Foster

While there are many uses today for big data, the general principle is thus: more data equals bigger sample sizes from which more accurate representations can be drawn. However, noted statistically inclined prognosticator Nate Silver warned at the RMS Exceedance Conference in Boston this week that an over-abundance of data can be dangerous and counter-productive if managed improperly.

Silver garnered national attention in November of last year when his statistical probability models correctly picked the winner of every state in the presidential election along with every senate race that year except for one. He essentially argued that it is easier to cherry-pick numbers and create correlations that are not actually there when more data is collected. The apparent issue with this argument is that institutions looking to garner insights from swaths of data would seem unlikely to intentionally misrepresent their data.

It can be easy to forget, however, that while ‘big data’ has been an IT buzzword for the last two years, it is still a relatively recent phenomenon, especially in companies that are used to making decisions in a gut-instinct sort of way. As noted in last month’s Big Data in Sports feature, some coaches like the collection of big data for the wrong reason: it tells them things they already knew.

As such, it is at least equally possible for a manager to stretch the data to tell them what they thought they already knew.

Silver had a couple of methods to combat these biases, the first being a different, but not drastically different, thinking process.

“Think probabilistically,” Silver said. “Think in terms of probabilities and not in terms of absolutes.”

This is what Silver did in his presidential forecast, noting things like “Obama has an X percent chance of winning Virginia.” Per his numbers, there was actually only a one-in-five chance that he would correctly predict all 50 states. This approach has two immediate benefits. The first is making prognostication impersonal (good for executives who may prefer not to be proven wrong). The second, and more useful, is recognizing and utilizing margin of error.

Another method Silver promoted was recognizing one’s biases. He noted an interesting experiment where people examined identical resumes where one was headed by a male name and the other a female name. People who self-identified as having a gender bias actually judged the resumes more fairly, according to Silver, that those who claimed that held no such bias. Recognizing a bias, for Silver, means one can consciously act against it.

“Know where you’re coming from,” Silver said in his section on bias recognition. “You are defined by your weakest link,” he continued.

These tidbits of advice may not reach the granular level that institutions delving into the data ocean are hoping for. However, it is occasionally important to take a step back and understand why the data is being collected and analyzed in the first places and the goals one hopes to reach. Not avoiding these roadblocks to data-enhanced understanding can sully those goals.

Related Articles

Visualizing the Big Data Job Market

Obama Win Reinforces New Tech Era

Big Data Big Five

Applications: Predictive Analytics

Sectors: Government

Tags: Exceedance Conference, Nate Silver

Only registered users may comment. Register using the form below.

Check off newsletters you would like to receive*
- HPCwire
- EnterpriseTech
- Datanami
- Technology Conferences & Events
- Advanced Computing Job Bank
- Technology Product Showcase
Email*
Name*
First Last
Organization*
Job Function*
Industry*
Country*
City*
State*
Province*
- Please check here to receive valuable email offers from Datanami on behalf of our select partners.

Nate Silver Warns Against Big Data Assumptions

Join the discussion Cancel reply

Only registered users may comment. Register using the form below.

May 14, 2024

May 13, 2024

May 10, 2024

May 9, 2024

May 8, 2024

Sponsored Partner Content

Get your Data AI Ready – Celebrate One Year of Deep Dish Data Virtual Series!

Supercharge Your Data Lake with Spark 3.3

Learn How to Build a Custom Chatbot Using a RAG Workflow in Minutes [Hands-on Demo]

Overcome ETL Bottlenecks with Metadata-driven Integration for the AI Era [Free Guide]

Gartner® Hype Cycle™ for Analytics and Business Intelligence 2023

The Art of Mastering Data Quality for AI and Analytics

Leading Solution Providers

Tabor Network

Sponsored Whitepapers

Top 6 Strategies for Reducing Data Warehouse Costs

Building an Operational Data Warehouse for Real-time Analytics

Sponsored Multimedia

The Power of DataOps: Bring Automation to Life
No Comments

Tactical Steps for Cloud Migration
No Comments

Immuta Data Access Platform
No Comments

Data Mesh: Fact or Fiction?
No Comments

Contributors

Featured Events

AI & Big Data Expo North America 2024

CDAO Canada Public Sector 2024

AI Hardware & Edge AI Summit Europe

AI Hardware & Edge AI Summit 2024

CDAO Government 2024

Nate Silver Warns Against Big Data Assumptions

Join the discussion Cancel reply

Only registered users may comment. Register using the form below.

May 14, 2024

May 13, 2024

May 10, 2024

May 9, 2024

May 8, 2024

Most Read Features

Most Read News In Brief

Most Read This Just In

Sponsored Partner Content

Leading Solution Providers

Tabor Network

Sponsored Whitepapers

Sponsored Multimedia

Contributors

Featured Events

Share

Copy short link