September 11, 2012

One Giant Leap for Psychohistory

Datanami Staff

For science fiction buffs, many news items coming out of the world of predictive and large-scale historical analysis invoke Asimov’s concept of Psychohistory, in which probabilistic group patterns can predict major future events in history.

While big data platforms today may not be able to predict the eventual fall of Galactic Empires (although predicting revolutionary events from social data is a reality) they can generate insights based on large swaths of historical data.

In particular, Kalev Leetaru from the University of Illinois carried out a fascinating historical analysis project where he mapped “the world according to Wikipedia” SGI’s UV2, a system which the company has touted as “the world’s largest in-memory data mining system.” Leetaru explains the genesis of the project below.

Leetaru had already used similar analytics in publishing his Culturomics 2.0, where he, according to SGI, predicted the Arab Spring and the location of bin Laden’s hideout. When he was approached by SGI’s Michael Woodacre about the new UV2 system, which would apparently carry 4,000 processors and 64,000 terabytes of cached coherent shared memory, he thought immediately of Wikipedia. “Wikipedia,” Leetaru said “has become such a fundamental part of our daily life. What could we do if we made a map of this or a series of maps over time?”

So Leetaru set out to model the world according to Wikipedia’s English-Language edition. The task itself is simple to comprehend, essentially Leetaru wanted to mark down every mention of a name, date, or place found in Wikipedia.

“We used this UV2 system to pull out every geographic location across every page,” said Leetaru “every date across every page, and every connection among those, basically capturing the spatial and temporal view of history as captured by Wikipedia’s pages…We can actually see history before our own eyes.”

Of course, there are over four million entries in the English version of Wikipedia, each of which have multiple references to any given date, place, or name. If those references are the neurons of Leetaru’s project, the connections are the synapses. Leetaru had to deal with and analyze one heck of a historical neural net.

UV2’s impressive in-memory capabilities made this possible for Leetaru. “I didn’t spend hours or days writing some fancy code that was distributed memory or using any of these fancy extensions, having to worry about memory management, allocating the right buffer sizes. I just wrote a ten line Perl script in a matter of minutes and just ran it… If I had to summarize the advantage of the UV2 platform in a single sentence, I think it would be ‘Outcomes over algorithms.’”

The outcome is represented in a fascinating infographic on SGI’s Facebook page, which goes over number of date mentions per year, sentiment over time and much more. For example, the sentiment over time graph shows sharp dips around the 1860s, 1910s, and 1940s. Those dips correspond with the American Civil War (the sharpest dip, perhaps shedding some light on the American bias in English-language Wikipedia articles) and both World Wars.

There are plenty more insights to be gleaned and plenty to be extrapolated. Leetaru’s research shows that the world has become exponentially more interconnected over the last fifty years. This connectivity makes it easier to digitalize human patterns and apply data analysis to them. Perhaps Asimov’s psychohistory is not thousands of years away after all.

Related Articles

MapReduce Makes Further Inroads in Academia

In-Memory Tweaks Boost Proteomics Research

Researchers Germinate Novel Approach to Big Bio Data

A Big Data Revolution in Astrophysics

Applications: Data Mining, Research Analytics, Visualization

Technologies: Systems

Sectors: Other, Science

Vendors: SGI

Tags: asimov, big data, data mining, in-memory, psychohistory, SGI, uv2, Wikipedia

Only registered users may comment. Register using the form below.

Check off newsletters you would like to receive*
- HPCwire
- EnterpriseTech
- Datanami
- Technology Conferences & Events
- Advanced Computing Job Bank
- Technology Product Showcase
Email*
Name*
First Last
Organization*
Job Function*
Industry*
Country*
City*
State*
Province*
- Please check here to receive valuable email offers from Datanami on behalf of our select partners.

One Giant Leap for Psychohistory

Join the discussion Cancel reply

Only registered users may comment. Register using the form below.

May 6, 2024

May 3, 2024

May 2, 2024

May 1, 2024

Sponsored Partner Content

Get your Data AI Ready – Celebrate One Year of Deep Dish Data Virtual Series!

Supercharge Your Data Lake with Spark 3.3

Learn How to Build a Custom Chatbot Using a RAG Workflow in Minutes [Hands-on Demo]

Overcome ETL Bottlenecks with Metadata-driven Integration for the AI Era [Free Guide]

Gartner® Hype Cycle™ for Analytics and Business Intelligence 2023

The Art of Mastering Data Quality for AI and Analytics

Leading Solution Providers

Tabor Network

Sponsored Whitepapers

Top 6 Strategies for Reducing Data Warehouse Costs

Building an Operational Data Warehouse for Real-time Analytics

Sponsored Multimedia

The Power of DataOps: Bring Automation to Life
No Comments

Tactical Steps for Cloud Migration
No Comments

Immuta Data Access Platform
No Comments

Data Mesh: Fact or Fiction?
No Comments

Contributors

Featured Events

AI & Big Data Expo North America 2024

CDAO Canada Public Sector 2024

AI Hardware & Edge AI Summit Europe

AI Hardware & Edge AI Summit 2024

CDAO Government 2024

One Giant Leap for Psychohistory

Join the discussion Cancel reply

Only registered users may comment. Register using the form below.

May 6, 2024

May 3, 2024

May 2, 2024

May 1, 2024

Most Read Features

Most Read News In Brief

Most Read This Just In

Sponsored Partner Content

Leading Solution Providers

Tabor Network

Sponsored Whitepapers

Sponsored Multimedia

Contributors

Featured Events

Share

Copy short link