Language Flags

Translation Disclaimer

HPCwire Enterprise Tech HPCwire Japan

April 26, 2013

Stanford Receives DARPA Grant to Study Big Data

Experts in the analysis of big data have noticed a curious pattern among those who tweet: Twitter accounts with the most followers are more likely to attract new ones. Its just one of the many nuggets revealed when researchers peer into the ocean of data being stockpiled by social media channels.

Stanford researchers from diverse disciplines are developing new ways to find meaning in data. The promise of their work has grown so great that the Defense Advanced Research Projects Agency (DARPA) recently stepped to the plate with a grant of $5.6 million to support their research.

The new project is called MEGA: Modern Graph Analysis for Dynamic Networks, and is led by Associate Professor Ashish Goel of Stanford's Management Science and Engineering department. A team of seven principal investigators, six of them Stanford faculty, will develop algorithms which model human communication and detect subtle patterns in huge data sets from social media.

DARPA is interested because, from a national security standpoint, big data holds the promise of recognizing threats in unusual or suspicious social interactions of terrorists and other foreign adversaries. But Goel, who also holds a courtesy appointment in computer science and serves on the technical advisory board of Twitter, Inc., said that the models and algorithms MEGA develops will also influence social media itself, leading to a more sophisticated, personalized experience for all users.

Our daily social communication is spread across many forms of interaction. E-mails, tweets, text messages and Facebook posts define our modern social lives. More than ever, information about this correspondence and behavior can be collected, stored and made available to computer scientists.

With access to billions of tweets, e-mails and text messages, a project like MEGA can build reliable mathematical models of social phenomena, like the way news spreads through a network for instance, or even how people choose their social connections, Goel said. "From an intellectual point of view, it's really exciting."

One goal of the MEGA project is to model human online behavior and find how it shapes social networks. The team can then transform these patterns into a more general, abstract theory and see if it applies across many social networks.

The sheer number of communications and the speed at which a network changes have given rise to new challenges, said Goel, problems that more storage or more processing power cannot solve. For instance, in order to analyze the masses of data flowing out of popular social media sites like Twitter, what happened yesterday might as well have happened last century. What matters most is now. The MEGA team wants to analyze it immediately, not gather and organize it later.

"On a site like Twitter, you're not finding data that was there yesterday, you're finding data that was there last second. And even one second of this data is too big to process on a single machine," he said. To achieve real-time analysis, the data must be stored and explored across many different computers, which requires yet more new algorithms. This is a second component of MEGA's research: writing the step-by-step procedures for processing distributed data in real time.

Goel says they have had some early successes, and the group expects to publish high-impact results in the form of new models and algorithms within the project's first or second year.

Some of their algorithms and programs will be passed to DARPA to be used in a security context, but the team is also tackling long-standing theoretical problems in computer science. One such problem is the "travelling salesman" scenario studied by Amin Saberi and his students: if a salesman has a list of cities to visit, and he must visit each one exactly once before returning to where he started, how can we calculate the shortest possible route?

This problem may seem unrelated to the world of social media, but it deals with a network of access points – like mobile phones or computers on the Internet – combined with an algorithm for calculating the shortest path among them. Goel said it is important to keep making progress on these kinds of classical problems. Even when they don't have an immediate, real-world application, he said, they advance our understanding of computer science as a discipline.

The team also plans to explore the connection between human behavior – the things we enjoy and choose to share in our social networks, or what we're looking for when we search online – and algorithms that help shape our online experience, like friend recommendations or search engine results.

MEGA's algorithms might, for example, lead to a search engine that takes into account not only keywords a user is typing in, but also that user's social connections and what's trending online at that moment. This system would essentially construct a brand new, highly personal search engine for each and every search, he said.

Helping things along, the MEGA team enjoys close ties to networking companies including Facebook, Twitter and Cisco. This means that their work may someday be used to drive new features on popular social media sites. "It happens only occasionally that you can design an abstract system that actually affects society and the economy on such a large scale," Goel said.

The project likewise unites a diverse group of experts. Goel's expertise lies in algorithm design, and he is responsible for several of Twitter's algorithmic products. Two other Management Science and Engineering professors, Amin Saberi and Ramesh Johari, will also contribute their algorithmic and modeling knowledge. Andrea Montanari, an associate professor of electrical engineering and statistics, will be the team's statistician and information theorist, while Associate Professor of Computer Science Jure Leskovec brings expertise in data mining and modeling. Economics professor Matthew Jackson has been collecting data from villages in India, which he hopes to compare to online networks like Facebook and Twitter. Also involved in the research is John Heidemann of USC's Information Sciences Institute.

"We were all having a lot of success in our individual research," Goel said, "but the DARPA grant allows us to work together to understand how social networks operate."

Related Articles:

Oversight Systems Makes Big Data Insights Easier

Python Wraps Around Big, Fast Data

IBM Targets Virtualized Big Data

Share Options


» Subscribe to our weekly e-newsletter


There are 0 discussion items posted.


Most Read Features

Most Read News

Most Read This Just In

Sponsored Whitepapers

Planning Your Dashboard Project

02/01/2014 | iDashboards

Achieve your dashboard initiative goals by paving a path for success. A strategic plan helps you focus on the right key performance indicators and ensures your dashboards are effective. Learn how your organization can excel by planning out your dashboard project with our proven step-by-step process. This informational whitepaper will outline the benefits of well-thought dashboards, simplify the dashboard planning process, help avoid implementation challenges, and assist in a establishing a post deployment strategy.

Download this Whitepaper...

Slicing the Big Data Analytics Stack

11/26/2013 | HP, Mellanox, Revolution Analytics, SAS, Teradata

This special report provides an in-depth view into a series of technical tools and capabilities that are powering the next generation of big data analytics. Used properly, these tools provide increased insight, the possibility for new discoveries, and the ability to make quantitative decisions based on actual operational intelligence.

Download this Whitepaper...

View the White Paper Library

Sponsored Multimedia

Webinar: Powering Research with Knowledge Discovery & Data Mining (KDD)

Watch this webinar and learn how to develop “future-proof” advanced computing/storage technology solutions to easily manage large, shared compute resources and very large volumes of data. Focus on the research and the application results, not system and data management.

View Multimedia

Video: Using Eureqa to Uncover Mathematical Patterns Hidden in Your Data

Eureqa is like having an army of scientists working to unravel the fundamental equations hidden deep within your data. Eureqa’s algorithms identify what’s important and what’s not, enabling you to model, predict, and optimize what you care about like never before. Watch the video and learn how Eureqa can help you discover the hidden equations in your data.

View Multimedia

More Multimedia


Job Bank

Datanami Conferences Ad

Featured Events

May 5-11, 2014
Big Data Week Atlanta
Atlanta, GA
United States

May 29-30, 2014
St. Louis, MO
United States

June 10-12, 2014
Big Data Expo
New York, NY
United States

June 18-18, 2014
Women in Advanced Computing Summit (WiAC ’14)
Philadelphia, PA
United States

June 22-26, 2014

» View/Search Events

» Post an Event