Follow Datanami:
March 28, 2012

Market Prediction Gives Sentiment the Slip

Nicole Hemsoth

While the idea of using Twitter and other social media outlets as the wellspring of semi-accurate predictions about the fate of financial markets is nothing new, researchers are now looking beyond mere sentiment to find more robust cues about the future of market activity.

As the Derwent example below highlights, sentiment can be a powerful indicator when all the world’s a stage, but researchers are finding that without context, sentiment itself is limited.

The “limitations of sentiment” shift is just part of what grabbed a few startup-watchers’ eyes earlier in the year when conversations hummed around companies like Knowsis, which was among the first to claim the ability to predict financial markets for a fee—all based on the tenor and content of social conversations.

A team based out of the University of California, Riverside claims to have developed a model that uses Twitter data to predict the traded volume and value of a stock the following day. This is not based on some collective sense of the Twittersphere—but on the nature of the conversations that serve as the source of the sentiment.

This is more of a “trading strategy” as described by the researcher behind it, Vagelis Hristidis, an associate professor from UC Riverside’s Bourns College of Engineering. The emphasis of his research is on large-scale data mining and pattern discovery—again, a step above looking for just a sentiment needle in the haystack.

With help from a team comprised of graduate students and others from Yahoo’s Spanish research division, it is possible, as they say, to outperform “other baseline trading strategies by between 1.4 percent and nearly 11 percent” over the course of a four-month simulation.

Moving beyond sentiment, this research is separated by its emphasis on Twitter activity as it relates to overall stock prices and traded volume. As the researchers claim, until now “little research has focused on the volume of tweets and the ways that tweets are linked to other tweets, topics or users.”

The team also says that until this effort, other researchers have focused on the overall stock market indexes versus the individual stocks.

At the high level, the team used the daily closing price and total number of traders from Yahoo! Finance across 150 random companies in the S&P index and designed a set of filters to find relevant tweets about those companies—no small task given the level of understanding the algorithm needs to have to discern whether or not users are talking about the apple they had at lunch or the company that produced the iPad they’re tweeting from.

At first, the team expected that they would find the number of trades to be on par with the number of tweets, but this was actually not the case. As the team notes, “the number of trades is slightly more correlated with the number of ‘connected components—that is, the number of posts about distinct topics related to one company.” Using the Apple example, this might be related to separate tweets about particular products or other tangential but still Apple-specific news.

While they might have some solid data to back up their recent success, the team understands that there are numerous potential holes in their approach. “First, the trading strategy worked in a period when the Dow Jones dropped, but it may not produce the same results when the Dow Jones is rising.

There is also sensitivity related to the duration of the trading. For example, it took 30 days in the simulation to start outperforming the Dow Jones.”

While it’s a bit dense, the full paper is well worth the read—makes it easy to see how this was the topic du jour at this year’s Fifth ACM International Conference on Web Search & Data Mining in Seattle.

Related Stories

Financial Firms See Big Data’s Big Picture

Mr. Watson Goes to Wall Street

The Profitability of Failure

Datanami