Follow Datanami:
September 21, 2012

Reworking Analytics at the World’s Largest Marketplace

Ian Armas Foster

“Consumers have a mall in their pocket,” said Bob Page, Platform Analytics VP of eBay, pointing to the predominance of smartphones, which have shifted the marketplace toward a new stage in online commerce.

However, the standards that worked in the “golden age” of a company like eBay, which was in the heyday of the early Web 2.0 era, need to be reworked to address the new ways consumers make purchases and think about ecommerce in general.  

As Page noted, in order to keep up in the online marketplace, companies need to use analytic tools that not only track transactions themselves but the behavioral patterns that lead to the transactions.

As an online marketplace, eBay is a unique position. For them, it matters little what gets bought or sold, simply that transactions continue to happen. As a result, it is important for eBay to analyze behavioral factors to better point customers to products they might want to buy. Page discussed how eBay’s analytics help them do that, along with applying those principles to companies who do actually sell stuff, in this talk at the 2012 Analytics Conference in Huntington Beach, California.

While computers certainly played their part in advancing ecommerce, no technology, according to Page, has done more to digitalize consumer business than mobile devices such as smartphones and tablets. In the year 2009, eBay and PayPal hosted 700 million dollars’ worth of mobile transactions. That number jumped up to 15 billion dollars over the first half of 2012, an increase of a factor and a half of magnitude.

This rapid increase in mobile usage can be a little overwhelming for businesses. “If you’re a merchant,” Page said “you’re really not sure how to keep up with this. The old way was, we’re going to put up a shingle, we’re going to have a bunch of stuff on display…all the things they used to worry about, they still worry about. And now, they have all this other stuff to worry about. ‘Where do I market now? I don’t even know. Should I go on LivingSocial, should I have an Android App, how do I generate demand?’”

Eventually, Page believes machine learning will play a big role in helping humans understand the swaths of data surrounding them. “There is just too much data for humans to understand the ramifications of all the data, but that’s been true for a long time. Machine learning is just going to be how we think about how to make things happen.” As of right now, machines do a decent job of learning over millions of trials. However

A few years ago, Page noted, it was generally thought that ecommerce was for the most part kicking offline business to the curb. “Back when I started in analytics, it was called website analysis. And we thought, ‘online is everything, offline is going away. Bricks and mortar, oh wow, that’s the old stuff.’ Instead, there’s this confluence. So much of the offline stuff that’s happening starts online.”

For example, the monocle function on many food review apps help a user determine which restaurants are in the area and how well they are rated. Then the user makes a decision, based on online information, to eat at a real restaurant.

This confluence of online and offline is not limited to the food industry. Page notes how commonplace the multi-tasking human, one who surfs on a tablet while watching their favorite television program. As a result, companies can get instant feedback on their marketing. Ebay even introduced an app which displays items seen in the television show. “What we’re seeing in the future is, it’s not ecommerce, it’s just commerce.”

Next — Analytics and Further Personalization >>


Collecting data on which restaurants pay special attention to before actually making a decision could benefit the restaurant review website so they can better direct customers’ attention. It would also obviously have great value to the restaurants themselves. “For the longest time, all we had was the tape that told you what you purchased. Now we can keep behavioral data, such as clicks, paths, habits, etc.”

With all that being said, there still exists far too much data for anyone to properly get their head around. However, Page noted that the cost of analytics platforms, such as Hadoop, have dropped dramatically over the past couple of years. In 2009, according to Page, eBay could not even afford to hold on to behavioral data, since the cost of store and processing it was so high.

In 2010, they kept 6.5 petabytes of relational data (mostly transactional, not behavioral) along with 6 petabytes of semistructured data. In the last two years alone, they have expanded that to 42 petabytes of semistructured data, 48 petabytes of grid data, and 10 petabytes of relational data.

eBay’s analytics strategy uses three technologies: Hadoop, Singularity (both of which Page noted are mainly used by researchers), and EDW, a standard relational analytics platform. Hadoop and Singularity are the more interesting, as they delve into the contextual analysis. eBay uses Singularity to compare the data they have to what they have collected over the last two years while using Hadoop to recognize patterns and counterfeit claims.

As a result, eBay processes and analyzes an impressive 80 petabytes a day of data, including 50 terabytes of new data.

While many businesses may not have the resources of an eBay, their remarkable growth rate in terms of big data analytics proves that those analytics are becoming more accessible to the mid-sized business.

Related Stories

The Algorithmic Magic of Trendspotting

Study Stacks MySQL, MapReduce and Hive

Six Super-Scale Hadoop Deployments