Follow Datanami:
December 18, 2017

Inside Overstock’s One-to-One Marketing Machine


Like many online retailers, Overstock closely tracks the behavior of visitors on its site. But transforming all those individual page views and clicks into actual revenue is easier said than done. The company recently discussed with Datanami how it overcame challenges in building its own one-to-one marketing analytics system, and what results it’s delivered this year.

Overstock emerged from the wreckage of the first dot-com boom with a winning business plan: sell the excess merchandise of failed retail outfits at below-wholesale levels, build a loyal following, rinse, and repeat. Now nearly two decades in, Overstock has expanded to sell new products, and it all adds up to nearly $2 billion in revenue annually.

Technology is obviously critical to running such a big ecommerce operation — especially this time of year, when big swings in website traffic can stress the scalability of systems. But technology is also important for staying ahead of curve when it comes to big data analytics too, and Overstock has an interesting story to tell in this regard.

According to Craig Kelly, a group product manager with the Midvale, Utah company, Overstock faced a choice when it came to marketing analytics – and in particular getting down to the granular one-to-one level instead of relying on aggregate data: It could either sign up with one of the big cloud marketing firms and trust that it can keep up with emerging technology, or it could seek a technological advantage by composing its own system.

For various reasons, it selected the latter. “This is the approach that’s allowed us to stay relevant for 18 years, and that will allow us to stay ahead of the market,” Kelly tells Datanami.

Targeted Marketing

The benefits of one-to-one marketing are potentially huge for a consumer-facing company. Instead of grouping customers into behavioral and demographic buckets and then running promotional offers based on those aggregated features, companies can cater their promotions and offers directly to individuals.

This approach allows a much greater degree of precision to be applied by the marketer. However, there are big technical challenges to be solved with building a one-to-one marketing system, putting it into production, and keeping it running. Those technical challenges are a major reason why the marketing cloud providers are able to hold onto their customers for so long.

But for Overstock, the benefits of building its own system outweighed the costs, according to Kelly.

“Getting away from the segment-based view of the world to a user-based view is something that everybody has dreamed about and talked about for however many years – decades at this point,” Kelly says. “The challenge at this point is not that we don’t have the data to do that. The challenge is that there are very few systems that give use the ability to use that data effectively in order to deliver that user-level customizing and experience.”

Overstock would soon learn that getting all the pieces to fit in the one-to-one marketing puzzle would be a major challenge. Changes would be required to not only its SQL-based data exploration and data warehousing solutions, but for its machine learning modeling and data science workflows, too.

Above all, the data engineering effort would be considerable. With all these challenges, Overstock opted to bring in trusted third-party partners to build its one-to-one marketing solution – including streaming data specialists, hosted cloud analytics providers, and machine learning experts.

Overcoming One-to-One Challenges

One of Overstock’s first data partners was Beeswax, a New York company that built a demand-side platform (DSP) that automates much of the work required for companies to participate in online advertising auctions. As part of that initiative, Overstock wanted to resolve user level scores for re-targeting, which made use of machine learning models.

“The big issue there was you have these massive data sets that we’re used to looking at an aggregate level in our historic data warehouse solution,” he says. “But the speed of the pipeline implementation was way too long. It took us about three months to deploy our initial pipeline to start bidding on Beeswax.”

To get help with data collection, it selected mParticle, a service that helps companies like Overstock gather and prep all of its consumer-level data, including behavioral data from Web and mobile apps, and pump it into a downstream analytical system.

At this point, Overstock was starting to see a light at the end of the tunnel. But there were still significant hurdles to overcome, according to Kelly, including finding the appropriate ways for its data scientists to explore and model the data in a cloud-based environment.

“We have this real-time stream of events coming from mParticle and we have all of our bid-exhaust coming from Beeswax,” Kelly says. “But it was a really big challenge to get that into our internal systems, because they’re not built for the modular world that we’re moving into in the cloud.”

Pileup on the S3

Overstock’s data scientists were falling behind as the user data and bid exhaust started to pile up in Overstock’s S3 repository. That’s when Kelly recalled his experience in working with Snowflake Computing’s cloud-based data warehousing service in a previous job.

Snowflake is a parallel SQL analytics program that runs on AWS and other public cloud platforms. The service handles various aspects of scaling up and scaling down the underlying computing resources, leaving the users free to explore their data without worrying too much about the technical nuts and bolts that make that possible.

Overstock’s marketing group started using Snowflake to power simple transformations using AWS’ Lambda triggering service, and it grew from there. Eventually, its data scientists were using Snowflake to identify features in the mParticle data that they wanted to model using Apache Spark’s machine learning capabilities, which were hosted in Databricks cloud.

The combination of Snowflake’s hosted data warehouse and Databricks’ hosted machine learning functions eliminated a lot of heavy lifting on the part of Overstock’s data scientists, Kelly says.

“We have this pipeline now where instead of the three months that it initially took us to deploy any machine learning model, we can deploy new models within a day,” he says.

This completed a virtuous data cycle for Overstock, where data comes in through mParticle, gets forwarded to Snowflake, where individual users’ data is “featurized” and rolled up. Then Snowflake’s Spark connector shunts the data over to Databricks to refine the models, which inform the Beeswax ad buying.

The great thing about this particular system is how little work it demands of the data scientists, Kelly says. “We don’t have to build and maintain all of these different pipelines,” he says. “For me, that’s the unspoken — and in a lot of cases, unmeasured — experience in any advanced data driven company: all of the time spent on data engineering and data pipeline building. We don’t even have to think about it anymore.”

Automating some of the data engineering work allows Overstock’s data scientists to get back to iterating on model development rather than building data pipelines, says Overstock Vice President of Product and Analytics Joe Kambeitz.

“A common meme in the data science world is that data scientists spend 80 percent of their time prepping data and 20 percent of their time building models. We wanted to flip that ratio,” Kambeitz says.

At the end of the day, the system gives Overstock the advantages of having a one-to-one marketing, but without suffering from vendor lock-in by going with an established cloud marketing vendor. “We don’t want to be locked in when various vendors fall behind, which vendors are always going to do,” Kelly says.

Related Items:

The Real-Time Future of ETL

How Next-Gen Analytics Will Impact Customer Interactions

Why Data Science Needs To Be Simpler for Marketers