Follow Datanami:
November 20, 2014

How Big Data Analytics Is Shining a Light on Anonymous Web Traffic

They arrive suddenly at your website with no identification or cookies, browse product for minutes or hours on end, and then leave abruptly without a word. They’re anonymous Web visitors, and they’re the bane of the data-driven marketer.

Attempts to catalog these shadowy creatures using traditional techniques often fail. The Web’s Wild West ways have led many people to crank up their privacy and security settings in an attempt to create a bubble of anonymity and protection from peering companies, governments, and cybercriminals. According to a 2013 Pew Research study, 86 percent of Internet users have taken steps to remove or mask their digital footprints.

But thanks to the power of advanced analytics, individual companies are able burn through that bubble and shine the bright lights of segmentation, categorization, and personalization on those anonymous visitors. If you claim you’re a dog on the Internet, Google will probably know better, but without advanced analytic system, the average company would struggle to get a clearer picture.

Peter Steiner's 1993 cartoon in The New Yorker is less true today.

Peter Steiner’s 1993 cartoon in The New Yorker is less true today.

The key to greater understanding of these data-poor personas lies in the clickstream data that follows everybody on the Internet. Whether you like it or not, there’s certain amount of data that you drag around to each subsequent website you visit, including what kind of Web browser, device, and plug-ins you’re using; you’re IP address and geographical location; your time zone and language preference; and what website you came from. The owner of any particular website knows even more about you, including what website pages you view, what you search for, and how long you stay on each page.

This data is a rich source of information that allows website operators to build models of website visitors who don’t otherwise say much about themselves. And these models, in turn, can be very useful for serving the personalized recommendations that are the ultimate goal of the modern marketing professional.

The folks at Syntasa are experts at building behavioral models that can categorize and describe anonymous visitors from the clickstream data. The Herndon, Virginia-based company got its start providing analytics for governmental agencies, and is now using its data science expertise to help companies boost conversions on their websites.

It takes about two weeks’ worth of clickstream data to build a suitable model for scoring behavior for a particular website, and several actions on the part of the visitor to actually do the scoring, says Grant Wagner, Syntasa’s vice president of sales and marketing. Once the models are in place, the company uses them to analyze the live clickstream feed and identify what a visitor is interested in, and how interested he or she is.

“We’re flying through massive amount of data with these machine learning models,” Wagner says. “The data outputs from those models …. generates these behavioral segments, where you can send a personalized offer within a couple of minutes.”

That offer could take the form of a coupon for 25 percent off or free shipping for a particular product that the anonymous visitor showed a strong interest in a product. Syntasa’s software tracks anonymous visitor data at the visitor, the visit, and the event levels, which enables it to grow and evolve the models over time.


The Syntasa Marketing Analyics Stack

Syntasa stores its customers’ data in a hybrid cloud/on-premise Hadoop cluster based on software from Cloudera. The company’s data scientists–including George Mason University professor Dr. Kirk Borne, who’s an advisor to the company—have developed a series of algorithms using R and other tools that allow customers to build and continually refine the behavioral models of anonymous visitors. At runtime, the company uses real-time capabilities of Apache Spark to score the data against the model and serve up the personalized recommendations to the (increasingly less) anonymous visitor, Wagner says.

It’s all about taking a more holistic view of anonymous customers, including how they got to the website, which could be via paid ads, organic search, or entering a URL into the Web address bar. “Because it’s a silhouette of a person, you don’t know anything about them. But you certainly can drive a lot of insight from their behavior,” he says. “They’re not a complete blank sheet of paper. But there’s nowhere near as much information as you would have with a customer or an authenticated user that has an account.”

Anonymous visitors account for more than 95 percent of the traffic at some big-box retailers’ websites, Wagner says, while some popular properties, like, may have fewer than 5 percent anonymous visitors at any one time. About half the web traffic is anonymous at one communications firm that Syntasa is working with, he says. “It’s a pretty niche market, but it’s way bigger than what most people may think,” he says.

Companies are sitting on a goldmine of clickstream data, but often don’t realize it. “There’s massive amounts of data in there [representing] thousands of different events of what happens on a website,” Wagner says. “It tracks what’s going on in the website, what are people there for, what are they looking for, how many products are represented, and what types of processes are represented. All that stuff is available in that data stream.”

Related Items:

Mining for YouTube Gold with Hadoop and Friends

Businesses Are Going About Data Science Wrong–Here’s How To Get It Right

A Look Into the Magic Ball: How To Harness Big Data For Predictions


They arrive suddenly at your website with no identification or cookies, browse product for minutes or hours on end, and then leave abruptly without a word. They’re anonymous Web visitors, and they’re the bane of the data-driven marketer. Read more…