Why Data Science is a Team Sport at AOL
As the world’s fifth largest digital advertising network, AOL chews through a lot of data. That’s not surprising. What may surprise you, however, is the unconventional way that the company goes about assembling its data science team to ensure the highest return for its advertising clients.
America On-Line holds a special place in Internet lore. As the World Wide Web took off in the late 1990s, AOL provided an on-ramp that introduced the digital world to millions of people willing to pay $19.95 per month for Internet access. AOL grew and grew until it gobbled up Time-Warner for $165 billion at the peak of the dot-com boom in early 2000. Just after completing what remains to this day the largest acquisition in history, it all came crashing down.
Today’s AOL Inc. is a much different animal. The company, which was acquired by Verizon Communications last year for $4.4 billion, is primarily an advertising platform. It serves about 500 million people per year, and is expected to gross more than $1 billion in advertising revenue for 2016 (according to eMarketer).
And like most ad-tech firms, AOL is awash in data and servers. AOL processes tens of billions of transactions per day on behalf of thousands of clients, who rely on it to serve the right advertisement to the right person at the right time. AOL’s data universe measures in the tens of petabytes, while its infrastructure spans many thousands of nodes.
Getting the personnel and products in place to make this business flow is neither easy nor trivial.
Man Vs. Machine
As the CTO of AOL Platforms, Seth Demsey’s job is to oversee the research, product development, and engineering. The former Google and Microsoft executive tells Datanami that getting the best out of man and machine requires a balanced approach.
“The general philosophy is tooling complements our needs, as far as computing and research go,” Demsey says. “For us, the technology is a means than an end, and we effectively do whatever we need do to solve the problems at the sale and latency that we want to.”
As you might imagine, AOL uses all the latest big data tech, such as Hadoop, Spark, and Kafka, to power its ad tech business. It also runs a lot of its own proprietary code, such as its Streaming Application Framework (SAF), which it developed before Storm and Spark were stable enough to base a business on.
“We love standing on the shoulders of giants and using open source or leveraging the vitality of communities for certain projects that are going on,” Demsey continues. “But if there’s stuff that’s going on that doesn’t exist, we build it from scratch.”
Finding the right folks to code and operate the machines is a different matter entirely. Rob Luenberger, who’s the chief scientist and senior vice president in AOL Platforms R&D department, says the company sources talent from unusual places.
“Some of our most recent hires have come out of biology and in particular genetics, where you deal with a lot of data and there’s data quality and modeling issues,” Luenberger says. “Some of the same challenges that come up, such as worrying about having too many models and having something be good by luck that you need to worry about.”
The AOL Platforms team has recently hired people with backgrounds in neuroscience, as well as finance. These are fields where the practitioners are used to dealing with big data sets and taking a disciplined approach to solving problems with a wide variety of skills, like statistics, control theory, linear programming, Luenberger says.
“It’s non-traditional in that they probably would not have identified as being a machine learning person,” he says. “But they learned and used those tools in schools to accomplish some other mission. That’s very much how the people here see themselves. They know those tools and now the mission is helping publishers or helping advertisers.”
It’s not that the company doesn’t look for people with the data science pedigree, and who may even have a MS or PhD in data science itself. The universities are ramping up production of professionals with those skill-sets. But finding those types of people can be tough at the moment, so the team approach works better for AOL Platforms.
“You can find that unicorn, but you should need that unicorn to be successful,” Demsey says. “It’s all about leveraging diverse backgrounds and skills sets.”
Flexibility Is Key
The big data universe is moving so fast that success almost requires a certain degree of flexibility in the hiring process.
“I remember in the old days, which was not too long ago, you’d find folks and their only capability in data science was to program in Matlab,” Demsey says. “You know what? There’s nothing wrong with that. You can build earth-shaking value with that. But then it’s the translation of the research tooling into production systems that you need to think about.”
That Matlab code might get shunted into C code to get it productized, while others may entirely rebuild the model in some other languages. Having a diverse team of data experts with a variety of skills makes it easier to be successful, Demsey says.
“It’s all about the application,” he says. “We think it’s a real advantage, rather than only look for unicorns, only looking for machine learning people who code in XYZ framework. That’s flexibility is important.”
Once the right people are in place, AOL Platforms takes pains to ensure they get involved, that their work has an impact on day-to-day operations, that it doesn’t stay sequestered in the ivory tower.
“We don’t think of data science or the optimization as an afterthought or something that you put on top. That thinking goes all the way through,” Luenberger says. “We’re not set up as a lab where we’re just writing papers. There are people who are reading papers and using cutting-edge tools. But it’s also very much embedded in the business, trying to get things deployed in timelines that are in sync with the product and engineering teams.”
AOL Platforms’ data science team has grown by 50% over the past 18 months. It’s continually looking for new people to bring new insight and skill sets. For example, it’s experimenting with new deep learning approaches that shows promise in identifying the right ads to show people.
To be sure, keeping a tight-knit group of coders together is harder as the group grows. There’s a general tendency among programmers to finish code and then throw it over the wall for the production team to implement. Getting everybody involved and keeping them involved isn’t easy, but it’s part of AOL Platforms’ game plan for success.
“It’s not like one group of people build a model or hand some prototype or paper to another group and say go build a production version of it. Those discussions start right at the beginning,” Luenberger says. “It’s almost like putting a car together. You know how to build each part, but making sure they all work together is something that just takes a lot of focus.”