Follow Datanami:
June 23, 2014

Game-Changer: The Big Data Behind Social Gaming

The rise of social gaming is changing the fabric of the video game market. Instead of buying game consoles, people are increasingly logging into social gaming apps through iPhones and Facebook. In this booming market, how game developers utilize data makes all the difference in the world.

Zynega is credited with kicking off the social gaming craze in 2009, when it introduced FarmVille for Facebook. The hook was set, and before long, people started spending real money to gain credits in the virtual world of social gaming.

At any given hour, tens of millions of people are logged into their favorite social game, be it the Candy Crush Saga from King, the Clash of Clans from Supercell, or Bingo Blitz from Buffalo Studios. While social and mobile gaming is still a fraction of the overall $80 billion video game market, it’s catching up fast.

Huge growth in this space means even bigger data, and that’s leading the industry’s top gaming titles to prepare for a data deluge. The capability to ingest and analyze billions or tens of billions of data points per day is par for the social gaming course, says Barry Sohl, the chief technology officer at Buffalo Studios, which is owned by Cesar Interactive Entertainment.

“It’s absolutely critical,” Sohl says. “Our entire industry is very data driven. The vast majority of our revenue comes from a small percentage of our users, so figuring out how to increase that conversion rate by even a small amount…has a fundamental impact on revenue, and our ability to do that is almost entirely driven by the data.”

In social gaming, virtually every move a player makes is logged and tracked by the game provider. This creates a feedback loop between gamer and developer that simply doesn’t exist in the traditional video game console market. The opportunity to sell credits to customers also changes the financial dynamics of the market.

Gave developers use a variety of tools to analyze their big data sets, with Hadoop being one of the most popular. Zynaga’s big data setup is said to be one of the largest Hadoop clusters in the space, storing 1.4 PB of data. The company behind Farmville 2.0 also uses Splunk, MySQL, and HP Vertica to collect and analyze about 60 billion rows of data per day.

Buffalo Studios also uses a high-performance Vertica data warehouse to run SQL reports on player data. In Buffalo Studios case, the company uses Apache Flume to ingest about a billion rows of data per day, or about 100GB, into its data warehouse, which is hooked up to a Tableau visualization and reporting system.

This BI system is used to track a wide range of in-game metrics, including when to give away credits to customers during a game, whether new features added to the game are successful, and fraud detection. “There are a ton of different criteria to watch,” Sohl says. “Customers are buying credits and coins and spending them to play the game, but they also win credits and coins back, so you have to carefully manage how much currency you’re putting into the economy versus how much you’re taking out. It’s a very careful balance between keeping the fun factor of the game up, but also driving revenue.”

Competition in the social gaming space is fierce, and the Santa Monica, California-based company is constantly adding new features to its top two titles, Bingo Blitz and Bingo Rush 2, in the hopes of attracting and retaining customers. The company releases at least one new feature every week for each of its top four platforms, iOS, Android, Facebook, and Amazon’s Kindle.

But connecting the dots between the addition of new features and their profitability was not always easy. In particular, Buffalo’s developers struggled to ensure that each new feature had the appropriate level of instrumentation that allowed the company’s business analysts to determine whether a new feature was helping or hurting.

“One of our biggest bottlenecks in the past was at the ETL layer,” Sohl says. “Every time we launch anything new in the game we need data warehouse instrumentation around that to measure the success and usage for those features. Previously it was extremely cumbersome in the ETL pipeline because that required going in and touching ETL code and modifying database schemas. So what happened more often than not, the functionality didn’t get instrumented or the data would make it into raw logs and never get ingested all the way to Vertica.”

In May, the company finished a major overhaul of its ETL system, and today it has the confidence to know that each event is appropriately logged. The new system utilizes Talend’s ETL tools and Flume to move events from the game servers into the Vertica warehouse.

One of the reasons Buffalo Studios selected Talend is its developers weren’t restricted to using the user interface. “We do a lot of fairly close-to-the-metal stuff for our system, and the ability to write those pieces as custom Java component, hook them into Talend, and still use the niceties of the higher-level UI was the big plus for us,” Sohl says.

The new ETL system today is giving Buffalo Studio’s analysts with very close to real-time input on the effectiveness of new features. It’s also helped the company’s fraud detection. “Now we can detect fraudulent activity within 10 minute,” Sohl says. “If we had a bug or an exploit and left it to run for an entire day, it could have a catastrophic effect on the game. So now we’re able to detect these things very quickly.”

Sohl says the new ETL setup is working so well that it may be adopted by other Cesar Interactive Entertainment properties, including Playtika, which develops the Slotmania and Farkle Pro titles, and World Series of Poker. And as Buffalo Studios attracts more users, the company may need to follow in the footsteps of other social gaming outfits and adopt Hadoop. “We’ll probably eventually end up in a hybrid model where we’re keeping a sliding window in Vertica and then archiving longer running stuff in Hadoop,” Sohl says. “But as of right now our BI team is addicted to the performance and ease of use of Vertica.”

Related Items:

Meshing Advanced Analytics with Hadoop

The New Data Blending Mandate

Why Hadoop Won’t Replace Your Data Warehouse