Game-Changer: The Big Data Behind Social Gaming
The rise of social gaming is changing the fabric of the video game market. Instead of buying game consoles, people are increasingly logging into social gaming apps through iPhones and Facebook. In this booming market, how game developers utilize data makes all the difference in the world.
Zynega is credited with kicking off the social gaming craze in 2009, when it introduced FarmVille for Facebook. The hook was set, and before long, people started spending real money to gain credits in the virtual world of social gaming.
At any given hour, tens of millions of people are logged into their favorite social game, be it the Candy Crush Saga from King, the Clash of Clans from Supercell, or Bingo Blitz from Buffalo Studios. While social and mobile gaming is still a fraction of the overall $80 billion video game market, it’s catching up fast.
Huge growth in this space means even bigger data, and that’s leading the industry’s top gaming titles to prepare for a data deluge. The capability to ingest and analyze billions or tens of billions of data points per day is par for the social gaming course, says Barry Sohl, the chief technology officer at Buffalo Studios, which is owned by Cesar Interactive Entertainment.
“It’s absolutely critical,” Sohl says. “Our entire industry is very data driven. The vast majority of our revenue comes from a small percentage of our users, so figuring out how to increase that conversion rate by even a small amount…has a fundamental impact on revenue, and our ability to do that is almost entirely driven by the data.”
In social gaming, virtually every move a player makes is logged and tracked by the game provider. This creates a feedback loop between gamer and developer that simply doesn’t exist in the traditional video game console market. The opportunity to sell credits to customers also changes the financial dynamics of the market.
Gave developers use a variety of tools to analyze their big data sets, with Hadoop being one of the most popular. Zynaga’s big data setup is said to be one of the largest Hadoop clusters in the space, storing 1.4 PB of data. The company behind Farmville 2.0 also uses Splunk, MySQL, and HP Vertica to collect and analyze about 60 billion rows of data per day.
Buffalo Studios also uses a high-performance Vertica data warehouse to run SQL reports on player data. In Buffalo Studios case, the company uses Apache Flume to ingest about a billion rows of data per day, or about 100GB, into its data warehouse, which is hooked up to a Tableau visualization and reporting system.
This BI system is used to track a wide range of in-game metrics, including when to give away credits to customers during a game, whether new features added to the game are successful, and fraud detection. “There are a ton of different criteria to watch,” Sohl says. “Customers are buying credits and coins and spending them to play the game, but they also win credits and coins back, so you have to carefully manage how much currency you’re putting into the economy versus how much you’re taking out. It’s a very careful balance between keeping the fun factor of the game up, but also driving revenue.”
Competition in the social gaming space is fierce, and the Santa Monica, California-based company is constantly adding new features to its top two titles, Bingo Blitz and Bingo Rush 2, in the hopes of attracting and retaining customers. The company releases at least one new feature every week for each of its top four platforms, iOS, Android, Facebook, and Amazon’s Kindle.
But connecting the dots between the addition of new features and their profitability was not always easy. In particular, Buffalo’s developers struggled to ensure that each new feature had the appropriate level of instrumentation that allowed the company’s business analysts to determine whether a new feature was helping or hurting.
“One of our biggest bottlenecks in the past was at the ETL layer,” Sohl says. “Every time we launch anything new in the game we need data warehouse instrumentation around that to measure the success and usage for those features. Previously it was extremely cumbersome in the ETL pipeline because that required going in and touching ETL code and modifying database schemas. So what happened more often than not, the functionality didn’t get instrumented or the data would make it into raw logs and never get ingested all the way to Vertica.”
In May, the company finished a major overhaul of its ETL system, and today it has the confidence to know that each event is appropriately logged. The new system utilizes Talend’s ETL tools and Flume to move events from the game servers into the Vertica warehouse.
One of the reasons Buffalo Studios selected Talend is its developers weren’t restricted to using the user interface. “We do a lot of fairly close-to-the-metal stuff for our system, and the ability to write those pieces as custom Java component, hook them into Talend, and still use the niceties of the higher-level UI was the big plus for us,” Sohl says.
The new ETL system today is giving Buffalo Studio’s analysts with very close to real-time input on the effectiveness of new features. It’s also helped the company’s fraud detection. “Now we can detect fraudulent activity within 10 minute,” Sohl says. “If we had a bug or an exploit and left it to run for an entire day, it could have a catastrophic effect on the game. So now we’re able to detect these things very quickly.”
Sohl says the new ETL setup is working so well that it may be adopted by other Cesar Interactive Entertainment properties, including Playtika, which develops the Slotmania and Farkle Pro titles, and World Series of Poker. And as Buffalo Studios attracts more users, the company may need to follow in the footsteps of other social gaming outfits and adopt Hadoop. “We’ll probably eventually end up in a hybrid model where we’re keeping a sliding window in Vertica and then archiving longer running stuff in Hadoop,” Sohl says. “But as of right now our BI team is addicted to the performance and ease of use of Vertica.”
October 26, 2021
- Data-as-a-Service Startup QoreNext Closes Seed Funding Round
- Cabot Corporation Selects Cognizant to Transform its Digital Operating Model
- Yellowbrick Data Sponsors 2021 Tableau Conference
- BigBear, LLC, a BigBear.ai Subsidiary, Awarded Defense Intelligence Agency Contract
- Iguazio Partners with Pure Storage to Operationalize AI for Enterprises
- Red Hat Brings Azure Red Hat OpenShift to U.S. Government Agencies
- Aerospike and Ably Partner to Simplify Real-time Event and Data Processing
- TripleBlind Accelerates Data Sharing Capabilities of Alternative Data Platform Eagle Alpha
October 25, 2021
- Sun Life Deploys Privacera to Accelerate AWS Migration
- The expert.ai NL API Now Available in AWS Marketplace
- Franz Announces AllegroGraph 7.2
- Teradata and H2O.ai Partnership Accelerates Enterprise AI Adoption in the Cloud
October 22, 2021
October 21, 2021
- Dremio Announces New Dart Initiative Release
- Hex Technologies Raises $16 Million Series A to Help Data Teams Do More
- 2021 GigaOm Radar Report for Data Warehouses Names Yellowbrick Data an Outperformer
- DataRobot Research Finds 86% of Organizations Prioritize AI and ML
- Terrafuse AI Launches New Platform to Visualize California Wildfire Risk
- New Relic Launches In-IDE Observability and Code Collaboration Experience
- KX Announces Launch of KX Academy On-Demand Training Portal
Most Read Features
- Google Cloud Gives Spanner a PostgreSQL Interface
- What Is Data Science? A Turing Award Winner Shares His View
- We’re In the Moneyball 3.0 Era. Here’s What It Means for Live Sports
- Big Data File Formats Demystified
- What’s the Difference Between AI, ML, Deep Learning, and Active Learning?
- Who’s Winning In the $17B AIOps and Observability Market
- Composite AI: What Is It, and Why You Need It
- Five Real-World Applications for Sports Analytics
- Enterprise AI: A Slow Progression Toward the Frontal Lobe, Not a Race to the Bottom of the Brain Stem
- One on One with Google Cloud Product Director Irina Farooq
- More Features…
Most Read News In Brief
- Data Prep Still Dominates Data Scientists’ Time, Survey Finds
- Data and AI Salaries Continue Upward March, O’Reilly Says
- The Next Breakthrough in Long-Term Data Storage is….Gold?
- Gartner Shuffles the Technology Deck with Latest ‘Hype Cycle’ Report
- Why Is SAS Going Public?
- Feature Stores Emerging as Must-Have Tech for Machine Learning
- Sisu Nabs $62M to Grow Data Analytics Biz
- LinkedIn Open Sources Tech Behind 10,000-Node Hadoop Cluster
- Here’s What Splunk Announced Today at .conf21
- Chip Shortage Hurts AI, But Hardware Just Part of the Story
- More News In Brief…
Most Read This Just In
- Esri Releases ArcGIS GeoBIM, Bringing Spatial Context to AEC Operations
- PrivaceraCloud 4.0 Enables Governed Data Sharing Across the Open Cloud
- Databricks Acquires Low-code/No-code Company to Expand its Lakehouse Platform
- Dremio Announces New Dart Initiative Release
- BriefCam Introduces Video Analytics Enabled on Deep Learning Cameras from Axis Communications
- NetApp to Acquire CloudCheckr and Expand its Spot by NetApp CloudOps Platform
- TIBCO Delivers a Comprehensive, Connected Platform for the Adaptable Digital Business
- Datatron Awarded U.S. Patent for Methodology for Modeling Machine Learning and Analytics
- Alteryx Acquires Lore IO to Enhance Enterprise Analytics for Cloud Data
- Sinequa Accelerates Time-to-Value with “Starter” Insight Apps
- More This Just In…
Sponsored Partner Content
October 27 - October 28
November 29 - December 3
December 6 - December 10San Diego CA United States
February 7, 2022 - February 9, 2022Houston TX United States
June 26, 2022 - June 30, 2022Hollywood FL United States