Inside Fortnite’s Massive Data Analytics Pipeline
With 125 million players around the world, Fortnite has set a new standard of success for massively multi-player games. But pulling together all the servers, databases, and data pipelines to manage 92 million events per minute was no small feat, as Epic Games’ director of platform Chris Dyl recently shared.
Epic Games relies on Amazon Web Services‘ (AWS) public cloud data centers to keep Fortnite running 24 hours a day, 365 days per year. Dyl appeared at a recent AWS Summit event in New York City to share his company’s AWS story.
The scale of Fortnite infrastructure running on AWS is immense. According to Dyl, Fortnite runs across 12 AWS data centers, encompassing 24 Availability Zones (AZs). The peak Fortnite load is 10x bigger than the smallest load, so Epic relies on the scale-up and scale-down features of AWS’s Elastic Cloud Compute (EC2) infrastructure to keep its computer bill (somewhat) manageable.
On the data analytics front, the company ingests 92 million events per minute (or about 54 billion events per day) from Fortnite clients into the AWS using Amazon’s Kinesis Streams products. The company is big into Kinesis Streams, and has about 5,000 shards of Kinesis running on AWS (along with just about every other AWS service), according to Dyl.
The company runs real-time and batch analytics on the data flowing through those pipelines. “The real-time pipeline is largely driven off of Spark and DynamoDB for temporary storage, which feeds a number of different sources [including] Grafana for scorecards and some limited time ad-hoc SQL type stuff that we do,” he says.
On the batch side, Epic relies heavily on Amazon’s version of Apache Hadoop, Elastic MapReduce (EMR), for processing. The company uses 22 production EMR clusters (encompassing more than 4,000 EC2 instances) run more than 8,000 batch ETL jobs per day. Those ETL jobs summarize data into Hive tables, which is then provided to analysts to explore via Tableau‘s BI tool, as well as via ad-hoc SQL analysis, Dyl says.
Underlying the batch and real-time analytics is S3, which Epic uses as a data warehouse. Its S3 data warehouse currently has 14PB of data , and it’s increasing at the rate of 2 petabytes per month.
The company uses its big data analytics pipeline for a number of things, including detecting any issues or problems that may be occurring. Some problems can only be detected by analyzing data from Fortnite clients, including PC, Web, and mobile interfaces. The company also analyzes players’ social media interaction to assess design decisions, identify player sentiment, and adjust the game.
“The clients are really a great, great place to actually understand exactly what the user experience is and what kind of problems they’re running into, so we use this as an early detection system for a lot of the issue and a lot of problems,” he says. “Some things that you can’t even detect from backend services, such a ISP issues or other things that go on.”
Fortnite has exploded in popularity over the past nine months, growing by 100x, according to Epic. Now the company is looking to technologies like Kubernetes to help it manage all the back-end servers and services, including dozens of microservices written in Java, Akka, and Go, that it needs to run its game.
“With all the growth we really want to look at better ways of managing all the microservices, so we’re looking at things like Kubernetes EKS (Elastic Container Service for Kubernetes) and Kubernetes to manage this stuff,” Dyl says.
The company is investigating the Amazon GuardDuty service to keep tabs on security threats that constantly arise. “We have a lot of threats coming in all the time and we just really want to make sure we’re on top of that stuff.”
It’s also looking into utilizing machine learning technologies as well – specifically through Amazon ML Solutions Lab, which is an Amazon program that connects Amazon machine learning experts with AWS customers and partners.