Follow Datanami:
March 28, 2014

Playing Ball with MLB’s New Analytic Data Feed

Alex Woodie

Baseball has always been a game of numbers and statistics. But Major League Baseball is taking that tradition to new heights this season with a pair of technology products, including a new play tracking system that will generate a big data feed of player movements on every play, as well as a new video replay system that’s brand new for 2014.

Major League Baseball Advanced Media unveiled the new play tracking system earlier this month at the Sloan Sports Analytics Conference in Boston. While MLBAM isn’t sharing all the details of the new system, it’s believed to use the same type of camera and radar technology that it delivered with the PITCHF/x system that has tracked every pitch in every ballpark the last few years. But instead of just tracking the velocity and movement of pitches, MLB will enable the tracking of other important elements of the game, including hit balls, defensive players, and baserunners.

“The goal is to revolutionize the way people evaluate baseball, by presenting for the first time the tools that connect all actions that happen on a field to determine how they work together,” MLB.com says in a March 1 story introducing the new system. “This new datastream will enable the industry to understand the whole play on the field–batting, pitching, fielding and baserunning–and enable new metrics for evaluation by clubs, scouts, players and fans.”

Today, many of baseball’s statistics focus on a hitter’s ability to hit for average or power. But the new player tracking system and accompanying statistics that it will generate are expected to open up a new world of statistics that rate how well defensive players and baserunners are doing their jobs. You can expect to hear the words “route efficiency” this season to reflect, for example, an outfielder’s ability to get a good jump and take a good angle to balls hit into the outfield.

Mashing the movement data into geometric data required extensive algorithmic work, says Claudio Silva, PhD and professor of computer science and engineering at NYU Polytechnic School of Engineering. “It’s really very complex algorithms that are going into making this thing work, into the validation process, and actually eventually into all the analysis that people are going to be doing on the metrics,” Silva told MLB.com.

Silva said he felt privileged to be one of the people with access to the new datastream, which he says is going to open up a new world to baseball fans. “I believe that this data is so rich, there are so many interesting things we can do with it, we’re going to be able to comb through this data and find layers and layers of features that we never could see before,” he told MLB.com.

Just as Oakland A’s general manager Billy Beane introduced baseball to the power of objective analytics with his “Moneyball” version of sabremetrics a decade ago, mining the new data feed for insights into players’ defensive and baserunning tendencies could eliminate some of the subjectivity that professional scouts bring to the game.

“There is a lot of quality defensive statistics out there, but they’re not completely accurate,” former MLB general manager Jim Duquette told MLB.com ” Some players . . . range to their left better, some range better to their right, some come in on ground balls better than others, some have better first-step quickness,” he says. “The exciting thing about this new technology is, you can start to take the subjectivity that is given to you by the scout and blend it with raw data now, and come up with a truer picture of evaluating a player.”

The new system, which doesn’t appear to have a name yet, is slated to be installed in only three ballparks this season, including Miller Park in Milwaukee, Target Field in Minnesota, and Citi Field in New York. However, it’s expected to be installed in every major league ballpark for the start of the 2015 season, MLB says.

It’s unclear if MLBAM will make the datastreams from these three parks available to everybody–as it does with the PITCHF/x system–or if it will restrict the data to the teams and TV operators. Eventually, all of the data will be open, MLBAM CEO Bob Bowman says.

“The goal over time, and hopefully certainly by this season, is to make these plays available in real time and start the debates,” Bowman says in the MLB.com story. “But we have to make sure baseball operations sees it and they agree that these are accurate renderings. But this year, fans will be able to see these data and these videos.”

MLB’s new replay center in NYC

Speaking of videos, the MLBAM is also finishing up the installation of the new instant replay center in its office in the Chelsea community of Manhattan in New York. Eight umpires will staff the 900 square-foot command center to be on call to provide instant replay monitoring for each game. There are 30 massive high-definition TVs in the command center, one for each stadium, and they’re fed by 12 cameras installed in each stadium that have the ability to catch each play from multiple angles.

To get the umpires and replay technicians up to speed, MLBAM looked at 50,000 close, reviewable plays from the 2013 season, and found it would have reversed 377 of them, which corresponds to one wrong call every 6.4 games. The analysis found there would have been two reversible calls every 90 games and three missed calls every 810 games. No game ever had four reversible calls, which shows you how accurate human umpires have been in baseball.

Numbers have always been a part of baseball. But with the new capability to transform unstructured video feeds and radar data into a real-time stream of game statistics, it’s a whole new ballgame.

Related Items:

When Data Analytics Goes Horribly Wrong: A Sporting Example

How Businesses Can Apply the Analytic Lessons Learned in Sports

Moneyball Meets Marketing as Ad Research Game Changes 

Datanami