Big Data’s Next Big Thing: Sports Training and Personalized Medicine
Big data analytics is having a huge impact across lots of industries, with financial services and retail leading the pack. But according to one expert, there’s a curious lack of human analytics talent focused on the potential to apply big data techniques to personalized athletic training and medicine.
There’s massive untapped potential to use big data analytics to develop personalized regimens to improve athletic performance and people’s overall health, says David Epstein, the author of The Sports Gene and longtime contributor to Sports Illustrated.
“There’s a tremendous opportunity now, from the highest level of sports to just getting people healthier,” Epstein said at the Hadoop Summit last month. “I hope that people here will apply some of the magic they’ve applied in other industries.”
During a fascinating 30-minute keynote, Epstein shared a short history of big data in sports, which was all the more impressive because he didn’t mention “Moneyball” once. From obscure sports like handball, shotput, and long jump to more visible ones like baseball, basketball, and golf, Epstein showed how athletes are extracting non-intuitive patterns from big data collections and using that knowledge to make small adjustments to improve their performance in the field.
Take, for example, the work that trainers did preparing Great Britain’s athletes for the 2012 Summer Olympic Games. An analysis of the biomechanics of the long jump show there are three main variables that impact the result: the speed of the runner when he hits the board, the force he exerts on the board, and the angle he takes off the board.
“Her job was just to see which of these three variables they could do something with,” Epstein said. “She figured out really quickly….she couldn’t change sprint speed at all at that point and she couldn’t change force on the board because everybody has pretty good technique. Technique is the thing that most of the people at that level spend all their time working on. She did find she could change angle very easily. People never work on angle. It seems like something silly.”
For two years prior to the Games, the trainer worked with Great Britain’s best jumper, and focused on nothing but the jump angle. “Come the Olympics, he doesn’t have one of the 10 fastest approaches to board, or one of five highest forces on board. But he jumps at the perfect angle and wins the gold medal,” Epstein says. “Here’s a guy who probably isn’t one of the 10 best athletes in the world at his sport, but by drilling down into big data and figuring out what mattered, they figured out what they could actually do something about, and what they could change by adding a little science to that they found.”
Big Sporting Data
Those sorts of stories are becoming more common as athletic trainers figure out how to use big data to their advantage. In many cases, Epstein says, this big data approach can help amateur athletes learn in a matter of months what it takes world-class athletes a lifetime of training to develop.
World-class athletes may appear to have super-human abilities. How else can you explain how soccer superstar Cristiano Ronaldo perfectly heads a ball into the net when the lights are turned off just after the pass? It’s all about how athletes learn to “chunk” data in their brains, says Epstein.
“The way that elite athletes do what they do, to make it look like they have super-human reaction speed, is they pick up body queues–the rotation of shoulders and torso, shifts of the lower leg and rotation of the ball,” Epstein said. “So they see the future of where the ball is going before it gets there. It turns out that kind of information processing is the hallmark of expertise, and we’ve learned a lot about it by gaze tracking.”
Armed with the insight from gaze tracking machines, amateur athletes are emulating their elite colleagues and shortcutting the training required to achieve super-human capabilities, Epstein says. In golf, for example, gaze tracking has detected the “quiet eye” period that is the hallmark of the world’s top professional golfers. Golf instructors have figured out they can get amateurs to calm their wandering eyes by telling them to read a word under the golf ball, and the result is that they take on average 1.2 strokes off their round, Epstein said.
“That quiet-eye period is something that’s learned implicitly over years. Now it can be taught in weeks,” Epstein said. “For first time ever, we’re starting to use huge data collections, big data, and gaze tracking from expert athletes, combining it with a little bit about what we know about the science, to be able to start teaching people skills that only experts have much quicker.”
It can also tell us what not to do. Take baseball, for example. Because the minimum human reaction time is barely much more than the time it takes for a baseball to fly 60.5 feet from the pitcher’s mound to home plate, that old adage of “keep your eye on the ball” is actually not possible.
“If you have kids and you tell them to keep their eye on the ball in Little League — We actually don’t have a visual system capable of tracking objects that are moving that fast….so keeping eyes on the ball is total nonsense,” Epstein said. “You literally cannot do it. Close your eyes when the ball is half way, and it wouldn’t make any difference.”
Epstein doesn’t appear to take delight in squashing time-honored hitting advice that’s been handed down over generations. For him, it’s all about the data. And if the data says what you’re doing isn’t working, then you should move on to something else.
But Epstein doesn’t always play the cold statistician. In fact, he shared with the Hadoop Summit audience a fascinating account of his own history in track and how he came to a broad realization of the uniqueness of human physiology largely through guts and determination.
As a walk-on runner in college, Epstein wasn’t expected to last long. Perhaps to discourage the slow-running newbie, Epstein was paired up with a roommate who was a star athlete and a member of Canada’s national team. On the 800 meter race, Epstein was 20 seconds slower than his roommate–eons in that event. Things didn’t look good.
“But I stuck with it,” he said. While he couldn’t handle his roommate’s running workload, he did lesser variables. As time went on, Epstein got faster, while his roommate stagnated, despite the larger workload. Eventually, Epstein passed his roommate on the track, which led him to receive an award for succeeding despite an “unusual challenge and difficulty.” His racing career took off, while his once-promising roommate was written off as a “head case.”
The results felt good, but Epstein knew there had to be something else going on. He read about the Heritage Family Study, a multi-generational study that found links between genetics and response to exercise. He researched another study that identified a collection of 21 genes that are good predictors of who will respond most to aerobic training.
Lo and behold, Epstein had most of those genes. “I was the classic low-baseline, high-responder,” he said. “I got winded walking up stairs. A pulmonologist said I had symptoms consistent with the early stages of emphysema. That was two years before I ran at the U.S. National Championships. I assume most people like that quit.”
The point Epstein made is that there are loads of non-intuitive connections out there that have still yet to be discovered, but it’s going to take data scientists and analysts to tease them out. Advances in gene sequencing are opening up enticing possibilities for personalized training and personalized healthcare, but the personnel and expertise are woefully lacking.
“I was at NIH last week, at the sequencing lab,” Epstein said. “They’re really proud they can sequence the whole genome in three days. Before it took six years. They’re bragging about how quickly they can sequence, but they’re so far behind in being able to analyze it. That’s where the bottleneck is.”
While the Human Genome Project may have opened the door to amazing things, people have yet to really benefit from them. “It was amazing what we did, but the results coming out of it haven’t really changed how healthcare and exercise are applied, because we’re so far behind the actual data,” he said. “There’s so much opportunity to revolutionize healthcare to do what we’ve done for music, for social interaction, to personalize their environment, to do that for actually taking control of their health.”
Instead of gathering more data, people should be focused on analyzing the healthcare data that we already have, he says. “We don’t need more studies. We need someone to put the data together, to revolutionize the way athletes are trained,” he said. “Sports is drowning in data but the bottom line is it needs people to analyze it. All we’re getting is the low-hanging fruit–stuff that’s more fun for fans to see the inside of the game as opposed to…actually speed up people’s skill development.”
Epstein encouraged the Hadoop Summit audience to consider the potential impact that they can have in the field of sports training and personalized medicine. “Across the consumer world, we’re allowing people to personalize the things they do. But very little attention is being paid to the areas that have a lot of interesting and commercial potential,” he said. “There’s lot of money in exercise, there’s a lot of money to be had in people who can stem the tide of chronic disease…. Right now, even if we weren’t gathering any more [data], we can be doing things that would affect billions of people, literally, in how they take care of themselves.”