Fifty percent of big data projects fail, Jim Kaskade, CEO of Infochimps, told an audience at the Strata + Hadoop World conference last month. One of the chief reasons that people's projects fail: their eye is not on the ball. It’s the application, not the data, that should be the primary focus, he says.
This theme--that we should be focused on applications and not technology-- has started to reverberate in the conference halls, analyst reports, and vendor press releases as the industry tries to cut through the hype around big data, and get to the meat: the applications. Kaskade said it bluntly:
“It all starts with the application, so stop building your big data sandboxes. Stop building your big data stacks. Stop building your big data Hadoop clusters without a purpose,” he admonished his audience. “If you start with the application -- the use case – you have a purpose. You have focus. You narrow the scope of your technology choices, and most importantly, you accelerate time to value.”
Decreasing time to value is ultimately the objective Kaskade intended to relay as he took the Strata + Hadoop World stage, relaying a personal story about his mother’s fight with leukemia, and highlighted the work that Dr. Björn Brücher is doing in trying to advance the ideas of using big data related technologies to advance applications in cancer treatment.
Echoing Kaskade’s admonitions, Brücher has been critical of how technology has been applied in health related research. “It seems humans did not learn from the last 30 years of research,” he publically wrote earlier this year. “We should have learned much more from the mistakes, during and after the Human Genome Project, but they are repeated," he said.
Brücher voiced his displeasure in how billions of dollars have been spent over the last 30 years, saying that their "effects and translational benefits had been marginal."
“Leadership here should not mean, just declaring visions and spending money on it,” he said. “Differentiated strategies are needed, which are very well thought through for enabling it, concentrating joined forces on different goals: elaborating different missions and taking one step after another.”
Brücher concerns needle at the tendency of populations to get carried away with the hype, and throw money where they should, instead, be throwing ideas. According to Kaskade, Brücher is hoping to reverse these trends by starting with an application first. “It’s Dr. Brücher, and our belief that we can help people with cancer by predicting individual outcomes, and then proactively applying preventative measures, and possibly giving them normal life spans.”
While he doesn’t go into detail about how these applications will work, his point is about how they are being approached. Once you have an application or use case, Kaskade says what’s next is the easy part: applying the data. “I’ll argue, it’s very simple. You’re probably a global 2000 company that’s only used 15% of your data assets. So let’s expand that to 100% of your internal data assets. Let’s add another 100% of external data assets.”
Kaskade says that once you’ve got the application, and then the data in place, only then is it time to apply the analytics piece.
Ultimately (and maybe somewhat between the lines) Kaskade makes the point that modern businesses live in the webscale world created by the webscale giants who built their infrastructures with their enormous needs in mind. Everyone else is surfing in their wake, and rather than businesses getting caught up in whether or not they have a bottle of “Big Data” to rub on their problems, they should instead come up with applications that solve problems and then see what is available in the wide world of ever-pervasive webscale technology to bring those solutions to bear.
Not putting the cart before the horse? It just might be crazy enough to work.