How to Get a ‘Network Effect’ from Your Big Data Lake
One of the hidden benefits of being a data-driven organization is a so-called “network effect” that occurs around data and analytics. When an organization has several successful big data analytics projects under its belt, it often becomes easier to see how data can be used to benefit the organization in profound new ways.
Creating a Hadoop-based data lake is often the first step in going down the big data analytics road. Without data and a place to put it—often a Hadoop cluster–you have nowhere to begin, says Ben Werther, the CEO of Hadoop application developer Platfora.
“As more and more raw data gets put into this Hadoop-based cluster in its raw form, it becomes a resource for the questions you ask,” Werther says. “Nobody is going to jump there in one step and say ‘I’m going to put everything in my universe in the data lake.’ But you also can’t answer questions until you have data in the data lake.”
The next step: pick a use case. It may be a glaring problem at the company, such as an increase in fraudulent transactions. Or perhaps the challenge is about improving the status quo, such as getting store associates to stick around a little longer. In any case, you should be able to bring data to bear on the problem (hopefully lots of it!)
The most common big data use cases among Platfora customers revolve around customer analytics, cybersecurity, and Internet of Things (IoT), Werther says. “Each of them is intrinsically different because of the high volume and the variety of data,” Werther says. “We can answer really interesting question about customer behavior that are big data in nature in each of these areas.”
Platfora is keen to present its solution as an “end-to-end” solution that allows regular business users to get insight out of data stored in Hadoop. Platfora customers don’t need to buy separate products for the various stages in the big data analytics project, such as data munging and preparation, the actual analytics and algorithms, and final presentation.
After their first successful big data project, Werther says Platfora customers often get that “a ha!” moment. “You move onto a second use case and third use case, and pretty soon you have a network effect of the data that’s starting to be productionalized in that data lake that inspires new questions and new capabilities,” he says.
This network effect can lead an organization to find creative new uses for their existing data, and also lead them to seek out new data sources to answer a widening circle of questions. “There’s organic growth because of the new capabilities it brings,” Werther says. “You end up with more and more data ending up there, but you do it step-wise by solving problems.”
One of the hottest workloads on Hadoop at the moment is enterprise data warehousing (EDW) offloads. By moving all of your SQL and ETL workloads from a Teradata or an Oracle environment into Hadoop, the thinking goes, companies can save a lot of money on expensive EDWs. Werther is not particularly enamored with trying to shoehorn existing business intelligence processes onto Hadoop. “That’s where there’s been false promises. That’s a path to failure,” he says. “Just because it’s familiar doesn’t mean it’s the best thing to do. There’s a much, much better way.”
The rise of the Hadoop ecosystem has given people new ways of utilizing data and analytics. It’s also re-written the economics of storing vast amounts of data that organizations previously discarded. This has piqued the curiosity of many organizations to get started with Hadoop and begin filling their own data lakes.
Platfora commonly runs into frustrated business analysts who say they’re feeling data-rich, but can’t get answers to any of their questions. Requests for reports get thrown over the wall to an ETL developer, who comes back months later with a report. But by then, the demands have changed, or the opportunity to act on the data has past.
With a well-designed Hadoop application, analytic tasks that previously took months can be accomplished in weeks. In Platfora’s application, users can explore and interact with Hadoop data graphically. “It’s sub-second and visual. It feels like the data is right here in front of you,” Werther says.
Platfora is one of many vendors innovating in a vibrant community around Hadoop. As Werther sees it, Platfora is contributing to an overall pace of innovation that will result in data being “on tap” for everybody within three years. “It won’t be all about the bottleneck through IT,” Werther says. “There will be all sorts of rich capabilities to enhance that analysis, and to share and publish and drive actions from that. And you don’t have to be a data scientist.”
The network effect will take hold once these business analysts get a taste for what’s possible with big data analytics, and then things will really start to get interesting. “People are swimming in data,” Werther says. “But the biggest barrier for a lot of people is the imagination to know what’s possible.”