Modeling Customer Behavior with Analytics and Big Data
For decades, companies spent big bucks on outside specialists to tell them what they were doing wrong and how they could better serve their customers. Today, companies can improve their businesses from the inside by using the latest analytic technology against massive stores of data to generate highly accurate models of customer behavior.
Tom Kersnick, director of big data solutions at Pactera, recently provided some tips on how to build a big data analytic system that can predict customer behavior. By discovering the connections that exist between customers, products, pricing, and sales, companies can improve their sales, increase cross-selling, decrease churn, and (last but not least) make the customer happy and more likely to return.
In the video, Kersnick describes how companies should go about architecting a predictive model framework. For starters, the data scientist should take a close look at the data and identify the potential “golden nuggets” in the data. If the unstructured data is not cooperating, it’s time to put on the “data janitor” hat and bring some order to it, he says.
Next, the scientist should ask himself what he’s not tracking, and how that affects what he can predict. What questions can be answered with the data? What cannot be answered? “Make sure you understand the modeling objective,” Kersnick says. “Think of model building as working backwards. Start with the desired outcomes and then work your way back to what needs to happen in order to see those outcomes. There is no use in building a great model for the wrong problem.”
Kersnick also advises the budding data scientist to create a training set and play around with the data a little bit. “Get a feel for your data,” he says. “After you’ve achieved clarity on a modeling objective, it is often a good idea to first play around with your data set. You need to know which variables to expect in your model, and then investigate new and or derived variables as the most powerful way to look for big improvement in prediction accuracy.”
Then the real fun begins: Algorithms! There are a range of classification algorithms that a data scientist can use to identify patterns within a big data set. Kersnick says SVM is not easy to program, but is widely considered the best classification algorithm. Another one called K Nearest Neighbor is simple and easy to program, while Decision Tree delivers fast results with a beautiful interface.
The usability of the algorithm can as big factor as the breadth of features in the quality of results. “The algorithm people are most familiar with always gets the best result,” Kersnick says. “This of course has less to do with the suitability of the algorithm than the skill level of analyst.”
Once a model has been built, keep improving upon it. “Predictive modeling is always a continuous cycle,” Kersnick says. “Use the model to generate predictions and see if it can improve your company’s performance. Put your model into action. Find a playground in your company to test and measure the changes in churn, conversion, risk, or whatever you want to model.”
Better results can also be had by improving or expanding the data. “Check how more data can improve your model. You can add data in two ways. Simply add more data points to the same data set, or you can add more features to the data set,” he says. But if the data set gets too big, parse it out into a smaller set to keep accuracy high. Always back test new data sets against the previous model as a way to foster continuous improvement.
“Maybe the model you just created the first time isn’t going to be as great as the next one you’re going to do. Put it to action and see how it works. Does it improve the previous results? Most likely it does,” Kersnick says. “The secret here is to try multiple models to see which one gives the best result. Continue to iterate to find the best fit.”