Interview: SAP Solidifies Predictive Strategy
Today SAP ushered in a new set of predictive capabilities around its HANA in-memory platform via its Business Objects Predictive Analysis offering.
According to Mani Gill, VP and General Manager of BI solutions at the SAP, the company is giving statisticians what they need most; the ability to be productive via a modern, easy to use interface that is designed for building predictive models as well as testing, visualizing, discovery insights and sharing them.
The predictive analytics offering is built on top of HANA so users can leverage the possibility for enhanced speed and power for real-time operations. It is also possible to layer the company’s analytics on top of non-SAP databases, a point they say is “proof of commitment to heterogeneity” they made when they took over Business Objects almost five years ago.
We talked with SAP’s Mani Gill and Jason Kuo about the announcement in advance of the formal news. Kuo told us that “With all the influx of volume and variety that’s taking place in big data, it’s clear that the leverage that organizations can get out of predictive is much more amplified now. We wanted a predictive solution moving forward that could fully leverage our big data strategy, including HANA. “
Kuo said that when it came to looking to predictive solutions, the big data buzz provided some big incentive. As he told us before our formal Q&A, “we’re bringing to market is what we call “from database to decision, predictive analysis.”
Where does HANA sit in the overall predictive technology?
HANA as a high performance, in-memory database, can be used for predictive, but it often isn’t. A lot of the initial use cases and what we see how HANA is being used today, is as an appliance for almost a dedicated datamart or data warehouse, often times for a certain subject area. More and more, these big announcements that their company continues to make about our investments in HANA and how it becomes the foundation for both analysis, but also for transaction applications.
The role of HANA in a data management strategy is broad and deep, beyond just predictive. That said, because all of that data, transactional, analytical and more are all funneling and going through HANA, and it’s in memory processing. It makes a lot of sense to embed predictive capabilities and to run the predictive algorithms on HANA. There is a huge advantage of that in terms of the performance, to run those predictive algorithms at a high speed and make them available to other environments very quickly. That’s the role of HANA. HANA’s role is to run and process the data very quickly. HANA includes a predictive algorithm library. That hasn’t been a huge point that we have talked about in the press, but it has been in there and we’ve created some applications. You might have heard of the smart meter analytics product. That product runs off of HANA and uses the statistical processing predictive algorithms in HANA.
What this new predictive analysis product does, is bring the design and visualization front end to people who don’t want to use HANA, but if they’re using HANA, it’s a powerful front end for doing the design and the visualization of all of that processing that you’re going to be doing on HANA. It really helps unlock and unleash the value of big data and unlock and unleash the value that you have in HANA.
You guys are taking advantage of in-memory processing. How would you say that compares to similar analytics offerings?
We’ve got a head start amongst many of these competitors who are now experimenting with in-memory. It’s clear we’ve made the technological and the investment in that market. We’ve got a lot of skin in the game there and we have the customers to show for it and the performance and benchmarks to show that too. We’re making further investments where predictive is actually in the database. We have that with this predictive algorithm library and you are going to hear about other investments in this area in the time to come throughout this year. We’re continuing to add to those predictive capabilities in HANA. We’re also bringing this productive design and visualization environment into the offering also. The two compliment very well and go even further when you start to deploy those applications off of the whole back end.
Other vendors will start to have predictive design environments, they may have in-memory database, but the richness and the success they have had with the in memory…. They haven’t been running on the in-memory as long. It doesn’t necessarily have the predictive capabilities in the database and they’re not building applications for business users.
Is R the backbone of this new offering?
R is one of the compelling methods of access. The vast majority [of predictive and data mining users] are using R and they also use other tools. They use some other industry tools as well. Being able to effectively, efficiently provide a design and visualization environment on top of R and use the R algorithms is pretty compelling. It’s a very important part of a typical predictive statistician’s skill set. Whenever there is an opportunity, there are advantages at times to running more local predictive algorithms. Maybe you’re not an R user, which would be one case. Sometimes the predictive algorithms run faster. For example, the predictive algorithms in HANA typically run really fast. We built them into the database as such. It’s not uncommon for customers to use a combination of both R and other predictive libraries.
Where does one of the external software partners’ offerings like that from Sybase fit in?
Sybase offers and brings to us the IQ, high performance data warehouse and IQ has a rich history of predictive capabilities in it. It has R integration and it’s own predictive analysis libraries within it too. It tends to use a lot of partner capabilities for that, Fuzzy Logics is one of them and KXEN is another. Like HANA and like predictive analysis, it’s this powerful piece of infrastructure that allows both open source R access and has its own analysis library. This predictive analysis product that we’re announcing, works on SybaseIQ as well.
Going back to R. Is it the backbone of this software, or is it tied in?
I’d say it’s tied in. If customers aren’t deploying R, They’re still going to love this product and want it. You need a design and visualization environment. Some people try to do without, sometimes they just try to use R on its own, but without a visualization and design environment, you’re not nearly as productive. Design and visualization is important and you’re going to need algorithms. Whether this is a smaller project or a big data project, you’re going to need predictive algorithms to do the design with, to employ in the model. The algorithms can come from a number of places. There are homegrown ones you could use, there are packages you could purchase on the market, or there is R. We support many different options of that. You need algorithms and customers usually like a variety of algorithm sources and algorithm functions that they can use. Some shops are all R and very few are none of the above. Some are a little bit of both. Is R the cornerstone of our strategy? No, not by any stretch. Is it a critical must have for us? Yes.
(Gill steps in with this addition): We’ve made the design of algorithms regardless of where they’re coming from, whether they’re coming from SAP or they’re coming from R as a seamless process. So, from and end user perspective who is using those algorithms, you do the same thing.
Speaking of other data sources to tie into the future… Can we get some more details about those data sources?
The predictive analysis product can access HANA, SybaseIQ, other databases, flat files like excel and csv’s and it can also access business objects universes. As you know, Business Objects universes is a cornerstone technology that’s been around for 20 years in the business objects client. The business objects universes can access all kinds of different data sources also. The product has a whole array of different data sources that it can access. We still believe the primary sources will likely be HANA, universes, some flat files and some other databases. Given our customer base, the ecology of our customers, you can see that most likely it’s going to be predictive deployments in conjunction with the BI environment. There are a lot of BusinessObjects customers wanting this capability and a lot of HANA customers who are excited of this product and how it can unleash the value of the big data that they’re processing through HANA.
Can you tell me any of the users who were involved in the ramp-up program?
At this point, we’re inviting customers in and don’t have any customers to publicly announce. That said, this technology of predictive, for example, the predictive library that I mentioned, there are applications that are running off of it right now, including that smart meter analytics one. We have customer testimonials on that. There are definitely SAP customers who are using our predictive technology today and are reference able and vocal about it
So Smart Meter analytics is one type of workload, do we have another 1-2 examples of workloads?
We do, there is one in supply chain and in sales automation.
How do you see this offering assisting with needle in the haystack problems?
For a long time, there has been the ability, technically, to identify hidden risks and uncover new opportunities that lie within the database. For too long though, that’s been limited to a very small segment of the business population, and that’s the data analysts. We’re saying finding those needles in the haystack, is going to become much easier and more valuable. Now that we have a high productivity tool for those statisticians, that can very quickly screen and search through the ever-growing bigger haystack.
What are SAP’s R&D priorities for big data analysis?
If you look at HANA and our BI investment and our information management platform, those are all 100% dedicated to big data scenarios. It’s about increasing volumes of data, going through HANA, going through our information management layer, increasing varieties of data, both structured and unstructured, coming from and including various social media and all other types of sources. Hence, we have had acquisitions recently of Netbase and around social media analytics, as well as the processing of that information, sharing it throughout the organization and deriving the value from that. Hence, information management, analytics, HANA, database technologies and applications are all revolving around big data. It’s what we do.