Follow Datanami:
August 19, 2014

Cloud Era Rising for Analytics?

Big organizations have traditionally run analytics on-site, for a number of reasons, including data security concerns. But the rise of flexible cloud deployment models and the rapid pace of product development are helping cloud offerings to chip away at on-site deployments, especially for analyzing data that originates in the cloud.

The IT world has not been the same since Amazon Web Services first took to the Net way back in 2006. Suddenly, a business of any size had an easy onramp to the Web. Businesses no longer had to worry about buying servers and hiring staff to manage them. Instead of the size of your infrastructure, what mattered were the quality of your developers and how quickly you could innovate. The Web 2.0 era of REST protocols, JSON documents, and hyper-scalability was born.

Fast forward to 2014, and the software market has changed dramatically. Companies like streaming media giant Netflix have outsourced practically their entire IT infrastructure to IT megalith AWS, which manages millions of whitebox servers in huge data centers scattered across the country. Large portions of whole industries, such as online and mobile gaming, run entirely on AWS and other massive public clouds, such as Google Cloud Compute and Microsoft Azure. Software vendors have followed suit by letting businesses run their apps either on-prem or in the cloud, while others, like Salesforce and Netsuite, live entirely in the cloud.

But while all of this has been going on, there have been two major application types that have resisted the cloud urge: ERP and analytics. Concerns with data security have largely prevented big and mid-sized companies (the Global 2000) from moving their core ERP systems to the cloud, even if though it would bring cost savings. The same has largely been true of the analytic systems that drive decision-making in the C-suite. The cost of data leakage was just too great.

But there are indications this wall is starting to crumble, at least for newer analytic systems, and especially for any analytic solutions crunching data originating in the cloud. It’s not surprising that AWS is at the center of this transformation. In the past week, two providers of advanced analytic solutions, SAS and RapidMiner, have unveiled new analytic solutions that run on AWS.

The new RapidMiner Cloud offering lets customers build complex models rapidminder_logo and then run the predictive analytic workload on the AWS platform. The new offering, which RapidMiner announced today at its user conference in Boston, allows customers to dial the server capacity up or down as needed, and pay only for what they use.

“In many instances you have two phases to an analytic project, one where you take the data and build the predictive model, and one where you apply the model to new and unseen data point to get the prediction,” explains Ingo Mierswa, CEO of RapidMiner. “The phase where you build the model is usually much more computational intensive or demanding than when you apply the model.”

With RapidMiner Cloud, users no longer have to buy a server to handle the big modeling workloads, only to see utilization dwindle when the models are scoring new data. With a few limitations, RapidMiner will dial up the underlying server—say moving a workload from a server with 32GB of RAM to one with 320GB of RAM—automatically.

“Amazon is maintaining the underlying infrastructure, but RapidMiner has built another layer on top of that to make sure you can make use of the elastic assignment of different computational resources,” Mierswa tells Datanami. “We are basically managing a huge farm of different servers and we distribute the different tasks to the different servers for you, so you don’t need to maintain it all.”

sas logoMeanwhile, analytics leader SAS is gearing up to make another cloud push by enabling its software to run on AWS. Unveiled last week, the new offerings span its data management, business intelligence, and analytics products to run on AWS, including Amazon RedShift, an online data warehouse built on Actian‘s ParAccel database, and its Amazon Elastic MapReduce hosted Hadoop implementation.

While SAS already offered some cloud offering, the new deal with Amazon is in response to unmet customer demand for cloud-based analytics. “We are seeing rapid adoption of customers who are looking for cloud-based options,” says Scott Van Valkenburgh, senior director of alliances at SAS, in a story in Triangle Business Journal.

Two SAS customers adopting the cloud solutions are computer giant Lenovo and Dignity Health, a network of 39 hospitals. Dignity Health CIO Deanna Wise says the SAS cloud offerings will assist the company with connecting and sharing data across hospitals, health centers, and the provider network, while Anthony Volpe, Lenovo’s chief corporate analytics officer, says the software will reduce costs and speed the decision-making process.

digital universe

The majority of new data generated today comes from consumer activity on the Internet. Source: IDC

The three main forces—faster access to new functionality, reducing capital IT costs and improving the use of existing resources—that are driving organizations to the cloud are also driving the adoption of analytic cloud solutions, says Dan Vesset, an analytics analyst with IDC. “These are all critical factors to maximizing the value of analytics,” he says. “Deploying analytics within the cloud gives a business the ability to use more than just existing in-house systems to manage and analyze their data.”

The barrier that data security represented to the adoption of cloud-based analytics is not as big a concern when most of the data being analyzed lives in the cloud anyway. “More and more companies and organizations store their data in different cloud-based data stores,” RapidMiner’s Mierswa says. “The data lives in the cloud.”

SAS and RapidMiner join a growing list of major analytic application and tool vendors who have set up shop in AWS, including: Attunity, Birst, Jaspersoft (now owned by TIBCO), Looker, Microstrategy, Revolution Analytics, ScaleOut Software, SAP‘s Business Objects, and Syncsort. Google, Microsoft, and IBM offer a range of analytic applications in their own cloud environments, while startups like Quobole, Metascale, and Altiscale are having success with running cloud-based Hadoop instances for their clients.

SAS also announced that it will be working with other vendors, notably Intel, to form a Cloud Innovation Council to help formalize best practices for running SAS in the cloud. Other vendors that will be participating include Hadoop vendors Cloudera and Hortonworks, as well as Capgemini, CoreCompete, Pinnacle Solutions, and Selerity.

Related Items:

Hosted Hadoop Gets Traction

SAS Takes on Spark with In-Memory Analytics for Hadoop

White-Glove Hadoop Cloud Service Launched by Altiscale