Why Analytics Are Now Assumed to Run in the Cloud
Not long ago, companies considered big data analytics such a strategic advantage that they wouldn’t dream of letting it outside their firewall. That hesitation has almost completely melted away to the point where spending for on-premise analytics is actually declining and companies expect analytics to run in the cloud.
The seminal event in cloud computing occurred 10.5 years ago when e-commerce giant Amazon decided to rent out its excess X86 and storage processing capacity. Thus Amazon Web Services (AWS) was born, and the IT world was never the same.
At first, AWS sucked up the easy workloads, like email serving and hosting websites. As enterprise software evolved thanks to the growing maturity of Web 2.0 and AJAX technologies, consumer-focused products like CRM and collaboration went up into the cloud too. However, core back-office application, like ERP and business intelligence, largely remained safely ensconced in on-premise systems.
‘Great Cloud Migration’
That began to change in 2014, according to the IDC, which noted an uptick that year in cloud deployments of BI and analytic applications, as well as the supporting data management and integration apparatus.
“2015 saw an influx of cloud BDA solutions from all of the large IT vendors,” IDC analyst Dan Vesset and company wrote in a recent “FutureScape” on big data analytics (hosted courtesy of Cloudera, which has many cloud customers).
“The big story of 2015 was the beginning of ‘the great migration’ to the cloud,” Vesset and company write in another IDC report published earlier this year (this one courtesy of SAS, also a cloud practitioner). “In 2015, the on-premises portion of the overall [worldwide business analytics software market] contracted by 1.4%, while the public cloud services revenue grew 26.5%.”
Those numbers are truly astounding, and show you how far cloud analytics has come. However, it still has a lot of growth ahead of it before it can match the on-prem footprint, when you consider that the public cloud portion of the worldwide analytics market now represents 17% of the market by spending.
“Despite the rapid growth in cloud deployments, the vast majority of solution remain on-premises,” IDC concludes. However, it’s changing, and the Connecticut analyst firm predicts cloud-based BDA solutions will grow 3x to 4.5x faster than on-premises deployments.
One of the analytic software providers seeing more demand for cloud-based solutions is Hadoop distributor Hortonworks (NASDAQ: HDP). For years the company has had a tight relationship with Microsoft (NASDAQ: MSFT), which uses HDP as the basis for its HDInsight offering that runs on its Azure cloud.
Last month, it unveiled the Hortonworks Data Cloud for AWS service, which offers a pre-configured HDP cluster through the AWS Marketplace. According to Shaun Connolly, chief strategy officer for Hortonworks, the company has an entire team dedicated to ushering customers up the “cloud onramp.”
“The path to getting the customer up and running–the friction is removed, so the productivity is faster,” Connolly says. “There’s a much more direct exchange of value, if you will.”
Companies of all sizes are looking to run their analytics on the cloud, not just lean-and-mean startups that can’t justify spending millions to buy a large cluster to run Hadoop. “Right now we see north of 25% of our large customers” who are running on the cloud Connolly says. “They have both an on-prem and a cloud footprint.”
Even traditionally risk-averse giants in financial services, healthcare, and telecommunications are moving analytic workloads to the cloud, or starting new workloads there, according to Brett Procek, the cloud architect Infogix, the Illinois-based company provider of a full-stack analytic solution running on AWS that uses Hadoop, Redshift, machine learning, and visualization tools.
“I’m seeing the hesitation [to run in the cloud] dwindling every year that goes by,” Procek says. “We’re engaged with several large financial organizations that are considering leveraging our cloud offering, or are interested in hearing about our analytics in the cloud platform.”
Data Gravity Shifts to Cloud
One of the truisms of “big data” is that when data sets get too big to move, you don’t–just analyze it in place. This thinking helped propel Hadoop to become a “one-stop shop” for storing and processing petaabyte-scale data that was too big to sit in relational and OLAP databases. And it helped drive analytic workloads on AWS, particularly for data that’s born on the Web (i.e. an AWS data center).
However, that thinking is starting to dissolve. Increasingly, companies are eager to upload data born on-prem into the cloud, just to take advantage of the huge diversity and wealth of analytics available there.
More companies are able to take advantage of analytics thanks to the cloud, according to Itamar Ankorion, chief marketing officer of ETL and data replication provider Attunity (NASDAQ: ATTU), which last week launched its Compose ETL solution in the AWS cloud.
“Some customers never thought of doing a data warehouse before because they thought it would be cost prohibitive,” he tells Datanami. “But now they have an option to do that. We’re seeing lots of adoption in the cloud for data warehousing for customer of different sizes. The cloud is a good solution for those as well. It’s something we’re seeing.”
Running on AWS, Compose helps do the grunt work of preparing the data for analyses in Redshift, which is an on-demand version of the ParAccel columnar MPP database. In the past year, we’ve seen other similar MPP databases move to the cloud, including the one from data warehousing giant Teradata (NYSE: TDC), which offered its eponymous database on AWS for the first time in 2016.
Plenty of data is headed from the ground up to the cloud. With the advent of 100TB AWS “Snowball” devices and the even bigger Snowmobile–the 45-foot, 100-petabyte big-rig that AWS launched last week—plenty more data is headed to the cloud.
Security is another concern that has traditionally presented enterprises from using the cloud. Those concerns are also melting away—or at least being addressed by the vendors.
That does not mean that big banks, phone companies, and healthcare companies aren’t interested in security. “They have a gauntlet of upfront security questions that really drill into our security practices,” Infogix’s Procek says. “We are hit up very hard up front with those questions. But we’ve gone beyond the early fears of security. Those fears are quickly being allayed.”
Hortonworks’ Connolly also sees enterprises raising security questions of their prospective cloud providers, and coming away satisfied with the answers. “The security and data governance are important to them,” he says. “Because they’re large global multinational companies, provenance and where data can reside is important. The public cloud vendors and their ability to have data centers in different countries is important.”
The cloud has transformed from a place to test out new workloads to a place where enterprises feel comfortable running them. Companies are no longer analyzing “low value” Web data there, but actively uploading critical business data born from on-prem ERP systems.
As the tools, applications, and architectures powering big data analytics becomes more critical and more complex, the cloud will increasingly offer a salvation of simplicity. Companies that just want to get their work done and not employee legions of specialists in running Hadoop or sharding data across hundreds of nodes–or employ ETL or data warehousing specialists that take months to get work done– will find the cloud increasingly to their liking.
“It’s almost like there’s a drumbeat of more and more companies willing to get those cost savings and willing to alleviate the burden of enterprise managing and maintaining the solution,” Procek says. “The days of companies having the burden of having deploy a complex distributed systems, and having to procure the hardware and manage and maintain them in-house, are quickly dwindling.”