May 1, 2015

Tracking the Rapid Rise in Cloud Data

Ellie Fields

In the last year we’ve seen organizations put more and more data in the cloud, but where are they putting all these data?

The data cloud boom we’re seeing is the result of a radical change in economics (space in the cloud continues to drastically drop) and improvements in products that store data in the cloud. Additionally, the rise of “the Internet of Things” is generating massive data that needs to be stored somewhere, and perhaps more important accessed from many places. IT has responded by searching for more flexible solutions to adapt to these changing conditions–setting up boxes upon boxes to accommodate new data is not a nimble strategy.

I’ve watched the evolution of the data cloud market closely. I work at a company that is both a provider of cloud analytics and a customer of cloud solutions. And as a curious data-driven person, I wanted to dig into the data in the cloud trend, and see and understand where is all these data going. Is cloud technology really changing where data lives? How fast is it changing?

So, I turned to the data about usage of Tableau Online, our cloud product. Of course this one set of data doesn’t necessarily represent the whole market, but I decided it could be an incomplete but interesting set of data to analyze as I began diving into a why and how we’re seeing a shift in the market.

Tableau Online data showed us which data sources are used, in aggregate. I want to point out that we didn’t, and we don’t, look at any customer data directly. Customer data is inaccessible to all but site administrators who must help with customer issues. What we analyzed was the metadata, the data about the type of services used.

Cloud Database Adoption Is Higher Than Expected

The first interesting finding was that cloud data sources represent 8.7% of the total data sources used. Now, because the people who use this cloud product tend to be friendlier to the idea of the cloud than the general population, we assume that adoption to be higher than in the overall market for analytics.

Tableau_1

Sources of data stored on Tableau’s cloud.

But considering that most cloud data solutions are only a few years old versus the decades-old alternatives, that 8.7% is still quite a remarkable number. And when you consider cloud as a percentage of all databases (excluding flat files) then it is about 17%.

 

Tableau_2

Cloud as a percent of all data sources, excluding flat files

 

Cloud Data Is Less Popular Than Files and Relational Databases, But Growing Fast

Next we wondered how that has changed over time. Our cloud product was launched in July of 2013, and we have data from the beta period going back to May 31, 2013. At first most of the data sources were file-based as people tried the system out. Gradually more robust data sources, such as relational databases, were used more. Over that time, cloud data sources climbed from an insignificant level—less than 2% of all data sources– to its current level of 8.7%, more than a 4x increase in the last 18 months.

Tableau_3

Growth of Specific Cloud Data Sources

The next logical question was, which cloud data sources are used the most?  At first it was Google Analytics and Salesforce, but very quickly the cloud data warehouse solution Amazon Redshift took the lead. Currently, Amazon Redshift is the most-used cloud data source on Tableau Online and Google Analytics is the second-most used.

Tableau_4

In speaking to my colleagues, they added that the cloud data source number we have here is a floor number.  The actual number is significantly higher – from an analysis done about a month ago, about 20% of the relational databases actually pointed to Amazon or Azure databases. So several of the data sources counted in the relational database number are actually relational databases in the cloud.

Why Do Companies Choose To Put Their Data In the Cloud?

This final question is the key to understanding the data cloud trend. For some companies, it’s the most scalable way to collect big data. I’ve spoken to one of our customers, Sling Media, who is collecting usage data on a new consumer device that they design and sell. The usage data is huge, but the company can scale as needed to accommodate that data by putting it in Amazon Redshift. Then they simply leave the data in the cloud to analyze it, rather than move all that data around.  Others are looking for some of the unique attributes of cloud data sources—for example, the metered pricing model or ability to scale up and down as needed.

For a variety of reasons, there is tremendous growth in cloud data and analytics. We expect the growth to keep going. We also expect that new scenarios, enabled my new product capabilities and customers pushing limits, will drive additional growth. Two things seem certain: in a few years, the cloud will be bigger, and it will look different. We’ll keep watching to see how.

Ellie_Fields_headshot

About the author: Ellie Fields is the Vice President of Product Marketing at Tableau Software, where she’s responsible for new product launch, Tableau’s community and Tableau Public. Her data geek credentials come from time served in technology and finance companies. She’s seen a lot of ugly data, beautiful data, and downright mean data. She’s a passionate believer that data used well can inform great discussions. She has an engineering degree from Rice University and an M.B.A. from The Stanford Graduate School of Business.

Related Items:

Five Reasons Machine Learning Is Moving to the Cloud

Microsoft Scales Data Lake into Exabyte Territory

Facebook Opens Its Own Private Firehose

Tags:

Share This