Snowflake Rides Cloud Wave to Great Heights
Snowflake Computing started quietly enough back in 2012, at the height of Hadoop hype, when the idea of a relational analytics database running in the cloud was interesting, if not exactly groundbreaking. Fast forward seven years, and this little startup from San Mateo, California has become one of the dominant players in the burgeoning cloud data warehouse market.
A lot has changed over the past seven years, both for Snowflake Computing and the market for analytics software in general. As Snowflake gets ready to host its inaugural Snowflake Summit next week in San Francisco, it’s worth taking some time to reflect on its journey and where Snowflake may go next.
Big Cloud Growth
Back in 2012, many CIOs were just starting to hear how Silicon Valley Web giants were innovating on this new computing architecture called Hadoop. The idea of linking together hundreds of Lintel nodes to create a data lake that could house and process all of an enterprise’s data – including high-performance SQL databases from Teradata, IBM Netezza, and Oracle — held a lot of promise.
But Benoit Dageville, Thierry Cruanes, and Marcin Zukowski had a different idea. Instead of building a SQL layer atop Hadoop, as Cloudera, Hortonworks, and company did with projects like Hive and Impala, the trio set out to redesign the SQL analytics database from the ground up to run natively on the cloud.
It’s not obvious today, but the cloud wasn’t the major data destination back then like it is today. While Amazon Web Services and others clouds were growing in 2012, nobody was really thinking that enterprises would move thousands of petabytes to the cloud. The thinking back then was that the cloud simply couldn’t deliver the levels of security and stability that enterprises demanded, and so the majority of data would remain on prem.
Boy, were they wrong! The three major cloud platforms today – AWS, Microsoft Azure, and Google Cloud – are growing at exceptional rates at the moment, as concerns about the security and stability of the cloud have melted away. Occasionally companies may be surprised at the high cost of cloud computing, but for many, the flexibility of having compute-on-demand is worth it, and the cloud has exploded.
Reimagining the Data Warehouse
Dageville, Cruanes, and Zukowski couldn’t have foreseen the tectonic shift that the IT industry was about to undergo, and the change from an “on-premise first” to a “cloud first” mentality. But the Snowflake founders definitely can be credited with foreseeing the need for new cloud-native architectures when it came to running large-scale SQL analytics.
The trio all have technical backgrounds, and utilized their low-level knowledge of high performance relational databases to help them create Snowflake’s new architecture.
- Zukowski was the CEO and co-founder of Vectorwise, and led the development of the SQL analytics database for the Amsterdam, Netherlands-based company from its inception until 2010. He also worked at Actian, which continues to sell the database, now called Vector.
- Cruanes was is an expert in query optimization and parallel execution, and spent 13 years at Oracle, where he worked on its eponymous database. He also spent several years working on data mining with IBM. As a PhD, he holds more than 40 patents.
- Dageville spent 16 years at Oracle, and was architect in the database server manageability group, which focused on SQL optimization and execution. Before that, Dageville spent time working on databases for Bull’s NUMA-based clusters.
Whereas AWS was having some degree of success running a traditional analytical data warehouses, ParAccel (i.e. RedShift), in the cloud, the three Snowflake co-founders theorized that even greater levels of efficiency and cost-saving could be had by building a new cloud database architecture from scratch.
In a 2015 interview, Bob Muglia, who was at that time Snowflake’s CEO, described the difference between Snowflake’s approach and that of his closest competitor, AWS Redshift.
“What Amazon did is they acquired rights to ParAccel and then they hosted it in the AWS cloud environment,” Muglia says. “They’ve done a very good job of doing that. It’s super easy in Amazon to instantiate a new Redshift cluster. But that’s kind of where it ends. They help you back it up. There’s a few things thety do. But all the administrative tasks you have to do. You still have to vacuum it, you still have to manage it, you still have to determine your distribution keys.
“All of the things that you had to do with ParAccel or really with any shared-nothing database, you have to do with Redshift,” the former Microsoft Server Division boss says. “That’s one of the big differentiators that Snowflake has is that all those tasks don’t exist. We don’t use a traditional architecture like shared disk or shared nothing. In fact we have a new architecture that has never existed before that we call mulita-cluster shared data that essentially makes this administrative work go away, and provides us with an incredible degree of elasticity and almost limitless scalability.”
Snowflake’s Winning Workload
Snowflake’s message of cloud simplicity has paid off, particularly in comparison to Hadoop’s high level of complexity. Smaller companies that may have thought about or started building their own Hadoop clusters to analyze data are increasingly turning to cloud data warehouse providers, and Snowflake – which runs on AWS and Microsoft Azure — appears to be capturing its share of that movement.
Since Snowflake GA’d its service in mid-2015 until late 2018, it attracted more than 1,000 customers, including companies like Netflix, Office Depot, Yamaha, and Blackboard, the educational company profiled by Datanami in 2017. Snowflake had the top score in GigaOm’s 2017 Cloud Analytics Database report, and was ranked 49th in LinkedIn’s analysis of the companies that people want to work for the most.
Snowflake is private company, so it doesn’t report revenue. We do know that the 900-person outfit has brought in around $1 billion in outside financing, and that Snowflake sported a valuation of about $3.5 billion when during its last big round in October.
We may soon get more information about Snowflake, since Muglia hinted that an IPO could be in Snowflake’s future. In any event, that decision is now in the hands of Frank Slootman, the former ServiceNow executive who replaced Muglia as Snowflake’s CEO earlier this month.
Slootman will get his first real public exposure next week at the company’s first ever user conference, which is expected to attract about 1,500 attendees. Tickets for Snowflake Summit sold out, which is a sign of the level of interest in this company.
The future would appear to be bright for Snowflake as it tackles a market that IDC says will be worth $18 billion next year. The cloud has emerged as a legitimate platform faster than analysts thought, and when coupled with Hadoop’s struggles and the drawbacks of running traditional analytics database in the cloud, it points to a potentially rosy future for Snowflake.