Snowflake: Not What You May Think It Is
Snowflake CEO Frank Slootman is the toast of the big data community, and following the $3.4 billion IPO, a favorite on Wall Street too. But this whole Snowflake exercise could have turned out dramatically different, the CEO says, if the founders had pursued their original premise for what the company should be.
“I was really fascinated with the fact the original idea for Snowflake is not what Snowflake became,” Slootman said during a media roundtable Monday with author Steve Hamm, with whom he is writing a book that will be released shortly. “Not many people know that. But it’s sort of typical of how things happen.”
Slootman’s point is that, while Snowflake co-founders Benoit Dageville, Thierry Cruanes, and Marcin Zukowski originally sought to create a cloud data warehouse–essentially a column-oriented, massively parallel processing (MPP) relational database running in the cloud–the company actually has become more than that.
That’s not to say that Snowflake is not a cloud data warehouse company. It is that. But it’s mantra, it’s raison d’etre if you will, has morphed over time. It has become not only a place where people can run big SQL queries on big data sets in the modern cloud style (i.e. rapid scaling up and down of processing thanks to the separation of compute and storage).
But it has also become a place to buy and sell data. Snowflake has become a data marketplace in its own right, a platform for data itself.
Or as Snowflake is wont to call it, it has become a “data cloud.”
“Here’s the point,” Slootman says. “A data cloud is not just about workload execution. It’s actually the combination of workload execution for operations and creating a global data universe where users can gain and provide unfettered access to data. So it’s the combination of execution and access to data [that] is really what constitutes the data cloud.”
The data processing component may be what many people think about when they hear the name “Snowflake.” And that’s not a bad thing. But Slootman understands that it’s a crowded market, and users have all kinds of options for cloud data warehouses these days.
All of the big cloud providers–whom Snowflake depends on for computing and storage resources, by the way–offer cloud data warehouses that are based on MPP databases (except for Google Cloud’s BigQuery, which is a little different). Teradata, which for years was the gold standard for data warehouses and still has some technological advantages, is pivoting to calling itself a “cloud data analytics company.” Databricks, which ostensibly has been targeting data engineering and data science workloads with its Spark-based offering, just launched a SQL Analytics offering designed to gobble up more workloads.
Rapid growth and a very successful IPO have put a target on Snowflake’s back, and Slootman knows it, which is why he’s (wisely) moving the target. The data analytics piece may be the lion’s share of the current business plan, but Slootman sees the potential for Snowflake to gain a foothold in the emerging market for massive data platforms, which could end up having a much larger market share.
“Obviously in our business, we end up talking 90% of time about workload characteristics, and not nearly enough about how to blow up the silos and bunkers that have emerged over the years,” Slootman says. “We’re sort of at risk of perpetuating the past in the way we’re thing about data platforms, which is why we came up with the notion of data cloud, which very much put data access center stage.
“We can now erase data silos and bunkers because of the combination of public cloud scale,” he continues. “A lot of customers right now are building the data silos because they approach data operations as one workload. We see it every day, which is why we have to have this conversation every day as well.”
Snowflake’s vision for a data cloud came into clearer focus earlier this week, when it expanded its cloud. Among the new capabilities that Snowflake released are a development tool called Snowpark for data engineers to build ETL/ELT pipelines in Python, Scala, and Java (in addition to SQL); and Data Services, which allows customers to expose analytic routines developed in Snowflake to their outside customers or partners.
Instead of being trapped in data silos and data bunkers, Snowflake is hoping that customers park more of their data in its cloud. Indeed, customers should be thinking about how they can leverage their own “data cloud for the enterprise,” Slootman says.
“The future needs are very dissimilar from the historical, from the legacy needs,” he says. “Data scientists are going to have demands on data and data access and people have a hard time envisioning that at the moment. We want to prepare them for it and also create the optionality for it, so whatever comes up, they’ll be okay.”
At the current rate of data growth, today’s data fragmentation will become an even more unmanageable problem in the future. With thousands of SaaS applications and data repositores—not to mention all the on-prem sources too–data risks becoming “fragmented in a million places,” Slootman says.
By building unifying data cloud–or even a “global data universe”–there will be a force for data centralization, which will eliminate many of today’s data integration challenges and set the ball in motion for a massive data networking effect. Snowflake, of course, is positioning itself in the middle of it all, ready to crystalize data opportunities and deliver data and analytics services as needed. It’s bold bet, to be sure, but according to Slootman, Snowflake is fully committed to making it a reality.
“It’s not a half-hearted strategy,” he says. “You gotta go all in to create a data cloud and really unlock the potential that is there. It requires fortitude. You’re not going to just end up there by luck or happenstance.”
Incidentally, the book that Slootman and Hamm are writing is called “Rise of the Data Cloud.” It should be available on Amazon in a few weeks.