June 8, 2021

Newly ‘Headquarterless’ Snowflake Makes a Flurry of Announcements

Alex Woodie

via Shutterstock

Snowflake is best known as a cloud data warehouse, but it’s delivering capabilities that go beyond delivering fast answers to SQL queries. These wider “data cloud” ambitions are on display this week, as the newly “headquarterless” company holds its first Snowflake Summit since its massive IPO last year.

According to Christian Kleinerman, Snowflake’s SVP of product, the biggest announcement to come out of Snowflake Summit this week revolves around Snowpark, the new development tool and runtime it unveiled last November at the Data Cloud Summit.

Snowpark gives customers the ability to develop and run Java-based programs against data they are storing in Snowflake. These programs could perform ETL/ETL, data transformation, or feature engineering tasks that are needed for data analytics, data science, and data engineering workflows.

“It’s an alternative to Spark or Dask or all those frameworks that exist to program to data in Java or Python,” Kleinerman tells Datanami. “It’s a programing model on top of the Snowflake entity.”

Snowpark will support Scala (a JVM compatible language) first. All Snowpark customers on AWS will have it by next Monday, according to Kleinerman. Support for Java, Python, and related libraries and routines is expected later this year.

On a related note, Snowflake is bringing to Snowpark a new Java user defined function (UDF), which will enable users or partners to bring their custom Java code and implement it within the Snowflake paradigm. This is still in private preview; a public preview is expected soon.

About 50 partners have already adopted Snowpark or have committed to adopting it, which is proof that Snowpark is getting traction, Kleinerman says. “[Snowpark] has been rolling out to different customers and partners for the last three months or so, and right now it’s ramping up,” he says. “We have customers and partners talking benefits, performance, throughput, and cost.”

Snowflake CEO Frank Slootman makes it snow onstage at the 2019 Snowflake Summit

Snowflake is also announcing support for unstructured data, such as images, video, and texts. According to Kleinerman, this will help complete the data analytics picture for customers with diverse data ambitions.

“Snowflake was born with structured data and semi-structured data as first-class capabilities,” the product manager says. “I hear customer say, I like the no-silo story. But I want all my data there, not just structured and semi-structured. So now we’re bringing full support for unstructured data in the form of file support.”

Customers can now store any file in Snowflake, and the company will provide the same guarantees around data governance, management, and replication atop that data, Kleinerman says. What’s more, with Snowpark providing support for Java-based programs (and soon Python-based programs, like PyTorch and Tensorflow), customers can start to do analytics atop that data.

For example, customers could perform sentiment analysis on text data or voice data,” Kleinerman says. “I have some speech. I can use some library to convert it to text. Then I can use some other library to extract sentiment from it.”

Snowflake is a central player in the ongoing battle that pits cloud data warehouses and cloud data lakes against each other. Proponents of cloud data warehouses, like Snowflake, proclaim that customers are better off using more closely managed (and proprietary) data warehouse to analyze data, whereas data lake supporters, such as Dremio, argue that customers are better off using less closely managed (and open) data lakes. Features like support for unstructured data and the ability to bring Java and Python-based functions to bear on that data indicate that Snowflake is responding to these customer concerns, at least in part.

Snowflake is also announcing that customers are benefiting from an across-the-board increase in the compression rate, in some instances by up to 30%. Kleinerman said this is exactly the type of improvement that users can expect because Snowflake closely manages its data format.

The 30% increase, which comes atop compression rates that are already around 10x for some data types, actually led Snowflake’s CFO to announce on the analyst call last quarter that its annual revenue will decline by $13 million , Kleinerman says. “It’s direct money that we are not recognizing because the economics are better for customers,” he says. “Each time we make the system faster, we hurt our topline a little bit. But we’re in this for the long run.”

Snowflake also is making news on the data marketplace front. Buyers and sellers using the company’s Data Marketplace, which it launched in 2019, can now complete their transaction within the marketplace instead of completing their deals offline. Snowflake is implementing a user-based pricing model for its data marketplace, which will calculate costs based on the compute time associated with a given piece of data.

The marketplace has doubled in size in the past year, and now has about 500 data listings from 160 providers, the company says. “It’s growing quite well,” Kleinerman says. “We’re trying to lower the bar on how easy it is for organizations to monetize their data.”

Selling or sharing data in the marketplace can be done more securely, thanks to steps that Snowflake has taken to prevent sensitive data from leaking. This includes a new sensitive data classifier that can automatically spot potentially problematic combinations of data, Kleinerman says.

Researchers have shown that, even in data that has been aggregated and isn’t explicitly tied to individuals’ identity, people can be re-identified by linking together disparate pieces of data. “If you take anyone’s date of birth, gender and ZIP Code, you can pretty much uniquely identify them,” Kleinerman says. “Our classifier not only will tell you, this is sensitive, but it also has concept of quasi identifier, so it will help customers identify combinations of data that are might be potentially identifying.”

Snowflake’s former headquarters in San Mateo, California (Sundry Photography/Shuttertock)

The company has also launched something called anonymized views, which is an anonymized version of a data set that reduces the risk of re-identification, but still provides analytic value. The technology uses the k-anonymity and differential privacy algorithms, Kleinerman says. “We think this is going to accelerate more the confidence that people have to share data with one another,” he says.

Last but not least, Snowflake today is announcing its “Powered by Snowflake” program to help build and grow its data cloud. Snowflake has worked with partners for years, but the new program will more clearly lay out the benefits that partners receive as it pertains to application development, go-to-market strategies, and tech support, among others.

Snowflake CEO Frank Slootman will be speaking at 9 a.m. PT today at the Snowflake Summit. The event will be virtual, just like the company, which has abandoned its Silicon Valley headquarters and announced it has become fully distributed, or “headquarterless,” save for its “principal executive office” in Bozeman, Montana, which is where Slootman and CFO Mike Scarpelli share a ZIP Code. For more info and the conference agenda, see www.snowflake.com/summit/agenda/.

Snowflake: Not What You May Think It Is

Snowflake Pops in ‘Largest Ever’ Software IPO

Editor’s note: This story has been corrected. Snowflake’s annual revenue will take a $13 million hit as a result of the up-to 30% data compression that Snowflake just implemented, not $13 million per quarter as first reported. Datanami regrets the error. It was also updated to reflect the timing of support for Scala, Java, and Python in Snowpark.

Applications: Artificial Intelligence, Data Mining, Enterprise Analytics

Technologies: Cloud, Frameworks

Sectors: Financial Services, Retail

Vendors: Snowflake

Tags: compression, data cloud, data warehousing, governance, Java, open vs closed, python, Snowflake Summit, UDF

Only registered users may comment. Register using the form below.

Check off newsletters you would like to receive*
- HPCwire
- EnterpriseTech
- Datanami
- Technology Conferences & Events
- Advanced Computing Job Bank
- Technology Product Showcase
Email*
Name*
First Last
Organization*
Job Function*
Industry*
Country*
City*
State*
Province*
- Please check here to receive valuable email offers from Datanami on behalf of our select partners.

Newly ‘Headquarterless’ Snowflake Makes a Flurry of Announcements

Join the discussion Cancel reply

Only registered users may comment. Register using the form below.

April 26, 2024

April 25, 2024

April 24, 2024

Sponsored Partner Content

Get your Data AI Ready – Celebrate One Year of Deep Dish Data Virtual Series!

Supercharge Your Data Lake with Spark 3.3

Learn How to Build a Custom Chatbot Using a RAG Workflow in Minutes [Hands-on Demo]

Overcome ETL Bottlenecks with Metadata-driven Integration for the AI Era [Free Guide]

Gartner® Hype Cycle™ for Analytics and Business Intelligence 2023

The Art of Mastering Data Quality for AI and Analytics

Leading Solution Providers

Tabor Network

Sponsored Whitepapers

Top 6 Strategies for Reducing Data Warehouse Costs

Building an Operational Data Warehouse for Real-time Analytics

Sponsored Multimedia

The Power of DataOps: Bring Automation to Life
No Comments

Tactical Steps for Cloud Migration
No Comments

Immuta Data Access Platform
No Comments

Data Mesh: Fact or Fiction?
No Comments

Contributors

Featured Events

AI & Big Data Expo North America 2024

CDAO Canada Public Sector 2024

AI Hardware & Edge AI Summit Europe

AI Hardware & Edge AI Summit 2024

CDAO Government 2024

Newly ‘Headquarterless’ Snowflake Makes a Flurry of Announcements

Join the discussion Cancel reply

Only registered users may comment. Register using the form below.

April 26, 2024

April 25, 2024

April 24, 2024

Most Read Features

Most Read News In Brief

Most Read This Just In

Sponsored Partner Content

Leading Solution Providers

Tabor Network

Sponsored Whitepapers

Sponsored Multimedia

Contributors

Featured Events

Share

Copy short link