Snowflake Gives Everybody a Little Something at Summit
Whether you’re a data engineer building data pipelines, a data scientist creating AI models, or a CFO trying to minimize cloud spending, Snowflake gave you something today at its annual user conference in Las Vegas, Nevada.
Snowflake kicked off four-day Snowflake Summit 2023 today by making announcements in three main areas. For starters, there are enhancements to the Snowflake platform itself, including a new Document AI functionality, Snowflake versions of Iceberg tables, and a new performance index. The cloud analytics company also launched a public preview of the Snowflake App Framework, which will give partners the ability to develop and sell native apps on Snowflake. Lastly, Snowpark Container Services widens the types of workloads customers can bring to Snowflake.
Let’s take the announcements one at a time:
Snowflake Data Platform Enhancements
The headliner in terms of core enhancements to the Snowflake platform is the new Document AI functionality, which is currently in private preview.
Document AI represents Snowflake’s entry into the burgeoning large language model (LLM) and generative AI space. The new offering, which is based on AI technology that Snowflake obtained with its September 2022 acquisition of Applica, enables customers to use AI technologies to interrogate documents loaded into Snowflake.
“This allows you to take unstructured document and unstructured files and convert them into structured data that can be moved into traditional analytics or AI or even other ML processes,” Christian Kleinerman, SVP of products for Snowflake, said during a press conference last week.
Snowflake is starting with documents, but the company plans to eventually add support for doing AI other types of unstructured data, including images. “We think that we are at the forefront of productizing this type of solution.”
Last year, Snowflake unveiled its support for Apache Iceberg, which gives customers the confidence of knowing their data will remain consistent as a variety of users and engines touch it. While Iceberg may not be as performant as Snowflake’s own proprietary data format in some cases, the company acknowledged that openness in some cases was more important to their customers.
“The goal here is we want to give choice to our customers and for those that prefer open platforms and open table formats, we believe will have either one of the most or the most performant engine on open clouds out there,” Kleinerman said.
Snowflake is building on that existing Iceberg support with a new offering dubbed Snowflake Iceberg Tables that will extend Snowflake’s performance and governance capabilities to data stored in open formats.
According to Kleinerman, it’s a “unified table type that will let customers either have Snowflake manage the data and be the entity that writes the data, or just be an entity that reads and consumes data produced by another Iceberg-compliant data engine.”
The company is also unveiling the public preview of a new budgeting tool that alerts users when their consumption of Snowflake resources begins to track over budget. “We’ve heard the feedback loud and clear on, if you don’t manage Snowflake spend, it can get higher than expected or higher than desired,” Kleinerman said.
Along the same lines, the company will be releasing a Snowflake Performance Index, which shows the user how much Snowflake has bolstered its performance over a period of time. According to Kleinerman, the SPI shows that the company has delivered a 15% improvement over the past 12 months, “which translates to 15% faster results and 15% better economics for our customers,” he said.
Finally, Snowflake is rolling out a series of “ML Power Functions” designed to give customers ready access to machine learning capabilities, even if they don’t possess data science expertise. Kleinerman said the company will deliver SQL-based ML Power Functions for things such as forecasting and anomaly detection.
Snowflake is rolling out a public preview of the Snowflake Native App Framework, which will not only speed development of applications designed to work with the Snowflake Data Cloud, but also provide a path for software developers to monetize their apps.
“This is really beneficial because it accelerates the ability of application developers to get into those environments,” Snowflake product manager Christopher Child said during the press conference last week. “They don’t have to go through lengthy security review process. They don’t have to go through lengthy procurement processes. They’re able to tap into the security governance… that Snowflake already has and provides.”
There are already 25 “native apps” from the Snowflake Marketplace that are certified to be sold to Snowflake users, including apps from vendors like Bond Brand Loyalty, Capital One Software, DTCC, Goldman Sachs, LiveRamp, Matillion, and My Data Outlet.
Just about any bit of code developed in Snowflake can be turned into a native app, Child said. That includes Python and Java applications developed in Snowpark, SQL code, Streamlit applications, and any object that lives in Snowflake, Child said.
“You can package these up and then actually distribute them on the Marketplace and instantiate them in the customer accounts,” he said. “Once they’re in the customer account, they run in a sandbox, which both protects the IP of the application from the person using it, but also guarantees that it can’t exfiltrate data or pull anything out of the customer’s account unless it’s explicitly given the ability to. So this means it’s safe to run these applications, even on your most sensitive data, without even necessarily give the developer of the application a copy of that data.”
The native app framework is in public preview on AWS. Snowflake is also working to make monetization a breeze for app developers by enabling Snowflake users to spend their pre-purchased Snowflake credits on these native apps.
“So this allows both data providers and application providers to tap into that and not have to go through their own procurement process and not have to find a separate budget,” Child said. “So we think this is really going to enable a lot of these developers and data providers to get access to customers in a much faster way, all while staying within the Snowflake ecosystem.”
Snowpark Container Services
Snowflake gave customers the capability to run non-SQL code–mostly Python and Java applications–within its Snowpark offering, which it launched in 2021. This year, the company is building Snowpark out with new container services.
The idea behind Snowpark container services is to enable Snowflake customers to tap into a large catalog of third-party software and apps that they can run in their Snowflake environment, the company said. According to Snowflake, it could bring in large language models (LLMs), data science notebooks, and MLOps tools, among others.
“We’re continuing to make investment into the libraries and the runtime to help you to really deploy and process non-SQL code on the Snowflake platform for data pipeline and machine learning use cases and native applications and more,” Snowflake Product Manager Torsten Grabs said during the press conference.
The company also unveiled public preview of new Snowpark ML APIs that it says will provide for more efficient model development, as well as private preview of a Snowpark Model Registry for scalable MLOps. Finally, the company will soon have a public preview of Streamlit in Snowflake, which will turn models into interactive apps and bring advanced streaming capabilities.
“Our goal here is to provide support for Streamlit running on completely native within Snowflake,” Grabs said. “You’re end-to-end application can deliver compelling visualization into activity through Streamlit while never leaving the secure boundary of your Snowflake account.”