Follow Datanami:
March 24, 2022

Snorkel AI Announces GA of Its Data-Centric AI Platform, Snorkel Flow

SAN FRANCISCO, March 24, 2022 — Snorkel AI, the data-centric AI platform company powered by programmatic data labeling, today announced the general availability of its data-centric AI platform, Snorkel Flow, for enterprises to accelerate AI application development 10-100x with automated data labeling. The generally available version of Snorkel Flow includes enhancements to its programmatic data labeling, integrated ML modeling, collaborative AI application development, guided data iteration capabilities, and pre-built templates for classification of and information extraction from documents. Register for the first in a series of Snorkel Flow demo events on March 29th, 2022, at 11 am Pacific Time.

Today, enterprise data scientists and domain experts spend over 80 percent of AI development time gathering, organizing, and manually labeling the training data used to train machine learning models (Cognilytica 2020 report). Manual labeling is notoriously expensive and slow, limiting development teams’ ability to build, iterate, adapt, or audit applications in a systematic and privacy-compliant manner. The training data bottleneck is a primary reason why 87 percent of AI projects never make it into production.

To solve the training data bottleneck, Snorkel Flow provides the world’s first data-centric AI platform for enterprise teams to label training data programmatically, use error analysis to guide training data and model iteration in tandem, and adapt to real-world changes with a few clicks rather than complete manual relabeling. With Snorkel Flow, organizations have achieved state-of-the-art machine learning model accuracy in days rather than weeks or months.

Snorkel AI recently announced that Snorkel Flow is deployed at Chubb, the world’s largest publicly traded property and casualty insurer. Snorkel AI’s customer base continues to grow rapidly, including Memorial Sloan Kettering Cancer Center, the world’s largest and oldest cancer center, two of the three top US banks, and other Fortune 500 organizations in the biotech, oil and gas, telecom sectors, and several government agencies.

“A significant need for AI models is labeled data, often tedious and expensive to generate,” said Janet Mak, Deputy CIO and VP of Digital Solutions, Memorial Sloan Kettering Cancer Center. “We are leveraging modern data-centric AI approaches, using generative learning with weak supervision, for machine-labeling data. We have applied Snorkel Flow to two use cases using pathology reports. We accurately labeled a few thousand pathology reports (95% accuracy, 85% precision) using one SME in days versus weeks. In addition to these material time savings, Snorkel Flow allows our teams to collaborate on the data accuracy and provides time efficiencies for our highly valued physicians and medical professionals.”

“Snorkel Flow is the result of over half a decade of research and close partnership with our customers. With a focus on speed, privacy, and collaboration, the platform delivers what Fortune 500 companies need to build mission-critical AI applications that power their business, protect data, and scale the use of AI,” said Alex Ratner, co-founder and CEO, Snorkel AI.

The generally available version of Snorkel Flow delivers a data-centric development workflow for data science and machine learning practitioners to tackle document intelligence applications. This includes:

  • Programmatic data labeling: No-code and Python SDK interfaces for programmatic labeling, with state-of-the-art weak supervision algorithms.
  • Integrated ML modeling suite: No-code, continuous training of leading, pre-configured models and modeling tools like AutoML available in-platform.
  • Collaborative AI application development: Workflows for domain experts to encode labeling insight and rationale at scale and platform tools for real-time troubleshooting.
  • Guided data iteration: Actionable error analysis and active learning workflows to improve training data quality and achieve production-worthy model accuracy faster.
  • Accelerated document intelligence: Built-in pipeline templates with pre- and post-processing operators, models, and business logic for document classification and extraction applications.

Snorkel Flow is built for the modern enterprise featuring cloud-agnostic Kubernetes deployment options, role-based access controls, SSO integrations, encryption in-transit and at-rest, and more. In addition to generally available capabilities, several enhancements were released as beta recently including the new Studio experience, annotation workspace, PDF extraction and conversational AI pipelines, sequence tagging capabilities, and more.

Existing customers have access to the generally available version of Snorkel Flow without additional cost. New customers can request a demo or visit for more information.

Snorkel Flow Demo Event

Register for a Snorkel Flow demonstration on March 29th, 2022, at 11 am Pacific Time. Braden Hancock, Snorkel AI Co-founder and Head of Technology, will build a text classification application based on a real-world financial services use case with a Fortune 50 bank using Snorkel Flow.

About Snorkel AI

Founded by a team spun out of the Stanford AI Lab, Snorkel AI makes AI application development fast and practical by unlocking the power of machine learning without the bottleneck of manually-labeled training data. Snorkel Flow is the first data-centric AI platform powered by programmatic labeling. Backed by Addition, Greylock, GV, In-Q-Tel, Lightspeed Venture Partners and funds and accounts managed by BlackRock, the company is based in Palo Alto. For more information on Snorkel AI, please visit:

Source: Snorkel AI