July 9, 2020

cnvrg.io and NetApp Partner to Deliver MLOps Dataset Caching

SAN FRANCISCO, July 9, 2020 — cnvrg.io, the data science platform simplifying model management and introducing advanced MLOps to the industry, announced its partnership with NetApp, the first to leverage the cnvrg.io dataset caching tool, a unique set of capabilities for immediate pulling of datasets from cache for any machine learning job. cnvrg.io is the first ML platform to use dataset caching for end to end machine learning development. Caching allows datasets to be ready to use in seconds rather than hours, and cached datasets can be authorized and used by multiple teams in the same compute cluster connected to the cached data. Dataset caching is already used by cnvrg.io customers at production level.

It’s not uncommon to have hundreds of datasets feeding models. However, those datasets may live far away from the compute that is training the models, such as in the public cloud or in a data lake. With NetApp and cnvrg.io’s dataset caching capability, users can cache the needed datasets (and/or their versions) and make sure that they’re located in the ONTAP® AI storage attached to the GPU compute cluster or CPU cluster that is exercising the training. Once the needed datasets are cached, they can be used multiple times by different team members.

The cnvrg.io dataset caching feature can be used by any cnvrg.io user with the ONTAP AI storage server. Once connected to an organization, data scientists can cache commits of their dataset on that Network File System (NFS). When a commit is cached, users can attach it to jobs for immediate high throughput access to the data, and the job will not need to clone the dataset on start-up. cnvrg.io’s dataset caching feature creates the following business advantages:

Increased productivity – Datasets are ready to be used in seconds rather than hours.
Improved sharing and collaboration – Cached datasets can be authorized and used by multiple teams in the same compute cluster connected to the cached data.
Reduced cost – Models are pulling the datasets from the cache, reducing payments per download.
Operationalizing hybrid cloud – Dataset cache presents an on-premises high performance mirror storage.
Multi-cloud dataset mobility – with on-prem cache as control point for the data.

“Deep Learning workloads are unique in that they need access to random data samples from a large dataset that may be sourced from diverse data sources and dispersed locations,” said Santosh Rao, Senior Technical Director, NetApp AI & Data Engineering. “Further, Deep Learning requires high performance data close to the GPU Compute clusters and this requires the combination of High Performance Flash Storage Systems, Connectors into Edge, Core and Cloud for dispersed data location access and the support of widely used Data Source formats across NFS or other filesystems on a unified Data Platform. NetApp and cnvrg.io form a first of its kind partnership to bring these capabilities to customers worldwide adopting Deep Learning to transform their business.”

“Our partnership with NetApp drives productivity and efficiency for data teams.” says Yochay Ettun, CEO & Co-founder of cnvrg.io. “We’re excited to launch our dataset caching for machine learning, to offer NetApp users and cnvrg.io users faster and simplified access to their datasets with tools for advanced data management and data versioning features that will allow data teams to focus on data science over technical complexity.”

To read more about the partnership and the dataset caching, visit https://cnvrg.io/solutions/netapp/.

About cnvrg.io

cnvrg.io is an AI OS, transforming the way enterprises manage, scale and accelerate AI and data science development from research to production. The code-first platform is built by data scientists, for data scientists and offers unrivaled flexibility to run on-premise or cloud.

Source: cnvrg.io

Only registered users may comment. Register using the form below.

Check off newsletters you would like to receive*
- HPCwire
- EnterpriseTech
- Datanami
- Technology Conferences & Events
- Advanced Computing Job Bank
- Technology Product Showcase
Email*
Name*
First Last
Organization*
Job Function*
Industry*
Country*
City*
State*
Province*
- Please check here to receive valuable email offers from Datanami on behalf of our select partners.

cnvrg.io and NetApp Partner to Deliver MLOps Dataset Caching

Join the discussion Cancel reply

Only registered users may comment. Register using the form below.

May 1, 2024

April 30, 2024

April 29, 2024

April 26, 2024

Sponsored Partner Content

Get your Data AI Ready – Celebrate One Year of Deep Dish Data Virtual Series!

Supercharge Your Data Lake with Spark 3.3

Learn How to Build a Custom Chatbot Using a RAG Workflow in Minutes [Hands-on Demo]

Overcome ETL Bottlenecks with Metadata-driven Integration for the AI Era [Free Guide]

Gartner® Hype Cycle™ for Analytics and Business Intelligence 2023

The Art of Mastering Data Quality for AI and Analytics

Leading Solution Providers

Tabor Network

Sponsored Whitepapers

Top 6 Strategies for Reducing Data Warehouse Costs

Building an Operational Data Warehouse for Real-time Analytics

Sponsored Multimedia

The Power of DataOps: Bring Automation to Life
No Comments

Tactical Steps for Cloud Migration
No Comments

Immuta Data Access Platform
No Comments

Data Mesh: Fact or Fiction?
No Comments

Contributors

Featured Events

AI & Big Data Expo North America 2024

CDAO Canada Public Sector 2024

AI Hardware & Edge AI Summit Europe

AI Hardware & Edge AI Summit 2024

CDAO Government 2024

cnvrg.io and NetApp Partner to Deliver MLOps Dataset Caching

Join the discussion Cancel reply

Only registered users may comment. Register using the form below.

May 1, 2024

April 30, 2024

April 29, 2024

April 26, 2024

Most Read Features

Most Read News In Brief

Most Read This Just In

Sponsored Partner Content

Leading Solution Providers

Tabor Network

Sponsored Whitepapers

Sponsored Multimedia

Contributors

Featured Events

Share

Copy short link