October 2, 2019

ArangoDB Extends Open Source Solution with ArangoML Pipeline

SAN FRANCISCO and COLOGNE, GERMANY, Oct. 2, 2019 – ArangoDB, the leading open source native multi-model database, today announced the release of ArangoML Pipeline, the first multi-model metadata layer for Machine Learning (ML) pipelines, an open source project that provides a common metadata layer for production-grade data science and ML platforms. ArangoML Pipeline is the first offering in ArangoDB’s new extension, ArangoML.

The need to monitor, manage and audit ML pipelines is a common challenge. Machine learning pipelines are all different depending on the project and the company. They contain a number of different components — from distributed training and Jupyter Notebooks, to hyperparameter optimization and feature stores. Most of these components produce metadata. For DataOps teams, it is critical to have a common view across their ML production platforms to answer questions such as, which models have been derived from which dataset, how can I reproducibly rebuild a particular model and more.

ArangoML Pipeline centralizes the metadata produced across the entire pipeline, allowing data scientists to have a history of how the ML models they are writing are trained and perform over time. As a native multi-model database, ArangoDB can easily accommodate and unite unstructured, highly-interlinked data, such as inference and model descriptions. This not only helps data scientists more easily access data that allows them to better optimize their ML models, but also it helps companies in highly-regulated industries meet auditing requirements, such as risk management or insurance incident handling. For example, consumers in many countries have the legal right to understand why their loan application or insurance claim has been declined. For ML- and AI-based case processing and risk analysis, enterprises need to provide the detailed audit trail which the common metadata of an ML pipeline can provide.

“In most machine learning production scenarios, Data Scientists and DataOps not only want to build a single accurate model, but also have a pipeline where they can build, rebuild and serve multiple machine learning models,” said Jörg Schad, Head of Engineering and Machine Learning at ArangoDB. “The metadata produced by these pipelines is often overlooked but highly valuable in terms of, for example, finding lineage and audit information, as well as optimizing the model serving policy. ArangoML Pipeline is the first comprehensive solution on the market that captures, analyzes and monitors any kind of metadata, answers arbitrary complex questions and works for any kind of pipeline setup.”

As a native multi-model database, ArangoDB is a natural fit for ML use cases which involve unstructured data, but also the need to track the relationships between those different entities. ArangoDB unites graph, document, and key/value data models, along with a full-text search engine, natively in a single C++ core with the same query language. By uniting multiple data models in a single database, ArangoDB simplifies the process of accessing different data models, finding connections between them, and extracting value out of them — which is what ML is all about.

Learn more

To get started with ArangoML Pipeline: Visit the GitHub repository
For more details on ArangoML: Read the blog
To join a webinar for a more in-depth overview of ArangoML with Jörg Schad, ArangoDB Head of Engineering and Machine Learning: Register here

About ArangoDB
One database, one query language, and three data models. With more than 7 million downloads and over 8,000 stargazers on GitHub, ArangoDB is the leading open source native multi-model database. It combines the power of graphs with JSON documents, a key-value store, and a full-text search engine, enabling developers to access and combine all of these data models with a single, elegant, declarative query language.

Simplifying complexity and increasing productivity is the mission of ArangoDB Inc., the company behind the project. Founded in 2014, ArangoDB Inc. is a privately-held company backed by Bow Capital and Target Partners. It is headquartered in San Francisco and Cologne with offices and employees around the world. Learn more at www.arangodb.com.

Source: ArangoDB

Tags: ArangoML, data pipeline, data science, machine learning, multi-modal

Only registered users may comment. Register using the form below.

Check off newsletters you would like to receive*
- HPCwire
- EnterpriseTech
- Datanami
- Technology Conferences & Events
- Advanced Computing Job Bank
- Technology Product Showcase
Email*
Name*
First Last
Organization*
Job Function*
Industry*
Country*
City*
State*
Province*
- Please check here to receive valuable email offers from Datanami on behalf of our select partners.

April 26, 2024

April 25, 2024

April 24, 2024

Sponsored Partner Content

Get your Data AI Ready – Celebrate One Year of Deep Dish Data Virtual Series!

Supercharge Your Data Lake with Spark 3.3

Learn How to Build a Custom Chatbot Using a RAG Workflow in Minutes [Hands-on Demo]

Overcome ETL Bottlenecks with Metadata-driven Integration for the AI Era [Free Guide]

Gartner® Hype Cycle™ for Analytics and Business Intelligence 2023

The Art of Mastering Data Quality for AI and Analytics

Leading Solution Providers

Tabor Network

Sponsored Whitepapers

Top 6 Strategies for Reducing Data Warehouse Costs

Building an Operational Data Warehouse for Real-time Analytics

Sponsored Multimedia

The Power of DataOps: Bring Automation to Life
No Comments

Tactical Steps for Cloud Migration
No Comments

Immuta Data Access Platform
No Comments

Data Mesh: Fact or Fiction?
No Comments

Contributors

Featured Events

AI & Big Data Expo North America 2024

CDAO Canada Public Sector 2024

AI Hardware & Edge AI Summit Europe

AI Hardware & Edge AI Summit 2024

CDAO Government 2024

ArangoDB Extends Open Source Solution with ArangoML Pipeline

Join the discussion Cancel reply

Only registered users may comment. Register using the form below.

April 26, 2024

April 25, 2024

April 24, 2024

Most Read Features

Most Read News In Brief

Most Read This Just In