September 7, 2017

ML Toolkit Aims to Ease Data Scientists’ Pain

George Leopold


As the AI services ecosystem expands, vendors are offering automation tools designed to make life easier for embattled data scientists through toolkits used to build machine and deep learning models, and then move those trained models to production.

That’s the premise behind the upgraded version of machine learning “lab” from toolkit vendor called Neptune. The AI startup based in Palo Alto, Calif., positions its toolkit as enabling data scientists to build machine-learning models in their preferred framework, including Keras, the open source neural network library, and TensorFlow, the Google-developed machine-learning framework.

The platform also is being promoted as a way to reduce complex technology stack and infrastructure management tasks, freeing data scientists to focus instead on model deployment and maintenance. “Users can share and compare their results in leaderboards and choose the best models for further development and deployment,” noted company CTO Piotr Niedźwiedź.

The Neptune toolkit also addresses the increasing complexity associated with creating machine learning models. That complexity has seeded an emerging market for what’s being called “automated machine learning.”

“Designing and tuning a machine learning model is not for the faint of heart,” noted data science analyst James Kobielus. “If you’re a working data scientist, you must sort your way through a bewildering range of parameters in an attempt to get it right. For starters, you must select a feature model that contains the right set of independent variables to drive your intended machine-learning outcome.”

The upgraded version of Neptune includes an interactive prototyping feature with Jupyter Notebooks, the open source web application that allows the creation and sharing of documents containing live code, equations and visualizations. The platform also allows users to track and reproduce machine-learning experiments hosted in public clouds.

Neptune also supports collaboration and eliminates the need to analyze logs by including a user interface designed to make it easier for data scientists to monitor model training, said.

Earlier this year, the startup participated in a competition sponsored by the U.K.’s Defence Science and Technology Laboratory designed to identify and categorize objects in satellite imagery. The laboratory provided 1-km by 1-km satellite images and the task was to detect different objects such as buildings, vehicles, trees or roads scattered across the landscape. Users had to identify an AI algorithm or develop software that would help evaluate large and complex data sets.

The data science team finished fourth in the competition, ahead of 400 other international teams.

Recent items:

Automating Development and Optimization of Machine Learning Models

How Spark Illuminates Deep Learning

Share This