Orchestrator Emerges to Speed ML Models to Production
As the pace of machine learning model development accelerates, vendors are beginning to offer orchestration tools designed to help data scientists manage the testing, retraining and redeployment of predictive analytics models with short shelf lives. The latest entrant is Hitachi Vantara Labs, which unveiled a model manager this week designed to speed the deployment of “supervised” models in production.
The company said Tuesday (March 6) its machine learning model manager can be used in a data pipeline based on Hitachi subsidiary Pentaho Corp.’s data integration platform. The intent is to reduce business risks by making it easier to retrain machine learning models in response to new data inputs and market conditions.
Cloud and other tool vendors note that most machine learning models built by enterprises seldom make it to production, a reality that has spurred the market for new tools for easing model development and deployment.
The same holds true for retraining and redeploying predictive analytics models that must be updated, some as often as daily, the Hitachi unit said. On the premise that outdated models increase business risks, the company promotes an approach that get models to production faster, boosts model accuracy when in production while governing model applications at scale.
To that end, the orchestrator automates “algorithmic specific” data preparation and cleansing. In keeping with the trend toward freeing data scientists from dealing with IT issues like standing up a server, the Hitachi tool also offloads requirements like having to write and maintain code,
Once models hit production, evaluation statistics can be used to identify outdated models that can be updated on the fly using a “challenger” model approach. Since the results of A/B testing are delivered faster, Hitachi claims its approach updates production models faster.
The company also addressed transparency and data governance as model developers demand greater visibility into how algorithms make decisions. Hence, the model orchestrator tracks the data lineage of model development steps while providing greater visibility into the data sources and features used to create machine learning models.
That capability, Hitachi asserts, allows data scientists to share data and data pipelines, thereby promoting standardization and reuse—steps required to build new machine learning applications faster. “Data scientist and IT operation teams will need to move newly trained models into production faster than ever before, which can jeopardize model accuracy, collaboration and governance,” said John Magee, vice president of product marketing at Hitachi Vantara Corp. based in Santa Clara, Calif.
The machine learning model manager is available now as a plug-in through the Pentaho Marketplace. Plug-ins not currently supported will be available for testing. Some may be integrated into the Pentaho’s data integration platform as part of future versions of the orchestrator, the company added.