How To Know If MLOps Is Worth The Cost For Your Problem
Enterprise data intake is growing exponentially, and leaders are looking toward any automated solutions to prevent disarray and arm themselves for the future. Machine learning (ML) will unlock the potential of collected data, but success requires time and resources.
MLOps — the discipline of deploying and monitoring machine learning models — can be extremely effective if used to solve fitting problems. It borrows many practices from the DevOps governance framework for models, availability, and evolution of the ML applications. As enterprises increase maturity and ML adoption, MLOps will be more important than ever. In fact, data from Gartner projects around 70% of organizations will have AI operationalized by 2025.
Data Quality First
It should go without saying, but a successful MLOps investment requires access to quality volumes of data. Automation is indeed vital to the future, but eager leaders must first ensure that there is something of value fueling the process.
Unfortunately, data quality continues to be a challenge for many businesses. According to a recent survey, nearly 1 in 2 organizations reported that ensuring data quality is a challenge. Furthermore, of all the challenges they face in using data effectively, data quality was the top concern.
This is especially difficult for smaller companies, who rarely have the structure or resources necessary to generate data and feed an ML model. For example, insurance companies have access to an abundance of data and likely have the talent aboard to ensure data quality is up to standard. Smaller businesses have none of the data or don’t have the right processes to collect it, and consequently do not have the hands to support complex data operations.
In order to successfully kick-off an MLOps program, businesses of this size should look toward data lakes — or warehouses — along with a data management solution that allows them to bring in significant volumes and maintain high quality. Test models can then be built with this data, and tasks that previously drained valuable talent will eventually be automated.
Unfortunately, some leaders misunderstand the scale of issue necessary to justify MLOps adoption.
This oversight will incur business costs if models fail and, because business processes have a dynamic nature in most cases, they will fail. The cost varies by use case. Generally, companies within finance, e-commerce and technology have been the first to implement MLOps techniques because they have relatively easy access to qualified data and resources. Now that machine learning is getting mainstream adoption, MLOps will be needed to help companies traditionally less exposed to technology to shift benefits from machine learning at scale.
In a poll conducted across different industries by Algorithmia, the three top ML use cases identified were: 1) reducing company cost; 2) generating advanced customer insights; and 3) improving customer experience, which extensively use recommendations produced by ML models.
The degree of automation for decision making — along with the performance of ML models utilized — can be combined to determine the risk/reward of the ML application. Gartner discusses the automation in the article Would You Let Artificial Intelligence Make Your Pay Decisions?, where three levels are defined: 1) decision automation; 2) decision augmentation – recommendations; and 3) decision support – insights.
As the ML performance increases, so does the potential reward and vice-versa. If the result of the model is fully automated, as in many use cases to reduce the company cost, risk is significantly increased. Alternatively, recommendations are less risky when model performance is lower, since the human user plays an important role to adopt or not adopt a recommendation.
Recently, a trucking logistics company adopted ML to track and route trucks. Due to the lack of incoming data validation, location equipment malfunctioned and the truck was left idle for many hours. This is a known and recurring problem that could have been detected as a data drift, for which techniques are often available when implementing MLOps.
MLOps aims to streamline and operationalize the development, testing, and deployment of ML models in production. MLOps generally includes consistency checking of input data distribution, model monitoring performance, and continuous retraining of the models to adapt to process changes, among others. Tools to improve governance prevent developmental chaos as data scientists perform many experimentations, which would otherwise cause issues. This requires DevOps techniques, as any application will have to manage environments, handle accounts, perform authentication, enable data access, and more.
The lack of MLOps also affects productivity when the number of data professionals including data engineers and scientists increases, which is a trend that shows no signs of slowing down. Delivering a model to production for many organizations takes anywhere between 31 to 90 days, as revealed in the Algorithmia poll – a challenge which will be exacerbated by a need to support the new data scientists coming on onboard.
Due to exposure to machine learning in everyday applications, consumer and customer expectations are growing and most companies will need to implement the technology to avoid falling behind. Delivering models that work at scale, in production, is a complex and resource-heavy process. However, MLOps implementation will ultimately exhibit a strong ROI if key decision-makers identify a legitimate use case, engage in a cost-benefit analysis, and ensure individuals with the proper skillset are brought aboard.
About the Author:Thibaut Gourdel has served in various technical product marketing roles at Talend since 2017. Previously he was a data engineer with Orange. Thibaut’s area of interest includes data management, data governance, and cloud technologies.