Why Data Scientists and ML Engineers Shouldn’t Worry About the Rise of AutoML
Low-code and no-code development tools are becoming increasingly popular, and the pandemic only accelerated this trend. When we think of low-code or no-code development, we’re usually referring to tools that allow a non-software-engineer to create a digital app (or workflow) in a plug-and-play manner that doesn’t require extensive technical knowledge.
But the idea of low-code or no-code engineering also extends to tools for machine learning and data science—and today, we’re seeing a proliferation of options in this category, too, sometimes referred to as AutoML. As with low-code dev tools, the allure of these offerings is that they enable businesses to implement data science and ML workflows without needing the resources or expertise to build them from scratch. AutoML tools allow a user to input a dataset and then, with minimal data science knowledge needed, deploy a model to run over the data and generate results. It’s tempting to think that AutoML fully breaks down the barriers to AI, allowing anyone to do this type of work, but for reasons I’ll explain later, that’s not really the case—quite the opposite, in fact.
AutoML does have some potential benefits. This article from Deloitte notes two advantages in particular. The first is increased productivity for data scientists, who can speed up specific steps of the ML lifecycle through automation. This will ultimately enable data scientists to increase their value contribution to the business and to focus on more complex problems.
A second benefit is enabling non-technical business leaders to gain some access to ML, which makes particular sense in the context of the well-documented demand for data scientists. Some have speculated that AutoML might ease the talent crunch for data scientists, if it does in fact allow existing employees to do the same type of ML work without specialized training. Amid COVID-19 cost-cutting, questions have been raised about whether demand for data scientists would begin to cool, especially since it’s a field that can struggle to show clear ROI in some business settings. How will the rise of AutoML fit into the mix?
I do think that AutoML will impact the data science field. As AutoML tools become more widespread, we’ll see a corresponding increase in ML adoption among businesses. For a long time, enterprise ML was the provenance of the few—tech giants, innovative startups, and “traditional” businesses that were large enough to fund in-house AI centers. Tools like AutoML will make basic ML models and outputs more accessible to other types of companies. This doesn’t mean that the neighborhood florist is going to suddenly have a system like J.A.R.V.I.S. running the place; as an article from McKinsey rightly notes, “at present, the technology is best suited to streamlining the development of common forecasting tasks.”
As AutoML increases enterprise ML adoption by lowering the barriers, enterprises will in fact find that they have a greater need for expert data scientists—not a reduced one. As organizations adopt more and more ML technologies and their use cases become more specific, they’ll outgrow the one-size-fits-all approach. At that point, they’ll need qualified data scientists and ML engineers to help continue on a growth trajectory. This is true not only because of the limitations of AutoML, but also because of the need for human oversight to account for ethical concerns like bias as ML usage becomes more prevalent.
Additionally, ML workflows are not typically a “set it and forget it” process: as dynamic forces of business change over time, data drift or concept drift may cause ML models to become less accurate. A skilled data scientist can detect and correct for these types of problems; they can also improve the overall model function by adjusting the training data as needed, to avoid the classic “garbage in, garbage out” scenario. While AutoML can improve access to basic ML workflows that a business can build on, experienced data scientists are needed to enable peak performance of those workflows and to provide the most nuanced, useful interpretation of results.
The reality is that we’ll never automate away the need for data scientists, even if we do automate some of their tasks or improve accessibility to basic ML workflows for non-technical business people. If anything, growing adoption of AutoML will drive increased need for real, live data science expertise. Putting companies on a more equal footing in terms of their ability to incorporate ML into their businesses is a good thing, as are efforts to further democratize data science and AI. But we’ll always need expert data scientists to guide implementations, especially as they become more use-case-specific or begin to more directly impact the public.
About the author: Kevin Goldsmith serves as the Chief Technology Officer for Anaconda, the data science platform that has more than 25 million users. In his role, he brings more than 29 years of experience in software development and engineering management to the team, where he oversees innovation for Anaconda’s current open-source and commercial offerings. Goldsmith also works to develop new solutions to bring data science practitioners together with innovators, vendors, and thought leaders in the industry. Prior to Anaconda, Kevin served as CTO of AI-powered identity management company Onfido. Other roles have included CTO at Avvo, vice president of engineering, consumer at Spotify, and nine years at Adobe Systems as a director of engineering. He has also held software engineering roles at Microsoft and IBM.