Meet Andrew Ng, a 2022 Datanami Person to Watch
Andrew Ng is one of the most influential individuals in big data and AI. He’s also one of the busiest, with stints at Google and Baidu, not to mention co-founding Coursera and his latest ventures, Landing AI and DeepLearning.AI. We’d be thrilled if he added one more credential to his stellar resume: Datanami Person to Watch for 2022.
Ng kindly responded to our questionnaire, which follows.
Datanami: You’ve had a storied career, from your work at Google and Baidu to founding Coursera and now Landing AI. What do you attribute your great success to?
Andrew Ng: I find this hard to answer, since I don’t think I’m that successful! But looking back, two things that had a large impact:
I was willing to stick to a contrarian position that I believed in.
In terms of personal productivity, I think many people underestimate the importance of forming helpful habits — such as learning a bit every week — so that your habits can pull you forward without needing to constantly count on willpower to get things done.
For example, scaling up deep learning algorithms (Google Brain), building AI in China (Baidu), trying to take a top university education to millions (Coursera), or building AI for manufacturing (Landing AI), all seemed like “bad” ideas to some when I started. But many of the most impactful projects result from spotting an opportunity that others have not yet spotted and bucking the trend to push in a new direction.
Having great teammates is also a big part of success.
Datanami: Can you tell us more about Landing AI and LandingLens?
Ng: LandingLens, the flagship product of Landing AI, is a tool that makes it easy to build and deploy cutting-edge deep learning algorithms for vision algorithms. Our initial focus is manufacturing and industrial automation.
Many factories count on visual inspection to ensure products are free of defects. Modern deep learning has significantly expanded the set of defects that are now possible to detect with machine vision, but is not yet widely used. Why is this? Two key barriers to adoption are:
- Manufacturing datasets are small, and most deep learning algorithms were developed for much larger datasets.
- Each factory makes a unique product, and thus needs a custom model trained to detect its defects. Unfortunately, there isn’t enough AI talent to build the number of custom models the industry needs. LandingLens is an MLOps (machine learning operations) platform that enables manufacturers to use data-centric AI development — that is, to systematically engineer the data fed to the learning algorithm — to train a custom model that achieves high accuracy, often even if only a small amount of data is available. We find that taking Landing AI’s data-centric AI approach also shortens the time to development and time to deployment.
Datanami: Large transformer networks like BERT have been instrumental in bringing powerful NLP to the masses. Do you think it has brought us closer to artificial general intelligence (AGI)?
Ng: I find the idea of AGI exciting, but it is also very overhyped. While BERT and other transformer neural networks (such as GPT-3) are a wonderful step forward for application builders, and also demonstrated an ability to generate language across a surprisingly large range of topics and styles, they are realistically only minuscule steps toward AGI.
Building a ladder was a step toward putting humans on the moon, not because we got there by building a 239,000 mile ladder, but because we could hardly have built rockets if we didn’t have a ladder. Transformer models (like BERT) also seem like an important step, but I believe the path toward the AGI dream will take many more decades — or perhaps centuries — of fundamental research and breakthroughs.
Datanami: What do you hope to see from the big data community in the coming year?
Ng: Big data has been instrumental to AI’s development, and I hope all of us in big data will embrace and contribute to the data-centric AI movement.
AI systems are built using code (which implements a learning algorithm) and data (used to train the system). For many years, the conventional approach in AI was to download some dataset and work on the code. Thanks to this paradigm of development, for many applications today, the code aspect of an AI system is mostly a solved problem. You can download a model from GitHub that works well enough for your project. Rather than spend time on the algorithm, in many instances it’s now more useful to work on the data — by systematically iterating on the dataset using data-engineering processes and principles.
Data-centric AI is an emerging technology approach that is developing principles and tools for systematically engineering the data needed to build a successful AI application. I had started to talk about Data-centric AI in a YouTube video on March 24, 2021 and since then I’ve noticed the phrase popping up on more and more corporate websites. I’m hopeful for these companies, and many others, to create tools that could enable many people to apply these ideas in a more systematic way.
This will be key to democratizing access to AI systems, and also to opening up many new applications where the amount of data available is not large.
Datanami: Outside of the professional sphere, what can you share about yourself that your colleagues might be surprised to learn – any unique hobbies or stories?
Ng: Most people don’t know that I love the arts. Growing up, my father was a medical doctor who taught me about science, computer programming, and also AI. And my mother ran arts festivals, and frequently took my brother and I to musical concerts, opera and the theater. So my upbringing had a healthy dose of both the arts and the sciences.
Today, my own artistic skill is limited to hand sketching the occasional panda for my daughter, (who loves them), or playing the “BabyShark” tune on the piano for my son. But I still appreciate the creativity of artists that create pieces or performances that move the human spirit.
You can read the rest of our interviews with the 2022 Datanami Person to Watch program at this link.