Patterns of Progress: Andrew Ng Eyes a Revolution in Computer Vision
Andrew Ng sees a pattern in AI. The visionary computer scientist recently gave a keynote address at the AI Hardware Summit where he predicted a forthcoming revolution in computer vision that could radically transform how machines perceive and interact with the visual world. Just as large transformer models have revolutionized text processing, Ng says a similar advancement is anticipated in vision processing.
“There’s definitely something in the air in computer vision in the same way that three years ago there was something in the air at NLP conferences,” he said, referring to the atmosphere at this year’s Conference on Computer Vision and Pattern Recognition (CVPR).
So why now? Ng says that the last ten years have been dominated by large scale supervised learning, where bigger neural networks and more data led to better performance and significant business results. More recent years have seen a diversification of AI tools, with generative AI coming to the forefront.
Ng then highlighted the practicality and potential of AI in everyday applications, suggesting that we are transitioning from a period of centralized, cloud-based AI to a more distributed model where AI operates at the edge. “On the consumer side but also on the industrial side, there are a lot of AI applications that will make sense to run at the edge rather than just in the cloud for privacy or agency reasons,” he noted.
This shift is particularly evident in industries where deploying AI at the edge can enhance privacy, reduce latency, and tailor solutions to specific, on-site needs. The sentiment analysis example he provided was a case in point: a task that once required substantial computational resources can now be accomplished with smaller models directly on consumer devices.
Moreover, Ng emphasized the importance of tailored applications of AI, such as the inspection of pizzas in a factory to ensure quality or the precision harvesting of wheat in agriculture. These examples illustrate his broader point: AI is not just about monumental, one-size-fits-all solutions but also about myriad “smaller” applications that cumulatively represent significant economic value and practical benefits. These use cases, often overlooked by major tech firms focused on consumer internet applications, demonstrate the untapped potential of AI in specialized, niche markets.
As AI tools become more accessible and customizable, thanks to developments in generative AI and prompting, organizations of various sizes can now harness the power of AI for unique challenges. This democratization could lead to a more equitable distribution of AI’s benefits across different sectors and industries.
Ng predicts that just as the groundbreaking advancements in natural language processing models have brought about a renaissance in text understanding, a similar watershed moment is on the horizon for computer vision. It is quite possible that the way machines “see” and “understand” images and visual data is about to undergo a revolutionary change.
“I think that revolution is underway; it’s not yet fully here,” Ng said. “But I think it’ll be exciting to see what we’re seeing in texts turn into a revolution in the vision space as well.”
What the NLP domain experienced a few years ago seems to be coming to fruition for computer vision now. The technology trends are there, Ng says, and teams around the world are building upon vision transformers just as others had with text only a short time before.
Andrew Ng sees a pattern in AI, and hopefully that pattern is no hallucination.