GIST Researchers Make Robot Vision Breakthrough
Robot vision increasingly pervades processes ranging from manufacturing—where robots have to manipulate difficult objects—and autonomous driving—where cars have to identify and respond to different kinds of obstacles. But these systems often struggle when objects are occluded (not fully visible)—and now, researchers from the Gwangju Institute of Science and Technology (GIST) have developed a novel framework for identifying these occluded objects more successfully than before.
Typically, robot vision systems have relied on simply identifying an object based on visible elements of the object. But this new system—called “unseen object amodal instance segmentation,” or UOAIS—quite literally introduces a new layer into the equation. When it encounters an object of interest, it isolates the visible elements of that object and then works to determine if the object is occluded, segmenting the image into a “visible mask” and an “amodal mask” and inferring the remainder of the object.
“Previous methods are limited to either detecting only specific types of objects or detecting only the visible regions without explicitly reasoning over occluded areas,” explained Seunghyeok Back, a PhD student at GIST who worked with Kyoobin Lee (an associate professor at GIST) to lead the UOAIS development team. “By contrast, our method can infer the hidden regions of occluded objects like a human vision system. This enables a reduction in data collection efforts while improving performance in a complex environment.”
Training traditional robot vision systems can be a tedious process with mixed results. “We expect a robot to recognize and manipulate objects they have not encountered before or been trained to recognize,” Back said. “In reality, however, we need to manually collect and label data one by one as the generalizability of deep neural networks depends highly on the quality and quantity of the training dataset.”
To train UOAIS, Lee and Back fed the model with a database of 45,000 synthetic photorealistic images with modeled depth information. The team said that this dataset—which they characterized as fairly limited—was, when combined with a hierarchical occlusion modeling scheme, able to achieve state-of-the-art performance in three benchmarks. “Perceiving unseen objects in a cluttered environment is essential for amodal robotic manipulation,” Back said. “Our UOAIS method could serve as a baseline on this front.”
To learn more about this research, read the paper, “Unseen Object Amodal Instance Segmentation via Hierarchical Occlusion Modeling,” which was accepted at the 2022 IEEE International Conference on Robotics and Automation. The paper was written by Seunghyeok Back, Joosoon Lee, Taewon Kim, Sangjun Noh, Raeyoung Kang, Seongho Bak, and Kyoobin Lee.