New AI Computer Visual System Is Capable of Human-Like Identification

The computer vision system can identify objects based on only partial glimpses, like with these photo snippets of a motorcycle. (Image courtesy of the UCLA Samueli School of Engineering.)

Machine vision systems are still a long way from performing just like the human brain. Today’s vision systems are still limited in how they “see” without human intervention. However, a new computer system might just be the breakthrough the field needs. This computer system can successfully discover and identify real-world objects through the same method of visual learning used by humans.

The system represents an advance in “computer vision” technology, which is what enables computers to read and identify visual images.

The system performs only three steps. The system first breaks up an image into smaller bits, which the researchers call “viewlets.” The computer then learns how the viewlets fit together in forming the object in the image. The system will then look at other objects in its immediate area to determine if the information about those objects is relevant in describing and identifying the object in question. This process is similar to how humans identify and study symbols and objects.

The engineers immersed this new system in an Internet replica of human environments.

"Fortunately, the Internet provides two things that help a brain-inspired computer vision system learn the same way humans do," said Vwani Roychowdhury, a UCLA professor of electrical and computer engineering and the study's principal investigator. "One is a wealth of images and videos that depict the same types of objects. The second is that these objects are shown from many perspectives—obscured, bird's eye, up-close—and they are placed in different kinds of environments.”

The researchers drew insights from the fields of cognitive psychology and neuroscience in developing the system’s framework.

"Starting as infants, we learn what something is because we see many examples of it, in many contexts," Roychowdhury explained. "That contextual learning is a key feature of our brains, and it helps us build robust models of objects that are part of an integrated worldview where everything is functionally connected.”

The system was shown over 9,000 images, which were an assortment of both people and objects. After studying the images, it was able to build a detailed model of the human body without any need for labels or guidance from the engineers.

The engineers also ran similar tests with images of common modes of transportation, such as motorcycles, cars and airplanes. The system was able to perform significantly better in all cases compared to traditional computer vision systems that had undergone years of training.

Photo snippets of a motorcycle used by the computer vision system.

(Image courtesy of the UCLA Samueli School of Engineering.)

While existing AI computer vision systems are powerful and capable, they are still limited to task-specific activities. That is, their ability to identify what they see is limited by the extent of training and programming provided by humans.

Current computer vision systems are not autonomous by design. Rather, they are trained with the exact materials they need to learn by reviewing thousands of images. The objects they need to identify are typically already labeled, making it more of a process of memorization.

Obviously, computers do not possess any rationale to determine what an object represents. It’s because these AI-based systems do not know how to progressively build a human-like model in understanding learned objects and symbols.

This is a significant step in developing AI systems that can reason and interact in the same way that humans do. The researchers hope that this new system can become more intuitive and learn on its own through more tests.

Findings and method can be found in Proceedings of the National Academy of Sciences. The system was developed by researchers from the UCLA Samueli School of Engineering and Stanford University.

For more on how machine vision systems are transforming the field of AI, check out how this Machine Vision System Can Track Vibrations for Production Monitoring.