All roads lead to vision for AI-powered industrial processes

The next frontier for industrial digitization and automation is the convergence of artificial intelligence (AI) and machine vision.


AI-powered machine vision promises to transform the way industrial manufacturers conduct their business, according to experts at a recent webinar hosted by the Association for Advancing Automation (A3). The webinar, entitled Harnessing AI-Powered Machine Vision for Industrial Success, brought together industry leaders to discuss how those two tools opens up many possibilities for industrial companies to maximize their competitiveness, from improving quality control to enhancing safety to optimizing production processes.


Michael Kleiner, VP of Edge AI Solutions at OnLogic; Prateek Sachdeva, Co-Founder and Chief Product Officer at Invisible.AI; and Gareth Powell, Product Marketing Director at Prophesee, discussed the potential of AI-powered machine vision to help industry optimize their operations and compete in a global economy.


Converging trends lower the bar for deploying AI in industrial settings

Kleiner identified out two trends facilitating the use of machine vision in industrial digital transformation: more powerful CPUs and improvements in model compression.


“We’re seeing a real increase in the AI capabilities of what’s within the CPU package,” siad Kleiner. This is particularly true in x86 CPU architectures which are commonly used in industry. This growth is taking place largely thanks to the inclusion of AI-optimized GPU hardware within the physical CPU. More and more internal GPUs (iGPUs) are being installed; these architectures are better suited to handle AI workloads than a conventional CPU. More recently, neural processing units (NPUs) have been added as well.


“With these additions, in terms of AI power, it’s growing faster than linear, more according to Moore’s Law, which is really helpful if we want to do AI at the edge,” said Kleiner. “We’re seeing a good amount of growth; adding these architectures and optimizing them for AI workloads is giving us more power.”


Along with more powerful CPUs, techniques to compress machine learning models have also been improving, enabling CPU architectures to do more with the data than ever before. These compression techniques include model choice, quantizing/data types, pruning/sparsity optimization, knowledge distillation, low rank factorization technique, and more.


This improvement is driven by the sheer increase in memory size needed to operate those models: the models need to be better compressed to bring them down to a manageable size for the CPU’s compute power—which is particularly beneficial for CPUs at the edge, which face significant resource constraints when running AI programs. “Increasing compute power withing the CPU package, and model compression so we can do more with a given architecture, really helps enable a growing number of AI use cases that can handle the inference tasks with the computational power of what’s in the CPU,” said Kleiner.


The convergence of these trends means a reduction in the physical complexity required by edge systems, more options in compute hardware because specialized high-end systems won’t me needed, as well as efficiencies in power use and emissions. “All of this helps to simplify processing data in real-time at the edge and lowers the barrier of entry for many AI deployments, including machine vision,” said Kleiner.


Deploying AI-powered sensors at the edge

Sachdeva says devices that capitalize on that convergence can help industry. His company, Invisible.AI, has developed an intelligent camera for manufacturing, and a software platform that monitors and learns from the camera’s recordings, to deliver insights for process optimization, safety and continuous improvement.


“Manufacturing changes every single day, every week, and you’re conducting optimization on the line,” said Sachdeva. “To be able to do data collection for every scenario is just not practical. Your solution with AI needs to work quickly, needs to be able to deploy day one, week one, not weeks or months from now, and not depend on a lot of data collection.”


AI-powered machine vision devices like Invisible.AI’s sensor help operators achieve an accurate, real-time understanding of the entirety of the company’s processes—enabling companies to implement the Japanese concept of Genchi Genbutsu: going to and directly observing an operation to understand and solve problems faster and more efficiently. Manufacturing has always given this understanding great importance, from from pen and clipboard decades ago, to current video-native software that layers digital context onto the video data to allow operators to implement solutions and find efficiencies.


Many companies have tried to streamline their operations with Industry 4.0 technologies such as IIoT sensors and other text and number-based technologies—but it hasn’t been enough. Thousands of digital signals exist, but they only tell a partial story, and manufacturers have lacked complete operational visibility. According to Invisible.AI, “video is the only way to digitize and understand the physical world at scale. AI is essential for making sense of the vast amounts of data generated on the production floor. Video-native software like computer vision, coupled with AI, represent the next generation of manufacturing technology.”


Collecting the right data for more efficient computing

While Sachdeva talked about sensor innovations, Powell presented details on the kind of model compression that enables those sensors to deploy AI at the edge.


Powell’s company, Prophesee, powers its sensors with an innovative approach to data collection. Their sensors use an event-based approach rather than the conventional image detection technologies. By selectively focusing on the changes, or events, in a series of images and ignoring static background objects, Prophesee claims that its sensors can produce up to 1000 times fewer data than a conventional sensor—while increasing the sensor’s temporal resolution to more than 10,000 frames per second.


“It addresses many of the issues that we have with conventional sensors in an AI system,” said Powell.


Prophesee’s event sensors are complete imaging systems in their own right, compared with conventional sensors. Conventional sensors may have two transistors linked to a particular pixel—while Prophesee’s sensors have 80 to 100, sometimes even more. The sensors work on light-based contrast detection: they continuously track changes in light levels.


Should that change exceed a certain threshold it triggers an “event” that directs the sensor to pay closer attention to only the elements in the image that are moving or changing—ignoring the rest of the image. As a result, the data generated by the sensor per event is much more efficient for processing than anything a conventional RGB camera generates.


Powell claimed event-based processing has several benefits. Ultra-low latency and high temporal resolution means that data inference at any rate is possible, limited only by computation time. That computation time is also reduced: sensors only have to learn simple patterns and features and don’t need to learn invariance in relation to the static background. If invariance is incorporated, it can be minimal, enabling greater and easier generalization.


Echoing Kleiner’s observation of compute limitations of devices on the edge, Mitchell stated that these innovations are beneficial for edge processing, where power and memory are more restrictive.


“All roads lead to vision”

With more powerful sensors, and more efficient processing capabilities, machine vision is coming increasingly within reach for industrial businesses looking to bring AI into their factories. “All roads lead to vision,” said Sachdeva. “To be able to get to the digital twin, to be able to run your whole factory in an automated fashion—that future requires digitizing the real world, and understanding through video what’s happening.” For example, an automated forklift might drive two to three miles per hour (mph) at most, limited because it can’t see what’s ahead. If it knew the entire route it needed to drive, based on the entire plant being digitized, it could drive 10 mph and deliver the product much faster because it can see when there’s nothing in its path. “Building towards that requires you to blanket your facilities in these vision solutions and get an understanding of what’s happening,” Sachdeva said.