Intel Reveals Neural Compute Engine in Movidius Myriad X

Intel has been on a tear lately. On August 21, the company hosted a Facebook Live session at its facility in Oregon, where it announced its 8th-generation family of processors.

Following this event, Intel made an announcement that could enable faster artificial intelligence processing capabilities in robotics, cameras, drones, virtual reality and mixed reality devices, as well as IoT devices.

The Movidius Myriad X is a virtual processing unit (VPU) that can process over 4 trillion operations per second (TOPS). Movidius was founded in 2005 to design and produce low-power processor chips for deep learning and computer vision tasks. (Image courtesy of Intel.)

Acquired by Intel in 2016, Movidius produced ISAAC, Myriad 1, Myriad 2 and, most recently, Fathom—a USB stick with Myriad 2 on it for Advanced RISC Machine (ARM) processors, which are used in drones, robotics, IoT devices and video surveillance systems. It costs just $79.

The Myriad X, which is being touted as the world’s first system on a chip (SOC),has a built-in neural compute engine designed to speed up inferences made by deep learning programs at the network’s edge. In other words, the neural compute engine is on-chip low-power hardware that is designed specifically to run deep neural networks. Such networks are being used to enable a broad array of devices and machines to “see” in realtime, and react to changes in the environment. This capability is enabled by the VPU’s ability to simultaneously run neural networks, allowing devices outfitted with Myriad X VPUs to react to real-time visual input with more autonomy.

Highlights of the Movidius Myriad X VPU:

It has 16 programmable 128-bit very long instruction word (VLIW) vector processors optimized for computer vision workloads, which make it capable of running multiple vision application and imaging workflows.
It has support for up to 8 HD resolution RGB camera inputs that are connected by the Myriad X’s 16 Mobile Industry Processor Interface (MIPI) lanes, which together can handle 700 million pixels per second of image and signal processing throughput.
It has 20 hardware accelerators for designating optical flow and stereo depth tasks without using additional computing power.
Centralized memory architecture is enabled to process operations with up to 450 GB per second of internal bandwidth due to the 2.5 MB of homogenous on-chip memory. This helps the VPU reduce latency and the amount of power consumed by minimizing off-chip data transfers.

As deep learning and computer vision become standardized in the billions of new devices that will be consumed in the next five years, VPUs like the Intel Movidius Myriad X will become more indispensable and more integrated into the design and engineering of computing-based electronics, robotics and video surveillance systems.