NVIDIA Goes Deep, Extends GPU Hardware and Software for Deep Learning

The GTC Conference in San Jose was the scene of several announcements of NVIDIA’s GPU-powered hardware. When taken together, the improvements promise a 10X performance increase over the previous generation for data scientists, according to the company.

These include doubling the memory in the NVIDIA Tesla V100, offering a GPU interconnect fabric called NVIDIA NVSwitch, updating the company’s software stack, and introducing the world’s first two-petaflop deep learning system, the NVIDIA DGX-2.

The NVIDIA DGX-2, a two-petaflop deep learning system, which will be available Q3, takes 110,000 Watts and weighs 350 lbs. (Image courtesy of NVIDIA.)

First, with an improved 32GB of memory in NVIDIA’s most powerful datacenter GPU, the Tesla V100 GPU will help progress deeper, larger, and more accurate deep learning models. The new Tesla V100 32GB GPU is now available throughout all of NVIDIA’s DGX systems. In a statement, computer manufacturers Cray, Hewlett Packard Enterprise, IBM, Lenovo, Supermicro and Tyan confirmed that they will offer the new 32GB system in their products soon in the next quarter. Oracle Cloud Infrastructure will also offer it in the cloud later this year.

Next up, building off of their original NVLink, NVIDIA presented the NVSwitch. Similar to the Nintendo Switch and how it connects various gaming experiences within one single system, the NVSwitch will be able to support up to 16 Tesla V100 GPUs in a single server node. This advancement will drive hyperconnection between GPUs, and in turn, more capability in handling bigger and more demanding projects.

Also for developers is NVIDIA’s complementary fully optimized and updated software stack. This includes new versions of NVIDIA CUDA, TensorRT, NCCL and cuDNN, and a new Isaac software developer kit for robotics.

For its grand finale, NVIDIA presented the NVIDIA DGX-2, “the first single server capable of delivering two petaflops of computational power.” According to a press release, “DGX-2 has the deep learning processing power of 300 servers occupying 15 racks of datacenter space, while being 60x smaller and 18x more power efficient.” This is made possible by combining the Tesla V100 32GB GPU, 16 total GPUs, 12 total NVSwitches, as well as other optimized components in the server.

Jensen Huang, NVIDIA founder and CEO, said these developments “will help revolutionize healthcare, transportation, science exploration and countless other areas.”