Tesla AI Day 2022: “Tesla Is Not a Car Company”

Tesla has been pushing forward with its robot concept since unveiling the design last year. (Source: Tesla.)

Tesla held its second annual Tesla AI Day late last month, where founder Elon Musk and a team of engineers showcased the company’s work on artificial intelligence (AI), robotics, supercomputing and autonomous driving.

The event raised the question of whether Tesla should still be considered a car company—or something more diversified and ambitious. Musk’s answer: “Tesla is not a car company.”

Let’s take a closer look at Tesla’s work in the AI field over the past year—and consider what the future could hold for the company and for all of us.

Optimus Robot

Tesla’s robot has been grabbing most of the headlines from the event. The company unveiled a working prototype called Bumble C, which performed rudimentary movements, waved, flexed at the waist and pumped its arms upward. Musk claimed that the robot could do much more, but he wanted to play it safe with the demonstration and didn’t want the robot to fall down.

Optimus will consume about 100 watts when idle but will increase to 500 watts when walking briskly. It weighs about 161 pounds and will have a range of motion similar to that of a human but with a more limited range. For instance, while the human body has over 200 degrees of movement and 27 in the hands alone, Optimus will have 28 movements overall and 11 in its hands—enough to allow it to handle tools.

Optimus’ actuators highlighted in orange, with its electrical components shown in blue. (Source: Tesla.)

Optimus uses 28 in-house Tesla designed and manufactured structural actuators, which are tightly packaged sets of gears that move the robot’s torso, arms, legs and fingers, as well as replicate the work of human muscles. Tesla also designed and built Optimus’ battery pack and control system. Residing in the robot’s torso, the battery holds 2.3 kWh, which Tesla anticipates is enough for a full day’s worth of work. The battery pack has all the battery electronics contained within a single protection circuit board.

Also found in the torso is the robot’s computer brain. The computer adapts both the hardware and software of Tesla’s Full Self Driving (FSD) feature. The FSD neuron network runs as if on a vehicle, but it’s been retrained for the robot. Like a vehicle, Optimus’ data is collected from sensors and fed into neural networks to determine the robot’s behavior.

Tesla is leveraging its FSD technology to help Optimus navigate autonomously. (Source: Tesla.)

Tesla’s approach to Optimus is the same as it has been for the company’s vehicles, with a focus on mass manufacturing so that it’s possible to produce the robots in high volume at low cost while maintaining high reliability. Unlike Boston Dynamics’ Atlas, which is a research platform, Tesla’s main goal is to make a commercially viable product. This means that the design and production will be optimized for factors such as cost and efficiency, including reducing the number of parts being used, as well as power consumption, whenever possible. For example, Optimus will use only six types of actuators optimized for cost, manufacturing ease, and functional efficiency. Musk said that Tesla is aiming to “produce the robot as quickly as possible and have it be useful as quickly as possible.”

Musk estimated that Optimus could be commercially available within three to five years.

Full Self Driving

While it makes sense for Tesla to adapt its autonomous driving technology to its robot, that technology has come under scrutiny—and Tesla’s engineers seem to be aware that the Full Self Driving (FSD) platform needs improving. After all, Optimus will require AI for everything it does, while a Tesla vehicle will have a human operator to take over the wheel in case of difficulties.

Tesla used AI Day 2022 to showcase the improvements it has made to FSD—both for Optimus and for the Tesla fleet. In 2021, about 2,000 customers had FSD deployed in their vehicles; since then, the number has grown to 160,000 customers and the company has put extensive work into upgrading the software’s robustness and capability. “The FSD Beta software is quite capable of driving the car,” said Ashok Elluswamy, Tesla’s director of Autopilot Software. “It should be able to navigate from parking lot to parking lot, handling city street driving, stopping for traffic lights and soft signs, negotiating with objects at intersections, making turns and so on.”

Tesla has expanded its FSD training infrastructure by 40 to 50 percent in the past year and has put work into its AI compiler, significantly boosting the capability of the neural networks in each car, according to the company.

FSD’s decision-making capabilities have been sped up from considering options in milliseconds to 100 microseconds. The system has an improved occupancy network, which is the model it creates of the physical world in 3D. It takes the video streams of all eight cameras as inputs and produces a single, unified volumetric occupancy model. It can predict the probabilities of objects being hidden behind other objects. For each location it navigates, the system creates semantics such as curb, pedestrian, road debris and vehicles. It also predicts motion and the flow of objects, even if they exhibit random motions such as another vehicle losing control.

Tesla demonstrated the improvements in FSD with a case study involving an oddly positioned car at an intersection. In previous FSD iterations, that car would have been identified in a video as stopped—though it was actually waiting to cross the intersection.

Tesla uses massive amounts of data to better train FSD. (Source: Tesla.)

Tesla’s engineers built a tool to identify mispredicted objects in videos of scenarios like the above. The tool corrects the label and categorizes the video clip into an evaluation set. For this scenario alone Tesla built a data evaluation set of 126 test videos and enhanced it with data mined and boosted from 13,900 other videos. That data was then used to create a solution. The result: FSD no longer predicts the crossing vehicle as parked but correctly identifies it as waiting to cross.

This process is applied to every single challenge the engineers identify: parked vehicles at turns, large vehicles such as buses, traffic on curvy roads, parking lots and other situations. Tesla can do this at scale because it has an entire fleet of vehicles feeding data into the model—and the solution derived from one vehicle’s problem can be implemented into every FSD-enabled Tesla on the road without having to change the architecture of the model itself.

“A biological neural net with two cameras on a slow gimbal [in other words, a human being] can drive a semitruck,” said Musk. “If you’ve got eight cameras with continuous 360-degree vision operating at a higher frame rate and a much higher reaction rate, I think it’s obvious it should be able to drive a semi—or any vehicle—much better than a human,” said Musk.

The improvements have Tesla confident that FSD will enable a vehicle to efficiently and safely navigate the road without mishaps—and will enable Optimus to navigate a home or workplace with equal ease.

Dojo Supercomputer

Enhancing FSD’s neural networks will require massive computing power. That’s where Dojo, the company’s supercomputing platform, comes in. Dojo will be a crucial component in Tesla’s pursuit of creating Level 4 autonomous driving in the near future and Level 5 autonomous vehicles in the long run.

At last year’s AI Day, Tesla showcased two components—its D1 chip and its training tile—and hinted that its end goal was an ExaPOD: a supercomputer that gives one exaflop of compute. This year, the company provided more details about Dojo.

The system tray is a key part of Tesla’s vision for a single accelerator. It connects six training tiles together within a cabinet as well as between cabinets, enabling them to achieve uniform communication. It also features high-speed connectivity, dense integration (each tray is 75 millimeters high—that’s as narrow as a cell phone—and weighs 135 kg) and can accommodate 54 petaflops.

The interface processor feeds data to the training tiles. It has 800 GB per second total memory bandwidth. In addition, it has full bandwidth memory and ingest to tile via the company’s custom Tesla Transport Protocol (TTP), which is used to communicate across the entire accelerator. The processor also has high-speed Ethernet, which enables TTP to extend over standard Ethernet with a connection via a standard Gen 4 PCIe host interface. Tesla provides native hardware support for this with minimal software overhead.

Tesla pairs 20 of these interface processor cards per system tray, resulting in 640 GB of high-bandwidth PCIe DRAM—this provides the disaggregated memory layer for the training tiles. The host interface is integrated directly under the trays and features 512 total x86 cores and eight terabytes of total memory. The host provides ingest processing and connects to interface processors through DCIe. The host also provides hardware and video decoder support for video-based training. The user applications housed on these hosts operate in a standard x86 Linux environment.

Two of these assemblies—tiles, trays and host—are put into one cabinet and are paired with redundant power supplies that do direct conversion of 3-phase 480-volt AC power to 52-colt DC power.

“By focusing on density at every level, we can realize the vision of a single accelerator,” said Bill Chang, principal system engineer for Tesla’s Autopilot. The uniform nodes on Tesla’s custom D1 dies are connected into a fully integrated training tile, and these tiles can be seamlessly connected to each other within cabinets and across cabinet boundaries to form the Dojo accelerator.

Tesla is accelerating the development of its own supercomputer. (Source: Tesla.)

Two full Dojo accelerators can be housed in a single ExaPOD—providing one exaflop of compute, 1.3 TB of high-speed SRAM and 13 TB of high-bandwidth TRAM. Tesla plans to house seven ExaPODs in its Palo Alto, Calif. facility, and already has one Dojo platform deployed that consists of 14,000 GPUs. Tesla doesn’t plan to sell the ExaPODs themselves, though it may sell compute time on a Dojo in a way that is similar to how Amazon Web Services functions: “just have it be a service that you can use that’s available online and where you can train your models way faster and for less money,” said Musk.

Tesla’s Future May Not Be in Vehicles

The second annual Tesla AI Day showcased an impressive amount of progress since last year’s presentation while clearly positioning Tesla for a future that isn’t only about its electric vehicles.

“I think the mission effectively does somewhat broaden with the advent of Optimus—to make the future awesome,” Musk said. While he does have a history of making aspirational promises, the work of his engineering team signals an interesting future for the company.