Adding Bugs to Make Software Better

Large-scale automated vulnerability addition (LAVA) puts bug-detection software to the test. (Image courtesy of MIT.)

The software used to detect computer bugs is limited, with potentially hundreds of bugs going undetected. This is due in part to our inability to determine the effectiveness of our bug-finding tools.

Now, a counterintuitive approach is being used to solve this problem: engineers are adding intentionally adding bugs to software by the hundreds of thousands.

This process has been dubbed “large-scale automated vulnerability addition” (LAVA) by its creators.

Using this method, they have determined that many available bug finders detect a mere two percent of vulnerabilities.

How LAVA Works

The effectiveness of bug-finding programs is measured by two metrics: the false positive rate (Type I errors) and the false negative rate (Type II errors). These metrics are difficult to calculate since programs may detect bugs that disappear later on (Type I) or miss vulnerabilities that are already present (Type II). This makes it difficult to calculate the actual number of bugs in a program.

"The only way to evaluate a bug finder is to control the number of bugs in a program, which is exactly what we do with LAVA," said Brenden Dolan-Gavitt, an assistant professor at the NYU Tandon School of Engineering and co-creator of LAVA.

Using an automated system, his team created synthetic vulnerabilities similar to what would be present in actual computer bugs.

This approach presents a significantly lower cost compared to manual custom-designed vulnerabilities. As a result, the realistic and inexpensive bugs applied in normal control and data flow allowed the team to thoroughly study how bug-finding tools perform. To accumulate enough data, LAVA created thousands of novel bugs.

Beyond Bug Testing

The end goal is to significantly increase the percentage of bug-finding capabilities using LAVA. In an open competition this summer, developers and other researchers can request a LAVA-bugged version of a piece of software, attempt to find the bugs and receive a score based on their accuracy.

"There has never been a performance benchmark at this scale in this area, and now we have one," Dolan-Gavitt said. "Developers can compete for bragging rights on who has the highest success rate in bug finding, and the programs that will come out of the process could be stronger."

This is yet another example of how breaking something can actually lead to making it better. For one more, find out why engineers are intentionally crashing drones.