Using Machine Learning to Predict Metallic Defects

Dominant defect type predictions from r-MART model for 946 B2-type intermetallics. Colors indicates the relationship between prediction and calculations as shown in the legend. (Image courtesy of Bharat Medasani, Berkeley Lab/PNNL.)
For the first time, researchers at the Lawrence Berkeley National Laboratory (Berkeley Lab) have built and trained machine learning algorithms to predict defect behavior in certain intermetallic compounds with high accuracy. The researchers believe this method will accelerate research of new advanced alloys and lightweight new materials for applications spanning from automotive to aerospace.

Their results were published in the journal Nature Computational Materials.

Materials are never chemically pure and structurally flawless. They almost always contain defects, which play an important role in dictating their properties. These defects may appear as vacancies, which are essentially 'holes' in the substance's crystal structure, or antisite defects, which are essentially atoms placed on the wrong crystal site. Understanding such point defects is crucial for researchers designing materials because they can have a dramatic effect on long-time structural stability and strength.

Traditionally, researchers have used a computational quantum mechanical method known as density functional calculations to predict what kinds of defects can be formed in a given structure and how they affect the material's properties. Although effective, this approach is very computationally expensive to execute for point defects, limiting the scope of such investigations.

"Density functional calculations work well if you are modeling one small unit, but if you want to make your modeling cell bigger the computational power required to do this increases substantially," said Bharat Medasani, a former Berkeley Lab postdoc and lead author of the paper. "And because it is computationally expensive to model defects in a single material, doing this kind of brute force modeling for tens of thousands of materials is not feasible."

To overcome these computing challenges, Medasani and his colleagues developed and trained machine learning algorithms to predict point defects in intermetallic compounds, focusing on the widely observed B2 crystal structure. Initially, they selected a sample of 100 of these compounds from the Materials Project Database and ran density functional calculations on supercomputers at the National Energy Research Scientific Computing Center (NERSC), a DOE Office of Science User Facility at Berkeley Lab, to identify their defects.

Because they had a small data sample to work from, Medasani and his team used a forest approach called gradient boosting to develop their machine learning method to a high accuracy. In this approach additional machine learning models were built successively and combined with prior models to minimize the difference between the models’ predictions and the results from density functional calculations.

"This work is essentially a proof of concept. It shows that we can run density functional calculations for a few hundred materials, then train machine learning algorithms to accurately predict point defects for a much larger group of materials," said Medasani, who is now a postdoctoral researcher at the Pacific Northwest National Laboratory.

"The benefit of this work is now we have a computationally inexpensive machine learning approach that can quickly and accurately predict point defects in new intermetallic materials " said Andrew Canning, a Berkeley Lab computational scientist and co-author of the paper. "We no longer have to run very costly first principle calculations to identify defect properties for every new metallic compound."

"This tool enables us to predict metallic defects faster and robustly, which will in turn accelerate materials design," added Kristin Persson, a Berkeley Lab scientist and director of the Materials Project, an initiative aimed at drastically reducing the time needed to invent new materials by providing open, web-based access to computed information on known and predicted materials.

As an extension of this work, an open source Python toolkit for modeling point defects in semiconductors and insulators (PyCDT) has been developed.

For more machine learning news, find out How Surfing the Web Improves Machine Learning. For more on material defects, check out Predicting Material Defects with X-Ray Imaging.