Federated Learning Project Will Train AI to Detect Brain Tumors Early

According to the National Brain Tumor Society, “An estimated 700,000 Americans are living with a brain tumor. 69.8% of brain tumors are benign, and 30.2% of brain tumors are malignant. An estimated 87,240 people will receive a primary brain tumor diagnosis in 2020. The average survival rate for all malignant brain tumor patients is only 36%.”

(Image courtesy of Vincent Florent and the French Institute of Health and Medical Research.)

Picture A is a preoperative MRI image of a 2-year old boy with a brain tumor, and picture B is a postoperative MRI scan showing a small enhanced tumor remnant, which was then treated with adjuvant chemotherapy. The American Brain Tumor Association (ABTA) states that 4,600 children will be diagnosed with brain tumors in 2020. 

A major effort to leverage federated learning to advance detection of brain tumors was recently announced by Intel Labs. In cooperation with the Perelman School of Medicine at the University of Pennsylvania (Penn Medicine), Intel Labs is joining forces with 29 research and health care institutions to address brain tumor detection by leveraging federated learning among other machine learning techniques. This approach requires a massive amount of data.

Federated learning is currently used by several entities in major industries that require the highest levels of privacy and security, including pharmaceuticals, telecommunications and defense. The technique uses machine learning to train an algorithm over many distributed servers and edge devices that collect and hold data samples, without exchanging data samples between them. This privacy-centric approach is a far cry from the highly centralized approach of most machine learning techniques.

(Image courtesy of Intel.)

Intel Labs and Penn Medicine first brought federated learning to light in this area with the publication and presentation of a 2018 research paper called “Multi-institutional Deep Learning Modeling Without Sharing Patient Data: A Feasibility Study on Brain Tumor Segmentation.”  The paper describes the importance of semantic segmentation of images from large datasets when creating deep learning models. It also addresses the complicated legal, technical and privacy-oriented challenges of creating such a dataset, emphasizing that federated learning could address these issues with a large-scale collaboration and proper expertise. The paper also highlights the fact that a federated learning approach could “train a model to over 99% of the accuracy of a model trained in the traditional, non-private method.”

To address the challenges of verifying the efficacy of federated learning in detection of brain tumors, Penn Medicine created an ongoing initiative called the Brain Tumor Segmentation (BraTS) Challenge . Semantic segmentation is also known as image segmentation. This technique organizes parts of an image into the same object class using a form of pixel-level prediction. In pixel-level prediction, each pixel in an image receives a specific categorical classification.

BraTS 2020 will evaluate segmentation of brain tumors in multimodal magnetic resonance imaging (MRI) scans in four phases:

  1.  Utilizing preoperative MRI scans to focus on segmentation of heterogeneous brain tumors, focusing specifically on gliomas.
  2. Predicting overall survival of patients.
  3. Making a distinction between tumor psuedoprogression (an increase in lesion size following treatment) and true tumor recurrence through integrative analyses of machine learning algorithms and “integrative analyses of radiomic features”.
  4. Evaluating the algorithmic uncertainty of brain tumor segmentation.

Bottom Line

The combined conglomerate includes institutions from India, Switzerland, Germany, Canada, the United States, the UK and the Netherlands. Penn Medicine, Intel Labs and the 29 international health care and research institutions will create a state-of-the-art AI model that is trained on the largest brain tumor dataset. By using federated learning, the group hopes to ensure that all data used to train the model will remain private and local.