Brain Disease Diagnosis Using Federated Deep Learning ----------------------------------------------------- Description ---------- This study proposes a Federated Learning (FL) framework to address the challenges of collaborative model training across decentralized institutions without sharing raw data. A hybrid deep learning model combining Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs) is developed to detect glioblastoma brain tumors and predict MGMT gene expression using MRI scans from the BraTS 2021 dataset. The model’s performance is further enhanced through hyperparameter optimization using two recent swarm intelligence algorithms: the Bayesian Search Optimization Algorithm and the Sparrow Search Optimization Algorithm. The FL framework was deployed across ten institutions and achieved performance comparable to models trained on centralized data. The proposed model, BrainGeneDeepNet, attained an accuracy of 0.9758, a loss of 0.0769, an AUC of 0.9980, a recall of 0.9770, and a precision of 0.9782. These results demonstrate the viability of Federated Learning for secure, collaborative medical imaging analysis and biomarker prediction. Dataset Information ------------------- Name: BRaTS 2021 Task 1 Dataset Source: https://www.kaggle.com/datasets/dschettler8845/brats-2021-task1 Description: The dataset comprises ample multi-institutional routine clinically-acquired multi-parametric MRI (mpMRI) scans of glioma, with pathologically confirmed diagnosis and available MGMT promoter methylation status. These scans are used as training, validation, and testing data for the BraTS challenge. For Task 1, the datasets have been updated with many more routine clinically-acquired mpMRI scans since BraTS'20. Ground truth annotations of the tumor sub-regions are created and approved by expert neuroradiologists for every subject to quantitatively evaluate the predicted tumor segmentations. Code Information ---------------- This repository contains the code for the Federated Deep Learning framework for brain disease diagnosis. The main components include: DisplayNIFITI.ipynb: Jupyter notebook for displaying NIfTI images. FedOptAggregationAlgo.py: Python script implementing the Federated Optimization Aggregation Algorithm, likely based on OpenFL. MGMTDATA.ipynb: Jupyter notebook for processing and analyzing MGMT-related data. MGMTModel.ipynb: Jupyter notebook for developing and training the deep learning model for MGMT prediction. nii_reader.py: Python script for reading and processing NIfTI files. NormlizedNIFITI.ipynb: Jupyter notebook for normalizing NIfTI images. Usage Instructions -------------------- Data Preparation: The NIfTI images from the BraTS 2021 Task 1 dataset need to be downloaded. The NormlizedNIFITI.ipynb and nii_reader.py scripts are used for preprocessing, including normalization, of these images. Data Analysis: MGMTDATA.ipynb can be used for initial data exploration and analysis related to MGMT gene expression. Model Training: The deep learning model is developed and trained using MGMTModel.ipynb. This involves combining CNNs and RNNs for tumor detection and MGMT gene expression prediction. Federated Learning Deployment: The FedOptAggregationAlgo.py script, integrated with the OpenFL framework, handles the federated aggregation of models trained across decentralized institutions. Requirements ------------ The project relies on deep learning, federated learning, and swarm intelligence algorithms. Key dependencies include: Python 3.x openfl framework TensorFlow / Keras NumPy Pandas Matplotlib Seaborn nibabel torchio (for NIfTI normalization) Pydicom Scikit-learn Specific versions of libraries might be required for optimal performance and compatibility. Methodology ----------- Raw NIfTI images from the BraTS 2021 Task 1 dataset are initially processed and normalized using NormlizedNIFITI.ipynb and the helper functions in nii_reader.py. Exploratory data analysis and visualization are performed using MGMTDATA.ipynb and DisplayNIFITI.ipynb. A hybrid deep learning model, BrainGeneDeepNet, combining CNNs and RNNs, is developed in MGMTModel.ipynb to detect glioblastoma brain tumors and predict MGMT gene expression. Hyperparameter optimization is performed using Bayesian Search Optimization Algorithm and Sparrow Search Optimization Algorithm. The model is then deployed within a Federated Learning (FL) framework using FedOptAggregationAlgo.py and OpenFL to enable collaborative training across multiple institutions without direct data sharing.