• 43007 Tarragona, Spain

Title of Project : Privacy preserving FL pipeline for multi a multi-institutional collaboration
Student Name: Faisal Ahmed‚Äč      

To develop ML/DL models for medical image analysis based on a FL approach by unlocking the main barriers of multiinstitutional collaborations: (a) non-IID local data sampling; (b) differences in image acquisition protocols and labeling methodologies across institutions; and (c) privacy and security concerns and test of the obtained ML/DL models for radiological and pathological image analysis for BC relapse prediction. To reach these objectives, the following tasks will be performed: (1) Review of the learning framework procedures required to set up FL in multi-institutional collaborations (i.e., parallel training, institutional incremental learning (IIL), cyclic institutional incremental learning (CIIL)), and select the most appropriate one for the BosomShield project purpose. (2) In case of non-IID local data, define a mathematically based criterion of the optimal training iteration from which each user would start the improvement of his individual model (personalization phase) on the basis of the shared model trained by all users (collaboration phase) in order to uniformize the individual model performance regardless of whether local data are IID or non-IID. (3) Investigate privacy-preserving federated learning algorithms in order to ensure that medical/personal data cannot be reconstructed by the model manager or an external intruder, design security mechanisms to prevent model poisoning by malicious participants, and make sure the model accuracy does not suffer from privacy and security defenses. (4) Evaluate the performance of the developed FL model on the classification of molecular subtypes of BC developed by DC1 (URV) ---we will use the publicly INbreast (115 cases) and DDSM (1168 cases) datasets to build up the shared model and the multi-institutional radiological images from our partners HUSJR, KTH, and UKCM to update the weights of shared model; the resulting model will be also tested on unseen training data from RADC.(6) Apply the developed federated learning procedure to pathological images using the DL models of the others DCs as shared models to predict the BC relapse.

The main outcome of this project is to provide a modularized federated learning framework useful in the real world of healthcare that is able to 1) train high performing and generalizable DL models in healthcare, mainly for BC relapse prediction, without private identifiable data exchanging hands; 2) achieve security and privacy in federated learning systems to mitigate model attacks, data reconstruction and disguised local model training; 3) fine-tuned DL models exploiting the FL paradigm to classify molecular subtypes of BC using radiological images and to predict the relapse using radiological and pathological images; and 4) boost the performance of CAD systems for hospitals with smaller datasets.