Federated Learning for Renal Tumor Segmentation and Classification on Multi-Center MRI Dataset.

Authors

Nguyen DT,Imami M,Zhao LM,Wu J,Borhani A,Mohseni A,Khunte M,Zhong Z,Shi V,Yao S,Wang Y,Loizou N,Silva AC,Zhang PJ,Zhang Z,Jiao Z,Kamel I,Liao WH,Bai H

Affiliations (10)

  • Tufts University School of Medicine, Boston, Massachusetts, USA.
  • Radiology AI Lab, Brown University, Providence, Rhode Island, USA.
  • Department of Radiology and Radiological Science, Johns Hopkins University School of Medicine, Baltimore, Maryland, USA.
  • Department of Radiology, Second Xiangya Hospital of Central South University, Changsha, China.
  • Department of Applied Mathematics and Statistics, Johns Hopkins University, Baltimore, Maryland, USA.
  • Department of Radiology, Mayo Clinic, Rochester, Minnesota, USA.
  • Department of Pathology and Laboratory Medicine, Hospital of the University of Pennsylvania, Philadelphia, Pennsylvania, USA.
  • Department of Diagnostic Imaging, Rhode Island Hospital, Providence, Rhode Island, USA.
  • Department of Radiology, University of Colorado, Boulder, Colorado, USA.
  • Department of Radiology, Xiangya Hospital of Central South University, Changsha, China.

Abstract

Deep learning (DL) models for accurate renal tumor characterization may benefit from multi-center datasets for improved generalizability; however, data-sharing constraints necessitate privacy-preserving solutions like federated learning (FL). To assess the performance and reliability of FL for renal tumor segmentation and classification in multi-institutional MRI datasets. Retrospective multi-center study. A total of 987 patients (403 female) from six hospitals were included for analysis. 73% (723/987) had malignant renal tumors, primarily clear cell carcinoma (n = 509). Patients were split into training (n = 785), validation (n = 104), and test (n = 99) sets, stratified across three simulated institutions. MRI was performed at 1.5 T and 3 T using T2-weighted imaging (T2WI) and contrast-enhanced T1-weighted imaging (CE-T1WI) sequences. FL and non-FL approaches used nnU-Net for tumor segmentation and ResNet for its classification. FL-trained models across three simulated institutional clients with central weight aggregation, while the non-FL approach used centralized training on the full dataset. Segmentation was evaluated using Dice coefficients, and classification between malignant and benign lesions was assessed using accuracy, sensitivity, specificity, and area under the curves (AUCs). FL and non-FL performance was compared using the Wilcoxon test for segmentation Dice and Delong's test for AUC (p < 0.05). No significant difference was observed between FL and non-FL models in segmentation (Dice: 0.43 vs. 0.45, p = 0.202) or classification (AUC: 0.69 vs. 0.64, p = 0.959) on the test set. For classification, no significant difference was observed between the models in accuracy (p = 0.912), sensitivity (p = 0.862), or specificity (p = 0.847) on the test set. FL demonstrated comparable performance to non-FL approaches in renal tumor segmentation and classification, supporting its potential as a privacy-preserving alternative for multi-institutional DL models. 4. Stage 2.

Topics

Journal Article
Get Started

Upload your X-ray image and get interpretation.

Upload now →

Disclaimer: X-ray Interpreter's AI-generated results are for informational purposes only and not a substitute for professional medical advice. Always consult a healthcare professional for medical diagnosis and treatment.