Federated Privacy-Preserving Multi-Modal Deep Learning for Breast Cancer Diagnosis: A Physics-Aware Approach.
Authors
Affiliations (3)
Affiliations (3)
- Department of Computer Engineering, Faculty of Engineering, Istanbul Aydin University, Istanbul 34295, Turkey.
- Defne Telekomünikasyon A.Ş., Maslak Mahallesi, Maslak Meydan Sokak, Spring Giz Plaza, No: 5, İç Kapı: 37, Kat: 9, Sarıyer, Istanbul 34485, Turkey.
- Department of Computer Engineering, Faculty of Engineering and Natural Sciences, Atlas University, Anadolu Caddesi No: 40, Kağıthane, Istanbul 34408, Turkey.
Abstract
<b>Background/Objectives:</b> Breast cancer remains a leading cause of cancer-related mortality among women worldwide. This study presents a systematically justified multi-modal breast cancer classification pipeline that combines established, physically motivated preprocessing operations, modality-specific deep learning models, late-fusion inference, and a deployment-aware federated learning evaluation. Rather than introducing new image restoration or federated optimization algorithms, this work formalizes how standard preprocessing methods can be organized according to the dominant degradation characteristics of ultrasound, MRI, and mammography, and evaluates their contribution under centralized and simulated federated learning settings. <b>Methods:</b> Patient-wise stratified five-fold cross-validation was applied across ultrasound (BUSI, n=780), dynamic contrast-enhanced MRI (DUKE, n=922), and mammography (CBIS-DDSM, n=400). A five-algorithm federated learning comparison, including FedAvg, FedProx, SCAFFOLD, FedNova, and FP16-FedAvg, was conducted under IID and non-IID conditions using a Dirichlet distribution with α=0.5. The evaluation reports diagnostic performance together with per-round training time, communication time, latency-related measurements, and cumulative bandwidth. Ablation experiments, McNemar's test, Cohen's <i>h</i> effect sizes, and confidence intervals were used to support the analysis. <b>Results:</b> Per-modality models achieved 92.50 ± 1.2%, 90.63 ± 1.5%, and 92.00 ± 1.3% accuracy for ultrasound, MRI, and mammography, respectively, with statistically significant improvements over the corresponding baselines according to McNemar's test (p<0.05). Weighted late fusion achieved 93.10 ± 1.1% accuracy and improved performance compared with the best individual modality (p=0.031). FP16 transmission reduced cumulative bandwidth from 8.14 GB to 1.23 GB (-84.9%) without a statistically significant performance difference compared with FP32 transmission (p=0.74), while SCAFFOLD achieved the highest non-IID accuracy (90.50%). <b>Conclusions:</b> The findings demonstrate internal technical validity and deployment-relevant trade-offs, but they should be interpreted cautiously because the federated evaluation is simulation-based, key-slice extraction may require annotation-assisted assumptions, and external multi-center validation remains necessary before clinical deployment. Reported improvements are statistically significant in several comparisons, but corresponding Cohen's <i>h</i> effect sizes are small, and clinical meaningfulness requires independent validation rather than inference from <i>p</i>-values alone.