Security Analysis of a Federated Learning Framework for Medical Image-to-Image Translation.
Authors
Affiliations (5)
Affiliations (5)
- Institute of Biomedical Engineering, Karlsruhe Institute of Technology, Fritz-Haber-Weg 1, 76131, Karlsruhe, Baden-Württemberg, Germany. [email protected].
- Department of Physics and Astronomy, Heidelberg University, Im Neuenheimer Feld 226, 69120, Heidelberg, Baden-Württemberg, Germany.
- Department of Radiation Oncology, University Hospital Schleswig-Holstein, Feldstrasse 21, 24105, Kiel, Schleswig-Holstein, Germany.
- Department of Experimental and Clinical Medicine, Magna Graecia University, Viale Europa, 88100, Catanzaro, Calabria, Italy.
- Institute of Biomedical Engineering, Karlsruhe Institute of Technology, Fritz-Haber-Weg 1, 76131, Karlsruhe, Baden-Württemberg, Germany.
Abstract
Federated Learning (FL) emerged as a privacy-preserving paradigm for collaborative training of deep learning models across institutions without sharing patient data. This approach has been applied to complex tasks such as medical image-to-image (I2I) translation, including MRI-to-synthetic CT (sCT) generation. However, existing federated I2I frameworks often assume privacy preservation as an inherent property of FL rather than a requirement to be explicitly validated, leaving their robustness to representative adversarial threat scenarios largely unexplored. In this study, we evaluated the vulnerability of a federated MRI-to-sCT translation framework (FedSynthCT-Brain) to three representative attack classes: Deep Leakage from Gradients (DLG), Federated Membership Inference Attack (FedMIA), and data poisoning. The efficacy of corresponding defense mechanisms, such as Secure Aggregation (SecAgg) and Byzantine-robust median aggregation (FedMedian), were assessed. DLG enabled only the recovery of coarse anatomical structures, with no clinically identifiable details (SSIM ≤ 0.16, PSNR ≤ 11 dB) across clients, suggesting limited vulnerability under the evaluated DLG setting. In contrast, FedMIA achieved high membership discrimination, with AUC scores between 0.92 and 0.99, revealing a critical privacy vulnerability. The introduction of SecAgg reduced AUC values to near-random levels (0.23-0.56) across all centers without impacting synthesis quality. Under high-noise poisoning, the standard federated averaging (FedAvg) aggregation rendered the federation inoperative, while FedMedian restored performance close to the no-poisoning baseline in most scenarios, with significant residual degradation in specific center configurations. At low noise levels, the advantage of FedMedian was less consistent, as low-level noise injection may be indistinguishable from natural heterogeneity across centers, potentially enabling stealthy degradation. These findings demonstrate that federated I2I translation frameworks are not inherently secure and require explicit, multi-layered evaluation. As FL is increasingly adopted in clinical workflows, our results underscore the necessity of integrating cryptographic, algorithmic, and infrastructural safeguards for secure deployment.