Federated Learning for Medical Image Classification: A Comprehensive Benchmark.
Authors
Abstract
The federated learning (FL) paradigm is well-suited for the field of medical image analysis, as it can effectively cope with machine learning on isolated multi-center data while protecting the privacy of participating parties. However, current research on optimization algorithms in FL often focuses on limited datasets and scenarios, primarily centered around natural images, with insufficient comparative experiments in medical contexts. In this work, we conduct a comprehensive evaluation of several state-of-the-art FL algorithms in the context of medical imaging. We conduct a fair comparison of classification models trained using various FL algorithms across multiple medical imaging datasets. Additionally, we evaluate system performance metrics, such as communication cost and computational efficiency, while considering different FL architectures. Our findings show that medical imaging datasets pose substantial challenges for current FL optimization algorithms. No single algorithm consistently delivers optimal performance across all medical FL scenarios, and many optimization algorithms may under-perform when applied to these datasets. Our experiments provide a benchmark and guidance for future research and application of FL in medical imaging contexts. Furthermore, we propose an efficient and robust method that combines generative techniques using denoising diffusion probabilistic models with label smoothing to augment datasets, widely enhancing the performance of FL on classification tasks across various medical imaging datasets.