Addressing data heterogeneity in distributed medical imaging with heterosync learning.
Authors
Affiliations (10)
Affiliations (10)
- Department of Medical Ultrasonics, Institute of Diagnostic and Interventional Ultrasound, the First Affiliated Hospital of Sun Yat-sen University, Guangzhou, China.
- School of Physics and Electronic Information, Guangxi Minzu University, Nanning, China.
- Department of Medical Ultrasound, the First Affiliated Hospital of Guangxi Medical University, Nanning, China.
- Department of Medical Ultrasonics, the Sixth Affiliated Hospital of Sun Yat-sen University (Guangdong Gastrointestinal Hospital), Guangzhou, China.
- Research Center of Big Data and Artificial Intelligence for Medicine, the First Affiliated Hospital of Sun Yat-Sen University, Guangzhou, China.
- Center of Hepato-Pancreato-Biliary Surgery, the First Affiliated Hospital of Sun Yat-sen University, Guangzhou, China.
- School of Artificial Intelligence, Optics and Electronics (iOPEN), Northwestern Polytechnical University, Xi'an, China. [email protected].
- Department of Medical Ultrasonics, Institute of Diagnostic and Interventional Ultrasound, the First Affiliated Hospital of Sun Yat-sen University, Guangzhou, China. [email protected].
- Center of Hepato-Pancreato-Biliary Surgery, the First Affiliated Hospital of Sun Yat-sen University, Guangzhou, China. [email protected].
- Department of Medical Ultrasonics, Institute of Diagnostic and Interventional Ultrasound, the First Affiliated Hospital of Sun Yat-sen University, Guangzhou, China. [email protected].
Abstract
Data heterogeneity critically limits distributed artificial intelligence (AI) in medical imaging. We propose HeteroSync Learning (HSL), a privacy-preserving framework that addresses heterogeneity through: (1) Shared Anchor Task (SAT) for cross-node representation alignment, and (2) an Auxiliary Learning Architecture coordinating SAT with local primary tasks. Validated via large-scale simulations (feature/label/quantity/combined heterogeneity) and a real-world multi-center thyroid cancer study, HSL outperforms local learning, 12 benchmark methods (FedAvg, FedProx, SplitAVG, FedRCL, FedCOME, etc.), and foundation models (e.g., CLIP) by better stability and up to 40% in area under the curve (AUC), matching central learning performance. HSL achieves 0.846 AUC on the out-of-distribution pediatric thyroid cancer data (outperforming others by 5.1-28.2%), demonstrating superior generalization. Visualizations confirm HSL successfully homogenizes heterogeneous distributions. This work provides an effective solution for distributed medical AI, enabling equitable collaboration across institutions and advancing healthcare AI democratization.