Bridging radiology and pathology: domain-generalized cross-modal learning for clinical.

February 16, 2026

papers

DOI: 10.1038/s41746-026-02423-w PMID: 41699055

Authors

Zhong X,Gu Z,Shanmuganathan M,Li M,Sun H,Du M,Chen Q,Jiang G

Affiliations (8)

Department of General Surgery, The Second Affiliated Hospital of Soochow University, Suzhou, Jiangsu, China.
University of Tabuk, Faculty of Computers and Information Technology, Tabuk, Kingdom of Saudi Arabia.
School of Nano-Tech and Nano-Bionics, University of Science and Technology of China, Hefei, Anhui, China.
CAS Key Laboratory of Nano-Bio Interface, Division of Nanobiomedicine and i-Lab, Suzhou Institute of Nano-Tech and Nano-Bionics, Chinese Academy of Sciences, Suzhou, Jiangsu, China.
Wolfson Institute for Biomedical Research, UCL, University College London, London, London, UK. [email protected].
CAS Key Laboratory of Nano-Bio Interface, Division of Nanobiomedicine and i-Lab, Suzhou Institute of Nano-Tech and Nano-Bionics, Chinese Academy of Sciences, Suzhou, Jiangsu, China. [email protected].
Medical Science and Technology Innovation Center, The Affiliated Suzhou Hospital of Nanjing Medical University, Suzhou Municipal Hospital, Gusu School of Nanjing Medical University, Suzhou, Jiangsu, China. [email protected].
Department of General Surgery, The Second Affiliated Hospital of Soochow University, Suzhou, Jiangsu, China. [email protected].

Abstract

Reliable interpretation of clinical imaging requires integrating complementary evidence across modalities, yet most AI systems remain limited by single-modality analysis and poor generalization across institutions. We propose a unified cross-modal framework that bridges mammography and histopathology for breast cancer diagnosis through: (1) a shared vision transformer encoder with lightweight modality-specific adapters, (2) a weakly supervised patient-level contrastive alignment module that learns cross-modal correspondences without pixel-level supervision, (3) domain generalization strategies combining MixStyle augmentation and invariant risk minimization, and (4) causal test-time adaptation for unseen target domains. The model jointly addresses classification, lesion localization, and pathological grading while generating reasoning-guided attention maps that explicitly link suspicious mammographic regions with corresponding histopathological evidence. Evaluated on four public benchmarks (CBIS-DDSM, INbreast, BACH, CAMELYON16/17), the framework consistently outperforms state-of-the-art unimodal, multimodal, and domain generalization baselines, achieving mean AUC of 0.90 under rigorous leave-one-domain-out evaluation and substantially smaller domain gaps (0.03 vs. 0.06-0.10). Visualization and interpretability analyses further confirm that predictions align with clinically meaningful features, supporting transparency and trust. By advancing multimodal integration, cross-institutional robustness, and explainability, this study represents a step toward clinically deployable AI systems for diagnostic decision support.

View Source Full Text PDF

Topics

Journal Article

Bridging radiology and pathology: domain-generalized cross-modal learning for clinical.

Authors

Affiliations (8)

Abstract

Tags

Topics

Ready to Sharpen Your Edge?