Back to all papers

A multi-modal deep learning framework for enhanced breast cancer diagnosis using mammograms and clinical data.

May 3, 2026pubmed logopapers

Authors

Ibrahim AM,Li J,Akhtar F,Chaudhary AA,Ali MAM,Khan H,Osman M

Affiliations (6)

  • Faculty of Information Technology, Beijing University of Technology, Beijing, 100124, China.
  • Faculty of Information Technology, Beijing University of Technology, Beijing, 100124, China. [email protected].
  • Department of Computer Science, Sukkur IBA University, Sukkur, 65200, Pakistan. [email protected].
  • Department of Biology, College of Science, Imam Mohammad Ibn Saud Islamic University (IMSIU), Riyadh, Saudi Arabia. [email protected].
  • Department of Biology, College of Science, Imam Mohammad Ibn Saud Islamic University (IMSIU), Riyadh, Saudi Arabia.
  • Department of Electrical and Electronic Engineering, College of Engineering and Computer Sciences, Jazan University, Jazan, Saudi Arabia.

Abstract

The fatality due to breast cancer is one of the most universal causes among women worldwide. The limited interpretation is due to tissue overlapping, high breast density and delicate lesion characteristics even though mammography is the standard screening tool. To overcome this universal challenges, we propose a multi-model deep learning framework that integrates mammographic image with structured clinical data (note: synthetic clinical variables were employed based on the known distributions of the risk factors due to the lack of real paired data) to achieve the accuracy in the diagnostics. Architectural developments establish promising directions for future multi-modal tactic with reliable clinical data. The architecture utilize a hierarchical fusion approach that incorporates with cross model attention mechanism and dynamic modality weighting to efficiently integrate heterogeneous features. Moreover, a region of interest care module is integrated to highlight crucial anatomical regions in mammograms. Experimental evaluations on a benchmark dataset validate a classification accuracy of 94.6% (95% CI: 92.8-96.4%) and an AUC of 0.963 (95% CI: 0.951-0.975), expressively overtaking prevailing baseline models (p < 0.01). These results suggested the significant potential of projected architectural framework as a proof-of-concept for steadfast and easily explainable clinical decision supporting tools.

Topics

Journal Article

Ready to Sharpen Your Edge?

Subscribe to join 11k+ peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.