Back to all papers

MSML-DenseXmer: harnessing vision transformers through integration with novel dense networks for medical image fusion.

May 29, 2026pubmed logopapers

Authors

Pathak DM,Jain D,Srivastava S,Kachhava R,Tewari TK,Sharma S

Affiliations (5)

  • Department of Computer Science and Engineering, ABES Engineering College, Ghaziabad, Uttar Pradesh, India.
  • Department of CSE-AIML, ABES Engineering College, Ghaziabad, Uttar Pradesh, India.
  • Computer Science Department, Indian Institute of Information Technology Kota, Kota, Rajasthan, India.
  • CSE and IT Department, Jaypee Institute of Information Technology, Noida, Uttar Pradesh, India.
  • Department of Computer Science and Engineering, Manipal University Jaipur, Jaipur, 303007, Rajasthan, India. [email protected].

Abstract

The integration of multiple modalities in medical imaging allows a thorough representation of structural and functional details, resulting in improved diagnosis and treatment. Deep learning methods outperform conventional methods by automating the extraction of pertinent features and fusing them while preserving both structural and textural integrity. Existing methods lack the ability to capture complex global structures, small-scale textural features, and long-range dependencies, which causes incomplete feature representation. This study presents a novel deep-learning framework that combines an improved DenseNet for capturing local fine- grained features with a Swin Transformer for extracting global structural details and long- range relationships, thereby facilitating a more comprehensive fused output. A modified hybrid approach utilizing L1, L2 and infinity norm is used to generate attention weights in the feature fusion-oriented row-column vector dimension technique. The model is trained on different modalities in the Whole Brain Atlas dataset and the Lung-PET-CT-Dx dataset using a novel loss function. This function improves fusion by integrating pixel loss, structural similarity, and textural preservation. The evaluation of the fused image's quality involves multiple metrics that assess image clarity, structural integrity, feature retention, contrast improvement, and overall visual accuracy, providing a thorough analysis. The MSML-DenseXmer framework demonstrates improved performance compared to existing approaches across multiple medical imaging modalities. Specifically, it achieves over 9.94% rise in MRI-SPECT fusion, above 6.82% gain in MRI-PET fusion, a minimum of 18.37% increase in MRI-CT fusion, and at least 2.83% gain on the Lungs PET-CT dataset, indicating its potential in improving fusion quality.

Topics

Journal Article

Ready to Sharpen Your Edge?

Subscribe to join 11k+ peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.