Leveraging an Image-Enhanced Cross-Modal Fusion Network for Radiology Report Generation.

August 11, 2025

papers

DOI: 10.1177/15578666251365959 PMID: 40785548

Authors

Guo Y,Hou X,Liu Z,Zhang Y

Affiliations (1)

School of Information Science and Technology, Dalian Maritime University, Dalian, China.

Abstract

Radiology report generation (RRG) tasks leverage computer-aided technology to automatically produce descriptive text reports for medical images, aiming to ease radiologists' workload, reduce misdiagnosis rates, and lessen the pressure on medical resources. However, previous works have yet to focus on enhancing feature extraction of low-quality images, incorporating cross-modal interaction information, and mitigating latency in report generation. We propose an Image-Enhanced Cross-Modal Fusion Network (IFNet) for automatic RRG to tackle these challenges. IFNet includes three key components. First, the image enhancement module enhances the detailed representation of typical and atypical structures in X-ray images, thereby boosting detection success rates. Second, the cross-modal fusion networks efficiently and comprehensively capture the interactions of cross-modal features. Finally, a more efficient transformer report generation module is designed to optimize report generation efficiency while being suitable for low-resource devices. Experimental results on public datasets IU X-ray and MIMIC-CXR demonstrate that IFNet significantly outperforms the current state-of-the-art methods.

View Source Full Text PDF

Topics

Journal Article

Leveraging an Image-Enhanced Cross-Modal Fusion Network for Radiology Report Generation.

Authors

Affiliations (1)

Abstract

Tags

Topics

Ready to Sharpen Your Edge?