Cross-Fusion Adaptive Feature Enhancement Transformer: Efficient high-frequency integration and sparse attention enhancement for brain MRI super-resolution.
Authors
Affiliations (6)
Affiliations (6)
- School of Artificial Intelligence, Chongqing University of Technology, 401135, Chongqing, China. Electronic address: [email protected].
- School of Artificial Intelligence, Chongqing University of Technology, 401135, Chongqing, China. Electronic address: [email protected].
- College of River and Ocean Engineering, Chongqing Jiaotong University, 402247, Chongqing, China. Electronic address: [email protected].
- School of Artificial Intelligence, Chongqing University of Technology, 401135, Chongqing, China. Electronic address: [email protected].
- School of Artificial Intelligence, Chongqing University of Technology, 401135, Chongqing, China. Electronic address: [email protected].
- School of Artificial Intelligence, Chongqing University of Technology, 401135, Chongqing, China. Electronic address: [email protected].
Abstract
High-resolution magnetic resonance imaging (MRI) is essential for diagnosing and treating brain diseases. Transformer-based approaches demonstrate strong potential in MRI super-resolution by capturing long-range dependencies effectively. However, existing Transformer-based super-resolution methods face several challenges: (1) they primarily focus on low-frequency information, neglecting the utilization of high-frequency information; (2) they lack effective mechanisms to integrate both low-frequency and high-frequency information; (3) they struggle to effectively eliminate redundant information during the reconstruction process. To address these issues, we propose the Cross-fusion Adaptive Feature Enhancement Transformer (CAFET). Our model maximizes the potential of both CNNs and Transformers. It consists of four key blocks: a high-frequency enhancement block for extracting high-frequency information; a hybrid attention block for capturing global information and local fitting, which includes channel attention and shifted rectangular window attention; a large-window fusion attention block for integrating local high-frequency features and global low-frequency features; and an adaptive sparse overlapping attention block for dynamically retaining key information and enhancing the aggregation of cross-window features. Extensive experiments validate the effectiveness of the proposed method. On the BraTS and IXI datasets, with an upsampling factor of ×2, the proposed method achieves a maximum PSNR improvement of 2.4 dB and 1.3 dB compared to state-of-the-art methods, along with an SSIM improvement of up to 0.16% and 1.42%. Similarly, at an upsampling factor of ×4, the proposed method achieves a maximum PSNR improvement of 1.04 dB and 0.3 dB over the current leading methods, along with an SSIM improvement of up to 0.25% and 1.66%. Our method is capable of reconstructing high-quality super-resolution brain MRI images, demonstrating significant clinical potential.