Sort by:
Page 14 of 22215 results

CineMA: A Foundation Model for Cine Cardiac MRI

Yunguan Fu, Weixi Yi, Charlotte Manisty, Anish N Bhuva, Thomas A Treibel, James C Moon, Matthew J Clarkson, Rhodri Huw Davies, Yipeng Hu

arxiv logopreprintMay 31 2025
Cardiac magnetic resonance (CMR) is a key investigation in clinical cardiovascular medicine and has been used extensively in population research. However, extracting clinically important measurements such as ejection fraction for diagnosing cardiovascular diseases remains time-consuming and subjective. We developed CineMA, a foundation AI model automating these tasks with limited labels. CineMA is a self-supervised autoencoder model trained on 74,916 cine CMR studies to reconstruct images from masked inputs. After fine-tuning, it was evaluated across eight datasets on 23 tasks from four categories: ventricle and myocardium segmentation, left and right ventricle ejection fraction calculation, disease detection and classification, and landmark localisation. CineMA is the first foundation model for cine CMR to match or outperform convolutional neural networks (CNNs). CineMA demonstrated greater label efficiency than CNNs, achieving comparable or better performance with fewer annotations. This reduces the burden of clinician labelling and supports replacing task-specific training with fine-tuning foundation models in future cardiac imaging applications. Models and code for pre-training and fine-tuning are available at https://github.com/mathpluscode/CineMA, democratising access to high-performance models that otherwise require substantial computational resources, promoting reproducibility and accelerating clinical translation.

Fully automated measurement of aortic pulse wave velocity from routine cardiac MRI studies.

Jiang Y, Yao T, Paliwal N, Knight D, Punjabi K, Steeden J, Hughes AD, Muthurangu V, Davies R

pubmed logopapersMay 30 2025
Aortic pulse wave velocity (PWV) is a prognostic biomarker for cardiovascular disease, which can be measured by dividing the aortic path length by the pulse transit time. However, current MRI techniques require special sequences and time-consuming manual analysis. We aimed to fully automate the process using deep learning to measure PWV from standard sequences, facilitating PWV measurement in routine clinical and research scans. A deep learning (DL) model was developed to generate high-resolution 3D aortic segmentations from routine 2D trans-axial SSFP localizer images, and the centerlines of the resulting segmentations were used to estimate the aortic path length. A further DL model was built to automatically segment the ascending and descending aorta in phase contrast images, and pulse transit time was estimated from the sampled flow curves. Quantitative comparison with trained observers was performed for path length, aortic flow segmentation and transit time, either using an external clinical dataset with both localizers and paired 3D images acquired or on a sample of UK Biobank subjects. Potential application to clinical research scans was evaluated on 1053 subjects from the UK Biobank. Aortic path length measurement was accurate with no major difference between the proposed method (125 ± 19 mm) and manual measurement by a trained observer (124 ± 19 mm) (P = 0.88). Automated phase contrast image segmentation was similar to that of a trained observer for both the ascending (Dice vs manual: 0.96) and descending (Dice 0.89) aorta with no major difference in transit time estimation (proposed method = 21 ± 9 ms, manual = 22 ± 9 ms; P = 0.15). 966 of 1053 (92 %) UK Biobank subjects were successfully analyzed, with a median PWV of 6.8 m/s, increasing 27 % per decade of age and 6.5 % higher per 10 mmHg higher systolic blood pressure. We describe a fully automated method for measuring PWV from standard cardiac MRI localizers and a single phase contrast imaging plane. The method is robust and can be applied to routine clinical scans, and could unlock the potential of measuring PWV in large-scale clinical and population studies. All models and deployment codes are available online.

ACM-UNet: Adaptive Integration of CNNs and Mamba for Efficient Medical Image Segmentation

Jing Huang, Yongkang Zhao, Yuhan Li, Zhitao Dai, Cheng Chen, Qiying Lai

arxiv logopreprintMay 30 2025
The U-shaped encoder-decoder architecture with skip connections has become a prevailing paradigm in medical image segmentation due to its simplicity and effectiveness. While many recent works aim to improve this framework by designing more powerful encoders and decoders, employing advanced convolutional neural networks (CNNs) for local feature extraction, Transformers or state space models (SSMs) such as Mamba for global context modeling, or hybrid combinations of both, these methods often struggle to fully utilize pretrained vision backbones (e.g., ResNet, ViT, VMamba) due to structural mismatches. To bridge this gap, we introduce ACM-UNet, a general-purpose segmentation framework that retains a simple UNet-like design while effectively incorporating pretrained CNNs and Mamba models through a lightweight adapter mechanism. This adapter resolves architectural incompatibilities and enables the model to harness the complementary strengths of CNNs and SSMs-namely, fine-grained local detail extraction and long-range dependency modeling. Additionally, we propose a hierarchical multi-scale wavelet transform module in the decoder to enhance feature fusion and reconstruction fidelity. Extensive experiments on the Synapse and ACDC benchmarks demonstrate that ACM-UNet achieves state-of-the-art performance while remaining computationally efficient. Notably, it reaches 85.12% Dice Score and 13.89mm HD95 on the Synapse dataset with 17.93G FLOPs, showcasing its effectiveness and scalability. Code is available at: https://github.com/zyklcode/ACM-UNet.

Bidirectional Projection-Based Multi-Modal Fusion Transformer for Early Detection of Cerebral Palsy in Infants.

Qi K, Huang T, Jin C, Yang Y, Ying S, Sun J, Yang J

pubmed logopapersMay 30 2025
Periventricular white matter injury (PWMI) is the most frequent magnetic resonance imaging (MRI) finding in infants with Cerebral Palsy (CP). We aim to detect CP and identify subtle, sparse PWMI lesions in infants under two years of age with immature brain structures. Based on the characteristic that the responsible lesions are located within five target regions, we first construct a multi-modal dataset including 243 cases with the mask annotations of five target regions for delineating anatomical structures on T1-Weighted Imaging (T1WI) images, masks for lesions on T2-Weighted Imaging (T2WI) images, and categories (CP or Non-CP). Furthermore, we develop a bidirectional projection-based multi-modal fusion transformer (BiP-MFT), incorporating a Bidirectional Projection Fusion Module (BPFM) for integrating the features between five target regions on T1WI images and lesions on T2WI images. Our BiP-MFT achieves subject-level classification accuracy of 0.90, specificity of 0.87, and sensitivity of 0.94. It surpasses the best results of nine comparative methods, with 0.10, 0.08, and 0.09 improvements in classification accuracy, specificity and sensitivity respectively. Our BPFM outperforms eight compared feature fusion strategies using Transformer and U-Net backbones on our dataset. Ablation studies on the dataset annotations and model components justify the effectiveness of our annotation method and the model rationality. The proposed dataset and codes are available at https://github.com/Kai-Qi/BiP-MFT.

Deep Modeling and Optimization of Medical Image Classification

Yihang Wu, Muhammad Owais, Reem Kateb, Ahmad Chaddad

arxiv logopreprintMay 29 2025
Deep models, such as convolutional neural networks (CNNs) and vision transformer (ViT), demonstrate remarkable performance in image classification. However, those deep models require large data to fine-tune, which is impractical in the medical domain due to the data privacy issue. Furthermore, despite the feasible performance of contrastive language image pre-training (CLIP) in the natural domain, the potential of CLIP has not been fully investigated in the medical field. To face these challenges, we considered three scenarios: 1) we introduce a novel CLIP variant using four CNNs and eight ViTs as image encoders for the classification of brain cancer and skin cancer, 2) we combine 12 deep models with two federated learning techniques to protect data privacy, and 3) we involve traditional machine learning (ML) methods to improve the generalization ability of those deep models in unseen domain data. The experimental results indicate that maxvit shows the highest averaged (AVG) test metrics (AVG = 87.03\%) in HAM10000 dataset with multimodal learning, while convnext\_l demonstrates remarkable test with an F1-score of 83.98\% compared to swin\_b with 81.33\% in FL model. Furthermore, the use of support vector machine (SVM) can improve the overall test metrics with AVG of $\sim 2\%$ for swin transformer series in ISIC2018. Our codes are available at https://github.com/AIPMLab/SkinCancerSimulation.

ADC-MambaNet: A Lightweight U-Shaped Architecture with Mamba and Multi-Dimensional Priority Attention for Medical Image Segmentation.

Nguyen TN, Ho QH, Nguyen VQ, Pham VT, Tran TT

pubmed logopapersMay 29 2025
Medical image segmentation is becoming a growing crucial step in assisting with disease detection and diagnosis. However, medical images often exhibit complex structures and textures, resulting in the need for highly complex methods. Particularly, when Deep Learning methods are utilized, they often require large-scale pretraining, leading to significant memory demands and increased computational costs. The well-known Convolutional Neural Networks (CNNs) have become the backbone of medical image segmentation tasks thanks to their effective feature extraction abilities. However, they often struggle to capture global context due to the limited sizes of their kernels. To address this, various Transformer-based models have been introduced to learn long-range dependencies through self-attention mechanisms. However, these architectures typically incur relatively high computational complexity.
Methods: To address the aforementioned challenges, we propose a lightweight and computationally efficient model named ADC-MambaNet, which combines the conventional Depthwise Convolutional layers with the Mamba algorithm that can address the computational complexity of Transformers. In the proposed model, a new feature extractor named Harmonious Mamba-Convolution (HMC) block, and the Multi-Dimensional Priority Attention (MDPA) block have been designed. These blocks enhance the feature extraction process, thereby improving the overall performance of the model. In particular, the mechanisms enable the model to effectively capture local and global patterns from the feature maps while keeping the computational costs low. A novel loss function called the Balanced Normalized Cross Entropy is also introduced, bringing promising performance compared to other losses. Evaluations on five public medical image datasets: ISIC 2018 Lesion Segmentation, PH2, Data Science Bowl 2018, GlaS, and Lung X-ray demonstrate that ADC-MambaNet achieves higher evaluation scores while maintaining compact parameters and low computational complexity.
Conclusion: ADC-MambaNet offers a promising solution for accurate and efficient medical image segmentation, especially in resource-limited or edge-computing environments. Implementation code will be publicly accessible at: https://github.com/nqnguyen812/mambaseg-model.

Gaussian random fields as an abstract representation of patient metadata for multimodal medical image segmentation.

Cassidy B, McBride C, Kendrick C, Reeves ND, Pappachan JM, Raad S, Yap MH

pubmed logopapersMay 29 2025
Growing rates of chronic wound occurrence, especially in patients with diabetes, has become a recent concerning trend. Chronic wounds are difficult and costly to treat, and have become a serious burden on health care systems worldwide. Innovative deep learning methods for the detection and monitoring of such wounds have the potential to reduce the impact to patients and clinicians. We present a novel multimodal segmentation method which allows for the introduction of patient metadata into the training workflow whereby the patient data are expressed as Gaussian random fields. Our results indicate that the proposed method improved performance when utilising multiple models, each trained on different metadata categories. Using the Diabetic Foot Ulcer Challenge 2022 test set, when compared to the baseline results (intersection over union = 0.4670, Dice similarity coefficient = 0.5908) we demonstrate improvements of +0.0220 and +0.0229 for intersection over union and Dice similarity coefficient respectively. This paper presents the first study to focus on integrating patient data into a chronic wound segmentation workflow. Our results show significant performance gains when training individual models using specific metadata categories, followed by average merging of prediction masks using distance transforms. All source code for this study is available at: https://github.com/mmu-dermatology-research/multimodal-grf.

Interpreting Chest X-rays Like a Radiologist: A Benchmark with Clinical Reasoning

Jinquan Guan, Qi Chen, Lizhou Liang, Yuhang Liu, Vu Minh Hieu Phan, Minh-Son To, Jian Chen, Yutong Xie

arxiv logopreprintMay 29 2025
Artificial intelligence (AI)-based chest X-ray (CXR) interpretation assistants have demonstrated significant progress and are increasingly being applied in clinical settings. However, contemporary medical AI models often adhere to a simplistic input-to-output paradigm, directly processing an image and an instruction to generate a result, where the instructions may be integral to the model's architecture. This approach overlooks the modeling of the inherent diagnostic reasoning in chest X-ray interpretation. Such reasoning is typically sequential, where each interpretive stage considers the images, the current task, and the contextual information from previous stages. This oversight leads to several shortcomings, including misalignment with clinical scenarios, contextless reasoning, and untraceable errors. To fill this gap, we construct CXRTrek, a new multi-stage visual question answering (VQA) dataset for CXR interpretation. The dataset is designed to explicitly simulate the diagnostic reasoning process employed by radiologists in real-world clinical settings for the first time. CXRTrek covers 8 sequential diagnostic stages, comprising 428,966 samples and over 11 million question-answer (Q&A) pairs, with an average of 26.29 Q&A pairs per sample. Building on the CXRTrek dataset, we propose a new vision-language large model (VLLM), CXRTrekNet, specifically designed to incorporate the clinical reasoning flow into the VLLM framework. CXRTrekNet effectively models the dependencies between diagnostic stages and captures reasoning patterns within the radiological context. Trained on our dataset, the model consistently outperforms existing medical VLLMs on the CXRTrek benchmarks and demonstrates superior generalization across multiple tasks on five diverse external datasets. The dataset and model can be found in our repository (https://github.com/guanjinquan/CXRTrek).

DeepChest: Dynamic Gradient-Free Task Weighting for Effective Multi-Task Learning in Chest X-ray Classification

Youssef Mohamed, Noran Mohamed, Khaled Abouhashad, Feilong Tang, Sara Atito, Shoaib Jameel, Imran Razzak, Ahmed B. Zaky

arxiv logopreprintMay 29 2025
While Multi-Task Learning (MTL) offers inherent advantages in complex domains such as medical imaging by enabling shared representation learning, effectively balancing task contributions remains a significant challenge. This paper addresses this critical issue by introducing DeepChest, a novel, computationally efficient and effective dynamic task-weighting framework specifically designed for multi-label chest X-ray (CXR) classification. Unlike existing heuristic or gradient-based methods that often incur substantial overhead, DeepChest leverages a performance-driven weighting mechanism based on effective analysis of task-specific loss trends. Given a network architecture (e.g., ResNet18), our model-agnostic approach adaptively adjusts task importance without requiring gradient access, thereby significantly reducing memory usage and achieving a threefold increase in training speed. It can be easily applied to improve various state-of-the-art methods. Extensive experiments on a large-scale CXR dataset demonstrate that DeepChest not only outperforms state-of-the-art MTL methods by 7% in overall accuracy but also yields substantial reductions in individual task losses, indicating improved generalization and effective mitigation of negative transfer. The efficiency and performance gains of DeepChest pave the way for more practical and robust deployment of deep learning in critical medical diagnostic applications. The code is publicly available at https://github.com/youssefkhalil320/DeepChest-MTL

ImmunoDiff: A Diffusion Model for Immunotherapy Response Prediction in Lung Cancer

Moinak Bhattacharya, Judy Huang, Amna F. Sher, Gagandeep Singh, Chao Chen, Prateek Prasanna

arxiv logopreprintMay 29 2025
Accurately predicting immunotherapy response in Non-Small Cell Lung Cancer (NSCLC) remains a critical unmet need. Existing radiomics and deep learning-based predictive models rely primarily on pre-treatment imaging to predict categorical response outcomes, limiting their ability to capture the complex morphological and textural transformations induced by immunotherapy. This study introduces ImmunoDiff, an anatomy-aware diffusion model designed to synthesize post-treatment CT scans from baseline imaging while incorporating clinically relevant constraints. The proposed framework integrates anatomical priors, specifically lobar and vascular structures, to enhance fidelity in CT synthesis. Additionally, we introduce a novel cbi-Adapter, a conditioning module that ensures pairwise-consistent multimodal integration of imaging and clinical data embeddings, to refine the generative process. Additionally, a clinical variable conditioning mechanism is introduced, leveraging demographic data, blood-based biomarkers, and PD-L1 expression to refine the generative process. Evaluations on an in-house NSCLC cohort treated with immune checkpoint inhibitors demonstrate a 21.24% improvement in balanced accuracy for response prediction and a 0.03 increase in c-index for survival prediction. Code will be released soon.
Page 14 of 22215 results
Show
per page

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.