Sort by:
Page 27 of 35350 results

Scale-Aware Super-Resolution Network With Dual Affinity Learning for Lesion Segmentation From Medical Images.

Luo L, Li Y, Chai Z, Lin H, Heng PA, Chen H

pubmed logopapersJun 1 2025
Convolutional neural networks (CNNs) have shown remarkable progress in medical image segmentation. However, the lesion segmentation remains a challenge to state-of-the-art CNN-based algorithms due to the variance in scales and shapes. On the one hand, tiny lesions are hard to delineate precisely from the medical images which are often of low resolutions. On the other hand, segmenting large-size lesions requires large receptive fields, which exacerbates the first challenge. In this article, we present a scale-aware super-resolution (SR) network to adaptively segment lesions of various sizes from low-resolution (LR) medical images. Our proposed network contains dual branches to simultaneously conduct lesion mask SR (LMSR) and lesion image SR (LISR). Meanwhile, we introduce scale-aware dilated convolution (SDC) blocks into the multitask decoders to adaptively adjust the receptive fields of the convolutional kernels according to the lesion sizes. To guide the segmentation branch to learn from richer high-resolution (HR) features, we propose a feature affinity (FA) module and a scale affinity (SA) module to enhance the multitask learning of the dual branches. On multiple challenging lesion segmentation datasets, our proposed network achieved consistent improvements compared with other state-of-the-art methods. Code will be available at: https://github.com/poiuohke/SASR_Net.

CineMA: A Foundation Model for Cine Cardiac MRI

Yunguan Fu, Weixi Yi, Charlotte Manisty, Anish N Bhuva, Thomas A Treibel, James C Moon, Matthew J Clarkson, Rhodri Huw Davies, Yipeng Hu

arxiv logopreprintMay 31 2025
Cardiac magnetic resonance (CMR) is a key investigation in clinical cardiovascular medicine and has been used extensively in population research. However, extracting clinically important measurements such as ejection fraction for diagnosing cardiovascular diseases remains time-consuming and subjective. We developed CineMA, a foundation AI model automating these tasks with limited labels. CineMA is a self-supervised autoencoder model trained on 74,916 cine CMR studies to reconstruct images from masked inputs. After fine-tuning, it was evaluated across eight datasets on 23 tasks from four categories: ventricle and myocardium segmentation, left and right ventricle ejection fraction calculation, disease detection and classification, and landmark localisation. CineMA is the first foundation model for cine CMR to match or outperform convolutional neural networks (CNNs). CineMA demonstrated greater label efficiency than CNNs, achieving comparable or better performance with fewer annotations. This reduces the burden of clinician labelling and supports replacing task-specific training with fine-tuning foundation models in future cardiac imaging applications. Models and code for pre-training and fine-tuning are available at https://github.com/mathpluscode/CineMA, democratising access to high-performance models that otherwise require substantial computational resources, promoting reproducibility and accelerating clinical translation.

ABCDEFGH: An Adaptation-Based Convolutional Neural Network-CycleGAN Disease-Courses Evolution Framework Using Generative Models in Health Education

Ruiming Min, Minghao Liu

arxiv logopreprintMay 31 2025
With the advancement of modern medicine and the development of technologies such as MRI, CT, and cellular analysis, it has become increasingly critical for clinicians to accurately interpret various diagnostic images. However, modern medical education often faces challenges due to limited access to high-quality teaching materials, stemming from privacy concerns and a shortage of educational resources (Balogh et al., 2015). In this context, image data generated by machine learning models, particularly generative models, presents a promising solution. These models can create diverse and comparable imaging datasets without compromising patient privacy, thereby supporting modern medical education. In this study, we explore the use of convolutional neural networks (CNNs) and CycleGAN (Zhu et al., 2017) for generating synthetic medical images. The source code is available at https://github.com/mliuby/COMP4211-Project.

ABCDEFGH: An Adaptation-Based Convolutional Neural Network-CycleGAN Disease-Courses Evolution Framework Using Generative Models in Health Education

Ruiming Min, Minghao Liu

arxiv logopreprintMay 31 2025
With the advancement of modern medicine and the development of technologies such as MRI, CT, and cellular analysis, it has become increasingly critical for clinicians to accurately interpret various diagnostic images. However, modern medical education often faces challenges due to limited access to high-quality teaching materials, stemming from privacy concerns and a shortage of educational resources (Balogh et al., 2015). In this context, image data generated by machine learning models, particularly generative models, presents a promising solution. These models can create diverse and comparable imaging datasets without compromising patient privacy, thereby supporting modern medical education. In this study, we explore the use of convolutional neural networks (CNNs) and CycleGAN (Zhu et al., 2017) for generating synthetic medical images. The source code is available at https://github.com/mliuby/COMP4211-Project.

QoQ-Med: Building Multimodal Clinical Foundation Models with Domain-Aware GRPO Training

Wei Dai, Peilin Chen, Chanakya Ekbote, Paul Pu Liang

arxiv logopreprintMay 31 2025
Clinical decision-making routinely demands reasoning over heterogeneous data, yet existing multimodal language models (MLLMs) remain largely vision-centric and fail to generalize across clinical specialties. To bridge this gap, we introduce QoQ-Med-7B/32B, the first open generalist clinical foundation model that jointly reasons across medical images, time-series signals, and text reports. QoQ-Med is trained with Domain-aware Relative Policy Optimization (DRPO), a novel reinforcement-learning objective that hierarchically scales normalized rewards according to domain rarity and modality difficulty, mitigating performance imbalance caused by skewed clinical data distributions. Trained on 2.61 million instruction tuning pairs spanning 9 clinical domains, we show that DRPO training boosts diagnostic performance by 43% in macro-F1 on average across all visual domains as compared to other critic-free training methods like GRPO. Furthermore, with QoQ-Med trained on intensive segmentation data, it is able to highlight salient regions related to the diagnosis, with an IoU 10x higher than open models while reaching the performance of OpenAI o4-mini. To foster reproducibility and downstream research, we release (i) the full model weights, (ii) the modular training pipeline, and (iii) all intermediate reasoning traces at https://github.com/DDVD233/QoQ_Med.

Physician-level classification performance across multiple imaging domains with a diagnostic medical foundation model and a large dataset of annotated medical images

Thieme, A. H., Miri, T., Marra, A. R., Kobayashi, T., Rodriguez-Nava, G., Li, Y., Barba, T., Er, A. G., Benzler, J., Gertler, M., Riechers, M., Hinze, C., Zheng, Y., Pelz, K., Nagaraj, D., Chen, A., Loeser, A., Ruehle, A., Zamboglou, C., Alyahya, L., Uhlig, M., Machiraju, G., Weimann, K., Lippert, C., Conrad, T., Ma, J., Novoa, R., Moor, M., Hernandez-Boussard, T., Alawad, M., Salinas, J. L., Mittermaier, M., Gevaert, O.

medrxiv logopreprintMay 31 2025
A diagnostic medical foundation model (MedFM) is an artificial intelligence (AI) system engineered to accurately determine diagnoses across various medical imaging modalities and specialties. To train MedFM, we created the PubMed Central Medical Images Dataset (PMCMID), the largest annotated medical image dataset to date, comprising 16,126,659 images from 3,021,780 medical publications. Using AI- and ontology-based methods, we identified 4,482,237 medical images (e.g., clinical photos, X-rays, ultrasounds) and generated comprehensive annotations. To optimize MedFMs performance and assess biases, 13,266 images were manually annotated to establish a multimodal benchmark. MedFM achieved physician-level performance in diagnosis tasks spanning radiology, dermatology, and infectious diseases without requiring specific training. Additionally, we developed the Image2Paper app, allowing clinicians to upload medical images and retrieve relevant literature. The correct diagnoses appeared within the top ten results in 88.4% and at least one relevant differential diagnosis in 93.0%. MedFM and PMCMID were made publicly available. FundingResearch reported here was partially supported by the National Cancer Institute (NCI) (R01 CA260271), the Saudi Company for Artificial Intelligence (SCAI) Authority, and the German Federal Ministry for Economic Affairs and Climate Action (BMWK) under the project DAKI-FWS (01MK21009E). The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.

Bidirectional Projection-Based Multi-Modal Fusion Transformer for Early Detection of Cerebral Palsy in Infants.

Qi K, Huang T, Jin C, Yang Y, Ying S, Sun J, Yang J

pubmed logopapersMay 30 2025
Periventricular white matter injury (PWMI) is the most frequent magnetic resonance imaging (MRI) finding in infants with Cerebral Palsy (CP). We aim to detect CP and identify subtle, sparse PWMI lesions in infants under two years of age with immature brain structures. Based on the characteristic that the responsible lesions are located within five target regions, we first construct a multi-modal dataset including 243 cases with the mask annotations of five target regions for delineating anatomical structures on T1-Weighted Imaging (T1WI) images, masks for lesions on T2-Weighted Imaging (T2WI) images, and categories (CP or Non-CP). Furthermore, we develop a bidirectional projection-based multi-modal fusion transformer (BiP-MFT), incorporating a Bidirectional Projection Fusion Module (BPFM) for integrating the features between five target regions on T1WI images and lesions on T2WI images. Our BiP-MFT achieves subject-level classification accuracy of 0.90, specificity of 0.87, and sensitivity of 0.94. It surpasses the best results of nine comparative methods, with 0.10, 0.08, and 0.09 improvements in classification accuracy, specificity and sensitivity respectively. Our BPFM outperforms eight compared feature fusion strategies using Transformer and U-Net backbones on our dataset. Ablation studies on the dataset annotations and model components justify the effectiveness of our annotation method and the model rationality. The proposed dataset and codes are available at https://github.com/Kai-Qi/BiP-MFT.

ACM-UNet: Adaptive Integration of CNNs and Mamba for Efficient Medical Image Segmentation

Jing Huang, Yongkang Zhao, Yuhan Li, Zhitao Dai, Cheng Chen, Qiying Lai

arxiv logopreprintMay 30 2025
The U-shaped encoder-decoder architecture with skip connections has become a prevailing paradigm in medical image segmentation due to its simplicity and effectiveness. While many recent works aim to improve this framework by designing more powerful encoders and decoders, employing advanced convolutional neural networks (CNNs) for local feature extraction, Transformers or state space models (SSMs) such as Mamba for global context modeling, or hybrid combinations of both, these methods often struggle to fully utilize pretrained vision backbones (e.g., ResNet, ViT, VMamba) due to structural mismatches. To bridge this gap, we introduce ACM-UNet, a general-purpose segmentation framework that retains a simple UNet-like design while effectively incorporating pretrained CNNs and Mamba models through a lightweight adapter mechanism. This adapter resolves architectural incompatibilities and enables the model to harness the complementary strengths of CNNs and SSMs-namely, fine-grained local detail extraction and long-range dependency modeling. Additionally, we propose a hierarchical multi-scale wavelet transform module in the decoder to enhance feature fusion and reconstruction fidelity. Extensive experiments on the Synapse and ACDC benchmarks demonstrate that ACM-UNet achieves state-of-the-art performance while remaining computationally efficient. Notably, it reaches 85.12% Dice Score and 13.89mm HD95 on the Synapse dataset with 17.93G FLOPs, showcasing its effectiveness and scalability. Code is available at: https://github.com/zyklcode/ACM-UNet.

Fully automated measurement of aortic pulse wave velocity from routine cardiac MRI studies.

Jiang Y, Yao T, Paliwal N, Knight D, Punjabi K, Steeden J, Hughes AD, Muthurangu V, Davies R

pubmed logopapersMay 30 2025
Aortic pulse wave velocity (PWV) is a prognostic biomarker for cardiovascular disease, which can be measured by dividing the aortic path length by the pulse transit time. However, current MRI techniques require special sequences and time-consuming manual analysis. We aimed to fully automate the process using deep learning to measure PWV from standard sequences, facilitating PWV measurement in routine clinical and research scans. A deep learning (DL) model was developed to generate high-resolution 3D aortic segmentations from routine 2D trans-axial SSFP localizer images, and the centerlines of the resulting segmentations were used to estimate the aortic path length. A further DL model was built to automatically segment the ascending and descending aorta in phase contrast images, and pulse transit time was estimated from the sampled flow curves. Quantitative comparison with trained observers was performed for path length, aortic flow segmentation and transit time, either using an external clinical dataset with both localizers and paired 3D images acquired or on a sample of UK Biobank subjects. Potential application to clinical research scans was evaluated on 1053 subjects from the UK Biobank. Aortic path length measurement was accurate with no major difference between the proposed method (125 ± 19 mm) and manual measurement by a trained observer (124 ± 19 mm) (P = 0.88). Automated phase contrast image segmentation was similar to that of a trained observer for both the ascending (Dice vs manual: 0.96) and descending (Dice 0.89) aorta with no major difference in transit time estimation (proposed method = 21 ± 9 ms, manual = 22 ± 9 ms; P = 0.15). 966 of 1053 (92 %) UK Biobank subjects were successfully analyzed, with a median PWV of 6.8 m/s, increasing 27 % per decade of age and 6.5 % higher per 10 mmHg higher systolic blood pressure. We describe a fully automated method for measuring PWV from standard cardiac MRI localizers and a single phase contrast imaging plane. The method is robust and can be applied to routine clinical scans, and could unlock the potential of measuring PWV in large-scale clinical and population studies. All models and deployment codes are available online.

Gaussian random fields as an abstract representation of patient metadata for multimodal medical image segmentation.

Cassidy B, McBride C, Kendrick C, Reeves ND, Pappachan JM, Raad S, Yap MH

pubmed logopapersMay 29 2025
Growing rates of chronic wound occurrence, especially in patients with diabetes, has become a recent concerning trend. Chronic wounds are difficult and costly to treat, and have become a serious burden on health care systems worldwide. Innovative deep learning methods for the detection and monitoring of such wounds have the potential to reduce the impact to patients and clinicians. We present a novel multimodal segmentation method which allows for the introduction of patient metadata into the training workflow whereby the patient data are expressed as Gaussian random fields. Our results indicate that the proposed method improved performance when utilising multiple models, each trained on different metadata categories. Using the Diabetic Foot Ulcer Challenge 2022 test set, when compared to the baseline results (intersection over union = 0.4670, Dice similarity coefficient = 0.5908) we demonstrate improvements of +0.0220 and +0.0229 for intersection over union and Dice similarity coefficient respectively. This paper presents the first study to focus on integrating patient data into a chronic wound segmentation workflow. Our results show significant performance gains when training individual models using specific metadata categories, followed by average merging of prediction masks using distance transforms. All source code for this study is available at: https://github.com/mmu-dermatology-research/multimodal-grf.
Page 27 of 35350 results
Show
per page

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.