Latest Papers on Radiology AI. Tags: None

Deep Learning-Based Breast Cancer Detection in Mammography: A Multi-Center Validation Study in Thai Population

Isarun Chamveha, Supphanut Chaiyungyuen, Sasinun Worakriangkrai, Nattawadee Prasawang, Warasinee Chaisangmongkon, Pornpim Korpraphong, Voraparee Suvannarerg, Shanigarn Thiravit, Chalermdej Kannawat, Kewalin Rungsinaporn, Suwara Issaragrisil, Payia Chadbunchachai, Pattiya Gatechumpol, Chawiporn Muktabhant, Patarachai Sereerat

•preprint•May 29 2025

This study presents a deep learning system for breast cancer detection in mammography, developed using a modified EfficientNetV2 architecture with enhanced attention mechanisms. The model was trained on mammograms from a major Thai medical center and validated on three distinct datasets: an in-domain test set (9,421 cases), a biopsy-confirmed set (883 cases), and an out-of-domain generalizability set (761 cases) collected from two different hospitals. For cancer detection, the model achieved AUROCs of 0.89, 0.96, and 0.94 on the respective datasets. The system's lesion localization capability, evaluated using metrics including Lesion Localization Fraction (LLF) and Non-Lesion Localization Fraction (NLF), demonstrated robust performance in identifying suspicious regions. Clinical validation through concordance tests showed strong agreement with radiologists: 83.5% classification and 84.0% localization concordance for biopsy-confirmed cases, and 78.1% classification and 79.6% localization concordance for out-of-domain cases. Expert radiologists' acceptance rate also averaged 96.7% for biopsy-confirmed cases, and 89.3% for out-of-domain cases. The system achieved a System Usability Scale score of 74.17 for source hospital, and 69.20 for validation hospitals, indicating good clinical acceptance. These results demonstrate the model's effectiveness in assisting mammogram interpretation, with the potential to enhance breast cancer screening workflows in clinical practice.

Mammography Detection Breast Retrospective Clinical Clinical Pilot Academic Lab Benchmark SOTA

Deep Learning CAIPIRINHA-VIBE Improves and Accelerates Head and Neck MRI.

Nitschke LV, Lerchbaumer M, Ulas T, Deppe D, Nickel D, Geisel D, Kubicka F, Wagner M, Walter-Rittel T

•papers•May 29 2025

The aim of this study was to evaluate image quality for contrast-enhanced (CE) neck MRI with a deep learning-reconstructed VIBE sequence with acceleration factors (AF) 4 (DL4-VIBE) and 6 (DL6-VIBE). Patients referred for neck MRI were examined in a 3-Tesla scanner in this prospective, single-center study. Four CE fat-saturated (FS) VIBE sequences were acquired in each patient: Star-VIBE (4:01 min), VIBE (2:05 min), DL4-VIBE (0:24 min), DL6-VIBE (0:17 min). Image quality was evaluated by three radiologists with a 5-point Likert scale and included overall image quality, muscle contour delineation, conspicuity of mucosa and pharyngeal musculature, FS uniformity, and motion artifacts. Objective image quality was assessed with signal-to-noise ratio (SNR), contrast-to-noise ratio (CNR), and quantification of metal artifacts. 68 patients (60.3% male; mean age 57.4±16 years) were included in this study. DL4-VIBE was superior for overall image quality, delineation of muscle contours, differentiation of mucosa and pharyngeal musculature, vascular delineation, and motion artifacts. Notably, DL4-VIBE exhibited exceptional FS uniformity (p<0.001). SNR and CNR were superior for DL4-VIBE compared to all other sequences (p<0.001). Metal artifacts were least pronounced in the standard VIBE, followed by DL4-VIBE (p<0.001). Although DL6-VIBE was inferior to DL4-VIBE, it demonstrated improved FS homogeneity, delineation of pharyngeal mucosa, and CNR compared to Star-VIBE and VIBE. DL4-VIBE significantly improves image quality for CE neck MRI with a fraction of the scan time of conventional sequences.

MRI Reconstruction Neurological Prospective Clinical Pilot Academic Lab

Motion-resolved parametric imaging derived from short dynamic [18F]FDG PET/CT scans.

Artesani A, van Sluis J, Providência L, van Snick JH, Slart RHJA, Noordzij W, Tsoumpas C

•papers•May 29 2025

This study aims to assess the added value of utilizing short-dynamic whole-body PET/CT scans and implementing motion correction before quantifying metabolic rate, offering more insights into physiological processes. While this approach may not be commonly adopted, addressing motion effects is crucial due to their demonstrated potential to cause significant errors in parametric imaging. A 15-minute dynamic FDG PET acquisition protocol was utilized for four lymphoma patients undergoing therapy evaluation. Parametric imaging was obtained using a population-based input function (PBIF) derived from twelve patients with full 65-minute dynamic FDG PET acquisition. AI-based registration methods were employed to correct misalignments between both PET and ACCT and PET-to-PET. Tumour characteristics were assessed using both parametric images and standardized uptake values (SUV). The motion correction process significantly reduced mismatches between images without significantly altering voxel intensity values, except for SUVmax. Following the alignment of the attenuation correction map with the PET frame, an increase in SUVmax in FDG-avid lymph nodes was observed, indicating its susceptibility to spatial misalignments. In contrast, Patlak Ki parameter was highly sensitive to misalignment across PET frames, that notably altered the Patlak slope. Upon completion of the motion correction process, the parametric representation revealed heterogeneous behaviour among lymph nodes compared to SUV images. Notably, reduced volume of elevated metabolic rate was determined in the mediastinal lymph nodes in contrast with an SUV of 5 g/ml, indicating potential perfusion or inflammation. Motion resolved short-dynamic PET can enhance the utility and reliability of parametric imaging, an aspect often overlooked in commercial software.

PET Registration Whole Body Retrospective Clinical Clinical Pilot Academic Lab

Image Aesthetic Reasoning: A New Benchmark for Medical Image Screening with MLLMs

Zheng Sun, Yi Wei, Long Yu

•preprint•May 29 2025

Multimodal Large Language Models (MLLMs) are of great application across many domains, such as multimodal understanding and generation. With the development of diffusion models (DM) and unified MLLMs, the performance of image generation has been significantly improved, however, the study of image screening is rare and its performance with MLLMs is unsatisfactory due to the lack of data and the week image aesthetic reasoning ability in MLLMs. In this work, we propose a complete solution to address these problems in terms of data and methodology. For data, we collect a comprehensive medical image screening dataset with 1500+ samples, each sample consists of a medical image, four generated images, and a multiple-choice answer. The dataset evaluates the aesthetic reasoning ability under four aspects: \textit{(1) Appearance Deformation, (2) Principles of Physical Lighting and Shadow, (3) Placement Layout, (4) Extension Rationality}. For methodology, we utilize long chains of thought (CoT) and Group Relative Policy Optimization with Dynamic Proportional Accuracy reward, called DPA-GRPO, to enhance the image aesthetic reasoning ability of MLLMs. Our experimental results reveal that even state-of-the-art closed-source MLLMs, such as GPT-4o and Qwen-VL-Max, exhibit performance akin to random guessing in image aesthetic reasoning. In contrast, by leveraging the reinforcement learning approach, we are able to surpass the score of both large-scale models and leading closed-source models using a much smaller model. We hope our attempt on medical image screening will serve as a regular configuration in image aesthetic reasoning in the future.

Classification Dataset Release In Silico Academic Lab Benchmark SOTA Open Dataset

Menopausal hormone therapy and the female brain: Leveraging neuroimaging and prescription registry data from the UK Biobank cohort.

Barth C, Galea LAM, Jacobs EG, Lee BH, Westlye LT, de Lange AG

•papers•May 29 2025

Menopausal hormone therapy (MHT) is generally thought to be neuroprotective, yet results have been inconsistent. Here, we present a comprehensive study of MHT use and brain characteristics in females from the UK Biobank. 19,846 females with magnetic resonance imaging data were included. Detailed MHT prescription data from primary care records was available for 538. We tested for associations between the brain measures (i.e. gray/white matter brain age, hippocampal volumes, white matter hyperintensity volumes) and MHT user status, age at first and last use, duration of use, formulation, route of administration, dosage, type, and active ingredient. We further tested for the effects of a history of hysterectomy ± bilateral oophorectomy among MHT users and examined associations by APOE ε4 status. Current MHT users, not past users, showed older gray and white matter brain age, with a difference of up to 9 mo, and smaller hippocampal volumes compared to never-users. Longer duration of use and older age at last use post-menopause was associated with older gray and white matter brain age, larger white matter hyperintensity volume, and smaller hippocampal volumes. MHT users with a history of hysterectomy ± bilateral oophorectomy showed younger gray matter brain age relative to MHT users without such history. We found no associations by APOE ε4 status and with other MHT variables. Our results indicate that population-level associations between MHT use and female brain health might vary depending on duration of use and past surgical history. The authors received funding from the Research Council of Norway (LTW: 223273, 249795, 273345, 298646, 300768), the South-Eastern Norway Regional Health Authority (CB: 2023037, 2022103; LTW: 2018076, 2019101), the European Research Council under the European Union's Horizon 2020 research and innovation program (LTW: 802998), the Swiss National Science Foundation (AMGdL: PZ00P3_193658), the Canadian Institutes for Health Research (LAMG: PJT-173554), the Treliving Family Chair in Women's Mental Health at the Centre for Addiction and Mental Health (LAMG), womenmind at the Centre for Addiction and Mental Health (LAMG, BHL), the Ann S. Bowers Women's Brain Health Initiative (EGJ), and the National Institutes of Health (EGJ: AG063843).

MRI Classification Neurological Retrospective Clinical In Silico Academic Lab

Can Large Language Models Challenge CNNS in Medical Image Analysis?

Shibbir Ahmed, Shahnewaz Karim Sakib, Anindya Bijoy Das

•preprint•May 29 2025

This study presents a multimodal AI framework designed for precisely classifying medical diagnostic images. Utilizing publicly available datasets, the proposed system compares the strengths of convolutional neural networks (CNNs) and different large language models (LLMs). This in-depth comparative analysis highlights key differences in diagnostic performance, execution efficiency, and environmental impacts. Model evaluation was based on accuracy, F1-score, average execution time, average energy consumption, and estimated $CO_2$ emission. The findings indicate that although CNN-based models can outperform various multimodal techniques that incorporate both images and contextual information, applying additional filtering on top of LLMs can lead to substantial performance gains. These findings highlight the transformative potential of multimodal AI systems to enhance the reliability, efficiency, and scalability of medical diagnostics in clinical settings.

Mixed Modality Classification Methodology In Silico GenAI

Deep learning radiomics fusion model to predict visceral pleural invasion of clinical stage IA lung adenocarcinoma: a multicenter study.

Zhao J, Wang T, Wang B, Satishkumar BM, Ding L, Sun X, Chen C

•papers•May 28 2025

To assess the predictive performance, risk stratification capabilities, and auxiliary diagnostic utility of radiomics, deep learning, and fusion models in identifying visceral pleural invasion (VPI) in lung adenocarcinoma. A total of 449 patients (female:male, 263:186; 59.8 ± 10.5 years) diagnosed with clinical IA stage lung adenocarcinoma (LAC) from two distinct hospitals were enrolled in the study and divided into a training cohort (n = 289) and an external test cohort (n = 160). The fusion models were constructed from the feature level and the decision level respectively. A comprehensive analysis was conducted to assess the prediction ability and prognostic value of radiomics, deep learning, and fusion models. The diagnostic performance of radiologists of varying seniority with and without the assistance of the optimal model was compared. The late fusion model demonstrated superior diagnostic performance (AUC = 0.812) compared to clinical (AUC = 0.650), radiomics (AUC = 0.710), deep learning (AUC = 0.770), and the early fusion models (AUC = 0.586) in the external test cohort. The multivariate Cox regression analysis showed that the VPI status predicted by the late fusion model were independently associated with patient disease-free survival (DFS) (p = 0.044). Furthermore, model assistance significantly improved radiologist performance, particularly for junior radiologists; the AUC increased by 0.133 (p < 0.001) reaching levels comparable to the senior radiologist without model assistance (AUC: 0.745 vs. 0.730, p = 0.790). The proposed decision-level (late fusion) model significantly reducing the risk of overfitting and demonstrating excellent robustness in multicenter external validation, which can predict VPI status in LAC, aid in prognostic stratification, and assist radiologists in achieving higher diagnostic performance.

CT Classification Chest Retrospective Clinical In Silico Academic Lab

RadCLIP: Enhancing Radiologic Image Analysis Through Contrastive Language-Image Pretraining.

Lu Z, Li H, Parikh NA, Dillman JR, He L

•papers•May 28 2025

The integration of artificial intelligence (AI) with radiology signifies a transformative era in medicine. Vision foundation models have been adopted to enhance radiologic imaging analysis. However, the inherent complexities of 2D and 3D radiologic data present unique challenges that existing models, which are typically pretrained on general nonmedical images, do not adequately address. To bridge this gap and harness the diagnostic precision required in radiologic imaging, we introduce radiologic contrastive language-image pretraining (RadCLIP): a cross-modal vision-language foundational model that utilizes a vision-language pretraining (VLP) framework to improve radiologic image analysis. Building on the contrastive language-image pretraining (CLIP) approach, RadCLIP incorporates a slice pooling mechanism designed for volumetric image analysis and is pretrained using a large, diverse dataset of radiologic image-text pairs. This pretraining effectively aligns radiologic images with their corresponding text annotations, resulting in a robust vision backbone for radiologic imaging. Extensive experiments demonstrate RadCLIP's superior performance in both unimodal radiologic image classification and cross-modal image-text matching, underscoring its significant promise for enhancing diagnostic accuracy and efficiency in clinical settings. Our key contributions include curating a large dataset featuring diverse radiologic 2D/3D image-text pairs, pretraining RadCLIP as a vision-language foundation model on this dataset, developing a slice pooling adapter with an attention mechanism for integrating 2D images, and conducting comprehensive evaluations of RadCLIP on various radiologic downstream tasks.

Mixed Modality Classification Methodology In Silico Academic Lab GenAI Open Dataset

Operationalizing postmortem pathology-MRI association studies in Alzheimer's disease and related disorders with MRI-guided histology sampling.

Athalye C, Bahena A, Khandelwal P, Emrani S, Trotman W, Levorse LM, Khodakarami Z, Ohm DT, Teunissen-Bermeo E, Capp N, Sadaghiani S, Arezoumandan S, Lim SA, Prabhakaran K, Ittyerah R, Robinson JL, Schuck T, Lee EB, Tisdall MD, Das SR, Wolk DA, Irwin DJ, Yushkevich PA

•papers•May 28 2025

Postmortem neuropathological examination, while the gold standard for diagnosing neurodegenerative diseases, often relies on limited regional sampling that may miss critical areas affected by Alzheimer's disease and related disorders. Ultra-high resolution postmortem MRI can help identify regions that fall outside the diagnostic sampling criteria for additional histopathologic evaluation. However, there are no standardized guidelines for integrating histology and MRI in a traditional brain bank. We developed a comprehensive protocol for whole hemisphere postmortem 7T MRI-guided histopathological sampling with whole-slide digital imaging and histopathological analysis, providing a reliable pipeline for high-volume brain banking in heterogeneous brain tissue. Our method uses patient-specific 3D printed molds built from postmortem MRI, allowing standardized tissue processing with a permanent spatial reference frame. To facilitate pathology-MRI association studies, we created a semi-automated MRI to histology registration pipeline and developed a quantitative pathology scoring system using weakly supervised deep learning. We validated this protocol on a cohort of 29 brains with diagnosis on the AD spectrum that revealed correlations between cortical thickness and phosphorylated tau accumulation. This pipeline has broad applicability across neuropathological research and brain banking, facilitating large-scale studies that integrate histology with neuroimaging. The innovations presented here provide a scalable and reproducible approach to studying postmortem brain pathology, with implications for advancing diagnostic and therapeutic strategies for Alzheimer's disease and related disorders.

MRI Registration Neurological Methodology In Silico Academic Lab Reproducibility

Deep Learning-Based Fully Automated Aortic Valve Leaflets and Root Measurement From Computed Tomography Images　- A Feasibility Study.

Yamauchi H, Aoyama G, Tsukihara H, Ino K, Tomii N, Takagi S, Fujimoto K, Sakaguchi T, Sakuma I, Ono M

•papers•May 28 2025

The aim of this study was to retrain our existing deep learning-based fully automated aortic valve leaflets/root measurement algorithm, using computed tomography (CT) data for root dilatation (RD), and assess its clinical feasibility. 67 ECG-gated cardiac CT scans were retrospectively collected from 40 patients with RD to retrain the algorithm. An additional 100 patients' CT data with aortic stenosis (AS, n=50) and aortic regurgitation (AR) with/without RD (n=50) were collected to evaluate the algorithm. 45 AR patients had RD. The algorithm provided patient-specific 3-dimensional aortic valve/root visualization. The measurements of 100 cases automatically obtained by the algorithm were compared with an expert's manual measurements. Overall, there was a moderate-to-high correlation, with differences of 6.1-13.4 mm2for the virtual basal ring area, 1.1-2.6 mm for sinus diameter, 0.1-0.6 mm for coronary artery height, 0.2-0.5 mm for geometric height, and 0.9 mm for effective height, except for the sinotubular junction of the AR cases (10.3 mm) with an indefinite borderline over the dilated sinuses, compared with 2.1 mm in AS cases. The measurement time (122 s) per case by the algorithm was significantly shorter than those of the experts (618-1,126 s). This fully automated algorithm can assist in evaluating aortic valve/root anatomy for planning surgical and transcatheter treatments while saving time and minimizing workload.

CT Segmentation Cardiac Retrospective Clinical In Silico Academic Lab

Filter Papers

Tags

Deep Learning-Based Breast Cancer Detection in Mammography: A Multi-Center Validation Study in Thai Population

Deep Learning CAIPIRINHA-VIBE Improves and Accelerates Head and Neck MRI.

Motion-resolved parametric imaging derived from short dynamic [<sup>18</sup>F]FDG PET/CT scans.

Image Aesthetic Reasoning: A New Benchmark for Medical Image Screening with MLLMs

Menopausal hormone therapy and the female brain: Leveraging neuroimaging and prescription registry data from the UK Biobank cohort.

Can Large Language Models Challenge CNNS in Medical Image Analysis?

Deep learning radiomics fusion model to predict visceral pleural invasion of clinical stage IA lung adenocarcinoma: a multicenter study.

RadCLIP: Enhancing Radiologic Image Analysis Through Contrastive Language-Image Pretraining.

Operationalizing postmortem pathology-MRI association studies in Alzheimer's disease and related disorders with MRI-guided histology sampling.

Deep Learning-Based Fully Automated Aortic Valve Leaflets and Root Measurement From Computed Tomography Images　- A Feasibility Study.

Ready to Sharpen Your Edge?

Filter Papers

Tags

Deep Learning-Based Breast Cancer Detection in Mammography: A Multi-Center Validation Study in Thai Population

Deep Learning CAIPIRINHA-VIBE Improves and Accelerates Head and Neck MRI.

Motion-resolved parametric imaging derived from short dynamic [<sup>18</sup>F]FDG PET/CT scans.

Image Aesthetic Reasoning: A New Benchmark for Medical Image Screening with MLLMs

Menopausal hormone therapy and the female brain: Leveraging neuroimaging and prescription registry data from the UK Biobank cohort.

Can Large Language Models Challenge CNNS in Medical Image Analysis?

Deep learning radiomics fusion model to predict visceral pleural invasion of clinical stage IA lung adenocarcinoma: a multicenter study.

RadCLIP: Enhancing Radiologic Image Analysis Through Contrastive Language-Image Pretraining.

Operationalizing postmortem pathology-MRI association studies in Alzheimer's disease and related disorders with MRI-guided histology sampling.

Deep Learning-Based Fully Automated Aortic Valve Leaflets and Root Measurement From Computed Tomography Images - A Feasibility Study.

Ready to Sharpen Your Edge?

Deep Learning-Based Fully Automated Aortic Valve Leaflets and Root Measurement From Computed Tomography Images　- A Feasibility Study.