Latest Papers on Radiology AI. Category: preprint, Sources: medrxiv, Order: Best Match, Limit: 10.

Deep Learning for Pneumonia Diagnosis: A Custom CNN Approach with Superior Performance on Chest Radiographs

Mehta, A., Vyas, M.

•preprint•May 26 2025

A major global health and wellness issue causing major health problems and death, pneumonia underlines the need of quickly and precisely identifying and treating it. Though imaging technology has advanced, radiologists manual reading of chest X-rays still constitutes the basic method for pneumonia detection, which causes delays in both treatment and medical diagnosis. This study proposes a pneumonia detection method to automate the process using deep learning techniques. The concept employs a bespoke convolutional neural network (CNN) trained on different pneumonia-positive and pneumonia-negative cases from several healthcare providers. Various pre-processing steps were done on the chest radiographs to increase integrity and efficiency before teaching the design. Based on the comparison study with VGG19, ResNet50, InceptionV3, DenseNet201, and MobileNetV3, our bespoke CNN model was discovered to be the most efficient in balancing accuracy, recall, and parameter complexity. It shows 96.5% accuracy and 96.6% F1 score. This study contributes to the expansion of an automated, paired with a reliable, pneumonia finding system, which could improve personal outcomes and increase healthcare efficiency. The full project is available at here.

X-Ray Classification Chest Methodology In Silico Academic Lab

Novel Deep Learning Framework for Simultaneous Assessment of Left Ventricular Mass and Longitudinal Strain: Clinical Feasibility and Validation in Patients with Hypertrophic Cardiomyopathy

Park, J., Yoon, Y. E., Jang, Y., Jung, T., Jeon, J., Lee, S.-A., Choi, H.-M., Hwang, I.-C., Chun, E. J., Cho, G.-Y., Chang, H.-J.

•preprint•May 23 2025

BackgroundThis study aims to present the Segmentation-based Myocardial Advanced Refinement Tracking (SMART) system, a novel artificial intelligence (AI)-based framework for transthoracic echocardiography (TTE) that incorporates motion tracking and left ventricular (LV) myocardial segmentation for automated LV mass (LVM) and global longitudinal strain (LVGLS) assessment. MethodsThe SMART system demonstrates LV speckle tracking based on motion vector estimation, refined by structural information using endocardial and epicardial segmentation throughout the cardiac cycle. This approach enables automated measurement of LVMSMART and LVGLSSMART. The feasibility of SMART is validated in 111 hypertrophic cardiomyopathy (HCM) patients (median age: 58 years, 69% male) who underwent TTE and cardiac magnetic resonance imaging (CMR). ResultsLVGLSSMART showed a strong correlation with conventional manual LVGLS measurements (Pearsons correlation coefficient [PCC] 0.851; mean difference 0 [-2-0]). When compared to CMR as the reference standard for LVM, the conventional dimension-based TTE method overestimated LVM (PCC 0.652; mean difference: 106 [90-123]), whereas LVMSMART demonstrated excellent agreement with CMR (PCC 0.843; mean difference: 1 [-11-13]). For predicting extensive myocardial fibrosis, LVGLSSMART and LVMSMART exhibited performance comparable to conventional LVGLS and CMR (AUC: 0.72 and 0.66, respectively). Patients identified as high-risk for extensive fibrosis by LVGLSSMART and LVMSMART had significantly higher rates of adverse outcomes, including heart failure hospitalization, new-onset atrial fibrillation, and defibrillator implantation. ConclusionsThe SMART technique provides a comparable LVGLS evaluation and a more accurate LVM assessment than conventional TTE, with predictive values for myocardial fibrosis and adverse outcomes. These findings support its utility in HCM management.

Ultrasound Segmentation Cardiac Retrospective Clinical Clinical Pilot Startup

Artificial Intelligence enhanced R1 maps can improve lesion detection in focal epilepsy in children

Doumou, G., D'Arco, F., Figini, M., Lin, H., Lorio, S., Piper, R., O'Muircheartaigh, J., Cross, H., Weiskopf, N., Alexander, D., Carmichael, D. W.

•preprint•May 23 2025

Background and purposeMRI is critical for the detection of subtle cortical pathology in epilepsy surgery assessment. This can be aided by improved MRI quality and resolution using ultra-high field (7T). But poor access and long scan durations limit widespread use, particularly in a paediatric setting. AI-based learning approaches may provide similar information by enhancing data obtained with conventional MRI (3T). We used a convolutional neural network trained on matched 3T and 7T images to enhance quantitative R1-maps (longitudinal relaxation rate) obtained at 3T in paediatric epilepsy patients and to determine their potential clinical value for lesion identification. Materials and MethodsA 3D U-Net was trained using paired patches from 3T and 7T R1-maps from n=10 healthy volunteers. The trained network was applied to enhance paediatric focal epilepsy 3T R1 images from a different scanner/site (n=17 MRI lesion positive / n=14 MR-negative). Radiological review assessed image quality, as well as lesion identification and visualization of enhanced maps in comparison to the 3T R1-maps without clinical information. Lesion appearance was then compared to 3D-FLAIR. ResultsAI enhanced R1 maps were superior in terms of image quality in comparison to the original 3T R1 maps, while preserving and enhancing the visibility of lesions. After exclusion of 5/31 patients (due to movement artefact or incomplete data), lesions were detected in AI Enhanced R1 maps for 14/15 (93%) MR-positive and 4/11 (36%) MR-negative patients. ConclusionAI enhanced R1 maps improved the visibility of lesions in MR positive patients, as well as providing higher sensitivity in the MR-negative group compared to either the original 3T R1-maps or 3D-FLAIR. This provides promising initial evidence that 3T quantitative maps can outperform conventional 3T imaging via enhancement by an AI model trained on 7T MRI data, without the need for pathology-specific information.

MRI Image Synthesis Neurological Retrospective Clinical In Silico Academic Lab GenAI

FLAMeS: A Robust Deep Learning Model for Automated Multiple Sclerosis Lesion Segmentation

Dereskewicz, E., La Rosa, F., dos Santos Silva, J., Sizer, E., Kohli, A., Wynen, M., Mullins, W. A., Maggi, P., Levy, S., Onyemeh, K., Ayci, B., Solomon, A. J., Assländer, J., Al-Louzi, O., Reich, D. S., Sumowski, J. F., Beck, E. S.

•preprint•May 22 2025

Background and Purpose Assessment of brain lesions on MRI is crucial for research in multiple sclerosis (MS). Manual segmentation is time consuming and inconsistent. We aimed to develop an automated MS lesion segmentation algorithm for T2-weighted fluid-attenuated inversion recovery (FLAIR) MRI. Methods We developed FLAIR Lesion Analysis in Multiple Sclerosis (FLAMeS), a deep learning-based MS lesion segmentation algorithm based on the nnU-Net 3D full-resolution U-Net and trained on 668 FLAIR 1.5 and 3 tesla scans from persons with MS. FLAMeS was evaluated on three external datasets: MSSEG-2 (n=14), MSLesSeg (n=51), and a clinical cohort (n=10), and compared to SAMSEG, LST-LPA, and LST-AI. Performance was assessed qualitatively by two blinded experts and quantitatively by comparing automated and ground truth lesion masks using standard segmentation metrics. Results In a blinded qualitative review of 20 scans, both raters selected FLAMeS as the most accurate segmentation in 15 cases, with one rater favoring FLAMeS in two additional cases. Across all testing datasets, FLAMeS achieved a mean Dice score of 0.74, a true positive rate of 0.84, and an F1 score of 0.78, consistently outperforming the benchmark methods. For other metrics, including positive predictive value, relative volume difference, and false positive rate, FLAMeS performed similarly or better than benchmark methods. Most lesions missed by FLAMeS were smaller than 10 mm3, whereas the benchmark methods missed larger lesions in addition to smaller ones. Conclusions FLAMeS is an accurate, robust method for MS lesion segmentation that outperforms other publicly available methods.

MRI Segmentation Neurological Retrospective Clinical In Silico Academic Lab

Radiomics-Based Early Triage of Prostate Cancer: A Multicenter Study from the CHAIMELEON Project

Vraka, A., Marfil-Trujillo, M., Ribas-Despuig, G., Flor-Arnal, S., Cerda-Alberich, L., Jimenez-Gomez, P., Jimenez-Pastor, A., Marti-Bonmati, L.

•preprint•May 22 2025

Prostate cancer (PCa) is the most commonly diagnosed malignancy in men worldwide. Accurate triage of patients based on tumor aggressiveness and staging is critical for selecting appropriate management pathways. While magnetic resonance imaging (MRI) has become a mainstay in PCa diagnosis, most predictive models rely on multiparametric imaging or invasive inputs, limiting generalizability in real-world clinical settings. This study aimed to develop and validate machine learning (ML) models using radiomic features extracted from T2-weighted MRI--alone and in combination with clinical variables--to predict ISUP grade (tumor aggressiveness), lymph node involvement (cN) and distant metastasis (cM). A retrospective multicenter cohort from three European sites in the Chaimeleon project was analyzed. Radiomic features were extracted from prostate zone segmentations and lesion masks, following standardized preprocessing and ComBat harmonization. Feature selection and model optimization were performed using nested cross-validation and Bayesian tuning. Hybrid models were trained using XGBoost and interpreted with SHAP values. The ISUP model achieved an AUC of 0.66, while the cN and cM models reached AUCs of 0.77 and 0.80, respectively. The best-performing models consistently combined prostate zone radiomics with clinical features such as PSA, PIRADSv2 and ISUP grade. SHAP analysis confirmed the importance of both clinical and texture-based radiomic features, with entropy and non-uniformity measures playing central roles in all tasks. Our results demonstrate the feasibility of using T2-weighted MRI and zonal radiomics for robust prediction of aggressiveness, nodal involvement and distant metastasis in PCa. This fully automated pipeline offers an interpretable, accessible and clinically translatable tool for first-line PCa triage, with potential integration into real-world diagnostic workflows.

MRI Classification Abdominal Retrospective Clinical In Silico Consortium Benchmark SOTA

A Deep Learning Vision-Language Model for Diagnosing Pediatric Dental Diseases

Pham, T.

•preprint•May 22 2025

This study proposes a deep learning vision-language model for the automated diagnosis of pediatric dental diseases, with a focus on differentiating between caries and periapical infections. The model integrates visual features extracted from panoramic radiographs using methods of non-linear dynamics and textural encoding with textual descriptions generated by a large language model. These multimodal features are concatenated and used to train a 1D-CNN classifier. Experimental results demonstrate that the proposed model outperforms conventional convolutional neural networks and standalone language-based approaches, achieving high accuracy (90%), sensitivity (92%), precision (92%), and an AUC of 0.96. This work highlights the value of combining structured visual and textual representations in improving diagnostic accuracy and interpretability in dental radiology. The approach offers a promising direction for the development of context-aware, AI-assisted diagnostic tools in pediatric dental care.

X-Ray Classification Methodology In Silico Academic Lab GenAI

Cardiac Magnetic Resonance Imaging in the German National Cohort: Automated Segmentation of Short-Axis Cine Images and Post-Processing Quality Control

Full, P. M., Schirrmeister, R. T., Hein, M., Russe, M. F., Reisert, M., Ammann, C., Greiser, K. H., Niendorf, T., Pischon, T., Schulz-Menger, J., Maier-Hein, K. H., Bamberg, F., Rospleszcz, S., Schlett, C. L., Schuppert, C.

•preprint•May 21 2025

PurposeTo develop a segmentation and quality control pipeline for short-axis cardiac magnetic resonance (CMR) cine images from the prospective, multi-center German National Cohort (NAKO). Materials and MethodsA deep learning model for semantic segmentation, based on the nnU-Net architecture, was applied to full-cycle short-axis cine images from 29,908 baseline participants. The primary objective was to determine data on structure and function for both ventricles (LV, RV), including end diastolic volumes (EDV), end systolic volumes (ESV), and LV myocardial mass. Quality control measures included a visual assessment of outliers in morphofunctional parameters, inter- and intra-ventricular phase differences, and LV time-volume curves (TVC). These were adjudicated using a five-point rating scale, ranging from five (excellent) to one (non-diagnostic), with ratings of three or lower subject to exclusion. The predictive value of outlier criteria for inclusion and exclusion was analyzed using receiver operating characteristics. ResultsThe segmentation model generated complete data for 29,609 participants (incomplete in 1.0%) and 5,082 cases (17.0 %) were visually assessed. Quality assurance yielded a sample of 26,899 participants with excellent or good quality (89.9%; exclusion of 1,875 participants due to image quality issues and 835 cases due to segmentation quality issues). TVC was the strongest single discriminator between included and excluded participants (AUC: 0.684). Of the two-category combinations, the pairing of TVC and phases provided the greatest improvement over TVC alone (AUC difference: 0.044; p<0.001). The best performance was observed when all three categories were combined (AUC: 0.748). Extending the quality-controlled sample to include acceptable quality ratings, a total of 28,413 (95.0%) participants were available. ConclusionThe implemented pipeline facilitated the automated segmentation of an extensive CMR dataset, integrating quality control measures. This methodology ensures that ensuing quantitative analyses are conducted with a diminished risk of bias.

MRI Segmentation Cardiac Retrospective Clinical In Silico Academic Lab

Longitudinal Validation of a Deep Learning Index for Aortic Stenosis Progression

Park, J., Kim, J., Yoon, Y. E., Jeon, J., Lee, S.-A., Choi, H.-M., Hwang, I.-C., Cho, G.-Y., Chang, H.-J., Park, J.-H.

•preprint•May 19 2025

AimsAortic stenosis (AS) is a progressive disease requiring timely monitoring and intervention. While transthoracic echocardiography (TTE) remains the diagnostic standard, deep learning (DL)-based approaches offer potential for improved disease tracking. This study examined the longitudinal changes in a previously developed DL-derived index for AS continuum (DLi-ASc) and assessed its value in predicting progression to severe AS. Methods and ResultsWe retrospectively analysed 2,373 patients a(7,371 TTEs) from two tertiary hospitals. DLi-ASc (scaled 0-100), derived from parasternal long- and/or short-axis views, was tracked longitudinally. DLi-ASc increased in parallel with worsening AS stages (p for trend <0.001) and showed strong correlations with AV maximal velocity (Vmax) (Pearson correlation coefficients [PCC] = 0.69, p<0.001) and mean pressure gradient (mPG) (PCC = 0.66, p<0.001). Higher baseline DLi-ASc was associated with a faster AS progression rate (p for trend <0.001). Additionally, the annualised change in DLi-ASc, estimated using linear mixed-effect models, correlated strongly with the annualised progression of AV Vmax (PCC = 0.71, p<0.001) and mPG (PCC = 0.68, p<0.001). In Fine-Gray competing risk models, baseline DLi-ASc independently predicted progression to severe AS, even after adjustment for AV Vmax or mPG (hazard ratio per 10-point increase = 2.38 and 2.80, respectively) ConclusionDLi-ASc increased in parallel with AS progression and independently predicted severe AS progression. These findings support its role as a non-invasive imaging-based digital marker for longitudinal AS monitoring and risk stratification.

Ultrasound Classification Cardiac Retrospective Clinical In Silico Academic Lab

Harnessing Artificial Intelligence for Accurate Diagnosis and Radiomics Analysis of Combined Pulmonary Fibrosis and Emphysema: Insights from a Multicenter Cohort Study

Zhang, S., Wang, H., Tang, H., Li, X., Wu, N.-W., Lang, Q., Li, B., Zhu, H., Chen, X., Chen, K., Xie, B., Zhou, A., Mo, C.

•preprint•May 18 2025

Combined Pulmonary Fibrosis and Emphysema (CPFE), formally recognized as a distinct pulmonary syndrome in 2022, is characterized by unique clinical features and pathogenesis that may lead to respiratory failure and death. However, the diagnosis of CPFE presents significant challenges that hinder effective treatment. Here, we assembled three-dimensional (3D) reconstruction data of the chest High-Resolution Computed Tomography (HRCT) of patients from multiple hospitals across different provinces in China, including Xiangya Hospital, West China Hospital, and Fujian Provincial Hospital. Using this dataset, we developed CPFENet, a deep learning-based diagnostic model for CPFE. It accurately differentiates CPFE from COPD, with performance comparable to that of professional radiologists. Additionally, we developed a CPFE score based on radiomic analysis of 3D CT images to quantify disease characteristics. Notably, female patients demonstrated significantly higher CPFE scores than males, suggesting potential sex-specific differences in CPFE. Overall, our study establishes the first diagnostic framework for CPFE, providing a diagnostic model and clinical indicators that enable accurate classification and characterization of the syndrome.

CT Classification Chest Retrospective Clinical In Silico Academic Lab Benchmark SOTA

The effect of medical explanations from large language models on diagnostic decisions in radiology

Spitzer, P., Hendriks, D., Rudolph, J., Schläger, S., Ricke, J., Kühl, N., Hoppe, B., Feuerriegel, S.

•preprint•May 18 2025

Large language models (LLMs) are increasingly used by physicians for diagnostic support. A key advantage of LLMs is the ability to generate explanations that can help physicians understand the reasoning behind a diagnosis. However, the best-suited format for LLM-generated explanations remains unclear. In this large-scale study, we examined the effect of different formats for LLM explanations on clinical decision-making. For this, we conducted a randomized experiment with radiologists reviewing patient cases with radiological images (N = 2020 assessments). Participants received either no LLM support (control group) or were supported by one of three LLM-generated explanations: (1) a standard output providing the diagnosis without explanation; (2) a differential diagnosis comparing multiple possible diagnoses; or (3) a chain-of-thought explanation offering a detailed reasoning process for the diagnosis. We find that the format of explanations significantly influences diagnostic accuracy. The chain-of-thought explanations yielded the best performance, improving the diagnostic accuracy by 12.2% compared to the control condition without LLM support (P = 0.001). The chain-of-thought explanations are also superior to the standard output without explanation (+7.2%; P = 0.040) and the differential diagnosis format (+9.7%; P = 0.004). We further assessed the robustness of these findings across case difficulty and different physician backgrounds such as general vs. specialized radiologists. Evidently, explaining the reasoning for a diagnosis helps physicians to identify and correct potential errors in LLM predictions and thus improve overall decisions. Altogether, the results highlight the importance of how explanations in medical LLMs are generated to maximize their utility in clinical practice. By designing explanations to support the reasoning processes of physicians, LLMs can improve diagnostic performance and, ultimately, patient outcomes.

LLM Radiology Report Prospective Clinical Pilot Academic Lab GenAI

Deep Learning for Pneumonia Diagnosis: A Custom CNN Approach with Superior Performance on Chest Radiographs

Novel Deep Learning Framework for Simultaneous Assessment of Left Ventricular Mass and Longitudinal Strain: Clinical Feasibility and Validation in Patients with Hypertrophic Cardiomyopathy

Artificial Intelligence enhanced R1 maps can improve lesion detection in focal epilepsy in children

FLAMeS: A Robust Deep Learning Model for Automated Multiple Sclerosis Lesion Segmentation

Radiomics-Based Early Triage of Prostate Cancer: A Multicenter Study from the CHAIMELEON Project

A Deep Learning Vision-Language Model for Diagnosing Pediatric Dental Diseases

Cardiac Magnetic Resonance Imaging in the German National Cohort: Automated Segmentation of Short-Axis Cine Images and Post-Processing Quality Control

Longitudinal Validation of a Deep Learning Index for Aortic Stenosis Progression

Harnessing Artificial Intelligence for Accurate Diagnosis and Radiomics Analysis of Combined Pulmonary Fibrosis and Emphysema: Insights from a Multicenter Cohort Study

The effect of medical explanations from large language models on diagnostic decisions in radiology

Ready to Sharpen Your Edge?