Latest Papers on Radiology AI. Sources: medrxiv, Order: Best Match, Limit: 10.

Multivariate whole brain neurodegenerative-cognitive-clinical severity mapping in the Alzheimer's disease continuum using explainable AI

Murad, T., Miao, H., Thakuri, D. S., Darekar, G., Chand, G.

•preprint•Jul 11 2025

Neurodegeneration and cognitive impairment are commonly reported in Alzheimers disease (AD); however, their multivariate links are not well understood. To map the multivariate relationships between whole brain neurodegenerative (WBN) markers, global cognition, and clinical severity in the AD continuum, we developed the explainable artificial intelligence (AI) methods, validated on semi-simulated data, and applied the outperforming method systematically to large-scale experimental data (N=1,756). The outperforming explainable AI method showed robust performance in predicting cognition from regional WBN markers and identified the ground-truth simulated dominant brain regions contributing to cognition. This method also showed excellent performance on experimental data and identified several prominent WBN regions hierarchically and simultaneously associated with cognitive declines across the AD continuum. These multivariate regional features also correlated with clinical severity, suggesting their clinical relevance. Overall, this study innovatively mapped the multivariate regional WBN-cognitive-clinical severity relationships in the AD continuum, thereby significantly advancing AD-relevant neurobiological pathways.

MRI Classification Neurological Methodology In Silico Academic Lab GenAI

A View-Agnostic Deep Learning Framework for Comprehensive Analysis of 2D-Echocardiography

Anisuzzaman, D. M., Malins, J. G., Jackson, J. I., Lee, E., Naser, J. A., Rostami, B., Bird, J. G., Spiegelstein, D., Amar, T., Ngo, C. C., Oh, J. K., Pellikka, P. A., Thaden, J. J., Lopez-Jimenez, F., Poterucha, T. J., Friedman, P. A., Pislaru, S., Kane, G. C., Attia, Z. I.

•preprint•Jul 11 2025

Echocardiography traditionally requires experienced operators to select and interpret clips from specific viewing angles. Clinical decision-making is therefore limited for handheld cardiac ultrasound (HCU), which is often collected by novice users. In this study, we developed a view-agnostic deep learning framework to estimate left ventricular ejection fraction (LVEF), patient age, and patient sex from any of several views containing the left ventricle. Model performance was: (1) consistently strong across retrospective transthoracic echocardiography (TTE) datasets; (2) comparable between prospective HCU versus TTE (625 patients; LVEF r2 0.80 vs. 0.86, LVEF [> or [≤]40%] AUC 0.981 vs. 0.993, age r2 0.85 vs. 0.87, sex classification AUC 0.985 vs. 0.996); (3) comparable between prospective HCU data collected by experts versus novice users (100 patients; LVEF r2 0.78 vs. 0.66, LVEF AUC 0.982 vs. 0.966). This approach may broaden the clinical utility of echocardiography by lessening the need for user expertise in image acquisition.

Ultrasound Classification Cardiac Prospective Clinical Pilot Academic Lab Benchmark SOTA

Incremental diagnostic value of AI-derived coronary artery calcium in 18F-flurpiridaz PET Myocardial Perfusion Imaging

Barrett, O., Shanbhag, A., Zaid, R., Miller, R. J., Lemley, M., Builoff, V., Liang, J., Kavanagh, P., Buckley, C., Dey, D., Berman, D. S., Slomka, P.

•preprint•Jul 11 2025

BackgroundPositron Emission Tomography (PET) myocardial perfusion imaging (MPI) is a powerful tool for predicting coronary artery disease (CAD). Coronary artery calcium (CAC) provides incremental risk stratification to PET-MPI and enhances diagnostic accuracy. We assessed additive value of CAC score, derived from PET/CT attenuation maps to stress TPD results using the novel 18F-flurpiridaz tracer in detecting significant CAD. Methods and ResultsPatients from 18F-flurpiridaz phase III clinical trial who underwent PET/CT MPI with 18F-flurpiridaz tracer, had available CT attenuation correction (CTAC) scans for CAC scoring, and underwent invasive coronary angiography (ICA) within a 6-month period between 2011 and 2013, were included. Total perfusion deficit (TPD) was quantified automatically, and CAC scores from CTAC scans were assessed using artificial intelligence (AI)-derived segmentation and manual scoring. Obstructive CAD was defined as [≥]50% stenosis in Left Main (LM) artery, or 70% or more stenosis in any of the other major epicardial vessels. Prediction performance for CAD was assessed by comparing the area under receiver operating characteristic curve (AUC) for stress TPD alone and in combination with CAC score. Among 498 patients (72% males, median age 63 years) 30.1% had CAD. Incorporating CAC score resulted in a greater AUC: manual scoring (AUC=0.87, 95% Confidence Interval [CI] 0.34-0.90; p=0.015) and AI-based scoring (AUC=0.88, 95%CI 0.85-0.90; p=0.002) compared to stress TPD alone (AUC 0.84, 95% CI 0.80-0.92). ConclusionsCombining automatically derived TPD and CAC score enhances 18F-flurpiridaz PET MPI accuracy in detecting significant CAD, offering a method that can be routinely used with PET/CT scanners without additional scanning or technologist time. CONDENSED ABSTRACTO_ST_ABSBackgroundC_ST_ABSWe assessed the added value of CAC score from hybrid PET/CT CTAC scans combined with stress TPD for detecting significant CAD using novel 18F-flurpiridaz tracer Methods and resultsPatients from the 18F-flurpiridaz phase III clinical trial (n=498, 72% male, median age 63) who underwent PET/CT MPI and ICA within 6-months were included. TPD was quantified automatically, and CAC scores were assessed by AI and manual methods. Adding CAC score to TPD improved AUC for manual (0.87) and AI-based (0.88) scoring versus TPD alone (0.84). ConclusionsCombining TPD and CAC score enhances 18F-flurpiridaz PET MPI accuracy for CAD detection O_FIG O_LINKSMALLFIG WIDTH=200 HEIGHT=110 SRC="FIGDIR/small/25330013v1_ufig1.gif" ALT="Figure 1"> View larger version (37K): [email protected]@ba93d1org.highwire.dtl.DTLVardef@13eabd9org.highwire.dtl.DTLVardef@1845505_HPS_FORMAT_FIGEXP M_FIG O_FLOATNOGraphical Abstract.C_FLOATNO Overview of the study design. C_FIG

Mixed Modality Segmentation Cardiac Retrospective Clinical In Silico Academic Lab Benchmark SOTA

Automated Detection of Lacunes in Brain MR Images Using SAM with Robust Prompts via Self-Distillation and Anatomy-Informed Priors

Deepika, P., Shanker, G., Narayanan, R., Sundaresan, V.

•preprint•Jul 10 2025

Lacunes, which are small fluid-filled cavities in the brain, are signs of cerebral small vessel disease and have been clinically associated with various neurodegenerative and cerebrovascular diseases. Hence, accurate detection of lacunes is crucial and is one of the initial steps for the precise diagnosis of these diseases. However, developing a robust and consistently reliable method for detecting lacunes is challenging because of the heterogeneity in their appearance, contrast, shape, and size. To address the above challenges, in this study, we propose a lacune detection method using the Segment Anything Model (SAM), guided by point prompts from a candidate prompt generator. The prompt generator initially detects potential lacunes with a high sensitivity using a composite loss function. The SAM model selects true lacunes by delineating their characteristics from mimics such as the sulcus and enlarged perivascular spaces, imitating the clinicians strategy of examining the potential lacunes along all three axes. False positives were further reduced by adaptive thresholds based on the region-wise prevalence of lacunes. We evaluated our method on two diverse, multi-centric MRI datasets, VALDO and ISLES, comprising only FLAIR sequences. Despite diverse imaging conditions and significant variations in slice thickness (0.5-6 mm), our method achieved sensitivities of 84% and 92%, with average false positive rates of 0.05 and 0.06 per slice in ISLES and VALDO datasets respectively. The proposed method outperformed the state-of-the-art methods, demonstrating its effectiveness in lacune detection and quantification.

MRI Detection Neurological Retrospective Clinical In Silico Benchmark SOTA

A Unified Platform for Radiology Report Generation and Clinician-Centered AI Evaluation

Ma, Z., Yang, X., Atalay, Z., Yang, A., Collins, S., Bai, H., Bernstein, M., Baird, G., Jiao, Z.

•preprint•Jul 8 2025

Generative AI models have demonstrated strong potential in radiology report generation, but their clinical adoption depends on physician trust. In this study, we conducted a radiology-focused Turing test to evaluate how well attendings and residents distinguish AI-generated reports from those written by radiologists, and how their confidence and decision time reflect trust. we developed an integrated web-based platform comprising two core modules: Report Generation and Report Evaluation. Using the web-based platform, eight participants evaluated 48 anonymized X-ray cases, each paired with two reports from three comparison groups: radiologist vs. AI model 1, radiologist vs. AI model 2, and AI model 1 vs. AI model 2. Participants selected the AI-generated report, rated their confidence, and indicated report preference. Attendings outperformed residents in identifying AI-generated reports (49.9% vs. 41.1%) and exhibited longer decision times, suggesting more deliberate judgment. Both groups took more time when both reports were AI-generated. Our findings highlight the role of clinical experience in AI acceptance and the need for design strategies that foster trust in clinical applications. The project page of the evaluation platform is available at: https://zachatalay89.github.io/Labsite.

X-Ray Report Generation Retrospective Clinical In Silico Academic Lab GenAI

Automated Deep Learning-Based 3D-to-2D Segmentation of Geographic Atrophy in Optical Coherence Tomography Data

Al-khersan, H., Oakley, J. D., Russakoff, D. B., Cao, J. A., Saju, S. M., Zhou, A., Sodhi, S. K., Pattathil, N., Choudhry, N., Boyer, D. S., Wykoff, C. C.

•preprint•Jul 7 2025

PurposeWe report on a deep learning-based approach to the segmentation of geographic atrophy (GA) in patients with advanced age-related macular degeneration (AMD). MethodThree-dimensional (3D) optical coherence tomography (OCT) data was collected from two instruments at two different retina practices. This totaled 367 and 348 volumes, respectively, of routinely collected clinical data. For all data, the accuracy of a 3D-to-2D segmentation model was assessed relative to ground-truth manual labeling. ResultsDice Similarity Scores (DSC) averaged 0.824 and 0.826 for each data set. Correlations (r2) between manual and automated areas were 0.883 and 0.906, respectively. The inclusion of near Infra-red imagery as an additional information channel to the algorithm did not notably improve performance. ConclusionAccurate assessment of GA in real-world clinical OCT data can be achieved using deep learning. In the advent of therapeutics to slow the rate of GA progression, reliable, automated assessment is a clinical objective and this work validates one such method.

OCT Segmentation Retrospective Clinical In Silico

Development and International Validation of a Deep Learning Model for Predicting Acute Pancreatitis Severity from CT Scans

Xu, Y., Teutsch, B., Zeng, W., Hu, Y., Rastogi, S., Hu, E. Y., DeGregorio, I. M., Fung, C. W., Richter, B. I., Cummings, R., Goldberg, J. E., Mathieu, E., Appiah Asare, B., Hegedus, P., Gurza, K.-B., Szabo, I. V., Tarjan, H., Szentesi, A., Borbely, R., Molnar, D., Faluhelyi, N., Vincze, A., Marta, K., Hegyi, P., Lei, Q., Gonda, T., Huang, C., Shen, Y.

•preprint•Jul 7 2025

Background and aimsAcute pancreatitis (AP) is a common gastrointestinal disease with rising global incidence. While most cases are mild, severe AP (SAP) carries high mortality. Early and accurate severity prediction is crucial for optimal management. However, existing severity prediction models, such as BISAP and mCTSI, have modest accuracy and often rely on data unavailable at admission. This study proposes a deep learning (DL) model to predict AP severity using abdominal contrast-enhanced CT (CECT) scans acquired within 24 hours of admission. MethodsWe collected 10,130 studies from 8,335 patients across a multi-site U.S. health system. The model was trained in two stages: (1) self-supervised pretraining on large-scale unlabeled CT studies and (2) fine-tuning on 550 labeled studies. Performance was evaluated against mCTSI and BISAP on a hold-out internal test set (n=100 patients) and externally validated on a Hungarian AP registry (n=518 patients). ResultsOn the internal test set, the model achieved AUROCs of 0.888 (95% CI: 0.800-0.960) for SAP and 0.888 (95% CI: 0.819-0.946) for mild AP (MAP), outperforming mCTSI (p = 0.002). External validation showed robust AUROCs of 0.887 (95% CI: 0.825-0.941) for SAP and 0.858 (95% CI: 0.826-0.888) for MAP, surpassing mCTSI (p = 0.024) and BISAP (p = 0.002). Retrospective simulation suggested the models potential to support admission triage and serve as a second reader during CECT interpretation. ConclusionsThe proposed DL model outperformed standard scoring systems for AP severity prediction, generalized well to external data, and shows promise for providing early clinical decision support and improving resource allocation.

CT Classification Abdominal Retrospective Clinical In Silico Academic Lab Benchmark SOTA

Predicting Cardiopulmonary Exercise Testing Performance in Patients Undergoing Transthoracic Echocardiography - An AI Based, Multimodal Model

Alishetti, S., Pan, W., Beecy, A. N., Liu, Z., Gong, A., Huang, Z., Clerkin, K. J., Goldsmith, R. L., Majure, D. T., Kelsey, C., vanMaanan, D., Ruhl, J., Tesfuzigta, N., Lancet, E., Kumaraiah, D., Sayer, G., Estrin, D., Weinberger, K., Kuleshov, V., Wang, F., Uriel, N.

•preprint•Jul 6 2025

Background and AimsTransthoracic echocardiography (TTE) is a widely available tool for diagnosing and managing heart failure but has limited predictive value for survival. Cardiopulmonary exercise test (CPET) performance strongly correlates with survival in heart failure patients but is less accessible. We sought to develop an artificial intelligence (AI) algorithm using TTE and electronic medical records to predict CPET peak oxygen consumption (peak VO2) [≤] 14 mL/kg/min. MethodsAn AI model was trained to predict peak VO2 [≤] 14 mL/kg/min from TTE images, structured TTE reports, demographics, medications, labs, and vitals. The training set included patients with a TTE within 6 months of a CPET. Performance was retrospectively tested in a held-out group from the development cohort and an external validation cohort. Results1,127 CPET studies paired with concomitant TTE were identified. The best performance was achieved by using all components (TTE images, all structured clinical data). The model performed well at predicting a peak VO2 [≤] 14 mL/kg/min, with an AUROC of 0.84 (development cohort) and 0.80 (external validation cohort). It performed consistently well using higher ([≤] 18 mL/kg/min) and lower ([≤] 12 mL/kg/min) cut-offs. ConclusionsThis multimodal AI model effectively categorized patients into low and high risk predicted peak VO2, demonstrating the potential to identify previously unrecognized patients in need of advanced heart failure therapies where CPET is not available.

Ultrasound Classification Cardiac Retrospective Clinical In Silico Academic Lab

Artificial Intelligence in Prenatal Ultrasound: A Systematic Review of Diagnostic Tools for Detecting Congenital Anomalies

Dunne, J., Kumarasamy, C., Belay, D. G., Betran, A. P., Gebremedhin, A. T., Mengistu, S., Nyadanu, S. D., Roy, A., Tessema, G., Tigest, T., Pereira, G.

•preprint•Jul 5 2025

BackgroundArtificial intelligence (AI) has potentially shown promise in interpreting ultrasound imaging through flexible pattern recognition and algorithmic learning, but implementation in clinical practice remains limited. This study aimed to investigate the current application of AI in prenatal ultrasounds to identify congenital anomalies, and to synthesise challenges and opportunities for the advancement of AI-assisted ultrasound diagnosis. This comprehensive analysis addresses the clinical translation gap between AI performance metrics and practical implementation in prenatal care. MethodsSystematic searches were conducted in eight electronic databases (CINAHL Plus, Ovid/EMBASE, Ovid/MEDLINE, ProQuest, PubMed, Scopus, Web of Science and Cochrane Library) and Google Scholar from inception to May 2025. Studies were included if they applied an AI-assisted ultrasound diagnostic tool to identify a congenital anomaly during pregnancy. This review adhered to PRISMA guidelines for systematic reviews. We evaluated study quality using the Checklist for Artificial Intelligence in Medical Imaging (CLAIM) guidelines. FindingsOf 9,918 records, 224 were identified for full-text review and 20 met the inclusion criteria. The majority of studies (11/20, 55%) were conducted in China, with most published after 2020 (16/20, 80%). All AI models were developed as an assistive tool for anomaly detection or classification. Most models (85%) focused on single-organ systems: heart (35%), brain/cranial (30%), or facial features (20%), while three studies (15%) attempted multi-organ anomaly detection. Fifty percent of the included studies reported exceptionally high model performance, with both sensitivity and specificity exceeding 0.95, with AUC-ROC values ranging from 0.91 to 0.97. Most studies (75%) lacked external validation, with internal validation often limited to small training and testing datasets. InterpretationWhile AI applications in prenatal ultrasound showed potential, current evidence indicates significant limitations in their practical implementation. Much work is required to optimise their application, including the external validation of diagnostic models with clinical utility to have real-world implications. Future research should prioritise larger-scale multi-centre studies, developing multi-organ anomaly detection capabilities rather than the current single-organ focus, and robust evaluation of AI tools in real-world clinical settings.

Ultrasound Detection Review In Silico Academic Lab Benchmark SOTA

Explainable machine learning for post PKR surgery follow-up

Soubeiran, C., Vilbert, M., Memmi, B., Georgeon, C., Borderie, V., Chessel, A., Plamann, K.

•preprint•Jul 5 2025

Photorefractive Keratectomy (PRK) is a widely used laser-assisted refractive surgical technique. In some cases, it leads to temporary subepithelial inflammation or fibrosis linked to visual haze. There are to our knowledge no physics based and quantitative tools to monitor these symptoms. We here present a comprehensive machine learning-based algorithm for the detection of fibrosis based on spectral domain optical coherence tomography images recorded in vivo on standard clinical devices. Because of the rarity of these phenomena, we trained the model on corneas presenting Fuchs dystrophy causing similar, but permanent, fibrosis symptoms, and applied it to images from patients who have undergone PRK surgery. Our study shows that the model output (probability of Fuchs dystrophy classification) provides a quantified and explainable indicator of corneal healing for post-operative follow-up.

OCT Classification Methodology In Silico

Multivariate whole brain neurodegenerative-cognitive-clinical severity mapping in the Alzheimer's disease continuum using explainable AI

A View-Agnostic Deep Learning Framework for Comprehensive Analysis of 2D-Echocardiography

Incremental diagnostic value of AI-derived coronary artery calcium in 18F-flurpiridaz PET Myocardial Perfusion Imaging

Automated Detection of Lacunes in Brain MR Images Using SAM with Robust Prompts via Self-Distillation and Anatomy-Informed Priors

A Unified Platform for Radiology Report Generation and Clinician-Centered AI Evaluation

Automated Deep Learning-Based 3D-to-2D Segmentation of Geographic Atrophy in Optical Coherence Tomography Data

Development and International Validation of a Deep Learning Model for Predicting Acute Pancreatitis Severity from CT Scans

Predicting Cardiopulmonary Exercise Testing Performance in Patients Undergoing Transthoracic Echocardiography - An AI Based, Multimodal Model

Artificial Intelligence in Prenatal Ultrasound: A Systematic Review of Diagnostic Tools for Detecting Congenital Anomalies

Explainable machine learning for post PKR surgery follow-up

Ready to Sharpen Your Edge?