Latest Papers on Radiology AI. Tags: None

Cross-dataset Evaluation of Dementia Longitudinal Progression Prediction Models

Zhang, C., An, L., Wulan, N., Nguyen, K.-N., Orban, C., Chen, P., Chen, C., Zhou, J. H., Liu, K., Yeo, B. T. T., Alzheimer's Disease Neuroimaging Initiative,, Australian Imaging Biomarkers and Lifestyle Study of Aging,

•preprint•Jun 11 2025

IntroductionAccurately predicting Alzheimers Disease (AD) progression is useful for clinical care. The 2019 TADPOLE (The Alzheimers Disease Prediction Of Longitudinal Evolution) challenge evaluated 92 algorithms from 33 teams worldwide. Unlike typical clinical prediction studies, TADPOLE accommodates (1) variable number of observed timepoints across patients, (2) missing data across modalities and visits, and (3) prediction over an open-ended time horizon, which better reflects real-world data. However, TADPOLE only used the Alzheimers Disease Neuroimaging Initiative (ADNI) dataset, so how well top algorithms generalize to other cohorts remains unclear. MethodsWe tested five algorithms in three external datasets covering 2,312 participants and 13,200 timepoints. The algorithms included FROG, the overall TADPOLE winner, which utilized a unique Longitudinal-to-Cross-sectional (L2C) transformation to convert variable-length longitudinal histories into feature vectors of the same length across participants (i.e., same-length feature vectors). We also considered two FROG variants. One variant unified all XGBoost models from the original FROG with a single feedforward neural network (FNN), which we referred to as L2C-FNN. We also included minimal recurrent neural networks (MinimalRNN), which was ranked second at publication time, as well as AD Course Map (AD-Map), which outperformed MinimalRNN at publication time. All five models - three FROG variants, MinimalRNN and AD-Map - were trained on ADNI and tested on the external datasets. ResultsL2C-FNN performed the best overall. In the case of predicting cognition and ventricle volume, L2C-FNN and AD-Map were the best. For clinical diagnosis prediction, L2C-FNN was the best, while AD-Map was the worst. L2C-FNN also maintained its edge over other models, regardless of the number of observed timepoints, and regardless of the prediction horizon from 0 to 6 years into the future. ConclusionsL2C-FNN shows strong potential for both short-term and long-term dementia progression prediction. Pretrained ADNI models are available: https://github.com/ThomasYeoLab/CBIG/tree/master/stable_projects/predict_phenotypes/Zhang2025_L2CFNN.

MRI Classification Neurological Retrospective Clinical In Silico Academic Lab Benchmark SOTA Open Code

Autonomous Computer Vision Development with Agentic AI

Jin Kim, Muhammad Wahi-Anwa, Sangyun Park, Shawn Shin, John M. Hoffman, Matthew S. Brown

•preprint•Jun 11 2025

Agentic Artificial Intelligence (AI) systems leveraging Large Language Models (LLMs) exhibit significant potential for complex reasoning, planning, and tool utilization. We demonstrate that a specialized computer vision system can be built autonomously from a natural language prompt using Agentic AI methods. This involved extending SimpleMind (SM), an open-source Cognitive AI environment with configurable tools for medical image analysis, with an LLM-based agent, implemented using OpenManus, to automate the planning (tool configuration) for a particular computer vision task. We provide a proof-of-concept demonstration that an agentic system can interpret a computer vision task prompt, plan a corresponding SimpleMind workflow by decomposing the task and configuring appropriate tools. From the user input prompt, "provide sm (SimpleMind) config for lungs, heart, and ribs segmentation for cxr (chest x-ray)"), the agent LLM was able to generate the plan (tool configuration file in YAML format), and execute SM-Learn (training) and SM-Think (inference) scripts autonomously. The computer vision agent automatically configured, trained, and tested itself on 50 chest x-ray images, achieving mean dice scores of 0.96, 0.82, 0.83, for lungs, heart, and ribs, respectively. This work shows the potential for autonomous planning and tool configuration that has traditionally been performed by a data scientist in the development of computer vision applications.

X-Ray Segmentation Chest Methodology In Silico Open Source GenAI Open Code

ADAgent: LLM Agent for Alzheimer's Disease Analysis with Collaborative Coordinator

Wenlong Hou, Guangqian Yang, Ye Du, Yeung Lau, Lihao Liu, Junjun He, Ling Long, Shujun Wang

•preprint•Jun 11 2025

Alzheimer's disease (AD) is a progressive and irreversible neurodegenerative disease. Early and precise diagnosis of AD is crucial for timely intervention and treatment planning to alleviate the progressive neurodegeneration. However, most existing methods rely on single-modality data, which contrasts with the multifaceted approach used by medical experts. While some deep learning approaches process multi-modal data, they are limited to specific tasks with a small set of input modalities and cannot handle arbitrary combinations. This highlights the need for a system that can address diverse AD-related tasks, process multi-modal or missing input, and integrate multiple advanced methods for improved performance. In this paper, we propose ADAgent, the first specialized AI agent for AD analysis, built on a large language model (LLM) to address user queries and support decision-making. ADAgent integrates a reasoning engine, specialized medical tools, and a collaborative outcome coordinator to facilitate multi-modal diagnosis and prognosis tasks in AD. Extensive experiments demonstrate that ADAgent outperforms SOTA methods, achieving significant improvements in accuracy, including a 2.7% increase in multi-modal diagnosis, a 0.7% improvement in multi-modal prognosis, and enhancements in MRI and PET diagnosis tasks.

Mixed Modality Classification Neurological Methodology In Silico Academic Lab GenAI

AI-based Hepatic Steatosis Detection and Integrated Hepatic Assessment from Cardiac CT Attenuation Scans Enhances All-cause Mortality Risk Stratification: A Multi-center Study

Yi, J., Patel, K., Miller, R. J., Marcinkiewicz, A. M., Shanbhag, A., Hijazi, W., Dharmavaram, N., Lemley, M., Zhou, J., Zhang, W., Liang, J. X., Ramirez, G., Builoff, V., Slipczuk, L., Travin, M., Alexanderson, E., Carvajal-Juarez, I., Packard, R. R., Al-Mallah, M., Ruddy, T. D., Einstein, A. J., Feher, A., Miller, E. J., Acampa, W., Knight, S., Le, V., Mason, S., Calsavara, V. F., Chareonthaitawee, P., Wopperer, S., Kwan, A. C., Wang, L., Berman, D. S., Dey, D., Di Carli, M. F., Slomka, P.

•preprint•Jun 11 2025

BackgroundHepatic steatosis (HS) is a common cardiometabolic risk factor frequently present but under- diagnosed in patients with suspected or known coronary artery disease. We used artificial intelligence (AI) to automatically quantify hepatic tissue measures for identifying HS from CT attenuation correction (CTAC) scans during myocardial perfusion imaging (MPI) and evaluate their added prognostic value for all-cause mortality prediction. MethodsThis study included 27039 consecutive patients [57% male] with MPI scans from nine sites. We used an AI model to segment liver and spleen on low dose CTAC scans and quantify the liver measures, and the difference of liver minus spleen (LmS) measures. HS was defined as mean liver attenuation < 40 Hounsfield units (HU) or LmS attenuation < -10 HU. Additionally, we used seven sites to develop an AI liver risk index (LIRI) for comprehensive hepatic assessment by integrating the hepatic measures and two external sites to validate its improved prognostic value and generalizability for all-cause mortality prediction over HS. FindingsMedian (interquartile range [IQR]) age was 67 [58, 75] years and body mass index (BMI) was 29.5 [25.5, 34.7] kg/m2, with diabetes in 8950 (33%) patients. The algorithm identified HS in 6579 (24%) patients. During median [IQR] follow-up of 3.58 [1.86, 5.15] years, 4836 (18%) patients died. HS was associated with increased mortality risk overall (adjusted hazard ratio (HR): 1.14 [1.05, 1.24], p=0.0016) and in subpopulations. LIRI provided higher prognostic value than HS after adjustments overall (adjusted HR 1.5 [1.32, 1.69], p<0.0001 vs HR 1.16 [1.02, 1.31], p=0.0204) and in subpopulations. InterpretationsAI-based hepatic measures automatically identify HS from CTAC scans in patients undergoing MPI without additional radiation dose or physician interaction. Integrated liver assessment combining multiple hepatic imaging measures improved risk stratification for all-cause mortality. FundingNational Heart, Lung, and Blood Institute/National Institutes of Health. Research in context Evidence before this studyExisting studies show that fully automated hepatic quantification analysis from chest computed tomography (CT) scans is feasible. While hepatic measures show significant potential for improving risk stratification and patient management, CT attenuation correction (CTAC) scans from patients undergoing myocardial perfusion imaging (MPI) have rarely been utilized for concurrent and automated volumetric hepatic analysis beyond its current utilization for attenuation correction and coronary artery calcium burden assessment. We conducted a literature review on PubMed and Google Scholar on April 1st, 2025, using the following keywords: ("liver" OR "hepatic") AND ("quantification" OR "measure") AND ("risk stratification" OR "survival analysis" OR "prognosis" OR "prognostic prediction") AND ("CT" OR "computed tomography"). Previous studies have established approaches for the identification of hepatic steatosis (HS) and its prognostic value in various small- scale cohorts using either invasive biopsy or non-invasive imaging approaches. However, CT-based non- invasive imaging, existing research predominantly focuses on manual region-of-interest (ROI)-based hepatic quantification from selected CT slices or on identifying hepatic steatosis without comprehensive prognostic assessment in large-scale and multi-site cohorts, which hinders the association evaluation of hepatic steatosis for risk stratification in clinical routine with less precise estimates, weak statistical reliability, and limited subgroup analysis to assess bias effects. No existing studies investigated the prognostic value of hepatic steatosis measured in consecutive patients undergoing MPI. These patients usually present with multiple cardiovascular risk factors such as hypertension, dyslipidemia, diabetes and family history of coronary disease. Whether hepatic measures could provide added prognostic value over existing cardiometabolic factors is unknown. Furthermore, despite the diverse hepatic measures on CT in existing literature, integrated AI-based assessment has not been investigated before though it may improve the risk stratification further over HS. Lastly, previous research relied on dedicated CT scans performed for screening purposes. CTAC scans obtained routinely with MPI had never been utilized for automated HS detection and prognostic evaluation, despite being readily available at no additional cost or radiation exposure. Added value of this studyIn this multi-center (nine sites) international (three countries) study of 27039 consecutive patients undergoing myocardial perfusion imaging (MPI) with PET or SPECT, we used an innovative artificial intelligence (AI)- based approach for automatically segmenting the entire liver and spleen volumes from low-dose ungated CT attenuation correction (CTAC) scans acquired during MPI, followed by the identification of hepatic steatosis. We evaluated the added prognostic value of several key hepatic metrics--liver measures (mean attenuation, coefficient of variation (CoV), entropy, and standard deviation), and similar measures for the difference of liver minus spleen (LmS)--derived from volumetric quantification of CTAC scans with adjustment for existing clinical and MPI variables. A HS imaging criterion (HSIC: a patient has moderate or severe hepatic steatosis if the mean liver attenuation is < 40 Hounsfield unit (HU) or the difference of liver mean attenuation and spleen mean attenuation is < -10 HU) was used to detect HS. These hepatic metrics were assessed for their ability to predict all-cause mortality in a large-scale and multi-center patient cohort. Additionally, we developed and validated an eXtreme Gradient Boosting decision tree model for integrated liver assessment and risk stratification by combining the hepatic metrics with the demographic variables to derive a liver risk index (LIRI). Our results demonstrated strong associations between the hepatic metrics and all-cause mortality, even after adjustment for clinical variables, myocardial perfusion, and atherosclerosis biomarkers. Our results revealed significant differences in the association of HS with mortality in different sex, age, and race subpopulations. Similar differences were also observed in various chronic disease subpopulations such as obese and diabetic subpopulations. These results highlighted the modifying effects of various patient characteristics, partially accounting for the inconsistent association observed in existing studies. Compared with individual hepatic measures, LIRI showed significant improvement compared to HSIC-based HS in mortality prediction in external testing. All these demonstrate the feasibility of HS detection and integrated liver assessment from cardiac low-dose CT scans from MPI, which is also expected to apply for generic chest CT scans which have coverage of liver and spleen while prior studies used dedicated abdominal CT scans for such purposes. Implications of all the available evidenceRoutine point-of-care analysis of hepatic quantification can be seamlessly integrated into all MPI using CTAC scans to noninvasively identify HS at no additional cost or radiation exposure. The automatically derived hepatic metrics enhance risk stratification by providing additional prognostic value beyond existing clinical and imaging factors, and the LIRI enables comprehensive assessment of liver and further improves risk stratification and patient management.

CT Segmentation Abdominal Retrospective Clinical In Silico Academic Lab Benchmark SOTA

Test-Time-Scaling for Zero-Shot Diagnosis with Visual-Language Reasoning

Ji Young Byun, Young-Jin Park, Navid Azizan, Rama Chellappa

•preprint•Jun 11 2025

As a cornerstone of patient care, clinical decision-making significantly influences patient outcomes and can be enhanced by large language models (LLMs). Although LLMs have demonstrated remarkable performance, their application to visual question answering in medical imaging, particularly for reasoning-based diagnosis, remains largely unexplored. Furthermore, supervised fine-tuning for reasoning tasks is largely impractical due to limited data availability and high annotation costs. In this work, we introduce a zero-shot framework for reliable medical image diagnosis that enhances the reasoning capabilities of LLMs in clinical settings through test-time scaling. Given a medical image and a textual prompt, a vision-language model processes a medical image along with a corresponding textual prompt to generate multiple descriptions or interpretations of visual features. These interpretations are then fed to an LLM, where a test-time scaling strategy consolidates multiple candidate outputs into a reliable final diagnosis. We evaluate our approach across various medical imaging modalities -- including radiology, ophthalmology, and histopathology -- and demonstrate that the proposed test-time scaling strategy enhances diagnostic accuracy for both our and baseline methods. Additionally, we provide an empirical analysis showing that the proposed approach, which allows unbiased prompting in the first stage, improves the reliability of LLM-generated diagnoses and enhances classification accuracy.

Mixed Modality Classification Methodology In Silico Academic Lab GenAI

Vector Representations of Vessel Trees

James Batten, Michiel Schaap, Matthew Sinclair, Ying Bai, Ben Glocker

•preprint•Jun 11 2025

We introduce a novel framework for learning vector representations of tree-structured geometric data focusing on 3D vascular networks. Our approach employs two sequentially trained Transformer-based autoencoders. In the first stage, the Vessel Autoencoder captures continuous geometric details of individual vessel segments by learning embeddings from sampled points along each curve. In the second stage, the Vessel Tree Autoencoder encodes the topology of the vascular network as a single vector representation, leveraging the segment-level embeddings from the first model. A recursive decoding process ensures that the reconstructed topology is a valid tree structure. Compared to 3D convolutional models, this proposed approach substantially lowers GPU memory requirements, facilitating large-scale training. Experimental results on a 2D synthetic tree dataset and a 3D coronary artery dataset demonstrate superior reconstruction fidelity, accurate topology preservation, and realistic interpolations in latent space. Our scalable framework, named VeTTA, offers precise, flexible, and topologically consistent modeling of anatomical tree structures in medical imaging.

CT Segmentation Vascular Methodology In Silico Academic Lab Breakthrough

Towards a general-purpose foundation model for fMRI analysis

Cheng Wang, Yu Jiang, Zhihao Peng, Chenxin Li, Changbae Bang, Lin Zhao, Jinglei Lv, Jorge Sepulcre, Carl Yang, Lifang He, Tianming Liu, Daniel Barron, Quanzheng Li, Randy Hirschtick, Byung-Hoon Kim, Xiang Li, Yixuan Yuan

•preprint•Jun 11 2025

Functional Magnetic Resonance Imaging (fMRI) is essential for studying brain function and diagnosing neurological disorders, but current analysis methods face reproducibility and transferability issues due to complex pre-processing and task-specific models. We introduce NeuroSTORM (Neuroimaging Foundation Model with Spatial-Temporal Optimized Representation Modeling), a generalizable framework that directly learns from 4D fMRI volumes and enables efficient knowledge transfer across diverse applications. NeuroSTORM is pre-trained on 28.65 million fMRI frames (>9,000 hours) from over 50,000 subjects across multiple centers and ages 5 to 100. Using a Mamba backbone and a shifted scanning strategy, it efficiently processes full 4D volumes. We also propose a spatial-temporal optimized pre-training approach and task-specific prompt tuning to improve transferability. NeuroSTORM outperforms existing methods across five tasks: age/gender prediction, phenotype prediction, disease diagnosis, fMRI-to-image retrieval, and task-based fMRI classification. It demonstrates strong clinical utility on datasets from hospitals in the U.S., South Korea, and Australia, achieving top performance in disease diagnosis and cognitive phenotype prediction. NeuroSTORM provides a standardized, open-source foundation model to improve reproducibility and transferability in fMRI-based clinical research.

MRI Classification Neurological Methodology In Silico Academic Lab Open Code Benchmark SOTA

Patient perspectives on AI in radiology: Insights from the United Arab Emirates.

El-Sayed MZ, Rawashdeh M, Moossa A, Atfah M, Prajna B, Ali MA

•papers•Jun 11 2025

Artificial intelligence (AI) enhances diagnostic accuracy, efficiency, and patient outcomes in radiology. Patient acceptance is essential for successful integration. This study examines patient perspectives on AI in radiology within the UAE, focusing on their knowledge, attitudes, and perceived barriers. Understanding these factors can address concerns, improve trust, and guide patient-centered AI implementation. The findings aim to support effective AI adoption in healthcare. A cross-sectional study involving 205 participants undergoing radiological imaging in the UAE. Data was collected through an online questionnaire, developed based on a literature review, and pre-tested for reliability and validity. Non-probability sampling methods, including convenience and snowball sampling, were employed. The questionnaire assessed participants' knowledge, attitudes, and perceived barriers regarding AI in radiology. Data was analyzed, and categorical variables were expressed as frequencies and percentages. Most participants (89.8 %) believed AI could improve diagnostic accuracy, and 87.8 % acknowledged its role in prioritizing urgent cases. However, only 22 % had direct experience with AI in radiology. While 81 % expressed comfort with AI-based technology, concerns about data security (80.5 %), lack of empathy in AI systems (82.9 %), and insufficient information about AI (85.8 %) were significant barriers. Additionally, (87.3 %) of participants were concerned about the cost of AI implementation. Despite these concerns, 86.3 % believed AI could improve the quality of radiological services, and 83.9 % were satisfied with its potential applications. UAE patients generally support AI in radiology, recognizing its potential for improved diagnostic accuracy. However, concerns about data security, empathy, and understanding of AI technologies necessitate improved patient education, transparent communication, and regulatory frameworks to foster trust and acceptance.

Retrospective Clinical Post Market Academic Lab Ethics Policy

Evaluation of Semi-Automated versus Fully Automated Technologies for Computed Tomography Scalable Body Composition Analyses in Patients with Severe Acute Respiratory Syndrome Coronavirus-2.

Wozniak A, O'Connor P, Seigal J, Vasilopoulos V, Beg MF, Popuri K, Joyce C, Sheean P

•papers•Jun 11 2025

Fully automated, artificial intelligence (AI) -based software has recently become available for scalable body composition analysis. Prior to broad application in the clinical arena, validation studies are needed. Our goal was to compare the results of a fully automated, AI-based software with a semi-automatic software in a sample of hospitalized patients. A diverse group of patients with Coronovirus-2 (COVID-19) and evaluable computed tomography (CT) images were included in this retrospective cohort. Our goal was to compare multiple aspects of body composition procuring results from fully automated and semi-automated body composition software. Bland-Altman analyses and correlation coefficients were used to calculate average bias and trend of bias for skeletal muscle (SM), visceral adipose tissue (VAT), subcutaneous adipose tissue (SAT), intermuscular adipose tissue (IMAT), and total adipose tissue (TAT-the sum of SAT, VAT, and IMAT). A total of 141 patients (average (standard deviation (SD)) age of 58.2 (18.9), 61% male, and 31% White Non-Hispanic, 31% Black Non-Hispanic, and 33% Hispanic) contributed to the analysis. Average bias (mean ± SD) was small (in comparison to the SD) and negative for SM (-3.79 cm2 ± 7.56 cm2) and SAT (-7.06 cm2 ± 19.77 cm2), and small and positive for VAT (2.29 cm2 ± 15.54 cm2). A large negative bias was observed for IMAT (-7.77 cm2 ± 5.09 cm2), where fully automated software underestimated intramuscular tissue quantity relative to the semi-automated software. The discrepancy in IMAT calculation was not uniform across its range given a correlation coefficient of -0.625; as average IMAT increased, the bias (underestimation by fully automated software) was greater. When compared to a semi-automated software, a fully automated, AI-based software provides consistent findings for key CT body composition measures (SM, SAT, VAT, TAT). While our findings support good overall agreement as evidenced by small biases and limited outliers, additional studies are needed in other clinical populations to further support validity and advanced precision, especially in the context of body composition and malnutrition assessment.

CT Segmentation Abdominal Retrospective Clinical In Silico Academic Lab

RCMIX model based on pre-treatment MRI imaging predicts T-downstage in MRI-cT4 stage rectal cancer.

Bai F, Liao L, Tang Y, Wu Y, Wang Z, Zhao H, Huang J, Wang X, Ding P, Wu X, Cai Z

•papers•Jun 11 2025

Neoadjuvant therapy (NAT) is the standard treatment strategy for MRI-defined cT4 rectal cancer. Predicting tumor regression can guide the resection plane to some extent. Here, we covered pre-treatment MRI imaging of 363 cT4 rectal cancer patients receiving NAT and radical surgery from three hospitals: Center 1 (n = 205), Center 2 (n = 109) and Center 3 (n = 52). We propose a machine learning model named RCMIX, which incorporates a multilayer perceptron algorithm based on 19 pre-treatment MRI radiomic features and 2 clinical features in cT4 rectal cancer patients receiving NAT. The model was trained on 205 cases of cT4 rectal cancer patients, achieving an AUC of 0.903 (95% confidence interval, 0.861-0.944) in predicting T-downstage. It also achieved AUC of 0.787 (0.699-0.874) and 0.773 (0.646-0.901) in two independent test cohorts, respectively. cT4 rectal cancer patients who were predicted as Well T-downstage by the RCMIX model had significantly better disease-free survival than those predicted as Poor T-downstage. Our study suggests that the RCMIX model demonstrates satisfactory performance in predicting T-downstage by NAT for cT4 rectal cancer patients, which may provide critical insights to improve surgical strategies.

MRI Classification Abdominal Retrospective Clinical In Silico Academic Lab

Filter Papers

Tags

Cross-dataset Evaluation of Dementia Longitudinal Progression Prediction Models

Autonomous Computer Vision Development with Agentic AI

ADAgent: LLM Agent for Alzheimer's Disease Analysis with Collaborative Coordinator

AI-based Hepatic Steatosis Detection and Integrated Hepatic Assessment from Cardiac CT Attenuation Scans Enhances All-cause Mortality Risk Stratification: A Multi-center Study

Test-Time-Scaling for Zero-Shot Diagnosis with Visual-Language Reasoning

Vector Representations of Vessel Trees

Towards a general-purpose foundation model for fMRI analysis

Patient perspectives on AI in radiology: Insights from the United Arab Emirates.

Evaluation of Semi-Automated versus Fully Automated Technologies for Computed Tomography Scalable Body Composition Analyses in Patients with Severe Acute Respiratory Syndrome Coronavirus-2.

RCMIX model based on pre-treatment MRI imaging predicts T-downstage in MRI-cT4 stage rectal cancer.

Ready to Sharpen Your Edge?