Latest Papers on Radiology AI. Tags: In Silico, Order: Best Match, Limit: 10.

SMURF: Scalable method for unsupervised reconstruction of flow in 4D flow MRI

Atharva Hans, Abhishek Singh, Pavlos Vlachos, Ilias Bilionis

•preprint•May 18 2025

We introduce SMURF, a scalable and unsupervised machine learning method for simultaneously segmenting vascular geometries and reconstructing velocity fields from 4D flow MRI data. SMURF models geometry and velocity fields using multilayer perceptron-based functions incorporating Fourier feature embeddings and random weight factorization to accelerate convergence. A measurement model connects these fields to the observed image magnitude and phase data. Maximum likelihood estimation and subsampling enable SMURF to process high-dimensional datasets efficiently. Evaluations on synthetic, in vitro, and in vivo datasets demonstrate SMURF's performance. On synthetic internal carotid artery aneurysm data derived from CFD, SMURF achieves a quarter-voxel segmentation accuracy across noise levels of up to 50%, outperforming the state-of-the-art segmentation method by up to double the accuracy. In an in vitro experiment on Poiseuille flow, SMURF reduces velocity reconstruction RMSE by approximately 34% compared to raw measurements. In in vivo internal carotid artery aneurysm data, SMURF attains nearly half-voxel segmentation accuracy relative to expert annotations and decreases median velocity divergence residuals by about 31%, with a 27% reduction in the interquartile range. These results indicate that SMURF is robust to noise, preserves flow structure, and identifies patient-specific morphological features. SMURF advances 4D flow MRI accuracy, potentially enhancing the diagnostic utility of 4D flow MRI in clinical applications.

MRI Segmentation Vascular Methodology In Silico Benchmark SOTA

Attention-Enhanced U-Net for Accurate Segmentation of COVID-19 Infected Lung Regions in CT Scans

Amal Lahchim, Lazar Davic

•preprint•May 18 2025

In this study, we propose a robust methodology for automatic segmentation of infected lung regions in COVID-19 CT scans using convolutional neural networks. The approach is based on a modified U-Net architecture enhanced with attention mechanisms, data augmentation, and postprocessing techniques. It achieved a Dice coefficient of 0.8658 and mean IoU of 0.8316, outperforming other methods. The dataset was sourced from public repositories and augmented for diversity. Results demonstrate superior segmentation performance. Future work includes expanding the dataset, exploring 3D segmentation, and preparing the model for clinical deployment.

CT Segmentation Chest Methodology In Silico Academic Lab

ChatGPT-4-Driven Liver Ultrasound Radiomics Analysis: Advantages and Drawbacks Compared to Traditional Techniques.

Sultan L, Venkatakrishna SSB, Anupindi S, Andronikou S, Acord M, Otero H, Darge K, Sehgal C, Holmes J

•papers•May 18 2025

Artificial intelligence (AI) is transforming medical imaging, with large language models such as ChatGPT-4 emerging as potential tools for automated image interpretation. While AI-driven radiomics has shown promise in diagnostic imaging, the efficacy of ChatGPT-4 in liver ultrasound analysis remains largely unexamined. This study evaluates the capability of ChatGPT-4 in liver ultrasound radiomics, specifically its ability to differentiate fibrosis, steatosis, and normal liver tissue, compared to conventional image analysis software. Seventy grayscale ultrasound images from a preclinical liver disease model, including fibrosis (n=31), fatty liver (n=18), and normal liver (n=21), were analyzed. ChatGPT-4 extracted texture features, which were compared to those obtained using Interactive Data Language (IDL), a traditional image analysis software. One-way ANOVA was used to identify statistically significant features differentiating liver conditions, and logistic regression models were employed to assess diagnostic performance. ChatGPT-4 extracted nine key textural features-echo intensity, heterogeneity, skewness, kurtosis, contrast, homogeneity, dissimilarity, angular second moment, and entropy-all of which significantly differed across liver conditions (p < 0.05). Among individual features, echo intensity achieved the highest F1-score (0.85). When combined, ChatGPT-4 attained 76% accuracy and 83% sensitivity in classifying liver disease. ROC analysis demonstrated strong discriminatory performance, with AUC values of 0.75 for fibrosis, 0.87 for normal liver, and 0.97 for steatosis. Compared to Interactive Data Language (IDL) image analysis software, ChatGPT-4 exhibited slightly lower sensitivity (0.83 vs. 0.89) but showed moderate correlation (R = 0.68, p < 0.0001) with IDL-derived features. However, it significantly outperformed IDL in processing efficiency, reducing analysis time by 40%, highlighting its potential for high throughput radiomic analysis. Despite slightly lower sensitivity than IDL, ChatGPT-4 demonstrated high feasibility for ultrasound radiomics, offering faster processing, high-throughput analysis, and automated multi-image evaluation. These findings support its potential integration into AI-driven imaging workflows, with further refinements needed to enhance feature reproducibility and diagnostic accuracy.

Ultrasound Classification Abdominal Retrospective Clinical In Silico Academic Lab GenAI

Deep learning feature-based model for predicting lymphovascular invasion in urothelial carcinoma of bladder using CT images.

Xiao B, Lv Y, Peng C, Wei Z, Xv Q, Lv F, Jiang Q, Liu H, Li F, Xv Y, He Q, Xiao M

•papers•May 18 2025

Lymphovascular invasion significantly impacts the prognosis of urothelial carcinoma of the bladder. Traditional lymphovascular invasion detection methods are time-consuming and costly. This study aims to develop a deep learning-based model to preoperatively predict lymphovascular invasion status in urothelial carcinoma of bladder using CT images. Data and CT images of 577 patients across four medical centers were retrospectively collected. The largest tumor slices from the transverse, coronal, and sagittal planes were selected and used to train CNN models (InceptionV3, DenseNet121, ResNet18, ResNet34, ResNet50, and VGG11). Deep learning features were extracted and visualized using Grad-CAM. Principal Component Analysis reduced features to 64. Using the extracted features, Decision Tree, XGBoost, and LightGBM models were trained with 5-fold cross-validation and ensembled in a stacking model. Clinical risk factors were identified through logistic regression analyses and combined with DL scores to enhance lymphovascular invasion prediction accuracy. The ResNet50-based model achieved an AUC of 0.818 in the validation set and 0.708 in the testing set. The combined model showed an AUC of 0.794 in the validation set and 0.767 in the testing set, demonstrating robust performance across diverse data. We developed a robust radiomics model based on deep learning features from CT images to preoperatively predict lymphovascular invasion status in urothelial carcinoma of the bladder. This model offers a non-invasive, cost-effective tool to assist clinicians in personalized treatment planning. We developed a robust radiomics model based on deep learning features from CT images to preoperatively predict lymphovascular invasion status in urothelial carcinoma of the bladder. We developed a deep learning feature-based stacking model to predict lymphovascular invasion in urothelial carcinoma of the bladder patients using CT. Max cross sections from three dimensions of the CT image are used to train the CNN model. We made comparisons across six CNN networks, including ResNet50.

CT Classification Abdominal Retrospective Clinical In Silico Academic Lab

MedSG-Bench: A Benchmark for Medical Image Sequences Grounding

Jingkun Yue, Siqi Zhang, Zinan Jia, Huihuan Xu, Zongbo Han, Xiaohong Liu, Guangyu Wang

•preprint•May 17 2025

Visual grounding is essential for precise perception and reasoning in multimodal large language models (MLLMs), especially in medical imaging domains. While existing medical visual grounding benchmarks primarily focus on single-image scenarios, real-world clinical applications often involve sequential images, where accurate lesion localization across different modalities and temporal tracking of disease progression (e.g., pre- vs. post-treatment comparison) require fine-grained cross-image semantic alignment and context-aware reasoning. To remedy the underrepresentation of image sequences in existing medical visual grounding benchmarks, we propose MedSG-Bench, the first benchmark tailored for Medical Image Sequences Grounding. It comprises eight VQA-style tasks, formulated into two paradigms of the grounding tasks, including 1) Image Difference Grounding, which focuses on detecting change regions across images, and 2) Image Consistency Grounding, which emphasizes detection of consistent or shared semantics across sequential images. MedSG-Bench covers 76 public datasets, 10 medical imaging modalities, and a wide spectrum of anatomical structures and diseases, totaling 9,630 question-answer pairs. We benchmark both general-purpose MLLMs (e.g., Qwen2.5-VL) and medical-domain specialized MLLMs (e.g., HuatuoGPT-vision), observing that even the advanced models exhibit substantial limitations in medical sequential grounding tasks. To advance this field, we construct MedSG-188K, a large-scale instruction-tuning dataset tailored for sequential visual grounding, and further develop MedSeq-Grounder, an MLLM designed to facilitate future research on fine-grained understanding across medical sequential images. The benchmark, dataset, and model are available at https://huggingface.co/MedSG-Bench

Mixed Modality Detection Whole Body Dataset Release In Silico Academic Lab Open Dataset Open Code GenAI

Feasibility of improving vocal fold pathology image classification with synthetic images generated by DDPM-based GenAI: a pilot study.

Khazrak I, Zainaee S, M Rezaee M, Ghasemi M, C Green R

•papers•May 17 2025

Voice disorders (VD) are often linked to vocal fold structural pathologies (VFSP). Laryngeal imaging plays a vital role in assessing VFSPs and VD in clinical and research settings, but challenges like scarce and imbalanced datasets can limit the generalizability of findings. Denoising Diffusion Probabilistic Models (DDPMs), a subtype of Generative AI, has gained attention for its ability to generate high-quality and realistic synthetic images to address these challenges. This study explores the feasibility of improving VFSP image classification by generating synthetic images using DDPMs. 404 laryngoscopic images depicting VF without and with VFSP were included. DDPMs were used to generate synthetic images to augment the original dataset. Two convolutional neural network architectures, VGG16 and ResNet50, were applied for model training. The models were initially trained only on the original dataset. Then, they were trained on the augmented datasets. Evaluation metrics were analyzed to assess the performance of the models for both binary classification (with/without VFSPs) and multi-class classification (seven specific VFSPs). Realistic and high-quality synthetic images were generated for dataset augmentation. The model first failed to converge when trained only on the original dataset, but they successfully converged and achieved low loss and high accuracy when trained on the augmented datasets. The best performance was gained for both binary and multi-class classification when the models were trained on an augmented dataset. Generating realistic images of VFSP using DDPMs is feasible and can enhance the classification of VFSPs by an AI model and may support VD screening and diagnosis.

Mixed Modality Classification Methodology In Silico Academic Lab GenAI

ML-Driven Alzheimer 's disease prediction: A deep ensemble modeling approach.

Jumaili MLF, Sonuç E

•papers•May 17 2025

Alzheimer's disease (AD) is a progressive neurological disorder characterized by cognitive decline due to brain cell death, typically manifesting later in life.Early and accurate detection is critical for effective disease management and treatment. This study proposes an ensemble learning framework that combines five deep learning architectures (VGG16, VGG19, ResNet50, InceptionV3, and EfficientNetB7) to improve the accuracy of AD diagnosis. We use a comprehensive dataset of 3,714 MRI brain scans collected from specialized clinics in Iraq, categorized into three classes: NonDemented (834 images), MildDemented (1,824 images), and VeryDemented (1,056 images). The proposed voting ensemble model achieves a diagnostic accuracy of 99.32% on our dataset. The effectiveness of the model is further validated on two external datasets: OASIS (achieving 86.6% accuracy) and ADNI (achieving 99.5% accuracy), demonstrating competitive performance compared to existing approaches. Moreover, the proposed model exhibits high precision and recall across all stages of dementia, providing a reliable and robust tool for early AD detection. This study highlights the effectiveness of ensemble learning in AD diagnosis and shows promise for clinical applications.

MRI Classification Neurological Retrospective Clinical In Silico Academic Lab

Fully Automated Evaluation of Condylar Remodeling after Orthognathic Surgery in Skeletal Class II Patients Using Deep Learning and Landmarks.

Jia W, Wu H, Mei L, Wu J, Wang M, Cui Z

•papers•May 17 2025

Condylar remodeling is a key prognostic indicator in maxillofacial surgery for skeletal class II patients. This study aimed to develop and validate a fully automated method leveraging landmark-guided segmentation and registration for efficient assessment of condylar remodeling. A V-Net-based deep learning workflow was developed to automatically segment the mandible and localize anatomical landmarks from CT images. Cutting planes were computed based on the landmarks to segment the condylar and ramus volumes from the mandible mask. The stable ramus served as a reference for registering pre- and post-operative condyles using the Iterative Closest Point (ICP) algorithm. Condylar remodeling was subsequently assessed through mesh registration, heatmap visualization, and quantitative metrics of surface distance and volumetric change. Experts also rated the concordance between automated assessments and clinical diagnoses. In the test set, condylar segmentation achieved a Dice coefficient of 0.98, and landmark prediction yielded a mean absolute error of 0.26 mm. The automated evaluation process was completed in 5.22 seconds, approximately 150 times faster than manual assessments. The method accurately quantified condylar volume changes, ranging from 2.74% to 50.67% across patients. Expert ratings for all test cases averaged 9.62. This study introduced a consistent, accurate, and fully automated approach for condylar remodeling evaluation. The well-defined anatomical landmarks guided precise segmentation and registration, while deep learning supported an end-to-end automated workflow. The test results demonstrated its broad clinical applicability across various degrees of condylar remodeling and high concordance with expert assessments. By integrating anatomical landmarks and deep learning, the proposed method improves efficiency by 150 times without compromising accuracy, thereby facilitating an efficient and accurate assessment of orthognathic prognosis. The personalized 3D condylar remodeling models aid in visualizing sequelae, such as joint pain or skeletal relapse, and guide individualized management of TMJ disorders.

CT Segmentation Retrospective Clinical In Silico Academic Lab

A Robust Automated Segmentation Method for White Matter Hyperintensity of Vascular-origin.

He H, Jiang J, Peng S, He C, Sun T, Fan F, Song H, Sun D, Xu Z, Wu S, Lu D, Zhang J

•papers•May 17 2025

White matter hyperintensity (WMH) is a primary manifestation of small vessel disease (SVD), leading to vascular cognitive impairment and other disorders. Accurate WMH quantification is vital for diagnosis and prognosis, but current automatic segmentation methods often fall short, especially across different datasets. The aims of this study are to develop and validate a robust deep learning segmentation method for WMH of vascular-origin. In this study, we developed a transformer-based method for the automatic segmentation of vascular-origin WMH using both 3D T1 and 3D T2-FLAIR images. Our initial dataset comprised 126 participants with varying WMH burdens due to SVD, each with manually segmented WMH masks used for training and testing. External validation was performed on two independent datasets: the WMH Segmentation Challenge 2017 dataset (170 subjects) and an in-house vascular risk factor dataset (70 subjects), which included scans acquired on eight different MRI systems at field strengths of 1.5T, 3T, and 5T. This approach enabled a comprehensive assessment of the method's generalizability across diverse imaging conditions. We further compared our method against LGA, LPA, BIANCA, UBO-detector and TrUE-Net in optimized settings. Our method consistently outperformed others, achieving a median Dice coefficient of 0.78±0.09 in our primary dataset, 0.72±0.15 in the external dataset 1, and 0.72±0.14 in the external dataset 2. The relative volume errors were 0.15±0.14, 0.50±0.86, and 0.47±1.02, respectively. The true positive rates were 0.81±0.13, 0.92±0.09, and 0.92±0.12, while the false positive rates were 0.20±0.09, 0.40±0.18, and 0.40±0.19. None of the external validation datasets were used for model training; instead, they comprise previously unseen MRI scans acquired from different scanners and protocols. This setup closely reflects real-world clinical scenarios and further demonstrates the robustness and generalizability of our model across diverse MRI systems and acquisition settings. As such, the proposed method provides a reliable solution for WMH segmentation in large-scale cohort studies.

MRI Segmentation Neurological Retrospective Clinical In Silico Academic Lab Benchmark SOTA

Foundation versus Domain-Specific Models for Left Ventricular Segmentation on Cardiac Ultrasound

Chao, C.-J., Gu, Y., Kumar, W., Xiang, T., Appari, L., Wu, J., Farina, J. M., Wraith, R., Jeong, J., Arsanjani, R., Garvan, K. C., Oh, J. K., Langlotz, C. P., Banerjee, I., Li, F.-F., Adeli, E.

•preprint•May 17 2025

The Segment Anything Model (SAM) was fine-tuned on the EchoNet-Dynamic dataset and evaluated on external transthoracic echocardiography (TTE) and Point-of-Care Ultrasound (POCUS) datasets from CAMUS (University Hospital of St Etienne) and Mayo Clinic (99 patients: 58 TTE, 41 POCUS). Fine-tuned SAM was superior or comparable to MedSAM. The fine-tuned SAM also outperformed EchoNet and U-Net models, demonstrating strong generalization, especially on apical 2-chamber (A2C) images (fine-tuned SAM vs. EchoNet: CAMUS-A2C: DSC 0.891 {+/-} 0.040 vs. 0.752 {+/-} 0.196, p<0.0001) and POCUS (DSC 0.857 {+/-} 0.047 vs. 0.667 {+/-} 0.279, p<0.0001). Additionally, SAM-enhanced workflow reduced annotation time by 50% (11.6 {+/-} 4.5 sec vs. 5.7 {+/-} 1.7 sec, p<0.0001) while maintaining segmentation quality. We demonstrated an effective strategy for fine-tuning a vision foundation model for enhancing clinical workflow efficiency and supporting human-AI collaboration.

Ultrasound Segmentation Cardiac Retrospective Clinical In Silico Academic Lab GenAI

SMURF: Scalable method for unsupervised reconstruction of flow in 4D flow MRI

Attention-Enhanced U-Net for Accurate Segmentation of COVID-19 Infected Lung Regions in CT Scans

ChatGPT-4-Driven Liver Ultrasound Radiomics Analysis: Advantages and Drawbacks Compared to Traditional Techniques.

Deep learning feature-based model for predicting lymphovascular invasion in urothelial carcinoma of bladder using CT images.

MedSG-Bench: A Benchmark for Medical Image Sequences Grounding

Feasibility of improving vocal fold pathology image classification with synthetic images generated by DDPM-based GenAI: a pilot study.

ML-Driven Alzheimer 's disease prediction: A deep ensemble modeling approach.

Fully Automated Evaluation of Condylar Remodeling after Orthognathic Surgery in Skeletal Class II Patients Using Deep Learning and Landmarks.

A Robust Automated Segmentation Method for White Matter Hyperintensity of Vascular-origin.

Foundation versus Domain-Specific Models for Left Ventricular Segmentation on Cardiac Ultrasound

Ready to Sharpen Your Edge?