Latest Papers on Radiology AI. Tags: In Silico, Order: Best Match, Limit: 10.

Augmenting Continual Learning of Diseases with LLM-Generated Visual Concepts

Jiantao Tan, Peixian Ma, Kanghao Chen, Zhiming Dai, Ruixuan Wang

•preprint•Aug 5 2025

Continual learning is essential for medical image classification systems to adapt to dynamically evolving clinical environments. The integration of multimodal information can significantly enhance continual learning of image classes. However, while existing approaches do utilize textual modality information, they solely rely on simplistic templates with a class name, thereby neglecting richer semantic information. To address these limitations, we propose a novel framework that harnesses visual concepts generated by large language models (LLMs) as discriminative semantic guidance. Our method dynamically constructs a visual concept pool with a similarity-based filtering mechanism to prevent redundancy. Then, to integrate the concepts into the continual learning process, we employ a cross-modal image-concept attention module, coupled with an attention loss. Through attention, the module can leverage the semantic knowledge from relevant visual concepts and produce class-representative fused features for classification. Experiments on medical and natural image datasets show our method achieves state-of-the-art performance, demonstrating the effectiveness and superiority of our method. We will release the code publicly.

Mixed Modality Classification Methodology In Silico Academic Lab Open Code GenAI

Beyond unimodal analysis: Multimodal ensemble learning for enhanced assessment of atherosclerotic disease progression.

Guarrasi V, Bertgren A, Näslund U, Wennberg P, Soda P, Grönlund C

•papers•Aug 5 2025

Atherosclerosis is a leading cardiovascular disease typified by fatty streaks accumulating within arterial walls, culminating in potential plaque ruptures and subsequent strokes. Existing clinical risk scores, such as systematic coronary risk estimation and Framingham risk score, profile cardiovascular risks based on factors like age, cholesterol, and smoking, among others. However, these scores display limited sensitivity in early disease detection. Parallelly, ultrasound-based risk markers, such as the carotid intima media thickness, while informative, only offer limited predictive power. Notably, current models largely focus on either ultrasound image-derived risk markers or clinical risk factor data without combining both for a comprehensive, multimodal assessment. This study introduces a multimodal ensemble learning framework to assess atherosclerosis severity, especially in its early sub-clinical stage. We utilize a multi-objective optimization targeting both performance and diversity, aiming to integrate features from each modality effectively. Our objective is to measure the efficacy of models using multimodal data in assessing vascular aging, i.e., plaque presence and vascular age, over a six-year period. We also delineate a procedure for optimal model selection from a vast pool, focusing on best-suited models for classification tasks. Additionally, through eXplainable Artificial Intelligence techniques, this work delves into understanding key model contributors and discerning unique subject subgroups.

Ultrasound Classification Vascular Methodology In Silico

Automated vertebral bone quality score measurement on lumbar MRI using deep learning: Development and validation of an AI algorithm.

Jayasuriya NM, Feng E, Nathani KR, Delawan M, Katsos K, Bhagra O, Freedman BA, Bydon M

•papers•Aug 5 2025

Bone health is a critical determinant of spine surgery outcomes, yet many patients undergo procedures without adequate preoperative assessment due to limitations in current bone quality assessment methods. This study aimed to develop and validate an artificial intelligence-based algorithm that predicts Vertebral Bone Quality (VBQ) scores from routine MRI scans, enabling improved preoperative identification of patients at risk for poor surgical outcomes. This study utilized 257 lumbar spine T1-weighted MRI scans from the SPIDER challenge dataset. VBQ scores were calculated through a three-step process: selecting the mid-sagittal slice, measuring vertebral body signal intensity from L1-L4, and normalizing by cerebrospinal fluid signal intensity. A YOLOv8 model was developed to automate region of interest placement and VBQ score calculation. The system was validated against manual annotations from 47 lumbar spine surgery patients, with performance evaluated using precision, recall, mean average precision, intraclass correlation coefficient, Pearson correlation, RMSE, and mean error. The YOLOv8 model demonstrated high accuracy in vertebral body detection (precision: 0.9429, recall: 0.9076, [email protected]: 0.9403, mAP@[0.5:0.95]: 0.8288). Strong interrater reliability was observed with ICC values of 0.95 (human-human), 0.88 and 0.93 (human-AI). Pearson correlations for VBQ scores between human and AI measurements were 0.86 and 0.9, with RMSE values of 0.58 and 0.42 respectively. The AI-based algorithm accurately predicts VBQ scores from routine lumbar MRIs. This approach has potential to enhance early identification and intervention for patients with poor bone health, leading to improved surgical outcomes. Further external validation is recommended to ensure generalizability and clinical applicability.

MRI Detection Musculoskeletal Retrospective Clinical In Silico

Altered effective connectivity in patients with drug-naïve first-episode, recurrent, and medicated major depressive disorder: a multi-site fMRI study.

Dai P, Huang K, Hu T, Chen Q, Liao S, Grecucci A, Yi X, Chen BT

•papers•Aug 5 2025

Major depressive disorder (MDD) has been diagnosed through subjective and inconsistent clinical assessments. Resting-state functional magnetic resonance imaging (rs-fMRI) with connectivity analysis has been valuable for identifying neural correlates of patients with MDD, yet most studies rely on single-site and small sample sizes. This study utilized large-scale, multi-site rs-fMRI data from the Rest-meta-MDD consortium to assess effective connectivity in patients with MDD and its subtypes, i.e., drug-naïve first-episode (FEDN), recurrent (RMDD), and medicated MDD (MMDD) subtypes. To mitigate site-related variability, the ComBat algorithm was applied, and multivariate linear regression was used to control for age and gender effects. A random forest classification model was developed to identify the most predictive features. Nested five-fold cross-validation was used to assess model performance. The model effectively distinguished FEDN subtype from healthy controls (HC) group, achieving 90.13% accuracy and 96.41% AUC. However, classification performance for RMDD vs. FEDN and MMDD vs. FEDN was lower, suggesting that differences between the subtypes were less pronounced than differences between the patients with MDD and the HC group. Patients with RMDD exhibited more extensive connectivity abnormalities in the frontal-limbic system and default mode network than the patients with FEDN, implying heightened rumination. Additionally, treatment with medication appeared to partially modulate the aberrant connectivity, steering it toward normalization. This study showed altered brain connectivity in patients with MDD and its subtypes, which could be classified with machine learning models with robust performance. Abnormal connectivity could be the potential neural correlates for the presenting symptoms of patients with MDD. These findings provide novel insights into the neural pathogenesis of patients with MDD.

MRI Classification Neurological Retrospective Clinical In Silico Consortium

A novel lung cancer diagnosis model using hybrid convolution (2D/3D)-based adaptive DenseUnet with attention mechanism.

Deepa J, Badhu Sasikala L, Indumathy P, Jerrin Simla A

•papers•Aug 5 2025

Existing Lung Cancer Diagnosis (LCD) models have difficulty in detecting early-stage lung cancer due to the asymptomatic nature of the disease which leads to an increased death rate of patients. Therefore, it is important to diagnose lung disease at an early stage to save the lives of affected persons. Hence, the research work aims to develop an efficient lung disease diagnosis using deep learning techniques for the early and accurate detection of lung cancer. This is achieved by. Initially, the proposed model collects the mandatory CT images from the standard benchmark datasets. Then, the lung cancer segmentation is done by using the development of Hybrid Convolution (2D/3D)-based Adaptive DenseUnet with Attention mechanism (HC-ADAM). The Hybrid Sewing Training with Spider Monkey Optimization (HSTSMO) is introduced to optimize the parameters in the developed HC-ADAM segmentation approach. Finally, the dissected lung nodule imagery is considered for the lung cancer classification stage, where the Hybrid Adaptive Dilated Networks with Attention mechanism (HADN-AM) are implemented with the serial cascading of ResNet and Long Short Term Memory (LSTM) for attaining better categorization performance. The accuracy, precision, and F1-score of the developed model for the LIDC-IDRI dataset are 96.3%, 96.38%, and 96.36%, respectively.

CT Segmentation Chest Methodology In Silico Academic Lab

Open-radiomics: a collection of standardized datasets and a technical protocol for reproducible radiomics machine learning pipelines.

Namdar K, Wagner MW, Ertl-Wagner BB, Khalvati F

•papers•Aug 4 2025

As an important branch of machine learning pipelines in medical imaging, radiomics faces two major challenges namely reproducibility and accessibility. In this work, we introduce open-radiomics, a set of radiomics datasets along with a comprehensive radiomics pipeline based on our proposed technical protocol to investigate the effects of radiomics feature extraction on the reproducibility of the results. We curated large-scale radiomics datasets based on three open-source datasets; BraTS 2020 for high-grade glioma (HGG) versus low-grade glioma (LGG) classification and survival analysis, BraTS 2023 for O6-methylguanine-DNA methyltransferase (MGMT) classification, and non-small cell lung cancer (NSCLC) survival analysis from the Cancer Imaging Archive (TCIA). We used the BraTS 2020 open-source Magnetic Resonance Imaging (MRI) dataset to demonstrate how our proposed technical protocol could be utilized in radiomics-based studies. The cohort includes 369 adult patients with brain tumors (76 LGG, and 293 HGG). Using PyRadiomics library for LGG vs. HGG classification, we created 288 radiomics datasets; the combinations of 4 MRI sequences, 3 binWidths, 6 image normalization methods, and 4 tumor subregions. We used Random Forest classifiers, and for each radiomics dataset, we repeated the training-validation-test (60%/20%/20%) experiment with different data splits and model random states 100 times (28,800 test results) and calculated the Area Under the Receiver Operating Characteristic Curve (AUROC). Unlike binWidth and image normalization, the tumor subregion and imaging sequence significantly affected performance of the models. T1 contrast-enhanced sequence and the union of Necrotic and the non-enhancing tumor core subregions resulted in the highest AUROCs (average test AUROC 0.951, 95% confidence interval of (0.949, 0.952)). Although several settings and data splits (28 out of 28800) yielded test AUROC of 1, they were irreproducible. Our experiments demonstrate the sources of variability in radiomics pipelines (e.g., tumor subregion) can have a significant impact on the results, which may lead to superficial perfect performances that are irreproducible. Not applicable.

MRI Classification Neurological Dataset Release In Silico Academic Lab Open Dataset Reproducibility

CT-Based 3D Super-Resolution Radiomics for the Differential Diagnosis of Brucella <i>vs.</i> Tuberculous Spondylitis using Deep Learning.

Wang K, Qi L, Li J, Zhang M, Du H

•papers•Aug 4 2025

This study aims to improve the accuracy of distinguishing Tuberculous Spondylitis (TBS) from Brucella Spondylitis (BS) by developing radiomics models using Deep Learning and CT images enhanced with Super-Resolution (SR). A total of 94 patients diagnosed with BS or TBS were randomly divided into training (n=65) and validation (n=29) groups in a 7:3 ratio. In the training set, there were 40 BS and 25 TBS patients, with a mean age of 58.34 ± 12.53 years. In the validation set, there were 17 BS and 12 TBS patients, with a mean age of 58.48 ± 12.29 years. Standard CT images were enhanced using SR, improving spatial resolution and image quality. The lesion regions (ROIs) were manually segmented, and radiomics features were extracted. ResNet18 and ResNet34 were used for deep learning feature extraction and model training. Four multi-layer perceptron (MLP) models were developed: clinical, radiomics (Rad), deep learning (DL), and a combined model. Model performance was assessed using five-fold cross-validation, ROC, and decision curve analysis (DCA). Statistical significance was assessed, with key clinical and imaging features showing significant differences between TBS and BS (e.g., gender, p=0.0038; parrot beak appearance, p<0.001; dead bone, p<0.001; deformities of the spinal posterior process, p=0.0044; psoas abscess, p<0.001). The combined model outperformed others, achieving the highest AUC (0.952), with ResNet34 and SR-enhanced images further boosting performance. Sensitivity reached 0.909, and Specificity was 0.941. DCA confirmed clinical applicability. The integration of SR-enhanced CT imaging and deep learning radiomics appears to improve diagnostic differentiation between BS and TBS. The combined model, especially when using ResNet34 and GAN-based super-resolution, demonstrated better predictive performance. High-resolution imaging may facilitate better lesion delineation and more robust feature extraction. Nevertheless, further validation with larger, multicenter cohorts is needed to confirm generalizability and reduce potential bias from retrospective design and imaging heterogeneity. This study suggests that integrating Deep Learning Radiomics with Super-Resolution may improve the differentiation between TBS and BS compared to standard CT imaging. However, prospective multi-center studies are necessary to validate its clinical applicability.

CT Classification Musculoskeletal Retrospective Clinical In Silico Academic Lab

A Novel Dual-Output Deep Learning Model Based on InceptionV3 for Radiographic Bone Age and Gender Assessment.

Rayed B, Amasya H, Sezdi M

•papers•Aug 4 2025

Hand-wrist radiographs are used in bone age prediction. Computer-assisted clinical decision support systems offer solutions to the limitations of the radiographic bone age assessment methods. In this study, a multi-output prediction model was designed to predict bone age and gender using digital hand-wrist radiographs. The InceptionV3 architecture was used as the backbone, and the model was trained and tested using the open-access dataset of 2017 RSNA Pediatric Bone Age Challenge. A total of 14,048 samples were divided to training, validation, and testing subsets with the ratio of 7:2:1, and additional specialized convolutional neural network layers were implemented for robust feature management, such as Squeeze-and-Excitation block. The proposed model achieved a mean squared error of approximately 25 and a mean absolute error of 3.1 for predicting bone age. In gender classification, an accuracy of 95% and an area under the curve of 97% were achieved. The intra-class correlation coefficient for the continuous bone age predictions was found to be 0.997, while the Cohen's <math xmlns="http://www.w3.org/1998/Math/MathML"><mi>κ</mi></math> coefficient for the gender predictions was found to be 0.898 ( <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mi>p</mi> <mo><</mo></mrow> </math> 0.001). The proposed model aims to increase model efficiency by identifying common and discrete features. Based on the results, the proposed algorithm is promising; however, the mid-high-end hardware requirement may be a limitation for its use on local machines in the clinic. The future studies may consider increasing the dataset and simplification of the algorithms.

X-Ray Classification Musculoskeletal Methodology In Silico Academic Lab

Development and Validation of an Explainable MRI-Based Habitat Radiomics Model for Predicting p53-Abnormal Endometrial Cancer: A Multicentre Feasibility Study.

Jin W, Zhang H, Ning Y, Chen X, Zhang G, Li H, Zhang H

•papers•Aug 4 2025

We developed an MRI-based habitat radiomics model (HRM) to predict p53-abnormal (p53abn) molecular subtypes of endometrial cancer (EC). Patients with pathologically confirmed EC were retrospectively enrolled from three hospitals and categorized into a training cohort (n = 270), test cohort 1 (n = 70), and test cohort 2 (n = 154). The tumour was divided into habitat sub-regions using diffusion-weighted imaging (DWI) and contrast-enhanced (CE) images with the K-means algorithm. Radiomics features were extracted from T1-weighted imaging (T1WI), T2-weighted imaging (T2WI), DWI, and CE images. Three machine learning classifiers-logistic regression, support vector machines, and random forests-were applied to develop predictive models for p53abn EC. Model performance was validated using receiver operating characteristic (ROC) curves, and the model with the best predictive performance was selected as the HRM. A whole-region radiomics model (WRM) was also constructed, and a clinical model (CM) with five clinical features was developed. The SHApley Additive ExPlanations (SHAP) method was used to explain the outputs of the models. DeLong's test evaluated and compared the performance across the cohorts. A total of 1920 habitat radiomics features were considered. Eight features were selected for the HRM, ten for the WRM, and three clinical features for the CM. The HRM achieved the highest AUC: 0.855 (training), 0.769 (test1), and 0.766 (test2). The AUCs of the WRM were 0.707 (training), 0.703 (test1), and 0.738 (test2). The AUCs of the CM were 0.709 (training), 0.641 (test1), and 0.665 (test2). The MRI-based HRM successfully predicted p53abn EC. The results indicate that habitat combined with machine learning, radiomics, and SHAP can effectively predict p53abn EC, providing clinicians with intuitive insights and interpretability regarding the impact of risk factors in the model.

MRI Classification Abdominal Retrospective Clinical In Silico Benchmark SOTA GenAI

Scaling Artificial Intelligence for Prostate Cancer Detection on MRI towards Population-Based Screening and Primary Diagnosis in a Global, Multiethnic Population (Study Protocol)

Anindo Saha, Joeran S. Bosma, Jasper J. Twilt, Alexander B. C. D. Ng, Aqua Asif, Kirti Magudia, Peder Larson, Qinglin Xie, Xiaodong Zhang, Chi Pham Minh, Samuel N. Gitau, Ivo G. Schoots, Martijn F. Boomsma, Renato Cuocolo, Nikolaos Papanikolaou, Daniele Regge, Derya Yakar, Mattijs Elschot, Jeroen Veltman, Baris Turkbey, Nancy A. Obuchowski, Jurgen J. Fütterer, Anwar R. Padhani, Hashim U. Ahmed, Tobias Nordström, Martin Eklund, Veeru Kasivisvanathan, Maarten de Rooij, Henkjan Huisman

•preprint•Aug 4 2025

In this intercontinental, confirmatory study, we include a retrospective cohort of 22,481 MRI examinations (21,288 patients; 46 cities in 22 countries) to train and externally validate the PI-CAI-2B model, i.e., an efficient, next-generation iteration of the state-of-the-art AI system that was developed for detecting Gleason grade group $\geq$2 prostate cancer on MRI during the PI-CAI study. Of these examinations, 20,471 cases (19,278 patients; 26 cities in 14 countries) from two EU Horizon projects (ProCAncer-I, COMFORT) and 12 independent centers based in Europe, North America, Asia and Africa, are used for training and internal testing. Additionally, 2010 cases (2010 patients; 20 external cities in 12 countries) from population-based screening (STHLM3-MRI, IP1-PROSTAGRAM trials) and primary diagnostic settings (PRIME trial) based in Europe, North and South Americas, Asia and Australia, are used for external testing. Primary endpoint is the proportion of AI-based assessments in agreement with the standard of care diagnoses (i.e., clinical assessments made by expert uropathologists on histopathology, if available, or at least two expert urogenital radiologists in consensus; with access to patient history and peer consultation) in the detection of Gleason grade group $\geq$2 prostate cancer within the external testing cohorts. Our statistical analysis plan is prespecified with a hypothesis of diagnostic interchangeability to the standard of care at the PI-RADS $\geq$3 (primary diagnosis) or $\geq$4 (screening) cut-off, considering an absolute margin of 0.05 and reader estimates derived from the PI-CAI observer study (62 radiologists reading 400 cases). Secondary measures comprise the area under the receiver operating characteristic curve (AUROC) of the AI system stratified by imaging quality, patient age and patient ethnicity to identify underlying biases (if any).

MRI Detection Abdominal Retrospective Clinical In Silico Consortium Benchmark SOTA Open Dataset Ethics

Filter Papers

Tags

Augmenting Continual Learning of Diseases with LLM-Generated Visual Concepts

Beyond unimodal analysis: Multimodal ensemble learning for enhanced assessment of atherosclerotic disease progression.

Automated vertebral bone quality score measurement on lumbar MRI using deep learning: Development and validation of an AI algorithm.

Altered effective connectivity in patients with drug-naïve first-episode, recurrent, and medicated major depressive disorder: a multi-site fMRI study.

A novel lung cancer diagnosis model using hybrid convolution (2D/3D)-based adaptive DenseUnet with attention mechanism.

Open-radiomics: a collection of standardized datasets and a technical protocol for reproducible radiomics machine learning pipelines.

CT-Based 3D Super-Resolution Radiomics for the Differential Diagnosis of Brucella <i>vs.</i> Tuberculous Spondylitis using Deep Learning.

A Novel Dual-Output Deep Learning Model Based on InceptionV3 for Radiographic Bone Age and Gender Assessment.

Development and Validation of an Explainable MRI-Based Habitat Radiomics Model for Predicting p53-Abnormal Endometrial Cancer: A Multicentre Feasibility Study.

Scaling Artificial Intelligence for Prostate Cancer Detection on MRI towards Population-Based Screening and Primary Diagnosis in a Global, Multiethnic Population (Study Protocol)

Ready to Sharpen Your Edge?