Latest Papers on Radiology AI. Tags: In Silico

Enhancing Spinal Cord and Canal Segmentation in Degenerative Cervical Myelopathy : The Role of Interactive Learning Models with manual Click.

Han S, Oh JK, Cho W, Kim TJ, Hong N, Park SB

•papers•Sep 29 2025

We aim to develop an interactive segmentation model that can offer accuracy and reliability for the segmentation of the irregularly shaped spinal cord and canal in degenerative cervical myelopathy (DCM) through manual click and model refinement. A dataset of 1444 frames from 294 magnetic resonance imaging records of DCM patients was used and we developed two different segmentation models for comparison : auto-segmentation and interactive segmentation. The former was based on U-Net and utilized a pretrained ConvNeXT-tiny as its encoder. For the latter, we employed an interactive segmentation model structured by SimpleClick, a large model that utilizes a vision transformer as its backbone, together with simple fine-tuning. The segmentation performance of the two models were compared in terms of their Dice scores, mean intersection over union (mIoU), Average Precision and Hausdorff distance. The efficiency of the interactive segmentation model was evaluated by the number of clicks required to achieve a target mIoU. Our model achieved better scores across all four-evaluation metrics for segmentation accuracy, showing improvements of +6.4%, +1.8%, +3.7%, and -53.0% for canal segmentation, and +11.7%, +6.0%, +18.2%, and -70.9% for cord segmentation with 15 clicks, respectively. The required clicks for the interactive segmentation model to achieve a 90% mIoU for spinal canal with cord cases and 80% mIoU for spinal cord cases were 11.71 and 11.99, respectively. We found that the interactive segmentation model significantly outperformed the auto-segmentation model. By incorporating simple manual inputs, the interactive model effectively identified regions of interest, particularly in the complex and irregular shapes of the spinal cord, demonstrating both enhanced accuracy and adaptability.

MRI Segmentation Neurological Methodology In Silico

A radiomics-based machine learning model and SHAP for predicting spread through air spaces and its prognostic implications in stage I lung adenocarcinoma: a multicenter cohort study.

Wang Y, Liu X, Zhao X, Wang Z, Li X, Sun D

•papers•Sep 29 2025

Despite early detection via low-dose computed tomography and complete surgical resection for early-stage lung adenocarcinoma, postoperative recurrence remains high, particularly in patients with tumor spread through air spaces. A reliable preoperative prediction model is urgently needed to adjust the treatment modality. In this multicenter retrospective study, 609 patients with pathological stage I lung adenocarcinoma from 3 independent centers were enrolled. Regions of interest for the primary tumor and peritumoral areas (extended by three, six, and twelve voxel units) were manually delineated from preoperative CT imaging. Quantitative imaging features were extracted and filtered by correlation analysis and Random forest Ranking to yield 40 candidate features. Fifteen machine learning methods were evaluated, and a ten-fold cross-validated elastic net regression model was selected to construct the radiomics-based prediction model. A clinical model based on five key clinical variables and a combined model integrating imaging and clinical features were also developed. The radiomics model achieved accuracies of 0.801, 0.866, and 0.831 in the training set and two external test sets, with AUC of 0.791, 0.829, and 0.807. In one external test set, the clinical model had an AUC of 0.689, significantly lower than the radiomics model (0.807, p < 0.05). The combined model achieved the highest performance, with AUC of 0.834 in the training set and 0.894 in an external test set (p < 0.01 and p < 0.001, respectively). Interpretability analysis revealed that wavelet-transformed features dominated the model, with the highest contribution from a feature reflecting small high-intensity clusters within the tumor and the second highest from a feature representing low-intensity clusters in the six-voxel peritumoral region. Kaplan-Meier analysis demonstrated that patients with either pathologically confirmed or model-predicted spread had significantly shorter progression-free survival (p < 0.001). Our novel machine learning model, integrating imaging features from both tumor and peritumoral regions, preoperatively predicts tumor spread through air spaces in stage I lung adenocarcinoma. It outperforms traditional clinical models, highlighting the potential of quantitative imaging analysis in personalizing treatment. Future prospective studies and further optimization are warranted.

CT Classification Chest Retrospective Clinical In Silico Academic Lab Benchmark SOTA

Convolutional neural network models of structural MRI for discriminating categories of cognitive impairment: a systematic review and meta-analysis.

Dong X, Li Y, Hao J, Zhou P, Yang C, Ai Y, He M, Zhang W, Hu H

•papers•Sep 29 2025

Alzheimer's disease (AD) and mild cognitive impairment (MCI) pose significant challenges to public health and underscore the need for accurate and early diagnostic tools. Structural magnetic resonance imaging (sMRI) combined with advanced analytical techniques like convolutional neural networks (CNNs) seemed to offer a promising avenue for the diagnosis of these conditions. This systematic review and meta-analysis aimed to evaluate the diagnostic performance of CNN algorithms applied to sMRI data in differentiating between AD, MCI, and normal cognition (NC). Following the PRISMA-DTA guidelines, a comprehensive literature search was carried out in PubMed and Web of Science databases for studies published between 2018 and 2024. Studies were included if they employed CNNs for the diagnostic classification of sMRI data from participants with AD, MCI, or NC. The methodological quality of the included studies was assessed using the QUADAS-2 and METRICS tools. Data extraction and statistical analysis were performed to calculate pooled diagnostic accuracy metrics. A total of 21 studies were included in the study, comprising 16,139 participants in the analysis. The pooled sensitivity and specificity of CNN algorithms for differentiating AD from NC were 0.92 and 0.91, respectively. For distinguishing MCI from NC, the pooled sensitivity and specificity were 0.74 and 0.79, respectively. The algorithms also showed a moderate ability to differentiate AD from MCI, with a pooled sensitivity and specificity of 0.73 and 0.79, respectively. In the pMCI versus sMCI classification, a pooled sensitivity was 0.69 and a specificity was 0.81. Heterogeneity across studies was significant, as indicated by meta-regression results. CNN algorithms demonstrated promising diagnostic performance in differentiating AD, MCI, and NC using sMRI data. The highest accuracy was observed in distinguishing AD from NC and the lowest accuracy observed in distinguishing pMCI from sMCI. These findings suggest that CNN-based radiomics has the potential to serve as a valuable tool in the diagnostic armamentarium for neurodegenerative diseases. However, the heterogeneity among studies indicates a need for further methodological refinement and validation. This systematic review was registered in PROSPERO (Registration ID: CRD42022295408).

MRI Classification Neurological Meta Analysis In Silico Benchmark SOTA

Mixed prototype correction for causal inference in medical image classification.

Hong ZL, Yang JC, Peng XR, Wu SS

•papers•Sep 29 2025

The heterogeneity of medical images poses significant challenges to accurate disease diagnosis. To tackle this issue, the impact of such heterogeneity on the causal relationship between image features and diagnostic labels should be incorporated into model design, which however remains under explored. In this paper, we propose a mixed prototype correction for causal inference (MPCCI) method, aimed at mitigating the impact of unseen confounding factors on the causal relationships between medical images and disease labels, so as to enhance the diagnostic accuracy of deep learning models. The MPCCI comprises a causal inference component based on front-door adjustment and an adaptive training strategy. The causal inference component employs a multi-view feature extraction (MVFE) module to establish mediators, and a mixed prototype correction (MPC) module to execute causal interventions. Moreover, the adaptive training strategy incorporates both information purity and maturity metrics to maintain stable model training. Experimental evaluations on four medical image datasets, encompassing CT and ultrasound modalities, demonstrate the superior diagnostic accuracy and reliability of the proposed MPCCI. The code will be available at https://github.com/Yajie-Zhang/MPCCI .

Mixed Modality Classification Methodology In Silico Academic Lab Open Code

A machine learning approach for non-invasive PCOS diagnosis from ultrasound and clinical features.

Agirsoy M, Oehlschlaeger MA

•papers•Sep 29 2025

This study investigates the use of machine learning (ML) algorithms to support faster and more accurate diagnosis of polycystic ovary syndrome (PCOS), with a focus on both predictive performance and clinical applicability. Multiple algorithms were evaluated-including Artificial Neural Networks (ANN), Support Vector Machines (SVM), Logistic Regression (LR), K-Nearest Neighbors (KNN), and Extreme Gradient Boosting (XGBoost). XGBoost consistently outperformed the other models and was selected for final development and validation. To align with the Rotterdam criteria, the dataset was structured into three feature categories: clinical, biochemical, and ultrasound (USG) data. The study explored various combinations of these feature subsets to identify the most efficient diagnostic pathways. Feature selection using the chi-square-based SelectKBest method revealed the top 10 predictive features, which were further validated through XGBoost's internal feature importance, SHAP analysis, and expert clinical assessment. The final XGBoost model demonstrated robust performance across multiple feature combinations: • Clinical + USG + AMH: AUC = 0.9947, Precision = 0.9553, F1 Score = 0.9553, Accuracy = 0.9553. • Clinical + USG: AUC = 0.9852, Precision = 0.9583, F1 Score = 0.9388, Accuracy = 0.9384. The most influential features included follicle count on both ovaries, weight gain, Anti-Müllerian Hormone (AMH), hair growth, menstrual irregularity, fast food consumption, pimples, and hair loss, levels. External validation was performed using a publicly available dataset containing 320 instances and 18 diagnostic features. The XGBoost model trained on the top-ranked features achieved perfect performance on the test set (AUC = 1.0, Precision = 1.0, F1 Score = 1.0, Accuracy = 1.0), though further validation is necessary to rule out overfitting or data leakage. These findings suggest that combining clinical and ultrasound features enables highly accurate, non-invasive, and cost-effective PCOS diagnosis. This study demonstrates the potential of ML-driven tools to streamline clinical workflows, reduce reliance on invasive diagnostics, and support early intervention in women's health.

Ultrasound Classification Abdominal Retrospective Clinical In Silico Academic Lab Benchmark SOTA

Artificial Intelligence Deep Learning Ultrasound Discrimination of Cosmetic Fillers: A Multicenter Study.

Wortsman X, Lozano M, Rodriguez FJ, Valderrama Y, Ortiz-Orellana G, Zattar L, de Cabo F, Ducati E, Sigrist R, Fontan C, Rezende J, Gonzalez C, Schelke L, Zavariz J, Barrera P, Velthuis P

•papers•Sep 29 2025

Despite the growing use of artificial intelligence (AI) in medicine, imaging, and dermatology, to date, there is no information on the use of AI for discriminating cosmetic fillers on ultrasound (US). An international collaborative group working in dermatologic and esthetic US was formed and worked with the staff of the Department of Computer Science and AI of the Universidad de Granada to gather and process a relevant number of anonymized images. AI techniques based on deep learning (DL) with YOLO (you only look once) architecture, together with a bounding box annotation tool, allowed experts to manually delineate regions of interest for the discrimination of common cosmetic fillers under real-world conditions. A total of 14 physicians from 6 countries participated in the AI study and compiled a final dataset comprising 1432 US images, including HA (hyaluronic acid), PMMA (polymethylmethacrylate), CaHA (calcium hydroxyapatite), and SO (silicone oil) filler cases. The model exhibits robust and consistent classification performance, with an average accuracy of 0.92 ± 0.04 across the cross-validation folds. YOLOv11 demonstrated outstanding performance in the detection of HA and SO, yielding F1 scores of 0.96 ± 0.02 and 0.94 ± 0.04, respectively. On the other hand, CaHA and PMMA show somewhat lower and less consistent performance in terms of precision and recall, with F1-scores around 0.83. AI using YOLOv11 allowed us to discriminate reliably between HA and SO using different complexity high-frequency US devices and operators. Further AI DL-specific work is needed to identify CaHA and PMMA more accurately.

Ultrasound Detection Retrospective Clinical In Silico Consortium

Cross-regional radiomics: a novel framework for relationship-based feature extraction with validation in Parkinson's disease motor subtyping.

Hosseini MS, Aghamiri SMR, Panahi M

•papers•Sep 29 2025

Traditional radiomics approaches focus on single-region feature extraction, limiting their ability to capture complex inter-regional relationships crucial for understanding pathophysiological mechanisms in complex diseases. This study introduces a novel cross-regional radiomics framework that systematically extracts relationship-based features between anatomically and functionally connected brain regions. We analyzed T1-weighted magnetic resonance imaging (MRI) data from 140 early-stage Parkinson's disease patients (70 tremor-dominant, 70 postural instability gait difficulty) from the Parkinson's Progression Markers Initiative (PPMI) database across multiple imaging centers. Eight bilateral motor circuit regions (putamen, caudate nucleus, globus pallidus, substantia nigra) were segmented using standardized atlases. Two feature sets were developed: 48 traditional single-region of interest (ROI) features and 60 novel motor-circuit features capturing cross-regional ratios, asymmetry indices, volumetric relationships, and shape distributions. Six feature engineering scenarios were evaluated using center-based 5-fold cross-validation with six machine learning classifiers to ensure robust generalization across different imaging centers. Motor-circuit features demonstrated superior performance compared to single-ROI features across enhanced preprocessing scenarios. Peak performance was achieved with area under the curve (AUC) of 0.821 ± 0.117 versus 0.650 ± 0.220 for single-ROI features (p = 0.0012, Cohen's d = 0.665). Cross-regional ratios, particularly putamen-substantia nigra relationships, dominated the most discriminative features. Motor-circuit features showed superior generalization across multi-center data and better clinical utility through decision curve analysis and calibration curves. The proposed cross-regional radiomics framework significantly outperforms traditional single-region approaches for Parkinson's disease motor subtype classification. This methodology provides a foundation for advancing radiomics applications in complex diseases where inter-regional connectivity patterns are fundamental to pathophysiology.

MRI Classification Neurological Methodology In Silico Academic Lab

Development of a High-Performance Ultrasound Prediction Model for the Diagnosis of Endometrial Cancer: An Interpretable XGBoost Algorithm Utilizing SHAP Analysis.

Lai H, Wu Q, Weng Z, Lyu G, Yang W, Ye F

•papers•Sep 29 2025

To develop and validate an ultrasonography-based machine learning (ML) model for predicting malignant endometrial and cavitary lesions. This retrospective study was conducted on patients with pathologically confirmed results following transvaginal or transrectal ultrasound from 2021 to 2023. Endometrial ultrasound features were characterized using the International Endometrial Tumor Analysis (IETA) terminology. The dataset was ranomly divided (7:3) into training and validation sets. LASSO (least absolute shrinkage and selection operator) regression was applied for feature selection, and an extreme gradient boosting (XGBoost) model was developed. Performance was assessed via receiver operating characteristic (ROC) analysis, calibration, decision curve analysis, sensitivity, specificity, and accuracy. Among 1080 patients, 6 had a non-measurable endometrium. Of the remaining 1074 cases, 641 were premenopausal and 433 postmenopausal. Performance of the XGBoost model on the test set: The area under the curve (AUC) for the premenopausal group was 0.845 (0.781-0.909), with a relatively low sensitivity (0.588, 0.442-0.722) and a relatively high specificity (0.923, 0.863-0.959); the AUC for the postmenopausal group was 0.968 (0.944-0.992), with both sensitivity (0.895, 0.778-0.956) and specificity (0.931, 0.839-0.974) being relatively high. SHapley Additive exPlanations (SHAP) analysis identified key predictors: endometrial-myometrial junction, endometrial thickness, endometrial echogenicity, color Doppler flow score, and vascular pattern in premenopausal women; endometrial thickness, endometrial-myometrial junction, endometrial echogenicity, and color Doppler flow score in postmenopausal women. The XGBoost-based model exhibited excellent predictive performance, particularly in postmenopausal patients. SHAP analysis further enhances interpretability by identifying key ultrasonographic predictors of malignancy.

Ultrasound Classification Abdominal Retrospective Clinical In Silico Academic Lab GenAI

Novel multi-task learning for Alzheimer's stage classification using hippocampal MRI segmentation, feature fusion, and nomogram modeling.

Hu W, Du Q, Wei L, Wang D, Zhang G

•papers•Sep 29 2025

To develop and validate a comprehensive and interpretable framework for multi-class classification of Alzheimer's disease (AD) progression stages based on hippocampal MRI, integrating radiomic, deep, and clinical features. This retrospective multi-center study included 2956 patients across four AD stages (Non-Demented, Very Mild Demented, Mild Demented, Moderate Demented). T1-weighted MRI scans were processed through a standardized pipeline involving hippocampal segmentation using four models (U-Net, nnU-Net, Swin-UNet, MedT). Radiomic features (n = 215) were extracted using the SERA platform, and deep features (n = 256) were learned using an LSTM network with attention applied to hippocampal slices. Fused features were harmonized with ComBat and filtered by ICC (≥ 0.75), followed by LASSO-based feature selection. Classification was performed using five machine learning models, including Logistic Regression (LR), Support Vector Machine (SVM), Random Forest (RF), Multilayer Perceptron (MLP), and eXtreme Gradient Boosting (XGBoost). Model interpretability was addressed using SHAP, and a nomogram and decision curve analysis (DCA) were developed. Additionally, an end-to-end 3D CNN-LSTM model and two transformer-based benchmarks (Vision Transformer, Swin Transformer) were trained for comparative evaluation. MedT achieved the best hippocampal segmentation (Dice = 92.03% external). Fused features yielded the highest classification performance with XGBoost (external accuracy = 92.8%, AUC = 94.2%). SHAP identified MMSE, hippocampal volume, and APOE ε4 as top contributors. The nomogram accurately predicted early-stage AD with clinical utility confirmed by DCA. The end-to-end model performed acceptably (AUC = 84.0%) but lagged behind the fused pipeline. Statistical tests confirmed significant performance advantages for feature fusion and MedT-based segmentation. This study demonstrates that integrating radiomics, deep learning, and clinical data from hippocampal MRI enables accurate and interpretable classification of AD stages. The proposed framework is robust, generalizable, and clinically actionable, representing a scalable solution for AD diagnostics.

MRI Classification Neurological Retrospective Clinical In Silico Benchmark SOTA

Evaluation of Context-Aware Prompting Techniques for Classification of Tumor Response Categories in Radiology Reports Using Large Language Model.

Park J, Sim WS, Yu JY, Park YR, Lee YH

•papers•Sep 29 2025

Radiology reports are essential for medical decision-making, providing crucial data for diagnosing diseases, devising treatment plans, and monitoring disease progression. While large language models (LLMs) have shown promise in processing free-text reports, research on effective prompting techniques for radiologic applications remains limited. To evaluate the effectiveness of LLM-driven classification based on radiology reports in terms of tumor response category (TRC), and to optimize the model through a comparison of four different prompt engineering techniques for effectively performing this classification task in clinical applications, we included 3062 whole-spine contrast-enhanced magnetic resonance imaging (MRI) radiology reports for prompt engineering and validation. TRCs were labeled by two radiologists based on criteria modified from the Response Evaluation Criteria in Solid Tumors (RECIST) guidelines. The Llama3 instruct model was used to classify TRCs in this study through four different prompts: General, In-Context Learning (ICL), Chain-of-Thought (CoT), and ICL with CoT. AUROC, accuracy, precision, recall, and F1-score were calculated against each prompt and model (8B, 70B) with the test report dataset. The average AUROC for ICL (0.96 internally, 0.93 externally) and ICL with CoT prompts (0.97 internally, 0.94 externally) outperformed other prompts. Error increased with prompt complexity, including 0.8% incomplete sentence errors and 11.3% probability-classification inconsistencies. This study demonstrates that context-aware LLM prompts substantially improved the efficiency and effectiveness of classifying TRCs from radiology reports, despite potential intrinsic hallucinations. While further improvements are required for real-world application, our findings suggest that context-aware prompts have significant potential for segmenting complex radiology reports and enhancing oncology clinical workflows.

MRI Classification Retrospective Clinical In Silico Academic Lab GenAI

Filter Papers

Tags

Enhancing Spinal Cord and Canal Segmentation in Degenerative Cervical Myelopathy : The Role of Interactive Learning Models with manual Click.

A radiomics-based machine learning model and SHAP for predicting spread through air spaces and its prognostic implications in stage I lung adenocarcinoma: a multicenter cohort study.

Convolutional neural network models of structural MRI for discriminating categories of cognitive impairment: a systematic review and meta-analysis.

Mixed prototype correction for causal inference in medical image classification.

A machine learning approach for non-invasive PCOS diagnosis from ultrasound and clinical features.

Artificial Intelligence Deep Learning Ultrasound Discrimination of Cosmetic Fillers: A Multicenter Study.

Cross-regional radiomics: a novel framework for relationship-based feature extraction with validation in Parkinson's disease motor subtyping.

Development of a High-Performance Ultrasound Prediction Model for the Diagnosis of Endometrial Cancer: An Interpretable XGBoost Algorithm Utilizing SHAP Analysis.

Novel multi-task learning for Alzheimer's stage classification using hippocampal MRI segmentation, feature fusion, and nomogram modeling.

Evaluation of Context-Aware Prompting Techniques for Classification of Tumor Response Categories in Radiology Reports Using Large Language Model.

Ready to Sharpen Your Edge?