Latest Papers on Radiology AI. Tags: In Silico

Impact of large language models and vision deep learning models in predicting neoadjuvant rectal score for rectal cancer treated with neoadjuvant chemoradiation.

Kim HB, Tan HQ, Nei WL, Tan YCRS, Cai Y, Wang F

•papers•Jul 31 2025

This study aims to explore Deep Learning methods, namely Large Language Models (LLMs) and Computer Vision models to accurately predict neoadjuvant rectal (NAR) score for locally advanced rectal cancer (LARC) treated with neoadjuvant chemoradiation (NACRT). The NAR score is a validated surrogate endpoint for LARC. 160 CT scans of patients were used in this study, along with 4 different types of radiology reports, 2 generated from CT scans and other 2 from MRI scans, both before and after NACRT. For CT scans, two different approaches with convolutional neural network were utilized to tackle the 3D scan entirely or tackle it slice by slice. For radiology reports, an encoder architecture LLM was used. The performance of the approaches was quantified by the Area under the Receiver Operating Characteristic curve (AUC). The two different approaches for CT scans yielded [Formula: see text] and [Formula: see text] while the LLM trained on post NACRT MRI reports showed the most predictive potential at [Formula: see text] and a statistical improvement, p = 0.03, over the baseline clinical approach (from [Formula: see text] to [Formula: see text])). This study showcases the potential of Large Language Models and the inadequacies of CT scans in predicting NAR values. Clinical trial number Not applicable.

Mixed Modality Classification Abdominal Retrospective Clinical In Silico Academic Lab GenAI

Prognostication in patients with idiopathic pulmonary fibrosis using quantitative airway analysis from HRCT: a retrospective study.

Nan Y, Federico FN, Humphries S, Mackintosh JA, Grainge C, Jo HE, Goh N, Reynolds PN, Hopkins PMA, Navaratnam V, Moodley Y, Walters H, Ellis S, Keir G, Zappala C, Corte T, Glaspole I, Wells AU, Yang G, Walsh SL

•papers•Jul 31 2025

Predicting shorter life expectancy is crucial for prioritizing antifibrotic therapy in fibrotic lung diseases, where progression varies widely, from stability to rapid deterioration. This heterogeneity complicates treatment decisions, emphasizing the need for reliable baseline measures. This study focuses on leveraging artificial intelligence model to address heterogeneity in disease outcomes, focusing on mortality as the ultimate measure of disease trajectory. This retrospective study included 1744 anonymised patients who underwent high-resolution CT scanning. The AI model, SABRE (Smart Airway Biomarker Recognition Engine), was developed using data from patients with various lung diseases (n=460, including lung cancer, pneumonia, emphysema, and fibrosis). Then, 1284 high-resolution CT scans with evidence of diffuse FLD from the Australian IPF Registry and OSIC were used for clinical analyses. Airway branches were categorized and quantified by anatomic structures and volumes, followed by multivariable analysis to explore the associations between these categories and patients' progression and mortality, adjusting for disease severity or traditional measurements. Cox regression identified SABRE-based variables as independent predictors of mortality and progression, even adjusting for disease severity (fibrosis extent, traction bronchiectasis extent, and ILD extent), traditional measures (FVC%, DLCO%, and CPI), and previously reported deep learning algorithms for fibrosis quantification and morphological analysis. Combining SABRE with DLCO significantly improved prognosis utility, yielding an AUC of 0.852 at the first year and a C-index of 0.752. SABRE-based variables capture prognostic signals beyond that provided by traditional measurements, disease severity scores, and established AI-based methods, reflecting the progressiveness and pathogenesis of the disease.

CT Classification Chest Retrospective Clinical In Silico Academic Lab Benchmark SOTA

Effectiveness of Radiomics-Based Machine Learning Models in Differentiating Pancreatitis and Pancreatic Ductal Adenocarcinoma: Systematic Review and Meta-Analysis.

Zhang L, Li D, Su T, Xiao T, Zhao S

•papers•Jul 31 2025

Pancreatic ductal adenocarcinoma (PDAC) and mass-forming pancreatitis (MFP) share similar clinical, laboratory, and imaging features, making accurate diagnosis challenging. Nevertheless, PDAC is highly malignant with a poor prognosis, whereas MFP is an inflammatory condition typically responding well to medical or interventional therapies. Some investigators have explored radiomics-based machine learning (ML) models for distinguishing PDAC from MFP. However, systematic evidence supporting the feasibility of these models is insufficient, presenting a notable challenge for clinical application. This study intended to review the diagnostic performance of radiomics-based ML models in differentiating PDAC from MFP, summarize the methodological quality of the included studies, and provide evidence-based guidance for optimizing radiomics-based ML models and advancing their clinical use. PubMed, Embase, Cochrane, and Web of Science were searched for relevant studies up to June 29, 2024. Eligible studies comprised English cohort, case-control, or cross-sectional designs that applied fully developed radiomics-based ML models-including traditional and deep radiomics-to differentiate PDAC from MFP, while also reporting their diagnostic performance. Studies without full text, limited to image segmentation, or insufficient outcome metrics were excluded. Methodological quality was appraised by means of the radiomics quality score. Since the limited applicability of QUADAS-2 in radiomics-based ML studies, the risk of bias was not formally assessed. Pooled sensitivity, specificity, area under the curve of summary receiver operating characteristics (SROC), likelihood ratios, and diagnostic odds ratio were estimated through a bivariate mixed-effects model. Results were presented with forest plots, SROC curves, and Fagan's nomogram. Subgroup analysis was performed to appraise the diagnostic performance of radiomics-based ML models across various imaging modalities, including computed tomography (CT), magnetic resonance imaging, positron emission tomography-CT, and endoscopic ultrasound. This meta-analysis included 24 studies with 14,406 cases, including 7635 PDAC cases. All studies adopted a case-control design, with 5 conducted across multiple centers. Most studies used CT as the primary imaging modality. The radiomics quality score scores ranged from 5 points (14%) to 17 points (47%), with an average score of 9 (25%). The radiomics-based ML models demonstrated high diagnostic performance. Based on the independent validation sets, the pooled sensitivity, specificity, area under the curve of SROC, positive likelihood ratio, negative likelihood ratio, and diagnostic odds ratio were 0.92 (95% CI 0.91-0.94), 0.90 (95% CI 0.85-0.94), 0.94 (95% CI 0.74-0.99), 9.3 (95% CI 6.0-14.2), 0.08 (95% CI 0.07-0.11), and 110 (95% CI 62-194), respectively. Radiomics-based ML models demonstrate high diagnostic accuracy in differentiating PDAC from MFP, underscoring their potential as noninvasive tools for clinical decision-making. Nonetheless, the overall methodological quality was moderate due to limitations in external validation, standardized protocols, and reproducibility. These findings support the promise of radiomics in clinical diagnostics while highlighting the need for more rigorous, multicenter research to enhance model generalizability and clinical applicability.

Mixed Modality Classification Abdominal Meta Analysis In Silico Academic Lab Benchmark SOTA

External Validation of a Winning Artificial Intelligence Algorithm from the RSNA 2022 Cervical Spine Fracture Detection Challenge.

Harper JP, Lee GR, Pan I, Nguyen XV, Quails N, Prevedello LM

•papers•Jul 31 2025

The Radiological Society of North America has actively promoted artificial intelligence (AI) challenges since 2017. Algorithms emerging from the recent RSNA 2022 Cervical Spine Fracture Detection Challenge demonstrated state-of-the-art performance in the competition's data set, surpassing results from prior publications. However, their performance in real-world clinical practice is not known. As an initial step toward the goal of assessing feasibility of these models in clinical practice, we conducted a generalizability test by using one of the leading algorithms of the competition. The deep learning algorithm was selected due to its performance, portability, and ease of use, and installed locally. One hundred examinations (50 consecutive cervical spine CT scans with at least 1 fracture present and 50 consecutive negative CT scans) from a level 1 trauma center not represented in the competition data set were processed at 6.4 seconds per examination. Ground truth was established based on the radiology report with retrospective confirmation of positive fracture cases. Sensitivity, specificity, F1 score, and area under the curve were calculated. The external validation data set comprised older patients in comparison to the competition set (53.5 ± 21.8 years versus 58 ± 22.0, respectively; <i>P</i> < .05). Sensitivity and specificity were 86% and 70% in the external validation group and 85% and 94% in the competition group, respectively. Fractures misclassified by the convolutional neural networks frequently had features of advanced degenerative disease, subtle nondisplaced fractures not easily identified on the axial plane, and malalignment. The model performed with a similar sensitivity on the test and external data set, suggesting that such a tool could be potentially generalizable as a triage tool in the emergency setting. Discordant factors such as age-associated comorbidities may affect accuracy and specificity of AI models when used in certain populations. Further research should be encouraged to help elucidate the potential contributions and pitfalls of these algorithms in supporting clinical care.

CT Detection Musculoskeletal Retrospective Clinical In Silico Consortium Benchmark SOTA

DiSC-Med: Diffusion-based Semantic Communications for Robust Medical Image Transmission

Fupei Guo, Hao Zheng, Xiang Zhang, Li Chen, Yue Wang, Songyang Zhang

•preprint•Jul 31 2025

The rapid development of artificial intelligence has driven smart health with next-generation wireless communication technologies, stimulating exciting applications in remote diagnosis and intervention. To enable a timely and effective response for remote healthcare, efficient transmission of medical data through noisy channels with limited bandwidth emerges as a critical challenge. In this work, we propose a novel diffusion-based semantic communication framework, namely DiSC-Med, for the medical image transmission, where medical-enhanced compression and denoising blocks are developed for bandwidth efficiency and robustness, respectively. Unlike conventional pixel-wise communication framework, our proposed DiSC-Med is able to capture the key semantic information and achieve superior reconstruction performance with ultra-high bandwidth efficiency against noisy channels. Extensive experiments on real-world medical datasets validate the effectiveness of our framework, demonstrating its potential for robust and efficient telehealth applications.

Mixed Modality Reconstruction Methodology In Silico Academic Lab

Application of Tuning-Ensemble N-Best in Auto-Sklearn for Mammographic Radiomic Analysis for Breast Cancer Prediction.

Ismail FA, Karim MKA, Zaidon SIA, Noor KA

•papers•Jul 31 2025

Breast cancer is a major cause of mortality among women globally. While mammography remains the gold standard for detection, its interpretation is often limited by radiologist variability and the challenge of differentiating benign and malignant lesions. The study explores the use of Auto- Sklearn, an automated machine learning (AutoML) framework, for breast tumor classification based on mammographic radiomic features. 244 mammographic images were enhanced using Contrast Limited Adaptive Histogram Equalization (CLAHE) and segmented with Active Contour Method (ACM). Thirty-seven radiomic features, including first-order statistics, Gray-Level Co-occurance Matrix (GLCM) texture and shape features were extracted and standardized. Auto-Sklearn was employed to automate model selection, hyperparameter tuning and ensemble construction. The dataset was divided into 80% training and 20% testing set. The initial Auto-Sklearn model achieved an 88.71% accuracy on the training set and 55.10% on the testing sets. After the resampling strategy was applied, the accuracy for the training set and testing set increased to 95.26% and 76.16%, respectively. The Receiver Operating Curve and Area Under Curve (ROC-AUC) for the standard and resampling strategy of Auto-Sklearn were 0.660 and 0.840, outperforming conventional models, demonstrating its efficiency in automating radiomic classification tasks. The findings underscore Auto-Sklearn's ability to automate and enhance tumor classification performance using handcrafted radiomic features. Limitations include dataset size and absence of clinical metadata. This study highlights the application of Auto-Sklearn as a scalable, automated and clinically relevant tool for breast cancer classification using mammographic radiomics.

Mammography Classification Breast Retrospective Clinical In Silico Academic Lab

Cognitive profiles associated with faster thalamic atrophy in multiple sclerosis.

Amin M, Scullin K, Nakamura K, Ontaneda D, Galioto R

•papers•Jul 31 2025

Cognitive impairment (CI) in people with MS (pwMS) has complex pathophysiology. Neuropsychological testing (NPT) can be helpful, but interpretation may be challenging for clinicians. Thalamic atrophy (TA) has shown correlation for both neurodegeneration and CI. Leverage machine learning methods to link CI and longitudinal neuroimaging biomarkers. Retrospective review of adult pwMS with NPT and ≥2 brain MRIs. Quantitative MRI regional change rates were calculated using mixed effects models. Participants were divided into training and validation cohorts. K-means clustering was done based on first and second NPT principal components (PC1 and PC2). MRI change rates were compared between clusters. 112 participants were included (mean age 48 years, 71 % female, 80 % relapsing remitting). Processing speed and memory were the major contributors to PC1. We identified two clusters based on PC1, one with significantly more TA in both training and validation cohorts (p = 0.035; p = 0.002) and similar rates of change in all other quantitative MRI measures. The most important contributors to PC1 included measures of processing speed (SDMT/WAIS Coding) and memory (List Learning/BVMT immediate and delayed recall). This clustering method identified a profile of NPT results strongly linked to and possibly driven by TA. These results confirm validity of previously established findings using more advanced analyses in addition to offering novel insights into NPT dimensionality reduction.

MRI Classification Neurological Retrospective Clinical In Silico Academic Lab

Topology Optimization in Medical Image Segmentation with Fast Euler Characteristic

Liu Li, Qiang Ma, Cheng Ouyang, Johannes C. Paetzold, Daniel Rueckert, Bernhard Kainz

•preprint•Jul 31 2025

Deep learning-based medical image segmentation techniques have shown promising results when evaluated based on conventional metrics such as the Dice score or Intersection-over-Union. However, these fully automatic methods often fail to meet clinically acceptable accuracy, especially when topological constraints should be observed, e.g., continuous boundaries or closed surfaces. In medical image segmentation, the correctness of a segmentation in terms of the required topological genus sometimes is even more important than the pixel-wise accuracy. Existing topology-aware approaches commonly estimate and constrain the topological structure via the concept of persistent homology (PH). However, these methods are difficult to implement for high dimensional data due to their polynomial computational complexity. To overcome this problem, we propose a novel and fast approach for topology-aware segmentation based on the Euler Characteristic ($\chi$). First, we propose a fast formulation for $\chi$ computation in both 2D and 3D. The scalar $\chi$ error between the prediction and ground-truth serves as the topological evaluation metric. Then we estimate the spatial topology correctness of any segmentation network via a so-called topological violation map, i.e., a detailed map that highlights regions with $\chi$ errors. Finally, the segmentation results from the arbitrary network are refined based on the topological violation maps by a topology-aware correction network. Our experiments are conducted on both 2D and 3D datasets and show that our method can significantly improve topological correctness while preserving pixel-wise segmentation accuracy.

Mixed Modality Segmentation Methodology In Silico Breakthrough

Interpreting convolutional neural network explainability for head-and-neck cancer radiotherapy organ-at-risk segmentation

Strijbis, V. I. J., Gurney-Champion, O. J., Grama, D. I., Slotman, B. J., Verbakel, W. F. A. R.

•preprint•Jul 31 2025

BackgroundConvolutional neural networks (CNNs) have emerged to reduce clinical resources and standardize auto-contouring of organs-at-risk (OARs). Although CNNs perform adequately for most patients, understanding when the CNN might fail is critical for effective and safe clinical deployment. However, the limitations of CNNs are poorly understood because of their black-box nature. Explainable artificial intelligence (XAI) can expose CNNs inner mechanisms for classification. Here, we investigate the inner mechanisms of CNNs for segmentation and explore a novel, computational approach to a-priori flag potentially insufficient parotid gland (PG) contours. MethodsFirst, 3D UNets were trained in three PG segmentation situations using (1) synthetic cases; (2) 1925 clinical computed tomography (CT) scans with typical and (3) more consistent contours curated through a previously validated auto-curation step. Then, we generated attribution maps for seven XAI methods, and qualitatively assessed them for congruency between simulated and clinical contours, and how much XAI agreed with expert reasoning. To objectify observations, we explored persistent homology intensity filtrations to capture essential topological characteristics of XAI attributions. Principal component (PC) eigenvalues of Euler characteristic profiles were correlated with spatial agreement (Dice-Sorensen similarity coefficient; DSC). Evaluation was done using sensitivity, specificity and the area under receiver operating characteristic (AUROC) curve on an external AAPM dataset, where as proof-of-principle, we regard the lowest 15% DSC as insufficient. ResultsPatternNet attributions (PNet-A) focused on soft-tissue structures, whereas guided backpropagation (GBP) highlighted both soft-tissue and high-density structures (e.g. mandible bone), which was congruent with synthetic situations. Both methods typically had higher/denser activations in better auto-contoured medial and anterior lobes. Curated models produced "cleaner" gradient class-activation mapping (GCAM) attributions. Quantitative analysis showed that PC{lambda}1 of guided GCAMs (GGCAM) Euler characteristic (EC) profile had good predictive value (sensitivity>0.85, specificity>0.9) of DSC for AAPM cases, with AUROC=0.66, 0.74, 0.94, 0.83 for GBP, GCAM, GGCAM and PNet-A. For for {lambda}1<-1.8e3 of GGCAMs EC-profile, 87% of cases were insufficient. ConclusionsGBP and PNet-A qualitatively agreed most with expert reasoning on directly (structure borders) and indirectly (proxies used for identifying structure borders) important features for PG segmentation. Additionally, this work investigated as proof-of-principle how topological data analysis could possibly be used for quantitative XAI signal analysis to a-priori mark potentially inadequate CNN-segmentations, using only features from inside the predicted PG. This work used PG as a well-understood segmentation paradigm and may extend to target volumes and other organs-at-risk.

CT Segmentation Neurological Methodology In Silico Academic Lab Ethics

IHE-Net:Hidden feature discrepancy fusion and triple consistency training for semi-supervised medical image segmentation.

Ju M, Wang B, Zhao Z, Zhang S, Yang S, Wei Z

•papers•Jul 31 2025

Teacher-Student (TS) networks have become the mainstream frameworks of semi-supervised deep learning, and are widely used in medical image segmentation. However, traditional TSs based on single or homogeneous encoders often struggle to capture the rich semantic details required for complex, fine-grained tasks. To address this, we propose a novel semi-supervised medical image segmentation framework (IHE-Net), which makes good use of the feature discrepancies of two heterogeneous encoders to improve segmentation performance. The two encoders are instantiated by different learning paradigm networks, namely CNN and Transformer/Mamba, respectively, to extract richer and more robust context representations from unlabeled data. On this basis, we propose a simple yet powerful multi-level feature discrepancy fusion module (MFDF), which effectively integrates different modal features and their discrepancies from two heterogeneous encoders. This design enhances the representational capacity of the model through efficient fusion without introducing additional computational overhead. Furthermore, we introduce a triple consistency learning strategy to improve predictive stability by setting dual decoders and adding mixed output consistency. Extensive experimental results on three skin lesion segmentation datasets, ISIC2017, ISIC2018, and PH2, demonstrate the superiority of our framework. Ablation studies further validate the rationale and effectiveness of the proposed method. Code is available at: https://github.com/joey-AI-medical-learning/IHE-Net.

OCT Segmentation Methodology In Silico Academic Lab Open Code

Filter Papers

Tags

Impact of large language models and vision deep learning models in predicting neoadjuvant rectal score for rectal cancer treated with neoadjuvant chemoradiation.

Prognostication in patients with idiopathic pulmonary fibrosis using quantitative airway analysis from HRCT: a retrospective study.

Effectiveness of Radiomics-Based Machine Learning Models in Differentiating Pancreatitis and Pancreatic Ductal Adenocarcinoma: Systematic Review and Meta-Analysis.

External Validation of a Winning Artificial Intelligence Algorithm from the RSNA 2022 Cervical Spine Fracture Detection Challenge.

DiSC-Med: Diffusion-based Semantic Communications for Robust Medical Image Transmission

Application of Tuning-Ensemble N-Best in Auto-Sklearn for Mammographic Radiomic Analysis for Breast Cancer Prediction.

Cognitive profiles associated with faster thalamic atrophy in multiple sclerosis.

Topology Optimization in Medical Image Segmentation with Fast Euler Characteristic

Interpreting convolutional neural network explainability for head-and-neck cancer radiotherapy organ-at-risk segmentation

IHE-Net:Hidden feature discrepancy fusion and triple consistency training for semi-supervised medical image segmentation.

Ready to Sharpen Your Edge?