Latest Papers on Radiology AI. Category: preprint, Tags: Retrospective Clinical, Order: Best Match, Limit: 10.

Pediatric Pancreas Segmentation from MRI Scans with Deep Learning

Elif Keles, Merve Yazol, Gorkem Durak, Ziliang Hong, Halil Ertugrul Aktas, Zheyuan Zhang, Linkai Peng, Onkar Susladkar, Necati Guzelyel, Oznur Leman Boyunaga, Cemal Yazici, Mark Lowe, Aliye Uc, Ulas Bagci

•preprint•Jun 18 2025

Objective: Our study aimed to evaluate and validate PanSegNet, a deep learning (DL) algorithm for pediatric pancreas segmentation on MRI in children with acute pancreatitis (AP), chronic pancreatitis (CP), and healthy controls. Methods: With IRB approval, we retrospectively collected 84 MRI scans (1.5T/3T Siemens Aera/Verio) from children aged 2-19 years at Gazi University (2015-2024). The dataset includes healthy children as well as patients diagnosed with AP or CP based on clinical criteria. Pediatric and general radiologists manually segmented the pancreas, then confirmed by a senior pediatric radiologist. PanSegNet-generated segmentations were assessed using Dice Similarity Coefficient (DSC) and 95th percentile Hausdorff distance (HD95). Cohen's kappa measured observer agreement. Results: Pancreas MRI T2W scans were obtained from 42 children with AP/CP (mean age: 11.73 +/- 3.9 years) and 42 healthy children (mean age: 11.19 +/- 4.88 years). PanSegNet achieved DSC scores of 88% (controls), 81% (AP), and 80% (CP), with HD95 values of 3.98 mm (controls), 9.85 mm (AP), and 15.67 mm (CP). Inter-observer kappa was 0.86 (controls), 0.82 (pancreatitis), and intra-observer agreement reached 0.88 and 0.81. Strong agreement was observed between automated and manual volumes (R^2 = 0.85 in controls, 0.77 in diseased), demonstrating clinical reliability. Conclusion: PanSegNet represents the first validated deep learning solution for pancreatic MRI segmentation, achieving expert-level performance across healthy and diseased states. This tool, algorithm, along with our annotated dataset, are freely available on GitHub and OSF, advancing accessible, radiation-free pediatric pancreatic imaging and fostering collaborative research in this underserved domain.

MRI Segmentation Abdominal Retrospective Clinical In Silico None Academic Lab Open Dataset Open Code

Multimodal MRI Marker of Cognition Explains the Association Between Cognition and Mental Health in UK Biobank

Buianova, I., Silvestrin, M., Deng, J., Pat, N.

•preprint•Jun 18 2025

BackgroundCognitive dysfunction often co-occurs with psychopathology. Advances in neuroimaging and machine learning have led to neural indicators that predict individual differences in cognition with reasonable performance. We examined whether these neural indicators explain the relationship between cognition and mental health in the UK Biobank cohort (n > 14000). MethodsUsing machine learning, we quantified the covariation between general cognition and 133 mental health indices and derived neural indicators of cognition from 72 neuroimaging phenotypes across diffusion-weighted MRI (dwMRI), resting-state functional MRI (rsMRI), and structural MRI (sMRI). With commonality analyses, we investigated how much of the cognition-mental health covariation is captured by each neural indicator and neural indicators combined within and across MRI modalities. ResultsThe predictive association between mental health and cognition was at out-of-sample r = 0.3. Neuroimaging phenotypes captured 2.1% to 25.8% of the cognition-mental health covariation. The highest proportion of variance explained by dwMRI was attributed to the number of streamlines connecting cortical regions (19.3%), by rsMRI through functional connectivity between 55 large-scale networks (25.8%), and by sMRI via the volumetric characteristics of subcortical structures (21.8%). Combining neuroimaging phenotypes within modalities improved the explanation to 25.5% for dwMRI, 29.8% for rsMRI, and 31.6% for sMRI, and combining them across all MRI modalities enhanced the explanation to 48%. ConclusionsWe present an integrated approach to derive multimodal MRI markers of cognition that can be transdiagnostically linked to psychopathology. This demonstrates that the predictive ability of neural indicators extends beyond the prediction of cognition itself, enabling us to capture the cognition-mental health covariation.

MRI Classification Neurological Retrospective Clinical In Silico None Academic Lab

A Deep Learning Lung Cancer Segmentation Pipeline to Facilitate CT-based Radiomics

So, A. C. P., Cheng, D., Aslani, S., Azimbagirad, M., Yamada, D., Dunn, R., Josephides, E., McDowall, E., Henry, A.-R., Bille, A., Sivarasan, N., Karapanagiotou, E., Jacob, J., Pennycuick, A.

•preprint•Jun 18 2025

BackgroundCT-based radio-biomarkers could provide non-invasive insights into tumour biology to risk-stratify patients. One of the limitations is laborious manual segmentation of regions-of-interest (ROI). We present a deep learning auto-segmentation pipeline for radiomic analysis. Patients and Methods153 patients with resected stage 2A-3B non-small cell lung cancer (NSCLCs) had tumours segmented using nnU-Net with review by two clinicians. The nnU-Net was pretrained with anatomical priors in non-cancerous lungs and finetuned on NSCLCs. Three ROIs were segmented: intra-tumoural, peri-tumoural, and whole lung. 1967 features were extracted using PyRadiomics. Feature reproducibility was tested using segmentation perturbations. Features were selected using minimum-redundancy-maximum-relevance with Random Forest-recursive feature elimination nested in 500 bootstraps. ResultsAuto-segmentation time was [~]36 seconds/series. Mean volumetric and surface Dice-Sorensen coefficient (DSC) scores were 0.84 ({+/-}0.28), and 0.79 ({+/-}0.34) respectively. DSC were significantly correlated with tumour shape (sphericity, diameter) and location (worse with chest wall adherence), but not batch effects (e.g. contrast, reconstruction kernel). 6.5% cases had missed segmentations; 6.5% required major changes. Pre-training on anatomical priors resulted in better segmentations compared to training on tumour-labels alone (p<0.001) and tumour with anatomical labels (p<0.001). Most radiomic features were not reproducible following perturbations and resampling. Adding radiomic features, however, did not significantly improve the clinical model in predicting 2-year disease-free survival: AUCs 0.67 (95%CI 0.59-0.75) vs 0.63 (95%CI 0.54-0.71) respectively (p=0.28). ConclusionOur study demonstrates that integrating auto-segmentation into radio-biomarker discovery is feasible with high efficiency and accuracy. Whilst radiomic analysis show limited reproducibility, our auto-segmentation may allow more robust radio-biomarker analysis using deep learning features.

CT Segmentation Chest Retrospective Clinical In Silico None Academic Lab Reproducibility

USING ARTIFICIAL INTELLIGENCE TO PREDICT TREATMENT OUTCOMES IN PATIENTS WITH NEUROGENIC OVERACTIVE BLADDER AND MULTIPLE SCLEROSIS

Chang, O., Lee, J., Lane, F., Demetriou, M., Chang, P.

•preprint•Jun 18 2025

Introduction and ObjectivesMany women with multiple sclerosis (MS) experience neurogenic overactive bladder (NOAB) characterized by urinary frequency, urinary urgency and urgency incontinence. The objective of the study was to create machine learning (ML) models utilizing clinical and imaging data to predict NOAB treatment success stratified by treatment type. MethodsThis was a retrospective cohort study of female patients with diagnosis of NOAB and MS seen at a tertiary academic center from 2017-2022. Clinical and imaging data were extracted. Three types of NOAB treatment options evaluated included behavioral therapy, medication therapy and minimally invasive therapies. The primary outcome - treatment success was defined as > 50% reduction in urinary frequency, urinary urgency or a subjective perception of treatment success. For the construction of the logistic regression ML models, bivariate analyses were performed with backward selection of variables with p-values of < 0.10 and clinically relevant variables applied. For ML, the cohort was split into a training dataset (70%) and a test dataset (30%). Area under the curve (AUC) scores are calculated to evaluate model performance. ResultsThe 110 patients included had a mean age of patients were 59 years old (SD 14 years), with a predominantly White cohort (91.8%), post-menopausal (68.2%). Patients were stratified by NOAB treatment therapy type received with 70 patients (63.6%) at behavioral therapy, 58 (52.7%) with medication therapy and 44 (40%) with minimally invasive therapies. On MRI brain imaging, 63.6% of patients had > 20 lesions though majority were not active lesions. The lesions were mostly located within the supratentorial (94.5%), infratentorial (68.2%) and 58.2 infratentorial brain (63.8%) as well as in the deep white matter (53.4%). For MRI spine imaging, most of the lesions were in the cervical spine (71.8%) followed by thoracic spine (43.7%) and lumbar spine (6.4%).10.3%). After feature selection, the top 10 highest ranking features were used to train complimentary LASSO-regularized logistic regression (LR) and extreme gradient-boosted tree (XGB) models. The top-performing LR models for predicting response to behavioral, medication, and minimally invasive therapies yielded AUC values of 0.74, 0.76, and 0.83, respectively. ConclusionsUsing these top-ranked features, LR models achieved AUC values of 0.74-0.83 for prediction of treatment success based on individual factors. Further prospective evaluation is needed to better characterize and validate these identified associations.

MRI Classification Neurological Retrospective Clinical In Silico None Academic Lab

Integrating Radiomics with Deep Learning Enhances Multiple Sclerosis Lesion Delineation

Nadezhda Alsahanova, Pavel Bartenev, Maksim Sharaev, Milos Ljubisavljevic, Taleb Al. Mansoori, Yauhen Statsenko

•preprint•Jun 17 2025

Background: Accurate lesion segmentation is critical for multiple sclerosis (MS) diagnosis, yet current deep learning approaches face robustness challenges. Aim: This study improves MS lesion segmentation by combining data fusion and deep learning techniques. Materials and Methods: We suggested novel radiomic features (concentration rate and R\'enyi entropy) to characterize different MS lesion types and fused these with raw imaging data. The study integrated radiomic features with imaging data through a ResNeXt-UNet architecture and attention-augmented U-Net architecture. Our approach was evaluated on scans from 46 patients (1102 slices), comparing performance before and after data fusion. Results: The radiomics-enhanced ResNeXt-UNet demonstrated high segmentation accuracy, achieving significant improvements in precision and sensitivity over the MRI-only baseline and a Dice score of 0.774$\pm$0.05; p<0.001 according to Bonferroni-adjusted Wilcoxon signed-rank tests. The radiomics-enhanced attention-augmented U-Net model showed a greater model stability evidenced by reduced performance variability (SDD = 0.18 $\pm$ 0.09 vs. 0.21 $\pm$ 0.06; p=0.03) and smoother validation curves with radiomics integration. Conclusion: These results validate our hypothesis that fusing radiomics with raw imaging data boosts segmentation performance and stability in state-of-the-art models.

MRI Segmentation Neurological Retrospective Clinical In Silico None Academic Lab

Radiologist-AI workflow can be modified to reduce the risk of medical malpractice claims

Bernstein, M., Sheppard, B., Bruno, M. A., Lay, P. S., Baird, G. L.

•preprint•Jun 16 2025

BackgroundArtificial Intelligence (AI) is rapidly changing the legal landscape of radiology. Results from a previous experiment suggested that providing AI error rates can reduce perceived radiologist culpability, as judged by mock jury members (4). The current study advances this work by examining whether the radiologists behavior also impacts perceptions of liability. Methods. Participants (n=282) read about a hypothetical malpractice case where a 50-year-old who visited the Emergency Department with acute neurological symptoms received a brain CT scan to determine if bleeding was present. An AI system was used by the radiologist who interpreted imaging. The AI system correctly flagged the case as abnormal. Nonetheless, the radiologist concluded no evidence of bleeding, and the blood-thinner t-PA was administered. Participants were randomly assigned to either a 1.) single-read condition, where the radiologist interpreted the CT once after seeing AI feedback, or 2.) a double-read condition, where the radiologist interpreted the CT twice, first without AI and then with AI feedback. Participants were then told the patient suffered irreversible brain damage due to the missed brain bleed, resulting in the patient (plaintiff) suing the radiologist (defendant). Participants indicated whether the radiologist met their duty of care to the patient (yes/no). Results. Hypothetical jurors were more likely to side with the plaintiff in the single-read condition (106/142, 74.7%) than in the double-read condition (74/140, 52.9%), p=0.0002. Conclusion. This suggests that the penalty for disagreeing with correct AI can be mitigated when images are interpreted twice, or at least if a radiologist gives an interpretation before AI is used.

CT Detection Neurological Retrospective Clinical Post Market None Academic Lab Ethics Policy

Default Mode Network Connectivity Predicts Individual Differences in Long-Term Forgetting: Evidence for Storage Degradation, not Retrieval Failure

Xu, Y., Prat, C. S., Sense, F., van Rijn, H., Stocco, A.

•preprint•Jun 16 2025

Despite the importance of memories in everyday life and the progress made in understanding how they are encoded and retrieved, the neural processes by which declarative memories are maintained or forgotten remain elusive. Part of the problem is that it is empirically difficult to measure the rate at which memories fade, even between repeated presentations of the source of the memory. Without such a ground-truth measure, it is hard to identify the corresponding neural correlates. This study addresses this problem by comparing individual patterns of functional connectivity against behavioral differences in forgetting speed derived from computational phenotyping. Specifically, the individual-specific values of the speed of forgetting in long-term memory (LTM) were estimated for 33 participants using a formal model fit to accuracy and response time data from an adaptive paired-associate learning task. Individual speeds of forgetting were then used to examine participant-specific patterns of resting-state fMRI connectivity, using machine learning techniques to identify the most predictive and generalizable features. Our results show that individual speeds of forgetting are associated with resting-state connectivity within the default mode network (DMN) as well as between the DMN and cortical sensory areas. Cross-validation showed that individual speeds of forgetting were predicted with high accuracy (r = .78) from these connectivity patterns alone. These results support the view that DMN activity and the associated sensory regions are actively involved in maintaining memories and preventing their decline, a view that can be seen as evidence for the hypothesis that forgetting is a result of storage degradation, rather than of retrieval failure.

MRI Classification Neurological Retrospective Clinical In Silico None Academic Lab

Improving Prostate Gland Segmenting Using Transformer based Architectures

Shatha Abudalou

•preprint•Jun 16 2025

Inter reader variability and cross site domain shift challenge the automatic segmentation of prostate anatomy using T2 weighted MRI images. This study investigates whether transformer models can retain precision amid such heterogeneity. We compare the performance of UNETR and SwinUNETR in prostate gland segmentation against our previous 3D UNet model [1], based on 546 MRI (T2weighted) volumes annotated by two independent experts. Three training strategies were analyzed: single cohort dataset, 5 fold cross validated mixed cohort, and gland size based dataset. Hyperparameters were tuned by Optuna. The test set, from an independent population of readers, served as the evaluation endpoint (Dice Similarity Coefficient). In single reader training, SwinUNETR achieved an average dice score of 0.816 for Reader#1 and 0.860 for Reader#2, while UNETR scored 0.8 and 0.833 for Readers #1 and #2, respectively, compared to the baseline UNets 0.825 for Reader #1 and 0.851 for Reader #2. SwinUNETR had an average dice score of 0.8583 for Reader#1 and 0.867 for Reader#2 in cross-validated mixed training. For the gland size-based dataset, SwinUNETR achieved an average dice score of 0.902 for Reader#1 subset and 0.894 for Reader#2, using the five-fold mixed training strategy (Reader#1, n=53; Reader#2, n=87) at larger gland size-based subsets, where UNETR performed poorly. Our findings demonstrate that global and shifted-window self-attention effectively reduces label noise and class imbalance sensitivity, resulting in improvements in the Dice score over CNNs by up to five points while maintaining computational efficiency. This contributes to the high robustness of SwinUNETR for clinical deployment.

MRI Segmentation Abdominal Retrospective Clinical In Silico None Academic Lab

Predicting overall survival of NSCLC patients with clinical, radiomics and deep learning features

Kanakarajan, H., Zhou, J., Baene, W. D., Sitskoorn, M.

•preprint•Jun 16 2025

Background and purposeAccurate estimation of Overall Survival (OS) in Non-Small Cell Lung Cancer (NSCLC) patients provides critical insights for treatment planning. While previous studies showed that radiomics and Deep Learning (DL) features increased prediction accuracy, this study aimed to examine whether a model that combines the radiomics and DL features with the clinical and dosimetric features outperformed other models. Materials and methodsWe collected pre-treatment lung CT scans and clinical data for 225 NSCLC patients from the Maastro Clinic: 180 for training and 45 for testing. Radiomics features were extracted using the Python radiomics feature extractor, and DL features were obtained using a 3D ResNet model. An ensemble model comprising XGB and NN classifiers was developed using: (1) clinical features only; (2) clinical and radiomics features; (3) clinical and DL features; and (4) clinical, radiomics, and DL features. The performance metrics were evaluated for the test and K-fold cross-validation data sets. ResultsThe prediction model utilizing only clinical variables provided an Area Under the Receiver Operating Characteristic Curve (AUC) of 0.64 and a test accuracy of 77.55%. The best performance came from combining clinical, radiomics, and DL features (AUC: 0.84, accuracy: 85.71%). The prediction improvement of this model was statistically significant compared to models trained with clinical features alone or with a combination of clinical and radiomics features. ConclusionIntegrating radiomics and DL features with clinical characteristics improved the prediction of OS after radiotherapy for NSCLC patients. The increased accuracy of our integrated model enables personalized, risk-based treatment planning, guiding clinicians toward more effective interventions, improved patient outcomes and enhanced quality of life.

CT Classification Chest Retrospective Clinical In Silico None Academic Lab

Deep-Learning Based Contrast Boosting Improves Lesion Visualization and Image Quality: A Multi-Center Multi-Reader Study on Clinical Performance with Standard Contrast Enhanced MRI of Brain Tumors

Pasumarthi, S., Campbell Arnold, T., Colombo, S., Rudie, J. D., Andre, J. B., Elor, R., Gulaka, P., Shankaranarayanan, A., Erb, G., Zaharchuk, G.

•preprint•Jun 13 2025

BackgroundGadolinium-based Contrast Agents (GBCAs) are used in brain MRI exams to improve the visualization of pathology and improve the delineation of lesions. Higher doses of GBCAs can improve lesion sensitivity but involve substantial deviation from standard-of-care procedures and may have safety implications, particularly in the light of recent findings on gadolinium retention and deposition. PurposeTo evaluate the clinical performance of an FDA cleared deep-learning (DL) based contrast boosting algorithm in routine clinical brain MRI exams. MethodsA multi-center retrospective database of contrast-enhanced brain MRI images (obtained from April 2017 to December 2023) was used to evaluate a DL-based contrast boosting algorithm. Pre-contrast and standard post-contrast (SC) images were processed with the algorithm to obtain contrast boosted (CB) images. Quantitative performance of CB images in comparison to SC images was compared using contrast-to-noise ratio (CNR), lesion-to-brain ratio (LBR) and contrast enhancement percentage (CEP). Three board-certified radiologists reviewed CB and SC images side-by-side for qualitative evaluation and rated them on a 4-point Likert scale for lesion contrast enhancement, border delineation, internal morphology, overall image quality, presence of artefacts, and changes in vessel conspicuity. The presence, cause, and severity of any false lesions was recorded. CB results were compared to SC using Wilcoxon signed rank test for statistical significance. ResultsBrain MRI images from 110 patients (47 {+/-} 22 years; 52 Females, 47 Males, 11 N/A) were evaluated. CB images had superior quantitative performance than SC images in terms of CNR (+634%), LBR (+70%) and CEP (+150%). In the qualitative assessment CB images showed better lesion visualization (3.73 vs 3.16) and had better image quality (3.55 vs 3.07). Readers were able to rule out all false lesions on CB by using SC for comparison. ConclusionsDeep learning based contrast boosting improves lesion visualization and image quality without increasing contrast dosage. Key ResultsO_LIIn a retrospective study of 110 patients, deep-learning based contrast boosted (CB) images showed better lesion visualization than standard post-contrast (SC) brain MRI images (3.73 vs 3.16; mean reader scores [4-point Likert scale]) C_LIO_LICB images had better overall image quality than SC images (3.55 vs 3.07) C_LIO_LIContrast-to-noise ratio, Lesion-to-brain Ratio and Contrast Enhancement Percentage for CB images were significantly higher than SC images (+729%, +88% and +165%; p < 0.001) C_LI Summary StatementDeep-learning based contrast boosting achieves better lesion visualization and overall image quality and provides more contrast information, without increasing the contrast dosage in contrast-enhanced brain MR protocols.

MRI Image Synthesis Neurological Retrospective Clinical FDA Cleared FDA 510(k)Startup Benchmark SOTA

Pediatric Pancreas Segmentation from MRI Scans with Deep Learning

Multimodal MRI Marker of Cognition Explains the Association Between Cognition and Mental Health in UK Biobank

A Deep Learning Lung Cancer Segmentation Pipeline to Facilitate CT-based Radiomics

USING ARTIFICIAL INTELLIGENCE TO PREDICT TREATMENT OUTCOMES IN PATIENTS WITH NEUROGENIC OVERACTIVE BLADDER AND MULTIPLE SCLEROSIS

Integrating Radiomics with Deep Learning Enhances Multiple Sclerosis Lesion Delineation

Radiologist-AI workflow can be modified to reduce the risk of medical malpractice claims

Default Mode Network Connectivity Predicts Individual Differences in Long-Term Forgetting: Evidence for Storage Degradation, not Retrieval Failure

Improving Prostate Gland Segmenting Using Transformer based Architectures

Predicting overall survival of NSCLC patients with clinical, radiomics and deep learning features

Deep-Learning Based Contrast Boosting Improves Lesion Visualization and Image Quality: A Multi-Center Multi-Reader Study on Clinical Performance with Standard Contrast Enhanced MRI of Brain Tumors

Ready to Sharpen Your Edge?