Latest Papers on Radiology AI. Tags: Reproducibility

Magnetization transfer MRI (MT-MRI) detects white matter damage beyond the primary site of compression in degenerative cervical myelopathy using a novel semi-automated analysis.

Muhammad F, Weber Ii KA, Haynes G, Villeneuve L, Smith L, Baha A, Hameed S, Khan AF, Dhaher Y, Parrish T, Rohan M, Smith ZA

•papers•Sep 14 2025

Degenerative cervical myelopathy (DCM) is the leading cause of spinal cord disorder in adults, yet conventional MRI cannot detect microstructural damage beyond the compression site. Current application of magnetization transfer ratio (MTR), while promising, suffer from limited standardization, operator dependence, and unclear added value over traditional metrics such as cross-sectional area (CSA). To address these limitations, we utilized our semi-automated analysis pipeline built on the Spinal Cord Toolbox (SCT) platform to automate MTR extraction. Our method integrates deep learning-based convolutional neural networks (CNNs) for spinal cord segmentation, vertebral labeling via the global curve optimization algorithm and PAM50 template registration to enable automated MTR extraction. Using the Generic Spine Protocol, we acquired 3T T2w- and MT-MRI images from 30 patients with DCM and 15 age-matched healthy controls (HC). We computed MTR and CSA at the maximal compression level (C5-C6) and a distant, uncompressed region (C2-C3). We extracted regional and tract-specific MTR using probabilistic maps in template space. Diagnostic accuracy was assessed with ROC analysis, and k-means clustering reveal patients subgroups based on neurological impairments. Correlation analysis assessed associations between MTR measures and DCM deficits. Patients with DCM showed significant MTR reductions in both compressed and uncompressed regions (p < 0.05). At C2-C3, MTR outperformed CSA (AUC 0.74 vs 0.69) in detecting spinal cord pathology. Tract-specific MTR were correlated with dexterity, grip strength, and balance deficits. Our reproducible, computationally robust pipeline links microstructural injury to clinical outcomes in DCM and provides a scalable framework for multi-site quantitative MRI analysis of the spinal cord.

MRI Segmentation Musculoskeletal Retrospective Clinical In Silico Academic Lab Reproducibility

Deep learning-based volume of interest imaging in helical CT for image quality improvement and radiation dose reduction.

Zhou Z, Inoue A, Cox CW, McCollough CH, Yu L

•papers•Sep 13 2025

To develop a volume of interest (VOI) imaging technique in multi-detector-row helical CT to reduce radiation dose or improve image quality within the VOI. A deep-learning method based on a residual U-Net architecture, named VOI-Net, was developed to correct truncation artifacts in VOI helical CT. Three patient cases, a chest CT of interstitial lung disease and 2 abdominopelvic CT of liver tumour, were used for evaluation through simulation. VOI-Net effectively corrected truncation artifacts (root mean square error [RMSE] of 5.97 ± 2.98 Hounsfield Units [HU] for chest, 3.12 ± 1.93 HU, and 3.71 ± 1.87 HU for liver). Radiation dose was reduced by 71% without sacrificing image quality within a 10-cm diameter VOI, compared to a full scan field of view (FOV) of 50 cm. With the same total energy deposited as in a full FOV scan, image quality within the VOI matched that at 350% higher radiation dose. A radiologist confirmed improved lesion conspicuity and visibility of small linear reticulations associated with ground-glass opacity and liver tumour. Focusing radiation on the VOI and using VOI-Net in a helical scan, total radiation can be reduced or higher image quality equivalent to those at higher doses in standard full FOV scan can be achieved within the VOI. A targeted helical VOI imaging technique enabled by a deep-learning-based artifact correction method improves image quality within the VOI without increasing radiation dose.

CT Reconstruction Chest Retrospective Clinical In Silico Academic Lab Reproducibility

Enhancement Without Contrast: Stability-Aware Multicenter Machine Learning for Glioma MRI Imaging

Sajad Amiri, Shahram Taeb, Sara Gharibi, Setareh Dehghanfard, Somayeh Sadat Mehrnia, Mehrdad Oveisi, Ilker Hacihaliloglu, Arman Rahmim, Mohammad R. Salmanpour

•preprint•Sep 13 2025

Gadolinium-based contrast agents (GBCAs) are central to glioma imaging but raise safety, cost, and accessibility concerns. Predicting contrast enhancement from non-contrast MRI using machine learning (ML) offers a safer alternative, as enhancement reflects tumor aggressiveness and informs treatment planning. Yet scanner and cohort variability hinder robust model selection. We propose a stability-aware framework to identify reproducible ML pipelines for multicenter prediction of glioma MRI contrast enhancement. We analyzed 1,446 glioma cases from four TCIA datasets (UCSF-PDGM, UPENN-GB, BRATS-Africa, BRATS-TCGA-LGG). Non-contrast T1WI served as input, with enhancement derived from paired post-contrast T1WI. Using PyRadiomics under IBSI standards, 108 features were extracted and combined with 48 dimensionality reduction methods and 25 classifiers, yielding 1,200 pipelines. Rotational validation was trained on three datasets and tested on the fourth. Cross-validation prediction accuracies ranged from 0.91 to 0.96, with external testing achieving 0.87 (UCSF-PDGM), 0.98 (UPENN-GB), and 0.95 (BRATS-Africa), with an average of 0.93. F1, precision, and recall were stable (0.87 to 0.96), while ROC-AUC varied more widely (0.50 to 0.82), reflecting cohort heterogeneity. The MI linked with ETr pipeline consistently ranked highest, balancing accuracy and stability. This framework demonstrates that stability-aware model selection enables reliable prediction of contrast enhancement from non-contrast glioma MRI, reducing reliance on GBCAs and improving generalizability across centers. It provides a scalable template for reproducible ML in neuro-oncology and beyond.

MRI Image Synthesis Neurological Retrospective Clinical In Silico Academic Lab Reproducibility Benchmark SOTA

Mapping of discrete range modulated proton radiograph to water-equivalent path length using machine learning

Atiq Ur Rahman, Chun-Chieh Wang, Shu-Wei Wu, Tsi-Chian Chao, I-Chun Cho

•preprint•Sep 11 2025

Objective. Proton beams enable localized dose delivery. Accurate range estimation is essential, but planning still relies on X-ray CT, which introduces uncertainty in stopping power and range. Proton CT measures water equivalent thickness directly but suffers resolution loss from multiple Coulomb scattering. We develop a data driven method that reconstructs water equivalent path length (WEPL) maps from energy resolved proton radiographs, bypassing intermediate reconstructions. Approach. We present a machine learning pipeline for WEPL from high dimensional radiographs. Data were generated with the TOPAS Monte Carlo toolkit, modeling a clinical nozzle and a patient CT. Proton energies spanned 70-230 MeV across 72 projection angles. Principal component analysis reduced input dimensionality while preserving signal. A conditional GAN with gradient penalty was trained for WEPL prediction using a composite loss (adversarial, MSE, SSIM, perceptual) to balance sharpness, accuracy, and stability. Main results. The model reached a mean relative WEPL deviation of 2.5 percent, an SSIM of 0.97, and a proton radiography gamma index passing rate of 97.1 percent (2 percent delta WEPL, 3 mm distance-to-agreement) on a simulated head phantom. Results indicate high spatial fidelity and strong structural agreement. Significance. WEPL can be mapped directly from proton radiographs with deep learning while avoiding intermediate steps. The method mitigates limits of analytic techniques and may improve treatment planning. Future work will tune the number of PCA components, include detector response, explore low dose settings, and extend multi angle data toward full proton CT reconstruction; it is compatible with clinical workflows.

CT Reconstruction Neurological Methodology In Silico Academic Lab Reproducibility

Fixed point method for PET reconstruction with learned plug-and-play regularization.

Savanier M, Comtat C, Sureau F

•papers•Sep 10 2025

Deep learning has shown great promise for improving medical image reconstruction, including PET. However, concerns remain about the stability and robustness of these methods, especially when trained on limited data. This work aims to explore the use of the Plug-and-Play (PnP) framework in PET reconstruction to address these concerns.Approach:We propose a convergent PnP algorithm for low-count PET reconstruction based on the Douglas-Rachford splitting method. We consider several denoisers trained to satisfy fixed-point conditions, with convergence properties ensured either during training or by design, including a spectrally normalized network and a deep equilibrium model. We evaluate the bias-standard deviation tradeoff across clinically relevant regions and an unseen pathological case in a synthetic experiment and a real study. Comparisons are made with model-based iterative reconstruction, post-reconstruction denoising, a deep end-to-end unfolded network and PnP with a Gaussian denoiser.Main Results:Our method achieves lower bias than post-reconstruction processing and reduced standard deviation at matched bias compared to model-based iterative reconstruction. While spectral normalization underperforms in generalization, the deep equilibrium model remains competitive with convolutional networks for plug-and-play reconstruction and generalizes better to the unseen pathology. Compared to the end-to-end unfolded network, it also generalizes more consistently.Significance:This study demonstrates the potential of the PnP framework to improve image quality and quantification accuracy in PET reconstruction. It also highlights the importance of how convergence conditions are imposed on the denoising network to ensure robust and generalizable performance.

PET Reconstruction Whole Body Methodology In Silico Academic Lab Reproducibility

AI-assisted detection of cerebral aneurysms on 3D time-of-flight MR angiography: user variability and clinical implications.

Liao L, Puel U, Sabardu O, Harsan O, Medeiros LL, Loukoul WA, Anxionnat R, Kerrien E

•papers•Sep 10 2025

The generalizability and reproducibility of AI-assisted detection for cerebral aneurysms on 3D time-of-flight MR angiography remain unclear. We aimed to evaluate physician performance using AI assistance, focusing on inter- and intra-user variability, identifying factors influencing performance and clinical implications. In this retrospective study, four state-of-the-art AI models were hyperparameter-optimized on an in-house dataset (2019-2021) and evaluated via 5-fold cross-validation on a public external dataset. The two best-performing models were selected for evaluation on an expert-revised external dataset. saccular aneurysms without prior treatment. Five physicians, grouped by expertise, each performed two AI-assisted evaluations, one with each model. Lesion-wise sensitivity and false positives per case (FPs/case) were calculated for each physician-AI pair and AI models alone. Agreement was assessed using kappa. Aneurysm size comparisons used the Mann-Whitney U test. The in-house dataset included 132 patients with 206 aneurysms (mean size: 4.0 mm); the revised external dataset, 270 patients with 174 aneurysms (mean size: 3.7 mm). Standalone AI achieved 86.8% sensitivity and 0.58 FPs/case. With AI assistance, non-experts achieved 72.1% sensitivity and 0.037 FPs/case; experts, 88.6% and 0.076 FPs/case; the intermediate-level physician, 78.5% and 0.037 FPs/case. Intra-group agreement was 80% for non-experts (kappa: 0.57, 95% CI: 0.54-0.59) and 77.7% for experts (kappa: 0.53, 95% CI: 0.51-0.55). In experts, false positives were smaller than true positives (2.7 vs. 3.8 mm, p < 0.001); no difference in non-experts (p = 0.09). Missed aneurysm locations were mainly model-dependent, while true- and false-positive locations reflected physician expertise. Non-experts more often rejected AI suggestions and added fewer annotations; experts were more conservative and added more. Evaluating AI models in isolation provides an incomplete view of their clinical applicability. Detection performance and patterns differ between standalone AI and AI-assisted use, and are modulated by physician expertise. Rigorous external validation is essential before clinical deployment.

MRI Detection Neurological Retrospective Clinical In Silico Reproducibility

Integration of nested cross-validation, automated hyperparameter optimization, high-performance computing to reduce and quantify the variance of test performance estimation of deep learning models.

Calle P, Bates A, Reynolds JC, Liu Y, Cui H, Ly S, Wang C, Zhang Q, de Armendi AJ, Shettar SS, Fung KM, Tang Q, Pan C

•papers•Sep 10 2025

The variability and biases in the real-world performance benchmarking of deep learning models for medical imaging compromise their trustworthiness for real-world deployment. The common approach of holding out a single fixed test set fails to quantify the variance in the estimation of test performance metrics. This study introduces NACHOS (Nested and Automated Cross-validation and Hyperparameter Optimization using Supercomputing) to reduce and quantify the variance of test performance metrics of deep learning models. NACHOS integrates Nested Cross-Validation (NCV) and Automated Hyperparameter Optimization (AHPO) within a parallelized high-performance computing (HPC) framework. NACHOS was demonstrated on a chest X-ray repository and an Optical Coherence Tomography (OCT) dataset under multiple data partitioning schemes. Beyond performance estimation, DACHOS (Deployment with Automated Cross-validation and Hyperparameter Optimization using Supercomputing) is introduced to leverage AHPO and cross-validation to build the final model on the full dataset, improving expected deployment performance. The findings underscore the importance of NCV in quantifying and reducing estimation variance, AHPO in optimizing hyperparameters consistently across test folds, and HPC in ensuring computational feasibility. By integrating these methodologies, NACHOS and DACHOS provide a scalable, reproducible, and trustworthy framework for DL model evaluation and deployment in medical imaging. To maximize public availability, the full open-source codebase is provided at https://github.com/thepanlab/NACHOS.

Mixed Modality Classification Chest Methodology In Silico Academic Lab Open Code Reproducibility

Multispectral CT Denoising via Simulation-Trained Deep Learning: Experimental Results at the ESRF BM18

Peter Gänz, Steffen Kieß, Guangpu Yang, Jajnabalkya Guhathakurta, Tanja Pienkny, Charls Clark, Paul Tafforeau, Andreas Balles, Astrid Hölzing, Simon Zabler, Sven Simon

•preprint•Sep 10 2025

Multispectral computed tomography (CT) enables advanced material characterization by acquiring energy-resolved projection data. However, since the incoming X-ray flux is be distributed across multiple narrow energy bins, the photon count per bin is greatly reduced compared to standard energy-integrated imaging. This inevitably introduces substantial noise, which can either prolong acquisition times and make scan durations infeasible or degrade image quality with strong noise artifacts. To address this challenge, we present a dedicated neural network-based denoising approach tailored for multispectral CT projections acquired at the BM18 beamline of the ESRF. The method exploits redundancies across angular, spatial, and spectral domains through specialized sub-networks combined via stacked generalization and an attention mechanism. Non-local similarities in the angular-spatial domain are leveraged alongside correlations between adjacent energy bands in the spectral domain, enabling robust noise suppression while preserving fine structural details. Training was performed exclusively on simulated data replicating the physical and noise characteristics of the BM18 setup, with validation conducted on CT scans of custom-designed phantoms containing both high-Z and low-Z materials. The denoised projections and reconstructions demonstrate substantial improvements in image quality compared to classical denoising methods and baseline CNN models. Quantitative evaluations confirm that the proposed method achieves superior performance across a broad spectral range, generalizing effectively to real-world experimental data while significantly reducing noise without compromising structural fidelity.

CT Reconstruction Methodology In Silico Academic Lab Reproducibility

Artificial Intelligence in Breast Cancer Care: Transforming Preoperative Planning and Patient Education with 3D Reconstruction

Mustafa Khanbhai, Giulia Di Nardo, Jun Ma, Vivienne Freitas, Caterina Masino, Ali Dolatabadi, Zhaoxun "Lorenz" Liu, Wey Leong, Wagner H. Souza, Amin Madani

•preprint•Sep 10 2025

Effective preoperative planning requires accurate algorithms for segmenting anatomical structures across diverse datasets, but traditional models struggle with generalization. This study presents a novel machine learning methodology to improve algorithm generalization for 3D anatomical reconstruction beyond breast cancer applications. We processed 120 retrospective breast MRIs (January 2018-June 2023) through three phases: anonymization and manual segmentation of T1-weighted and dynamic contrast-enhanced sequences; co-registration and segmentation of whole breast, fibroglandular tissue, and tumors; and 3D visualization using ITK-SNAP. A human-in-the-loop approach refined segmentations using U-Mamba, designed to generalize across imaging scenarios. Dice similarity coefficient assessed overlap between automated segmentation and ground truth. Clinical relevance was evaluated through clinician and patient interviews. U-Mamba showed strong performance with DSC values of 0.97 ($\pm$0.013) for whole organs, 0.96 ($\pm$0.024) for fibroglandular tissue, and 0.82 ($\pm$0.12) for tumors on T1-weighted images. The model generated accurate 3D reconstructions enabling visualization of complex anatomical features. Clinician interviews indicated improved planning, intraoperative navigation, and decision support. Integration of 3D visualization enhanced patient education, communication, and understanding. This human-in-the-loop machine learning approach successfully generalizes algorithms for 3D reconstruction and anatomical segmentation across patient datasets, offering enhanced visualization for clinicians, improved preoperative planning, and more effective patient education, facilitating shared decision-making and empowering informed patient choices across medical applications.

MRI Segmentation Breast Retrospective Clinical In Silico Academic Lab Reproducibility

MedicalPatchNet: A Patch-Based Self-Explainable AI Architecture for Chest X-ray Classification

Patrick Wienholt, Christiane Kuhl, Jakob Nikolas Kather, Sven Nebelung, Daniel Truhn

•preprint•Sep 9 2025

Deep neural networks excel in radiological image classification but frequently suffer from poor interpretability, limiting clinical acceptance. We present MedicalPatchNet, an inherently self-explainable architecture for chest X-ray classification that transparently attributes decisions to distinct image regions. MedicalPatchNet splits images into non-overlapping patches, independently classifies each patch, and aggregates predictions, enabling intuitive visualization of each patch's diagnostic contribution without post-hoc techniques. Trained on the CheXpert dataset (223,414 images), MedicalPatchNet matches the classification performance (AUROC 0.907 vs. 0.908) of EfficientNet-B0, while substantially improving interpretability: MedicalPatchNet demonstrates substantially improved interpretability with higher pathology localization accuracy (mean hit-rate 0.485 vs. 0.376 with Grad-CAM) on the CheXlocalize dataset. By providing explicit, reliable explanations accessible even to non-AI experts, MedicalPatchNet mitigates risks associated with shortcut learning, thus improving clinical trust. Our model is publicly available with reproducible training and inference scripts and contributes to safer, explainable AI-assisted diagnostics across medical imaging domains. We make the code publicly available: https://github.com/TruhnLab/MedicalPatchNet

X-Ray Classification Chest Methodology In Silico Academic Lab Open Code Reproducibility

Filter Papers

Tags

Magnetization transfer MRI (MT-MRI) detects white matter damage beyond the primary site of compression in degenerative cervical myelopathy using a novel semi-automated analysis.

Deep learning-based volume of interest imaging in helical CT for image quality improvement and radiation dose reduction.

Enhancement Without Contrast: Stability-Aware Multicenter Machine Learning for Glioma MRI Imaging

Mapping of discrete range modulated proton radiograph to water-equivalent path length using machine learning

Fixed point method for PET reconstruction with learned plug-and-play regularization.

AI-assisted detection of cerebral aneurysms on 3D time-of-flight MR angiography: user variability and clinical implications.

Integration of nested cross-validation, automated hyperparameter optimization, high-performance computing to reduce and quantify the variance of test performance estimation of deep learning models.

Multispectral CT Denoising via Simulation-Trained Deep Learning: Experimental Results at the ESRF BM18

Artificial Intelligence in Breast Cancer Care: Transforming Preoperative Planning and Patient Education with 3D Reconstruction

MedicalPatchNet: A Patch-Based Self-Explainable AI Architecture for Chest X-ray Classification

Ready to Sharpen Your Edge?