Latest Papers on Radiology AI. Tags: Reproducibility

Artificial intelligence vs human expertise: A comparison of plantar fascia thickness measurements through MRI imaging.

Alyanak B, Çakar İ, Dede BT, Yıldızgören MT, Bağcıer F

•papers•Jun 3 2025

This study aims to evaluate the reliability of plantar fascia thickness measurements performed by ChatGPT-4 using magnetic resonance imaging (MRI) compared to those obtained by an experienced clinician. In this retrospective, single-center study, foot MRI images from the hospital archive were analysed. Plantar fascia thickness was measured under both blinded and non-blinded conditions by an experienced clinician and ChatGPT-4 at two separate time points. Measurement reliability was assessed using the intraclass correlation coefficient (ICC), mean absolute error (MAE), and mean relative error (MRE). A total of 41 participants (32 females, 9 males) were included. The average plantar fascia thickness measured by the clinician was 4.20 ± 0.80 mm and 4.25 ± 0.92 mm under blinded and non-blinded conditions, respectively, while ChatGPT-4's measurements were 6.47 ± 1.30 mm and 6.46 ± 1.31 mm, respectively. Human evaluators demonstrated excellent agreement (ICC = 0.983-0.989), whereas ChatGPT-4 exhibited low reliability (ICC = 0.391-0.432). In thin plantar fascia cases, ChatGPT-4's error rate was higher, with MAE = 2.70 mm, MRE = 77.17 % under blinded conditions, and MAE = 2.91 mm, MRE = 87.02 % under non-blinded conditions. ChatGPT-4 demonstrated lower reliability in plantar fascia thickness measurements compared to an experienced clinician, with increased error rates in thin structures. These findings highlight the limitations of AI-based models in medical image analysis and emphasize the need for further refinement before clinical implementation.

MRI Segmentation Musculoskeletal Retrospective Clinical In Silico Academic Lab Reproducibility

Open-PMC-18M: A High-Fidelity Large Scale Medical Dataset for Multimodal Representation Learning

Negin Baghbanzadeh, Sajad Ashkezari, Elham Dolatabadi, Arash Afkanpour

•preprint•Jun 3 2025

Compound figures, which are multi-panel composites containing diverse subfigures, are ubiquitous in biomedical literature, yet large-scale subfigure extraction remains largely unaddressed. Prior work on subfigure extraction has been limited in both dataset size and generalizability, leaving a critical open question: How does high-fidelity image-text alignment via large-scale subfigure extraction impact representation learning in vision-language models? We address this gap by introducing a scalable subfigure extraction pipeline based on transformer-based object detection, trained on a synthetic corpus of 500,000 compound figures, and achieving state-of-the-art performance on both ImageCLEF 2016 and synthetic benchmarks. Using this pipeline, we release OPEN-PMC-18M, a large-scale high quality biomedical vision-language dataset comprising 18 million clinically relevant subfigure-caption pairs spanning radiology, microscopy, and visible light photography. We train and evaluate vision-language models on our curated datasets and show improved performance across retrieval, zero-shot classification, and robustness benchmarks, outperforming existing baselines. We release our dataset, models, and code to support reproducible benchmarks and further study into biomedical vision-language modeling and representation learning.

Mixed Modality Image Synthesis Dataset Release In Silico Academic Lab Open Dataset Open Code Benchmark SOTA Reproducibility

SASWISE-UE: Segmentation and synthesis with interpretable scalable ensembles for uncertainty estimation.

Chen W, McMillan AB

•papers•Jun 2 2025

This paper introduces an efficient sub-model ensemble framework aimed at enhancing the interpretability of medical deep learning models, thus increasing their clinical applicability. By generating uncertainty maps, this framework enables end-users to evaluate the reliability of model outputs. We developed a strategy to generate diverse models from a single well-trained checkpoint, facilitating the training of a model family. This involves producing multiple outputs from a single input, fusing them into a final output, and estimating uncertainty based on output disagreements. Implemented using U-Net and UNETR models for segmentation and synthesis tasks, this approach was tested on CT body segmentation and MR-CT synthesis datasets. It achieved a mean Dice coefficient of 0.814 in segmentation and a Mean Absolute Error of 88.17 HU in synthesis, improved from 89.43 HU by pruning. Additionally, the framework was evaluated under image corruption and data undersampling, maintaining correlation between uncertainty and error, which highlights its robustness. These results suggest that the proposed approach not only maintains the performance of well-trained models but also enhances interpretability through effective uncertainty estimation, applicable to both convolutional and transformer models in a range of imaging tasks.

Mixed Modality Segmentation Whole Body Methodology In Silico Academic Lab Reproducibility

Direct parametric reconstruction in dynamic PET using deep image prior and a novel parameter magnification strategy.

Hong X, Wang F, Sun H, Arabi H, Lu L

•papers•Jun 2 2025

Multiple parametric imaging in positron emission tomography (PET) is challenging due to the noisy dynamic data and the complex mapping to kinetic parameters. Although methods like direct parametric reconstruction have been proposed to improve the image quality, limitations persist, particularly for nonlinear and small-value micro-parameters (e.g., k2, k3). This study presents a novel unsupervised deep learning approach to reconstruct and improve the quality of these micro-parameters. We proposed a direct parametric image reconstruction model, DIP-PM, integrating deep image prior (DIP) with a parameter magnification (PM) strategy. The model employs a U-Net generator to predict multiple parametric images using a CT image prior, with each output channel subsequently magnified by a factor to adjust the intensity. The model was optimized with a log-likelihood loss computed between the measured projection data and forward projected data. Two tracer datasets were simulated for evaluation: 82Rb data using the 1-tissue compartment (1 TC) model and 18F-FDG data using the 2-tissue compartment (2 TC) model, with 10-fold magnification applied to the 1 TC k2 and the 2 TC k3, respectively. DIP-PM was compared to the indirect method, direct algorithm (OTEM) and the DIP method without parameter magnification (DIP-only). Performance was assessed on phantom data using peak signal-to-noise ratio (PSNR), normalized root mean square error (NRMSE) and structural similarity index (SSIM), as well as on real 18F-FDG scan from a male subject. For the 1 TC model, OTEM performed well in K1 reconstruction, but both indirect and OTEM methods showed high noise and poor performance in k2. The DIP-only method suppressed noise in k2, but failed to reconstruct fine structures in the myocardium. DIP-PM outperformed other methods with well-preserved detailed structures, particularly in k2, achieving the best metrics (PSNR: 19.00, NRMSE: 0.3002, SSIM: 0.9289). For the 2 TC model, traditional methods exhibited high noise and blurred structures in estimating all nonlinear parameters (K1, k2, k3), while DIP-based methods significantly improved image quality. DIP-PM outperformed all methods in k3 (PSNR: 21.89, NRMSE: 0.4054, SSIM: 0.8797), and consequently produced the most accurate 2 TC Ki images (PSNR: 22.74, NRMSE: 0.4897, SSIM: 0.8391). On real FDG data, DIP-PM also showed evident advantages in estimating K1, k2 and k3 while preserving myocardial structures. The results underscore the efficacy of the DIP-based direct parametric imaging in generating and improving quality of PET parametric images. This study suggests that the proposed DIP-PM method with the parameter magnification strategy can enhance the fidelity of nonlinear micro-parameter images.

PET Reconstruction Cardiac Methodology In Silico Academic Lab Reproducibility

Robust multi-coil MRI reconstruction via self-supervised denoising.

Aali A, Arvinte M, Kumar S, Arefeen YI, Tamir JI

•papers•Jun 2 2025

To examine the effect of incorporating self-supervised denoising as a pre-processing step for training deep learning (DL) based reconstruction methods on data corrupted by Gaussian noise. K-space data employed for training are typically multi-coil and inherently noisy. Although DL-based reconstruction methods trained on fully sampled data can enable high reconstruction quality, obtaining large, noise-free datasets is impractical. We leverage Generalized Stein's Unbiased Risk Estimate (GSURE) for denoising. We evaluate two DL-based reconstruction methods: Diffusion Probabilistic Models (DPMs) and Model-Based Deep Learning (MoDL). We evaluate the impact of denoising on the performance of these DL-based methods in solving accelerated multi-coil magnetic resonance imaging (MRI) reconstruction. The experiments were carried out on T2-weighted brain and fat-suppressed proton-density knee scans. We observed that self-supervised denoising enhances the quality and efficiency of MRI reconstructions across various scenarios. Specifically, employing denoised images rather than noisy counterparts when training DL networks results in lower normalized root mean squared error (NRMSE), higher structural similarity index measure (SSIM) and peak signal-to-noise ratio (PSNR) across different SNR levels, including 32, 22, and 12 dB for T2-weighted brain data, and 24, 14, and 4 dB for fat-suppressed knee data. We showed that denoising is an essential pre-processing technique capable of improving the efficacy of DL-based MRI reconstruction methods under diverse conditions. By refining the quality of input data, denoising enables training more effective DL networks, potentially bypassing the need for noise-free reference MRI scans.

MRI Reconstruction Methodology In Silico Academic Lab Reproducibility

Accelerating 3D radial MPnRAGE using a self-supervised deep factor model.

Chen Y, Kecskemeti SR, Holmes JH, Corum CA, Yaghoobi N, Magnotta VA, Jacob M

•papers•Jun 2 2025

To develop a self-supervised and memory-efficient deep learning image reconstruction method for 4D non-Cartesian MRI with high resolution and a large parametric dimension. The deep factor model (DFM) represents a parametric series of 3D multicontrast images using a neural network conditioned by the inversion time using efficient zero-filled reconstructions as input estimates. The model parameters are learned in a single-shot learning (SSL) fashion from the k-space data of each acquisition. A compatible transfer learning (TL) approach using previously acquired data is also developed to reduce reconstruction time. The DFM is compared to subspace methods with different regularization strategies in a series of phantom and in vivo experiments using the MPnRAGE acquisition for multicontrast <math xmlns="http://www.w3.org/1998/Math/MathML"> <semantics> <mrow> <msub><mrow><mi>T</mi></mrow> <mrow><mn>1</mn></mrow> </msub> </mrow> <annotation>$$ {T}_1 $$</annotation></semantics> </math> imaging and quantitative <math xmlns="http://www.w3.org/1998/Math/MathML"> <semantics> <mrow> <msub><mrow><mi>T</mi></mrow> <mrow><mn>1</mn></mrow> </msub> </mrow> <annotation>$$ {T}_1 $$</annotation></semantics> </math> estimation. DFM-SSL improved the image quality and reduced bias and variance in quantitative <math xmlns="http://www.w3.org/1998/Math/MathML"> <semantics> <mrow> <msub><mrow><mi>T</mi></mrow> <mrow><mn>1</mn></mrow> </msub> </mrow> <annotation>$$ {T}_1 $$</annotation></semantics> </math> estimates in both phantom and in vivo studies, outperforming all other tested methods. DFM-TL reduced the inference time while maintaining a performance comparable to DFM-SSL and outperforming subspace methods with multiple regularization techniques. The proposed DFM offers a superior representation of the multicontrast images compared to subspace models, especially in the highly accelerated MPnRAGE setting. The self-supervised training is ideal for methods with both high resolution and a large parametric dimension, where training neural networks can become computationally demanding without a dedicated high-end GPU array.

MRI Reconstruction Neurological Methodology Phantom/Animal Academic Lab Reproducibility

Inferring single-cell spatial gene expression with tissue morphology via explainable deep learning

Zhao, Y., Alizadeh, E., Taha, H. B., Liu, Y., Xu, M., Mahoney, J. M., Li, S.

•preprint•Jun 2 2025

Deep learning models trained with spatial omics data uncover complex patterns and relationships among cells, genes, and proteins in a high-dimensional space. State-of-the-art in silico spatial multi-cell gene expression methods using histological images of tissue stained with hematoxylin and eosin (H&E) allow us to characterize cellular heterogeneity. We developed a vision transformer (ViT) framework to map histological signatures to spatial single-cell transcriptomic signatures, named SPiRiT. SPiRiT predicts single-cell spatial gene expression using the matched H&E image tiles of human breast cancer and whole mouse pup, evaluated by Xenium (10x Genomics) datasets. Importantly, SPiRiT incorporates rigorous strategies to ensure reproducibility and robustness of predictions and provides trustworthy interpretation through attention-based model explainability. SPiRiT model interpretation revealed the areas, and attention details it uses to predict gene expressions like marker genes in invasive cancer cells. In an apple-to-apple comparison with ST-Net, SPiRiT improved the predictive accuracy by 40%. These gene predictions and expression levels were highly consistent with the tumor region annotation. In summary, SPiRiT highlights the feasibility to infer spatial single-cell gene expression using tissue morphology in multiple-species.

X-Ray Image Synthesis Breast Methodology In Silico Academic Lab Reproducibility

Eliminating the second CT scan of dual-tracer total-body PET/CT via deep learning-based image synthesis and registration.

Lin Y, Wang K, Zheng Z, Yu H, Chen S, Tang W, He Y, Gao H, Yang R, Xie Y, Yang J, Hou X, Wang S, Shi H

•papers•Jun 1 2025

This study aims to develop and validate a deep learning framework designed to eliminate the second CT scan of dual-tracer total-body PET/CT imaging. We retrospectively included three cohorts of 247 patients who underwent dual-tracer total-body PET/CT imaging on two separate days (time interval:1-11 days). Out of these, 167 underwent [68Ga]Ga-DOTATATE/[18F]FDG, 50 underwent [68Ga]Ga-PSMA-11/[18F]FDG, and 30 underwent [68Ga]Ga-FAPI-04/[18F]FDG. A deep learning framework was developed that integrates a registration generative adversarial network (RegGAN) with non-rigid registration techniques. This approach allows for the transformation of attenuation-correction CT (ACCT) images from the first scan into pseudo-ACCT images for the second scan, which are then used for attenuation and scatter correction (ASC) of the second tracer PET images. Additionally, the derived registration transform facilitates dual-tracer image fusion and analysis. The deep learning-based ASC PET images were evaluated using quantitative metrics, including mean absolute error (MAE), peak signal-to-noise ratio (PSNR), and structural similarity index measure (SSIM) across the whole body and specific regions. Furthermore, the quantitative accuracy of PET images was assessed by calculating standardized uptake value (SUV) bias in normal organs and lesions. The MAE for whole-body pseudo-ACCT images ranged from 97.64 to 112.59 HU across four tracers. The deep learning-based ASC PET images demonstrated high similarity to the ground-truth PET images. The MAE of SUV for whole-body PET images was 0.06 for [68Ga]Ga-DOTATATE, 0.08 for [68Ga]Ga-PSMA-11, 0.06 for [68Ga]Ga-FAPI-04, and 0.05 for [18F]FDG, respectively. Additionally, the median absolute percent deviation of SUV was less than 2.6% for all normal organs, while the mean absolute percent deviation of SUV was less than 3.6% for lesions across four tracers. The proposed deep learning framework, combining RegGAN and non-rigid registration, shows promise in reducing CT radiation dose for dual-tracer total-body PET/CT imaging, with successful validation across multiple tracers.

PET Image Synthesis Whole Body Retrospective Clinical In Silico Academic Lab Reproducibility

Significant reduction in manual annotation costs in ultrasound medical image database construction through step by step artificial intelligence pre-annotation.

Zheng F, XingMing L, JuYing X, MengYing T, BaoJian Y, Yan S, KeWei Y, ZhiKai L, Cheng H, KeLan Q, XiHao C, WenFei D, Ping H, RunYu W, Ying Y, XiaoHui B

•papers•Jun 1 2025

This study investigates the feasibility of reducing manual image annotation costs in medical image database construction by utilizing a step by step approach where the Artificial Intelligence model (AI model) trained on a previous batch of data automatically pre-annotates the next batch of image data, taking ultrasound image of thyroid nodule annotation as an example. The study used YOLOv8 as the AI model. During the AI model training, in addition to conventional image augmentation techniques, augmentation methods specifically tailored for ultrasound images were employed to balance the quantity differences between thyroid nodule classes and enhance model training effectiveness. The study found that training the model with augmented data significantly outperformed training with raw images data. When the number of original images number was only 1,360, with 7 thyroid nodule classifications, pre-annotation using the AI model trained on augmented data could save at least 30% of the manual annotation workload for junior physicians. When the scale of original images number reached 6,800, the classification accuracy of the AI model trained on augmented data was very close with that of junior physicians, eliminating the need for manual preliminary annotation.

Ultrasound Detection Abdominal Methodology In Silico Academic Lab Reproducibility

Radiomics across modalities: a comprehensive review of neurodegenerative diseases.

Inglese M, Conti A, Toschi N

•papers•Jun 1 2025

Radiomics allows extraction from medical images of quantitative features that are able to reveal tissue patterns that are generally invisible to human observers. Despite the challenges in visually interpreting radiomic features and the computational resources required to generate them, they hold significant value in downstream automated processing. For instance, in statistical or machine learning frameworks, radiomic features enhance sensitivity and specificity, making them indispensable for tasks such as diagnosis, prognosis, prediction, monitoring, image-guided interventions, and evaluating therapeutic responses. This review explores the application of radiomics in neurodegenerative diseases, with a focus on Alzheimer's disease, Parkinson's disease, Huntington's disease, and multiple sclerosis. While radiomics literature often focuses on magnetic resonance imaging (MRI) and computed tomography (CT), this review also covers its broader application in nuclear medicine, with use cases of positron emission tomography (PET) and single-photon emission computed tomography (SPECT) radiomics. Additionally, we review integrated radiomics, where features from multiple imaging modalities are fused to improve model performance. This review also highlights the growing integration of radiomics with artificial intelligence and the need for feature standardisation and reproducibility to facilitate its translation into clinical practice.

Mixed Modality Classification Neurological Review Concept Academic Lab Reproducibility GenAI

Filter Papers

Tags

Artificial intelligence vs human expertise: A comparison of plantar fascia thickness measurements through MRI imaging.

Open-PMC-18M: A High-Fidelity Large Scale Medical Dataset for Multimodal Representation Learning

SASWISE-UE: Segmentation and synthesis with interpretable scalable ensembles for uncertainty estimation.

Direct parametric reconstruction in dynamic PET using deep image prior and a novel parameter magnification strategy.

Robust multi-coil MRI reconstruction via self-supervised denoising.

Accelerating 3D radial MPnRAGE using a self-supervised deep factor model.

Inferring single-cell spatial gene expression with tissue morphology via explainable deep learning

Eliminating the second CT scan of dual-tracer total-body PET/CT via deep learning-based image synthesis and registration.

Significant reduction in manual annotation costs in ultrasound medical image database construction through step by step artificial intelligence pre-annotation.

Radiomics across modalities: a comprehensive review of neurodegenerative diseases.

Ready to Sharpen Your Edge?