Sort by:
Page 2 of 14137 results

MAISI-v2: Accelerated 3D High-Resolution Medical Image Synthesis with Rectified Flow and Region-specific Contrastive Loss

Can Zhao, Pengfei Guo, Dong Yang, Yucheng Tang, Yufan He, Benjamin Simon, Mason Belue, Stephanie Harmon, Baris Turkbey, Daguang Xu

arxiv logopreprintAug 7 2025
Medical image synthesis is an important topic for both clinical and research applications. Recently, diffusion models have become a leading approach in this area. Despite their strengths, many existing methods struggle with (1) limited generalizability that only work for specific body regions or voxel spacings, (2) slow inference, which is a common issue for diffusion models, and (3) weak alignment with input conditions, which is a critical issue for medical imaging. MAISI, a previously proposed framework, addresses generalizability issues but still suffers from slow inference and limited condition consistency. In this work, we present MAISI-v2, the first accelerated 3D medical image synthesis framework that integrates rectified flow to enable fast and high quality generation. To further enhance condition fidelity, we introduce a novel region-specific contrastive loss to enhance the sensitivity to region of interest. Our experiments show that MAISI-v2 can achieve SOTA image quality with $33 \times$ acceleration for latent diffusion model. We also conducted a downstream segmentation experiment to show that the synthetic images can be used for data augmentation. We release our code, training details, model weights, and a GUI demo to facilitate reproducibility and promote further development within the community.

Predicting language outcome after stroke using machine learning: in search of the big data benefit.

Saranti M, Neville D, White A, Rotshtein P, Hope TMH, Price CJ, Bowman H

pubmed logopapersAug 6 2025
Accurate prediction of post-stroke language outcomes using machine learning offers the potential to enhance clinical treatment and rehabilitation for aphasic patients. This study of 758 English speaking stroke patients from the PLORAS project explores the impact of sample size on the performance of logistic regression and a deep learning (ResNet-18) model in predicting language outcomes from neuroimaging and impairment-relevant tabular data. We assessed the performance of both models on two key language tasks from the Comprehensive Aphasia Test: Spoken Picture Description and Naming, using a learning curve approach. Contrary to expectations, the simpler logistic regression model performed comparably or better than the deep learning model (with overlapping confidence intervals), with both models showing an accuracy plateau around 80% for sample sizes larger than 300 patients. Principal Component Analysis revealed that the dimensionality of the neuroimaging data could be reduced to as few as 20 (or even 2) dominant components without significant loss in accuracy, suggesting that classification may be driven by simple patterns such as lesion size. The study highlights both the potential limitations of current dataset size in achieving further accuracy gains and the need for larger datasets to capture more complex patterns, as some of our results indicate that we might not have reached an absolute classification performance ceiling. Overall, these findings provide insights into the practical use of machine learning for predicting aphasia outcomes and the potential benefits of much larger datasets in enhancing model performance.

Machine Learning-Based Reconstruction of 2D MRI for Quantitative Morphometry in Epilepsy

Ratcliffe, C., Taylor, P. N., de Bezenac, C., Das, K., Biswas, S., Marson, A., Keller, S. S.

medrxiv logopreprintAug 6 2025
IntroductionStructural neuroimaging analyses require research quality images, acquired with costly MRI acquisitions. Isotropic (3D-T1) images are desirable for quantitative analyses, however a routine compromise in the clinical setting is to acquire anisotropic (2D-T1) analogues for qualitative visual inspection. ML (Machine learning-based) software have shown promise in addressing some of the limitations of 2D-T1 scans in research applications, yet their efficacy in quantitative research is generally poorly understood. Pathology-related abnormalities of the subcortical structures have previously been identified in idiopathic generalised epilepsy (IGE), which have been overlooked based on visual inspection, through the use of quantitative morphometric analyses. As such, IGE biomarkers present a suitable model in which to evaluate the applicability of image preprocessing methods. This study therefore explores subcortical structural biomarkers of IGE, first in our silver standard 3D-T1 scans, then in 2D-T1 scans that were either untransformed, resampled using a classical interpolation approach, or synthesised with a resolution and contrast agnostic ML model (the latter of which is compared to a separate model). Methods2D-T1 and 3D-T1 MRI scans were acquired during the same scanning session for 33 individuals with drug-responsive IGE (age mean 32.16 {+/-} SD = 14.20, male n = 14) and 42 individuals with drug-resistant IGE (31.76 {+/-} 11.12, 17), all diagnosed at the Walton Centre NHS Foundation Trust Liverpool, alongside 39 age- and sex-matched healthy controls (32.32 {+/-} 8.65, 16). The untransformed 2D-T1 scans were resampled into isotropic images using NiBabel (res-T1), and preprocessed into synthetic isotropic images using SynthSR (syn-T1). For the 3D-T1, 2D-T1, res-T1, and syn-T1 images, the recon-all command from FreeSurfer 8.0.0 was used to create parcellations of 174 anatomical regions (equivalent to the 174 regional parcellations provided as part of the DL+DiReCT pipeline), defined by the aseg and Destrieux atlases, and FSL run_first_all was used to segment subcortical surface shapes. The new ML FreeSurfer pipeline, recon-all-clinical, was also tested in the 2D-T1, 3D-T1, and res-T1 images. As a model comparison for SynthSR, the DL+DiReCT pipeline was used to provide segmentations of the 2D-T1 and res-T1 images, including estimates of regional volume and thickness. Spatial overlap and intraclass correlations between the morphometrics of the eight resulting parcellations were first determined, then subcortical surface shape abnormalities associated with IGE were identified by comparing the FSL run_first_all outputs of patients with controls. ResultsWhen standardised to the metrics derived from the 3D-T1 scans, cortical volume and thickness estimates trended lower for the 2D-T1, res-T1, syn-T1, and DL+DiReCT outputs, whereas subcortical volume estimates were more coherent. Dice coefficients revealed an acceptable spatial similarity between the cortices of the 3D-T1 scans and the other images overall, and was higher in the subcortical structures. Intraclass correlation coefficients were consistently lowest when metrics were computed for model-derived inputs, and estimates of thickness were less similar to the ground truth than those of volume. For the people with epilepsy, the 3D-T1 scans showed significant surface deflations across various subcortical structures when compared to healthy controls. Analysis of the 2D-T1 scans enabled the reliable detection of a subset of subcortical abnormalities, whereas analyses of the res-T1 and syn-T1 images were more prone to false-positive results. ConclusionsResampling and ML image synthesis methods do not currently attenuate partial volume effects resulting from low through plane resolution in anisotropic MRI scans, instead quantitative analyses using 2D-T1 scans should be interpreted with caution, and researchers should consider the potential implications of preprocessing. The recon-all-clinical pipeline is promising, but requires further evaluation, especially when considered as an alternative to the classical pipeline. Key PointsO_LISurface deviations indicative of regional atrophy and hypertrophy were identified in people with idiopathic generalised epilepsy. C_LIO_LIPartial volume effects are likely to attenuate subtle morphometric abnormalities, increasing the likelihood of erroneous inference. C_LIO_LIPriors in synthetic image creation models may render them insensitive to subtle biomarkers. C_LIO_LIResampling and machine-learning based image synthesis are not currently replacements for research quality acquisitions in quantitative MRI research. C_LIO_LIThe results of studies using synthetic images should be interpreted in a separate context to those using untransformed data. C_LI

Open-radiomics: a collection of standardized datasets and a technical protocol for reproducible radiomics machine learning pipelines.

Namdar K, Wagner MW, Ertl-Wagner BB, Khalvati F

pubmed logopapersAug 4 2025
As an important branch of machine learning pipelines in medical imaging, radiomics faces two major challenges namely reproducibility and accessibility. In this work, we introduce open-radiomics, a set of radiomics datasets along with a comprehensive radiomics pipeline based on our proposed technical protocol to investigate the effects of radiomics feature extraction on the reproducibility of the results. We curated large-scale radiomics datasets based on three open-source datasets; BraTS 2020 for high-grade glioma (HGG) versus low-grade glioma (LGG) classification and survival analysis, BraTS 2023 for O6-methylguanine-DNA methyltransferase (MGMT) classification, and non-small cell lung cancer (NSCLC) survival analysis from the Cancer Imaging Archive (TCIA). We used the BraTS 2020 open-source Magnetic Resonance Imaging (MRI) dataset to demonstrate how our proposed technical protocol could be utilized in radiomics-based studies. The cohort includes 369 adult patients with brain tumors (76 LGG, and 293 HGG). Using PyRadiomics library for LGG vs. HGG classification, we created 288 radiomics datasets; the combinations of 4 MRI sequences, 3 binWidths, 6 image normalization methods, and 4 tumor subregions. We used Random Forest classifiers, and for each radiomics dataset, we repeated the training-validation-test (60%/20%/20%) experiment with different data splits and model random states 100 times (28,800 test results) and calculated the Area Under the Receiver Operating Characteristic Curve (AUROC). Unlike binWidth and image normalization, the tumor subregion and imaging sequence significantly affected performance of the models. T1 contrast-enhanced sequence and the union of Necrotic and the non-enhancing tumor core subregions resulted in the highest AUROCs (average test AUROC 0.951, 95% confidence interval of (0.949, 0.952)). Although several settings and data splits (28 out of 28800) yielded test AUROC of 1, they were irreproducible. Our experiments demonstrate the sources of variability in radiomics pipelines (e.g., tumor subregion) can have a significant impact on the results, which may lead to superficial perfect performances that are irreproducible. Not applicable.

Explainable AI Methods for Neuroimaging: Systematic Failures of Common Tools, the Need for Domain-Specific Validation, and a Proposal for Safe Application

Nys Tjade Siegel, James H. Cole, Mohamad Habes, Stefan Haufe, Kerstin Ritter, Marc-André Schulz

arxiv logopreprintAug 4 2025
Trustworthy interpretation of deep learning models is critical for neuroimaging applications, yet commonly used Explainable AI (XAI) methods lack rigorous validation, risking misinterpretation. We performed the first large-scale, systematic comparison of XAI methods on ~45,000 structural brain MRIs using a novel XAI validation framework. This framework establishes verifiable ground truth by constructing prediction tasks with known signal sources - from localized anatomical features to subject-specific clinical lesions - without artificially altering input images. Our analysis reveals systematic failures in two of the most widely used methods: GradCAM consistently failed to localize predictive features, while Layer-wise Relevance Propagation generated extensive, artifactual explanations that suggest incompatibility with neuroimaging data characteristics. Our results indicate that these failures stem from a domain mismatch, where methods with design principles tailored to natural images require substantial adaptation for neuroimaging data. In contrast, the simpler, gradient-based method SmoothGrad, which makes fewer assumptions about data structure, proved consistently accurate, suggesting its conceptual simplicity makes it more robust to this domain shift. These findings highlight the need for domain-specific adaptation and validation of XAI methods, suggest that interpretations from prior neuroimaging studies using standard XAI methodology warrant re-evaluation, and provide urgent guidance for practical application of XAI in neuroimaging.

Medical Image De-Identification Resources: Synthetic DICOM Data and Tools for Validation

Michael W. Rutherford, Tracy Nolan, Linmin Pei, Ulrike Wagner, Qinyan Pan, Phillip Farmer, Kirk Smith, Benjamin Kopchick, Laura Opsahl-Ong, Granger Sutton, David Clunie, Keyvan Farahani, Fred Prior

arxiv logopreprintAug 3 2025
Medical imaging research increasingly depends on large-scale data sharing to promote reproducibility and train Artificial Intelligence (AI) models. Ensuring patient privacy remains a significant challenge for open-access data sharing. Digital Imaging and Communications in Medicine (DICOM), the global standard data format for medical imaging, encodes both essential clinical metadata and extensive protected health information (PHI) and personally identifiable information (PII). Effective de-identification must remove identifiers, preserve scientific utility, and maintain DICOM validity. Tools exist to perform de-identification, but few assess its effectiveness, and most rely on subjective reviews, limiting reproducibility and regulatory confidence. To address this gap, we developed an openly accessible DICOM dataset infused with synthetic PHI/PII and an evaluation framework for benchmarking image de-identification workflows. The Medical Image de-identification (MIDI) dataset was built using publicly available de-identified data from The Cancer Imaging Archive (TCIA). It includes 538 subjects (216 for validation, 322 for testing), 605 studies, 708 series, and 53,581 DICOM image instances. These span multiple vendors, imaging modalities, and cancer types. Synthetic PHI and PII were embedded into structured data elements, plain text data elements, and pixel data to simulate real-world identity leaks encountered by TCIA curation teams. Accompanying evaluation tools include a Python script, answer keys (known truth), and mapping files that enable automated comparison of curated data against expected transformations. The framework is aligned with the HIPAA Privacy Rule "Safe Harbor" method, DICOM PS3.15 Confidentiality Profiles, and TCIA best practices. It supports objective, standards-driven evaluation of de-identification workflows, promoting safer and more consistent medical image sharing.

Medical Image De-Identification Benchmark Challenge

Linmin Pei, Granger Sutton, Michael Rutherford, Ulrike Wagner, Tracy Nolan, Kirk Smith, Phillip Farmer, Peter Gu, Ambar Rana, Kailing Chen, Thomas Ferleman, Brian Park, Ye Wu, Jordan Kojouharov, Gargi Singh, Jon Lemon, Tyler Willis, Milos Vukadinovic, Grant Duffy, Bryan He, David Ouyang, Marco Pereanez, Daniel Samber, Derek A. Smith, Christopher Cannistraci, Zahi Fayad, David S. Mendelson, Michele Bufano, Elmar Kotter, Hamideh Haghiri, Rajesh Baidya, Stefan Dvoretskii, Klaus H. Maier-Hein, Marco Nolden, Christopher Ablett, Silvia Siggillino, Sandeep Kaushik, Hongzhu Jiang, Sihan Xie, Zhiyu Wan, Alex Michie, Simon J Doran, Angeline Aurelia Waly, Felix A. Nathaniel Liang, Humam Arshad Mustagfirin, Michelle Grace Felicia, Kuo Po Chih, Rahul Krish, Ghulam Rasool, Nidhal Bouaynaya, Nikolas Koutsoubis, Kyle Naddeo, Kartik Pandit, Tony O'Sullivan, Raj Krish, Qinyan Pan, Scott Gustafson, Benjamin Kopchick, Laura Opsahl-Ong, Andrea Olvera-Morales, Jonathan Pinney, Kathryn Johnson, Theresa Do, Juergen Klenk, Maria Diaz, Arti Singh, Rong Chai, David A. Clunie, Fred Prior, Keyvan Farahani

arxiv logopreprintJul 31 2025
The de-identification (deID) of protected health information (PHI) and personally identifiable information (PII) is a fundamental requirement for sharing medical images, particularly through public repositories, to ensure compliance with patient privacy laws. In addition, preservation of non-PHI metadata to inform and enable downstream development of imaging artificial intelligence (AI) is an important consideration in biomedical research. The goal of MIDI-B was to provide a standardized platform for benchmarking of DICOM image deID tools based on a set of rules conformant to the HIPAA Safe Harbor regulation, the DICOM Attribute Confidentiality Profiles, and best practices in preservation of research-critical metadata, as defined by The Cancer Imaging Archive (TCIA). The challenge employed a large, diverse, multi-center, and multi-modality set of real de-identified radiology images with synthetic PHI/PII inserted. The MIDI-B Challenge consisted of three phases: training, validation, and test. Eighty individuals registered for the challenge. In the training phase, we encouraged participants to tune their algorithms using their in-house or public data. The validation and test phases utilized the DICOM images containing synthetic identifiers (of 216 and 322 subjects, respectively). Ten teams successfully completed the test phase of the challenge. To measure success of a rule-based approach to image deID, scores were computed as the percentage of correct actions from the total number of required actions. The scores ranged from 97.91% to 99.93%. Participants employed a variety of open-source and proprietary tools with customized configurations, large language models, and optical character recognition (OCR). In this paper we provide a comprehensive report on the MIDI-B Challenge's design, implementation, results, and lessons learned.

Label-free estimation of clinically relevant performance metrics under distribution shifts

Tim Flühmann, Alceu Bissoto, Trung-Dung Hoang, Lisa M. Koch

arxiv logopreprintJul 30 2025
Performance monitoring is essential for safe clinical deployment of image classification models. However, because ground-truth labels are typically unavailable in the target dataset, direct assessment of real-world model performance is infeasible. State-of-the-art performance estimation methods address this by leveraging confidence scores to estimate the target accuracy. Despite being a promising direction, the established methods mainly estimate the model's accuracy and are rarely evaluated in a clinical domain, where strong class imbalances and dataset shifts are common. Our contributions are twofold: First, we introduce generalisations of existing performance prediction methods that directly estimate the full confusion matrix. Then, we benchmark their performance on chest x-ray data in real-world distribution shifts as well as simulated covariate and prevalence shifts. The proposed confusion matrix estimation methods reliably predicted clinically relevant counting metrics on medical images under distribution shifts. However, our simulated shift scenarios exposed important failure modes of current performance estimation techniques, calling for a better understanding of real-world deployment contexts when implementing these performance monitoring techniques for postmarket surveillance of medical AI models.

Deep sensorless tracking of ultrasound probe orientation during freehand transperineal biopsy with spatial context for symmetry disambiguation.

Soormally C, Beitone C, Troccaz J, Voros S

pubmed logopapersJul 29 2025
Diagnosis of prostate cancer requires histopathology of tissue samples. Following an MRI to identify suspicious areas, a biopsy is performed under ultrasound (US) guidance. In existing assistance systems, 3D US information is generally available (taken before the biopsy session and/or in between samplings). However, without registration between 2D images and 3D volumes, the urologist must rely on cognitive navigation. This work introduces a deep learning model to track the orientation of real-time US slices relative to a reference 3D US volume using only image and volume data. The dataset comprises 515 3D US volumes collected from 51 patients during routine transperineal biopsy. To generate 2D images streams, volumes are resampled to simulate three degrees of freedom rotational movements around the rectal entrance. The proposed model comprises two ResNet-based sub-modules to address the symmetry ambiguity arising from complex out-of-plane movement of the probe. The first sub-module predicts the unsigned relative orientation between consecutive slices, while the second leverages a custom similarity model and a spatial context volume to determine the sign of this relative orientation. From the sub-modules predictions, slices orientations along the navigated trajectory can then be derived in real-time. Results demonstrate that registration error remains below 2.5 mm in 92% of cases over a 5-second trajectory, and 80% over a 25-second trajectory. These findings show that accurate, sensorless 2D/3D US registration given a spatial context is achievable with limited drift over extended navigation. This highlights the potential of AI-driven biopsy assistance to increase the accuracy of freehand biopsy.

AI generated annotations for Breast, Brain, Liver, Lungs, and Prostate cancer collections in the National Cancer Institute Imaging Data Commons.

Murugesan GK, McCrumb D, Soni R, Kumar J, Nuernberg L, Pei L, Wagner U, Granger S, Fedorov AY, Moore S, Van Oss J

pubmed logopapersJul 29 2025
The Artificial Intelligence in Medical Imaging (AIMI) initiative aims to enhance the National Cancer Institute's (NCI) Image Data Commons (IDC) by releasing fully reproducible nnU-Net models, along with AI-assisted segmentation for cancer radiology images. In this extension of our earlier work, we created high-quality, AI-annotated imaging datasets for 11 IDC collections, spanning computed tomography (CT) and magnetic resonance imaging (MRI) of the lungs, breast, brain, kidneys, prostate, and liver. Each nnU-Net model was trained on open-source datasets, and a portion of the AI-generated annotations was reviewed and corrected by board-certified radiologists. Both the AI and radiologist annotations were encoded in compliance with the Digital Imaging and Communications in Medicine (DICOM) standard, ensuring seamless integration into the IDC collections. By making these models, images, and annotations publicly accessible, we aim to facilitate further research and development in cancer imaging.
Page 2 of 14137 results
Show
per page

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.