Sort by:
Page 24 of 2922917 results

Impact of a computed tomography-based artificial intelligence software on radiologists' workflow for detecting acute intracranial hemorrhage.

Kim J, Jang J, Oh SW, Lee HY, Min EJ, Choi JW, Ahn KJ

pubmed logopapersJul 7 2025
To assess the impact of a commercially available computed tomography (CT)-based artificial intelligence (AI) software for detecting acute intracranial hemorrhage (AIH) on radiologists' diagnostic performance and workflow in a real-world clinical setting. This retrospective study included a total of 956 non-contrast brain CT scans obtained over a 70-day period, interpreted independently by 2 board-certified general radiologists. Of these, 541 scans were interpreted during the initial 35 days before the implementation of AI software, and the remaining 415 scans were interpreted during the subsequent 35 days, with reference to AIH probability scores generated by the software. To assess the software's impact on radiologists' performance in detecting AIH, performance before and after implementation was compared. Additionally, to evaluate the software's effect on radiologists' workflow, Kendall's Tau was used to assess the correlation between the daily chronological order of CT scans and the radiologists' reading order before and after implementation. The early diagnosis rate for AIH (defined as the proportion of AIH cases read within the first quartile by radiologists) and the median reading order of AIH cases were also compared before and after implementation. A total of 956 initial CT scans from 956 patients [mean age: 63.14 ± 18.41 years; male patients: 447 (47%)] were included. There were no significant differences in accuracy [from 0.99 (95% confidence interval: 0.99-1.00) to 0.99 (0.98-1.00), <i>P</i> = 0.343], sensitivity [from 1.00 (0.99-1.00) to 1.00 (0.99-1.00), <i>P</i> = 0.859], or specificity [from 1.00 (0.99-1.00) to 0.99 (0.97-1.00), <i>P</i> = 0.252] following the implementation of the AI software. However, the daily correlation between the chronological order of CT scans and the radiologists' reading order significantly decreased [Kendall's Tau, from 0.61 (0.48-0.73) to 0.01 (0.00-0.26), <i>P</i> < 0.001]. Additionally, the early diagnosis rate significantly increased [from 0.49 (0.34-0.63) to 0.76 (0.60-0.93), <i>P</i> = 0.013], and the daily median reading order of AIH cases significantly decreased [from 7.25 (Q1-Q3: 3-10.75) to 1.5 (1-3), <i>P</i> < 0.001] after the implementation. After the implementation of CT-based AI software for detecting AIH, the radiologists' daily reading order was considerably reprioritized to allow more rapid interpretation of AIH cases without compromising diagnostic performance in a real-world clinical setting. With the increasing number of CT scans and the growing burden on radiologists, optimizing the workflow for diagnosing AIH through CT-based AI software integration may enhance the prompt and efficient treatment of patients with AIH.

External Validation on a Japanese Cohort of a Computer-Aided Diagnosis System Aimed at Characterizing ISUP ≥ 2 Prostate Cancers at Multiparametric MRI.

Escande R, Jaouen T, Gonindard-Melodelima C, Crouzet S, Kuroda S, Souchon R, Rouvière O, Shoji S

pubmed logopapersJul 7 2025
To evaluate the generalizability of a computer-aided diagnosis (CADx) system based on the apparent diffusion coefficient (ADC) and wash-in rate, and trained on a French population to diagnose International Society of Urological Pathology ≥ 2 prostate cancer on multiparametric MRI. Sixty-eight consecutive patients who underwent radical prostatectomy at a single Japanese institution were retrospectively included. Pre-prostatectomy MRIs were reviewed by an experienced radiologist who assigned to suspicious lesions a Prostate Imaging-Reporting and Data System version 2.1 (PI-RADSv2.1) score and delineated them. The CADx score was computed from these regions-of-interest. Using prostatectomy whole-mounts as reference, the CADx and PI-RADSv2.1 scores were compared at the lesion level using areas under the receiver operating characteristic curves (AUC), and sensitivities and specificities obtained with predefined thresholds. In PZ, AUCs were 80% (95% confidence interval [95% CI]: 71-90) for the CADx score and 80% (95% CI: 71-89; p = 0.886) for the PI-RADSv2.1score; in TZ, AUCs were 79% (95% CI: 66-90) for the CADx score and 93% (95% CI: 82-96; p = 0.051) for the PI-RADSv2.1 score. The CADx diagnostic thresholds that provided sensitivities of 86%-91% and specificities of 64%-75% in French test cohorts yielded sensitivities of 60% (95% CI: 38-83) in PZ and 42% (95% CI: 20-71) in TZ, with specificities of 95% (95% CI: 86-100) and 92% (95% CI: 73-100), respectively. This shift may be attributed to higher ADC values and lower dynamic contrast-enhanced temporal resolution in the test cohort. The CADx obtained good overall results in this external cohort. However, predefined diagnostic thresholds provided lower sensitivities and higher specificities than expected.

Performance of GPT-4 for automated prostate biopsy decision-making based on mpMRI: a multi-center evidence study.

Shi MJ, Wang ZX, Wang SK, Li XH, Zhang YL, Yan Y, An R, Dong LN, Qiu L, Tian T, Liu JX, Song HC, Wang YF, Deng C, Cao ZB, Wang HY, Wang Z, Wei W, Song J, Lu J, Wei X, Wang ZC

pubmed logopapersJul 7 2025
Multiparametric magnetic resonance imaging (mpMRI) has significantly advanced prostate cancer (PCa) detection, yet decisions on invasive biopsy with moderate prostate imaging reporting and data system (PI-RADS) scores remain ambiguous. To explore the decision-making capacity of Generative Pretrained Transformer-4 (GPT-4) for automated prostate biopsy recommendations, we included 2299 individuals who underwent prostate biopsy from 2018 to 2023 in 3 large medical centers, with available mpMRI before biopsy and documented clinical-histopathological records. GPT-4 generated structured reports with given prompts. The performance of GPT-4 was quantified using confusion matrices, and sensitivity, specificity, as well as area under the curve were calculated. Multiple artificial evaluation procedures were conducted. Wilcoxon's rank sum test, Fisher's exact test, and Kruskal-Wallis tests were used for comparisons. Utilizing the largest sample size in the Chinese population, patients with moderate PI-RADS scores (scores 3 and 4) accounted for 39.7% (912/2299), defined as the subset-of-interest (SOI). The detection rates of clinically significant PCa corresponding to PI-RADS scores 2-5 were 9.4, 27.3, 49.2, and 80.1%, respectively. Nearly 47.5% (433/912) of SOI patients were histopathologically proven to have undergone unnecessary prostate biopsies. With the assistance of GPT-4, 20.8% (190/912) of the SOI population could avoid unnecessary biopsies, and it performed even better [28.8% (118/410)] in the most heterogeneous subgroup of PI-RADS score 3. More than 90.0% of GPT-4 -generated reports were comprehensive and easy to understand, but less satisfied with the accuracy (82.8%). GPT-4 also demonstrated cognitive potential for handling complex problems. Additionally, the Chain of Thought method enabled us to better understand the decision-making logic behind GPT-4. Eventually, we developed a ProstAIGuide platform to facilitate accessibility for both doctors and patients. This multi-center study highlights the clinical utility of GPT-4 for prostate biopsy decision-making and advances our understanding of the latest artificial intelligence implementation in various medical scenarios.

Prediction of tissue and clinical thrombectomy outcome in acute ischaemic stroke using deep learning.

von Braun MS, Starke K, Peter L, Kürsten D, Welle F, Schneider HR, Wawrzyniak M, Kaiser DPO, Prasse G, Richter C, Kellner E, Reisert M, Klingbeil J, Stockert A, Hoffmann KT, Scheuermann G, Gillmann C, Saur D

pubmed logopapersJul 7 2025
The advent of endovascular thrombectomy has significantly improved outcomes for stroke patients with intracranial large vessel occlusion, yet individual benefits can vary widely. As demand for thrombectomy rises and geographical disparities in stroke care access persist, there is a growing need for predictive models that quantify individual benefits. However, current imaging methods for estimating outcomes may not fully capture the dynamic nature of cerebral ischaemia and lack a patient-specific assessment of thrombectomy benefits. Our study introduces a deep learning approach to predict individual responses to thrombectomy in acute ischaemic stroke patients. The proposed models provide predictions for both tissue and clinical outcomes under two scenarios: one assuming successful reperfusion and another assuming unsuccessful reperfusion. The resulting simulations of penumbral salvage and difference in National Institutes of Health Stroke Scale (NIHSS) at discharge quantify the potential individual benefits of the intervention. Our models were developed on an extensive dataset from routine stroke care, which included 405 ischaemic stroke patients who underwent thrombectomy. We used acute data for training (n = 304), including multimodal CT imaging and clinical characteristics, along with post hoc markers such as thrombectomy success, final infarct localization and NIHSS at discharge. We benchmarked our tissue outcome predictions under the observed reperfusion scenario against a thresholding-based clinical method and a generalized linear model. Our deep learning model showed significant superiority, with a mean Dice score of 0.48 on internal test data (n = 50) and 0.52 on external test data (n = 51), versus 0.26/0.36 and 0.34/0.35 for the baselines, respectively. The NIHSS sum score prediction achieved median absolute errors of 1.5 NIHSS points on the internal test dataset and 3.0 NIHSS points on the external test dataset, outperforming other machine learning models. By predicting the patient-specific response to thrombectomy for both tissue and clinical outcomes, our approach offers an innovative biomarker that captures the dynamics of cerebral ischaemia. We believe this method holds significant potential to enhance personalized therapeutic strategies and to facilitate efficient resource allocation in acute stroke care.

MedGemma Technical Report

Andrew Sellergren, Sahar Kazemzadeh, Tiam Jaroensri, Atilla Kiraly, Madeleine Traverse, Timo Kohlberger, Shawn Xu, Fayaz Jamil, Cían Hughes, Charles Lau, Justin Chen, Fereshteh Mahvar, Liron Yatziv, Tiffany Chen, Bram Sterling, Stefanie Anna Baby, Susanna Maria Baby, Jeremy Lai, Samuel Schmidgall, Lu Yang, Kejia Chen, Per Bjornsson, Shashir Reddy, Ryan Brush, Kenneth Philbrick, Howard Hu, Howard Yang, Richa Tiwari, Sunny Jansen, Preeti Singh, Yun Liu, Shekoofeh Azizi, Aishwarya Kamath, Johan Ferret, Shreya Pathak, Nino Vieillard, Ramona Merhej, Sarah Perrin, Tatiana Matejovicova, Alexandre Ramé, Morgane Riviere, Louis Rouillard, Thomas Mesnard, Geoffrey Cideron, Jean-bastien Grill, Sabela Ramos, Edouard Yvinec, Michelle Casbon, Elena Buchatskaya, Jean-Baptiste Alayrac, Dmitry, Lepikhin, Vlad Feinberg, Sebastian Borgeaud, Alek Andreev, Cassidy Hardin, Robert Dadashi, Léonard Hussenot, Armand Joulin, Olivier Bachem, Yossi Matias, Katherine Chou, Avinatan Hassidim, Kavi Goel, Clement Farabet, Joelle Barral, Tris Warkentin, Jonathon Shlens, David Fleet, Victor Cotruta, Omar Sanseviero, Gus Martins, Phoebe Kirk, Anand Rao, Shravya Shetty, David F. Steiner, Can Kirmizibayrak, Rory Pilgrim, Daniel Golden, Lin Yang

arxiv logopreprintJul 7 2025
Artificial intelligence (AI) has significant potential in healthcare applications, but its training and deployment faces challenges due to healthcare's diverse data, complex tasks, and the need to preserve privacy. Foundation models that perform well on medical tasks and require less task-specific tuning data are critical to accelerate the development of healthcare AI applications. We introduce MedGemma, a collection of medical vision-language foundation models based on Gemma 3 4B and 27B. MedGemma demonstrates advanced medical understanding and reasoning on images and text, significantly exceeding the performance of similar-sized generative models and approaching the performance of task-specific models, while maintaining the general capabilities of the Gemma 3 base models. For out-of-distribution tasks, MedGemma achieves 2.6-10% improvement on medical multimodal question answering, 15.5-18.1% improvement on chest X-ray finding classification, and 10.8% improvement on agentic evaluations compared to the base models. Fine-tuning MedGemma further improves performance in subdomains, reducing errors in electronic health record information retrieval by 50% and reaching comparable performance to existing specialized state-of-the-art methods for pneumothorax classification and histopathology patch classification. We additionally introduce MedSigLIP, a medically-tuned vision encoder derived from SigLIP. MedSigLIP powers the visual understanding capabilities of MedGemma and as an encoder achieves comparable or better performance than specialized medical image encoders. Taken together, the MedGemma collection provides a strong foundation of medical image and text capabilities, with potential to significantly accelerate medical research and development of downstream applications. The MedGemma collection, including tutorials and model weights, can be found at https://goo.gle/medgemma.

Uncovering Neuroimaging Biomarkers of Brain Tumor Surgery with AI-Driven Methods

Carmen Jimenez-Mesa, Yizhou Wan, Guilio Sansone, Francisco J. Martinez-Murcia, Javier Ramirez, Pietro Lio, Juan M. Gorriz, Stephen J. Price, John Suckling, Michail Mamalakis

arxiv logopreprintJul 7 2025
Brain tumor resection is a complex procedure with significant implications for patient survival and quality of life. Predictions of patient outcomes provide clinicians and patients the opportunity to select the most suitable onco-functional balance. In this study, global features derived from structural magnetic resonance imaging in a clinical dataset of 49 pre- and post-surgery patients identified potential biomarkers associated with survival outcomes. We propose a framework that integrates Explainable AI (XAI) with neuroimaging-based feature engineering for survival assessment, offering guidance for surgical decision-making. In this study, we introduce a global explanation optimizer that refines survival-related feature attribution in deep learning models, enhancing interpretability and reliability. Our findings suggest that survival is influenced by alterations in regions associated with cognitive and sensory functions, indicating the importance of preserving areas involved in decision-making and emotional regulation during surgery to improve outcomes. The global explanation optimizer improves both fidelity and comprehensibility of explanations compared to state-of-the-art XAI methods. It effectively identifies survival-related variability, underscoring its relevance in precision medicine for brain tumor treatment.

Efficacy of Image Similarity as a Metric for Augmenting Small Dataset Retinal Image Segmentation

Thomas Wallace, Ik Siong Heng, Senad Subasic, Chris Messenger

arxiv logopreprintJul 7 2025
Synthetic images are an option for augmenting limited medical imaging datasets to improve the performance of various machine learning models. A common metric for evaluating synthetic image quality is the Fr\'echet Inception Distance (FID) which measures the similarity of two image datasets. In this study we evaluate the relationship between this metric and the improvement which synthetic images, generated by a Progressively Growing Generative Adversarial Network (PGGAN), grant when augmenting Diabetes-related Macular Edema (DME) intraretinal fluid segmentation performed by a U-Net model with limited amounts of training data. We find that the behaviour of augmenting with standard and synthetic images agrees with previously conducted experiments. Additionally, we show that dissimilar (high FID) datasets do not improve segmentation significantly. As FID between the training and augmenting datasets decreases, the augmentation datasets are shown to contribute to significant and robust improvements in image segmentation. Finally, we find that there is significant evidence to suggest that synthetic and standard augmentations follow separate log-normal trends between FID and improvements in model performance, with synthetic data proving more effective than standard augmentation techniques. Our findings show that more similar datasets (lower FID) will be more effective at improving U-Net performance, however, the results also suggest that this improvement may only occur when images are sufficiently dissimilar.

SV-DRR: High-Fidelity Novel View X-Ray Synthesis Using Diffusion Model

Chun Xie, Yuichi Yoshii, Itaru Kitahara

arxiv logopreprintJul 7 2025
X-ray imaging is a rapid and cost-effective tool for visualizing internal human anatomy. While multi-view X-ray imaging provides complementary information that enhances diagnosis, intervention, and education, acquiring images from multiple angles increases radiation exposure and complicates clinical workflows. To address these challenges, we propose a novel view-conditioned diffusion model for synthesizing multi-view X-ray images from a single view. Unlike prior methods, which are limited in angular range, resolution, and image quality, our approach leverages the Diffusion Transformer to preserve fine details and employs a weak-to-strong training strategy for stable high-resolution image generation. Experimental results demonstrate that our method generates higher-resolution outputs with improved control over viewing angles. This capability has significant implications not only for clinical applications but also for medical education and data extension, enabling the creation of diverse, high-quality datasets for training and analysis. Our code is available at GitHub.

Sequential Attention-based Sampling for Histopathological Analysis

Tarun G, Naman Malpani, Gugan Thoppe, Sridharan Devarajan

arxiv logopreprintJul 7 2025
Deep neural networks are increasingly applied for automated histopathology. Yet, whole-slide images (WSIs) are often acquired at gigapixel sizes, rendering it computationally infeasible to analyze them entirely at high resolution. Diagnostic labels are largely available only at the slide-level, because expert annotation of images at a finer (patch) level is both laborious and expensive. Moreover, regions with diagnostic information typically occupy only a small fraction of the WSI, making it inefficient to examine the entire slide at full resolution. Here, we propose SASHA -- {\it S}equential {\it A}ttention-based {\it S}ampling for {\it H}istopathological {\it A}nalysis -- a deep reinforcement learning approach for efficient analysis of histopathological images. First, SASHA learns informative features with a lightweight hierarchical, attention-based multiple instance learning (MIL) model. Second, SASHA samples intelligently and zooms selectively into a small fraction (10-20\%) of high-resolution patches, to achieve reliable diagnosis. We show that SASHA matches state-of-the-art methods that analyze the WSI fully at high-resolution, albeit at a fraction of their computational and memory costs. In addition, it significantly outperforms competing, sparse sampling methods. We propose SASHA as an intelligent sampling model for medical imaging challenges that involve automated diagnosis with exceptionally large images containing sparsely informative features.

HGNet: High-Order Spatial Awareness Hypergraph and Multi-Scale Context Attention Network for Colorectal Polyp Detection

Xiaofang Liu, Lingling Sun, Xuqing Zhang, Yuannong Ye, Bin zhao

arxiv logopreprintJul 7 2025
Colorectal cancer (CRC) is closely linked to the malignant transformation of colorectal polyps, making early detection essential. However, current models struggle with detecting small lesions, accurately localizing boundaries, and providing interpretable decisions. To address these issues, we propose HGNet, which integrates High-Order Spatial Awareness Hypergraph and Multi-Scale Context Attention. Key innovations include: (1) an Efficient Multi-Scale Context Attention (EMCA) module to enhance lesion feature representation and boundary modeling; (2) the deployment of a spatial hypergraph convolution module before the detection head to capture higher-order spatial relationships between nodes; (3) the application of transfer learning to address the scarcity of medical image data; and (4) Eigen Class Activation Map (Eigen-CAM) for decision visualization. Experimental results show that HGNet achieves 94% accuracy, 90.6% recall, and 90% [email protected], significantly improving small lesion differentiation and clinical interpretability. The source code will be made publicly available upon publication of this paper.
Page 24 of 2922917 results
Show
per page

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.