Latest Papers on Radiology AI. Tags: GenAI

Medical image translation with deep learning: Advances, datasets and perspectives.

Chen J, Ye Z, Zhang R, Li H, Fang B, Zhang LB, Wang W

•papers•Jul 1 2025

Traditional medical image generation often lacks patient-specific clinical information, limiting its clinical utility despite enhancing downstream task performance. In contrast, medical image translation precisely converts images from one modality to another, preserving both anatomical structures and cross-modal features, thus enabling efficient and accurate modality transfer and offering unique advantages for model development and clinical practice. This paper reviews the latest advancements in deep learning(DL)-based medical image translation. Initially, it elaborates on the diverse tasks and practical applications of medical image translation. Subsequently, it provides an overview of fundamental models, including convolutional neural networks (CNNs), transformers, and state space models (SSMs). Additionally, it delves into generative models such as Generative Adversarial Networks (GANs), Variational Autoencoders (VAEs), Autoregressive Models (ARs), diffusion Models, and flow Models. Evaluation metrics for assessing translation quality are discussed, emphasizing their importance. Commonly used datasets in this field are also analyzed, highlighting their unique characteristics and applications. Looking ahead, the paper identifies future trends, challenges, and proposes research directions and solutions in medical image translation. It aims to serve as a valuable reference and inspiration for researchers, driving continued progress and innovation in this area.

Mixed Modality Image Synthesis Review Concept Academic Lab GenAI

Evaluating a large language model's accuracy in chest X-ray interpretation for acute thoracic conditions.

Ostrovsky AM

•papers•Jul 1 2025

The rapid advancement of artificial intelligence (AI) has great ability to impact healthcare. Chest X-rays are essential for diagnosing acute thoracic conditions in the emergency department (ED), but interpretation delays due to radiologist availability can impact clinical decision-making. AI models, including deep learning algorithms, have been explored for diagnostic support, but the potential of large language models (LLMs) in emergency radiology remains largely unexamined. This study assessed ChatGPT's feasibility in interpreting chest X-rays for acute thoracic conditions commonly encountered in the ED. A subset of 1400 images from the NIH Chest X-ray dataset was analyzed, representing seven pathology categories: Atelectasis, Effusion, Emphysema, Pneumothorax, Pneumonia, Mass, and No Finding. ChatGPT 4.0, utilizing the "X-Ray Interpreter" add-on, was evaluated for its diagnostic performance across these categories. ChatGPT demonstrated high performance in identifying normal chest X-rays, with a sensitivity of 98.9 %, specificity of 93.9 %, and accuracy of 94.7 %. However, the model's performance varied across pathologies. The best results were observed in diagnosing pneumonia (sensitivity 76.2 %, specificity 93.7 %) and pneumothorax (sensitivity 77.4 %, specificity 89.1 %), while performance for atelectasis and emphysema was lower. ChatGPT demonstrates potential as a supplementary tool for differentiating normal from abnormal chest X-rays, with promising results for certain pathologies like pneumonia. However, its diagnostic accuracy for more subtle conditions requires improvement. Further research integrating ChatGPT with specialized image recognition models could enhance its performance, offering new possibilities in medical imaging and education.

X-Ray Classification Chest Retrospective Clinical In Silico Academic Lab GenAI

Dynamic glucose enhanced imaging using direct water saturation.

Knutsson L, Yadav NN, Mohammed Ali S, Kamson DO, Demetriou E, Seidemo A, Blair L, Lin DD, Laterra J, van Zijl PCM

•papers•Jul 1 2025

Dynamic glucose enhanced (DGE) MRI studies employ CEST or spin lock (CESL) to study glucose uptake. Currently, these methods are hampered by low effect size and sensitivity to motion. To overcome this, we propose to utilize exchange-based linewidth (LW) broadening of the direct water saturation (DS) curve of the water saturation spectrum (Z-spectrum) during and after glucose infusion (DS-DGE MRI). To estimate the glucose-infusion-induced LW changes (ΔLW), Bloch-McConnell simulations were performed for normoglycemia and hyperglycemia in blood, gray matter (GM), white matter (WM), CSF, and malignant tumor tissue. Whole-brain DS-DGE imaging was implemented at 3 T using dynamic Z-spectral acquisitions (1.2 s per offset frequency, 38 s per spectrum) and assessed on four brain tumor patients using infusion of 35 g of D-glucose. To assess ΔLW, a deep learning-based Lorentzian fitting approach was used on voxel-based DS spectra acquired before, during, and post-infusion. Area-under-the-curve (AUC) images, obtained from the dynamic ΔLW time curves, were compared qualitatively to perfusion-weighted imaging parametric maps. In simulations, ΔLW was 1.3%, 0.30%, 0.29/0.34%, 7.5%, and 13% in arterial blood, venous blood, GM/WM, malignant tumor tissue, and CSF, respectively. In vivo, ΔLW was approximately 1% in GM/WM, 5% to 20% for different tumor types, and 40% in CSF. The resulting DS-DGE AUC maps clearly outlined lesion areas. DS-DGE MRI is highly promising for assessing D-glucose uptake. Initial results in brain tumor patients show high-quality AUC maps of glucose-induced line broadening and DGE-based lesion enhancement similar and/or complementary to perfusion-weighted imaging.

MRI Segmentation Neurological Retrospective Clinical Clinical Pilot Academic Lab GenAI

The Chest X- Ray: The Ship has Sailed, But Has It?

Iacovino JR

•papers•Jul 1 2025

In the past, the chest X-ray (CXR) was a traditional age and amount requirement used to assess potential mortality risk in life insurance applicants. It fell out of favor due to inconvenience to the applicant, cost, and lack of protective value. With the advent of deep learning techniques, can the results of the CXR, as a requirement, now add additional value to underwriting risk analysis?

X-Ray Classification Chest GenAI

Evaluation of radiology residents' reporting skills using large language models: an observational study.

Atsukawa N, Tatekawa H, Oura T, Matsushita S, Horiuchi D, Takita H, Mitsuyama Y, Omori A, Shimono T, Miki Y, Ueda D

•papers•Jul 1 2025

Large language models (LLMs) have the potential to objectively evaluate radiology resident reports; however, research on their use for feedback in radiology training and assessment of resident skill development remains limited. This study aimed to assess the effectiveness of LLMs in revising radiology reports by comparing them with reports verified by board-certified radiologists and to analyze the progression of resident's reporting skills over time. To identify the LLM that best aligned with human radiologists, 100 reports were randomly selected from 7376 reports authored by nine first-year radiology residents. The reports were evaluated based on six criteria: (1) addition of missing positive findings, (2) deletion of findings, (3) addition of negative findings, (4) correction of the expression of findings, (5) correction of the diagnosis, and (6) proposal of additional examinations or treatments. Reports were segmented into four time-based terms, and 900 reports (450 CT and 450 MRI) were randomly chosen from the initial and final terms of the residents' first year. The revised rates for each criterion were compared between the first and last terms using the Wilcoxon Signed-Rank test. Among the three LLMs-ChatGPT-4 Omni (GPT-4o), Claude-3.5 Sonnet, and Claude-3 Opus-GPT-4o demonstrated the highest level of agreement with board-certified radiologists. Significant improvements were noted in Criteria 1-3 when comparing reports from the first and last terms (Criteria 1, 2, and 3; P < 0.001, P = 0.023, and P = 0.004, respectively) using GPT-4o. No significant changes were observed for Criteria 4-6. Despite this, all criteria except for Criteria 6 showed progressive enhancement over time. LLMs can effectively provide feedback on commonly corrected areas in radiology reports, enabling residents to objectively identify and improve their weaknesses and monitor their progress. Additionally, LLMs may help reduce the workload of radiologists' mentors.

Mixed Modality LLM Radiology Report Retrospective Clinical In Silico Academic Lab GenAI

Denoising Diffusion Probabilistic Model to Simulate Contrast-enhanced spinal MRI of Spinal Tumors: A Multi-Center Study.

Wang C, Zhang S, Xu J, Wang H, Wang Q, Zhu Y, Xing X, Hao D, Lang N

•papers•Jul 1 2025

To generate virtual T1 contrast-enhanced (T1CE) sequences from plain spinal MRI sequences using the denoising diffusion probabilistic model (DDPM) and to compare its performance against one baseline model pix2pix and three advanced models. A total of 1195 consecutive spinal tumor patients who underwent contrast-enhanced MRI at two hospitals were divided into a training set (n = 809, 49 ± 17 years, 437 men), an internal test set (n = 203, 50 ± 16 years, 105 men), and an external test set (n = 183, 52 ± 16 years, 94 men). Input sequences were T1- and T2-weighted images, and T2 fat-saturation images. The output was T1CE images. In the test set, one radiologist read the virtual images and marked all visible enhancing lesions. Results were evaluated using sensitivity (SE) and false discovery rate (FDR). We compared differences in lesion size and enhancement degree between reference and virtual images, and calculated signal-to-noise (SNR) and contrast-to-noise ratios (CNR) for image quality assessment. In the external test set, the mean squared error was 0.0038±0.0065, and structural similarity index 0.78±0.10. Upon evaluation by the reader, the overall SE of the generated T1CE images was 94% with FDR 2%. There was no difference in lesion size or signal intensity ratio between the reference and generated images. The CNR was higher in the generated images than the reference images (9.241 vs. 4.021; P＜0.001). The proposed DDPM demonstrates potential as an alternative to gadolinium contrast in spinal MRI examinations of oncologic patients.

MRI Image Synthesis Musculoskeletal Retrospective Clinical In Silico Academic Lab GenAI

The Evolution of Radiology Image Annotation in the Era of Large Language Models.

Flanders AE, Wang X, Wu CC, Kitamura FC, Shih G, Mongan J, Peng Y

•papers•Jul 1 2025

Although there are relatively few diverse, high-quality medical imaging datasets on which to train computer vision artificial intelligence models, even fewer datasets contain expertly classified observations that can be repurposed to train or test such models. The traditional annotation process is laborious and time-consuming. Repurposing annotations and consolidating similar types of annotations from disparate sources has never been practical. Until recently, the use of natural language processing to convert a clinical radiology report into labels required custom training of a language model for each use case. Newer technologies such as large language models have made it possible to generate accurate and normalized labels at scale, using only clinical reports and specific prompt engineering. The combination of automatically generated labels extracted and normalized from reports in conjunction with foundational image models provides a means to create labels for model training. This article provides a short history and review of the annotation and labeling process of medical images, from the traditional manual methods to the newest semiautomated methods that provide a more scalable solution for creating useful models more efficiently. <b>Keywords:</b> Feature Detection, Diagnosis, Semi-supervised Learning © RSNA, 2025.

Mixed Modality Classification Review Concept Academic Lab GenAI Open Dataset

Generative Artificial Intelligence in Prostate Cancer Imaging.

Haque F, Simon BD, Özyörük KB, Harmon SA, Türkbey B

•papers•Jul 1 2025

Prostate cancer (PCa) is the second most common cancer in men and has a significant health and social burden, necessitating advances in early detection, prognosis, and treatment strategies. Improvement in medical imaging has significantly impacted early PCa detection, characterization, and treatment planning. However, with an increasing number of patients with PCa and comparatively fewer PCa imaging experts, interpreting large numbers of imaging data is burdensome, time-consuming, and prone to variability among experts. With the revolutionary advances of artificial intelligence (AI) in medical imaging, image interpretation tasks are becoming easier and exhibit the potential to reduce the workload on physicians. Generative AI (GenAI) is a recently popular sub-domain of AI that creates new data instances, often to resemble patterns and characteristics of the real data. This new field of AI has shown significant potential for generating synthetic medical images with diverse and clinically relevant information. In this narrative review, we discuss the basic concepts of GenAI and cover the recent application of GenAI in the PCa imaging domain. This review will help the readers understand where the PCa research community stands in terms of various medical image applications like generating multi-modal synthetic images, image quality improvement, PCa detection, classification, and digital pathology image generation. We also address the current safety concerns, limitations, and challenges of GenAI for technical and clinical adaptation, as well as the limitations of current literature, potential solutions, and future directions with GenAI for the PCa community.

Mixed Modality Image Synthesis Abdominal Review Concept Academic Lab GenAI

Enhancing Magnetic Resonance Imaging (MRI) Report Comprehension in Spinal Trauma: Readability Analysis of AI-Generated Explanations for Thoracolumbar Fractures.

Sing DC, Shah KS, Pompliano M, Yi PH, Velluto C, Bagheri A, Eastlack RK, Stephan SR, Mundis GM

•papers•Jul 1 2025

Magnetic resonance imaging (MRI) reports are challenging for patients to interpret and may subject patients to unnecessary anxiety. The advent of advanced artificial intelligence (AI) large language models (LLMs), such as GPT-4o, hold promise for translating complex medical information into layman terms. This paper aims to evaluate the accuracy, helpfulness, and readability of GPT-4o in explaining MRI reports of patients with thoracolumbar fractures. MRI reports of 20 patients presenting with thoracic or lumbar vertebral body fractures were obtained. GPT-4o was prompted to explain the MRI report in layman's terms. The generated explanations were then presented to 7 board-certified spine surgeons for evaluation on the reports' helpfulness and accuracy. The MRI report text and GPT-4o explanations were then analyzed to grade the readability of the texts using the Flesch Readability Ease Score (FRES) and Flesch-Kincaid Grade Level (FKGL) Scale. The layman explanations provided by GPT-4o were found to be helpful by all surgeons in 17 cases, with 6 of 7 surgeons finding the information helpful in the remaining 3 cases. ChatGPT-generated layman reports were rated as "accurate" by all 7 surgeons in 11/20 cases (55%). In an additional 5/20 cases (25%), 6 out of 7 surgeons agreed on their accuracy. In the remaining 4/20 cases (20%), accuracy ratings varied, with 4 or 5 surgeons considering them accurate. Review of surgeon feedback on inaccuracies revealed that the radiology reports were often insufficiently detailed. The mean FRES score of the MRI reports was significantly lower than the GPT-4o explanations (32.15, SD 15.89 vs 53.9, SD 7.86; P<.001). The mean FKGL score of the MRI reports trended higher compared to the GPT-4o explanations (11th-12th grade vs 10th-11th grade level; P=.11). Overall helpfulness and readability ratings for AI-generated summaries of MRI reports were high, with few inaccuracies recorded. This study demonstrates the potential of GPT-4o to serve as a valuable tool for enhancing patient comprehension of MRI report findings.

MRI LLM Radiology Report Musculoskeletal Retrospective Clinical In Silico Academic Lab GenAI

A Review of the Opportunities and Challenges with Large Language Models in Radiology: The Road Ahead.

Soni N, Ora M, Agarwal A, Yang T, Bathla G

•papers•Jul 1 2025

In recent years, generative artificial intelligence (AI), particularly large language models (LLMs) and their multimodal counterparts, multimodal large language models, including vision language models, have generated considerable interest in the global AI discourse. LLMs, or pre-trained language models (such as ChatGPT, Med-PaLM, LLaMA), are neural network architectures trained on extensive text data, excelling in language comprehension and generation. Multimodal LLMs, a subset of foundation models, are trained on multimodal data sets, integrating text with another modality, such as images, to learn universal representations akin to human cognition better. This versatility enables them to excel in tasks like chatbots, translation, and creative writing while facilitating knowledge sharing through transfer learning, federated learning, and synthetic data creation. Several of these models can have potentially appealing applications in the medical domain, including, but not limited to, enhancing patient care by processing patient data; summarizing reports and relevant literature; providing diagnostic, treatment, and follow-up recommendations; and ancillary tasks like coding and billing. As radiologists enter this promising but uncharted territory, it is imperative for them to be familiar with the basic terminology and processes of LLMs. Herein, we present an overview of the LLMs and their potential applications and challenges in the imaging domain.

Mixed Modality LLM Radiology Report Review Concept Academic Lab GenAI Ethics

Filter Papers

Tags

Medical image translation with deep learning: Advances, datasets and perspectives.

Evaluating a large language model's accuracy in chest X-ray interpretation for acute thoracic conditions.

Dynamic glucose enhanced imaging using direct water saturation.

The Chest X- Ray: The Ship has Sailed, But Has It?

Evaluation of radiology residents' reporting skills using large language models: an observational study.

Denoising Diffusion Probabilistic Model to Simulate Contrast-enhanced spinal MRI of Spinal Tumors: A Multi-Center Study.

The Evolution of Radiology Image Annotation in the Era of Large Language Models.

Generative Artificial Intelligence in Prostate Cancer Imaging.

Enhancing Magnetic Resonance Imaging (MRI) Report Comprehension in Spinal Trauma: Readability Analysis of AI-Generated Explanations for Thoracolumbar Fractures.

A Review of the Opportunities and Challenges with Large Language Models in Radiology: The Road Ahead.

Ready to Sharpen Your Edge?