58 Latest Papers on Radiology AI. Sources: pubmed. Tags: GenAI. Order: Best Match. Limit: 10.

Diagnosis of early idiopathic pulmonary fibrosis: current status and future perspective.

Wang X, Xia X, Hou Y, Zhang H, Han W, Sun J, Li F

•papers•May 19 2025

The standard approach to diagnosing idiopathic pulmonary fibrosis (IPF) includes identifying the usual interstitial pneumonia (UIP) pattern via high resolution computed tomography (HRCT) or lung biopsy and excluding known causes of interstitial lung disease (ILD). However, limitations of manual interpretation of lung imaging, along with other reasons such as lack of relevant knowledge and non-specific symptoms have hindered the timely diagnosis of IPF. This review proposes the definition of early IPF, emphasizes the diagnostic urgency of early IPF, and highlights current diagnostic strategies and future prospects for early IPF. The integration of artificial intelligence (AI), specifically machine learning (ML) and deep learning (DL), is revolutionizing the diagnostic procedure of early IPF by standardizing and accelerating the interpretation of thoracic images. Innovative bronchoscopic techniques such as transbronchial lung cryobiopsy (TBLC), genomic classifier, and endobronchial optical coherence tomography (EB-OCT) provide less invasive diagnostic alternatives. In addition, chest auscultation, serum biomarkers, and susceptibility genes are pivotal for the indication of early diagnosis. Ongoing research is essential for refining diagnostic methods and treatment strategies for early IPF.

CT Classification Chest Review In Silico Academic Lab GenAI

ChatGPT-4-Driven Liver Ultrasound Radiomics Analysis: Advantages and Drawbacks Compared to Traditional Techniques.

Sultan L, Venkatakrishna SSB, Anupindi S, Andronikou S, Acord M, Otero H, Darge K, Sehgal C, Holmes J

•papers•May 18 2025

Artificial intelligence (AI) is transforming medical imaging, with large language models such as ChatGPT-4 emerging as potential tools for automated image interpretation. While AI-driven radiomics has shown promise in diagnostic imaging, the efficacy of ChatGPT-4 in liver ultrasound analysis remains largely unexamined. This study evaluates the capability of ChatGPT-4 in liver ultrasound radiomics, specifically its ability to differentiate fibrosis, steatosis, and normal liver tissue, compared to conventional image analysis software. Seventy grayscale ultrasound images from a preclinical liver disease model, including fibrosis (n=31), fatty liver (n=18), and normal liver (n=21), were analyzed. ChatGPT-4 extracted texture features, which were compared to those obtained using Interactive Data Language (IDL), a traditional image analysis software. One-way ANOVA was used to identify statistically significant features differentiating liver conditions, and logistic regression models were employed to assess diagnostic performance. ChatGPT-4 extracted nine key textural features-echo intensity, heterogeneity, skewness, kurtosis, contrast, homogeneity, dissimilarity, angular second moment, and entropy-all of which significantly differed across liver conditions (p < 0.05). Among individual features, echo intensity achieved the highest F1-score (0.85). When combined, ChatGPT-4 attained 76% accuracy and 83% sensitivity in classifying liver disease. ROC analysis demonstrated strong discriminatory performance, with AUC values of 0.75 for fibrosis, 0.87 for normal liver, and 0.97 for steatosis. Compared to Interactive Data Language (IDL) image analysis software, ChatGPT-4 exhibited slightly lower sensitivity (0.83 vs. 0.89) but showed moderate correlation (R = 0.68, p < 0.0001) with IDL-derived features. However, it significantly outperformed IDL in processing efficiency, reducing analysis time by 40%, highlighting its potential for high throughput radiomic analysis. Despite slightly lower sensitivity than IDL, ChatGPT-4 demonstrated high feasibility for ultrasound radiomics, offering faster processing, high-throughput analysis, and automated multi-image evaluation. These findings support its potential integration into AI-driven imaging workflows, with further refinements needed to enhance feature reproducibility and diagnostic accuracy.

Ultrasound Classification Abdominal Retrospective Clinical In Silico Academic Lab GenAI

Evaluating the Performance of Reasoning Large Language Models on Japanese Radiology Board Examination Questions.

Nakaura T, Takamure H, Kobayashi N, Shiraishi K, Yoshida N, Nagayama Y, Uetani H, Kidoh M, Funama Y, Hirai T

•papers•May 17 2025

This study evaluates the performance, cost, and processing time of OpenAI's reasoning large language models (LLMs) (o1-preview, o1-mini) and their base models (GPT-4o, GPT-4o-mini) on Japanese radiology board examination questions. A total of 210 questions from the 2022-2023 official board examinations of the Japan Radiological Society were presented to each of the four LLMs. Performance was evaluated by calculating the percentage of correctly answered questions within six predefined radiology subspecialties. The total cost and processing time for each model were also recorded. The McNemar test was used to assess the statistical significance of differences in accuracy between paired model responses. The o1-preview achieved the highest accuracy (85.7%), significantly outperforming GPT-4o (73.3%, P<.001). Similarly, o1-mini (69.5%) performed significantly better than GPT-4o-mini (46.7%, P<.001). Across all radiology subspecialties, o1-preview consistently ranked highest. However, reasoning models incurred substantially higher costs (o1-preview: $17.10, o1-mini: $2.58) compared to their base counterparts (GPT-4o: $0.496, GPT-4o-mini: $0.04), and their processing times were approximately 3.7 and 1.2 times longer, respectively. Reasoning LLMs demonstrated markedly superior performance in answering radiology board exam questions compared to their base models, albeit at a substantially higher cost and increased processing time.

LLM Radiology Report Other Retrospective Clinical In Silico Academic Lab GenAI Benchmark SOTA

Exploring interpretable echo analysis using self-supervised parcels.

Majchrowska S, Hildeman A, Mokhtari R, Diethe T, Teare P

•papers•May 17 2025

The application of AI for predicting critical heart failure endpoints using echocardiography is a promising avenue to improve patient care and treatment planning. However, fully supervised training of deep learning models in medical imaging requires a substantial amount of labelled data, posing significant challenges due to the need for skilled medical professionals to annotate image sequences. Our study addresses this limitation by exploring the potential of self-supervised learning, emphasising interpretability, robustness, and safety as crucial factors in cardiac imaging analysis. We leverage self-supervised learning on a large unlabelled dataset, facilitating the discovery of features applicable to a various downstream tasks. The backbone model not only generates informative features for training smaller models using simple techniques but also produces features that are interpretable by humans. The study employs a modified Self-supervised Transformer with Energy-based Graph Optimisation (STEGO) network on top of self-DIstillation with NO labels (DINO) as a backbone model, pre-trained on diverse medical and non-medical data. This approach facilitates the generation of self-segmented outputs, termed "parcels", which identify distinct anatomical sub-regions of the heart. Our findings highlight the robustness of these self-learned parcels across diverse patient profiles and phases of the cardiac cycle phases. Moreover, these parcels offer high interpretability and effectively encapsulate clinically relevant cardiac substructures. We conduct a comprehensive evaluation of the proposed self-supervised approach on publicly available datasets, demonstrating its adaptability to a wide range of requirements. Our results underscore the potential of self-supervised learning to address labelled data scarcity in medical imaging, offering a path to improve cardiac imaging analysis and enhance the efficiency and interpretability of diagnostic procedures, thus positively impacting patient care and clinical decision-making.

Ultrasound Segmentation Cardiac Methodology In Silico Big Tech GenAI Ethics

Feasibility of improving vocal fold pathology image classification with synthetic images generated by DDPM-based GenAI: a pilot study.

Khazrak I, Zainaee S, M Rezaee M, Ghasemi M, C Green R

•papers•May 17 2025

Voice disorders (VD) are often linked to vocal fold structural pathologies (VFSP). Laryngeal imaging plays a vital role in assessing VFSPs and VD in clinical and research settings, but challenges like scarce and imbalanced datasets can limit the generalizability of findings. Denoising Diffusion Probabilistic Models (DDPMs), a subtype of Generative AI, has gained attention for its ability to generate high-quality and realistic synthetic images to address these challenges. This study explores the feasibility of improving VFSP image classification by generating synthetic images using DDPMs. 404 laryngoscopic images depicting VF without and with VFSP were included. DDPMs were used to generate synthetic images to augment the original dataset. Two convolutional neural network architectures, VGG16 and ResNet50, were applied for model training. The models were initially trained only on the original dataset. Then, they were trained on the augmented datasets. Evaluation metrics were analyzed to assess the performance of the models for both binary classification (with/without VFSPs) and multi-class classification (seven specific VFSPs). Realistic and high-quality synthetic images were generated for dataset augmentation. The model first failed to converge when trained only on the original dataset, but they successfully converged and achieved low loss and high accuracy when trained on the augmented datasets. The best performance was gained for both binary and multi-class classification when the models were trained on an augmented dataset. Generating realistic images of VFSP using DDPMs is feasible and can enhance the classification of VFSPs by an AI model and may support VD screening and diagnosis.

Mixed Modality Classification Other Methodology In Silico Academic Lab GenAI

Escarcitys: A framework for enhancing medical image classification performance in scarcity of trainable samples scenarios.

Wang T, Dai Q, Xiong W

•papers•May 16 2025

In the field of healthcare, the acquisition and annotation of medical images present significant challenges, resulting in a scarcity of trainable samples. This data limitation hinders the performance of deep learning models, creating bottlenecks in clinical applications. To address this issue, we construct a framework (EScarcityS) aimed at enhancing the success rate of disease diagnosis in scarcity of trainable medical image scenarios. Firstly, considering that Transformer-based deep learning networks rely on a large amount of trainable data, this study takes into account the unique characteristics of pathological regions. By extracting the feature representations of all particles in medical images at different granularities, a multi-granularity Transformer network (MGVit) is designed. This network leverages additional prior knowledge to assist the Transformer network during training, thereby reducing the data requirement to some extent. Next, the importance maps of particles at different granularities, generated by MGVit, are fused to construct disease probability maps corresponding to the images. Based on these maps, a disease probability map-guided diffusion generation model is designed to generate more realistic and interpretable synthetic data. Subsequently, authentic and synthetical data are mixed and used to retrain MGVit, aiming to enhance the accuracy of medical image classification in scarcity of trainable medical image scenarios. Finally, we conducted detailed experiments on four real medical image datasets to validate the effectiveness of EScarcityS and its specific modules.

Classification Other Methodology In Silico Academic Lab GenAI

High-Performance Prompting for LLM Extraction of Compression Fracture Findings from Radiology Reports.

Kanani MM, Monawer A, Brown L, King WE, Miller ZD, Venugopal N, Heagerty PJ, Jarvik JG, Cohen T, Cross NM

•papers•May 16 2025

Extracting information from radiology reports can provide critical data to empower many radiology workflows. For spinal compression fractures, these data can facilitate evidence-based care for at-risk populations. Manual extraction from free-text reports is laborious, and error-prone. Large language models (LLMs) have shown promise; however, fine-tuning strategies to optimize performance in specific tasks can be resource intensive. A variety of prompting strategies have achieved similar results with fewer demands. Our study pioneers the use of Meta's Llama 3.1, together with prompt-based strategies, for automated extraction of compression fractures from free-text radiology reports, outputting structured data without model training. We tested performance on a time-based sample of CT exams covering the spine from 2/20/2024 to 2/22/2024 acquired across our healthcare enterprise (637 anonymized reports, age 18-102, 47% Female). Ground truth annotations were manually generated and compared against the performance of three models (Llama 3.1 70B, Llama 3.1 8B, and Vicuna 13B) with nine different prompting configurations for a total of 27 model/prompt experiments. The highest F1 score (0.91) was achieved by the 70B Llama 3.1 model when provided with a radiologist-written background, with similar results when the background was written by a separate LLM (0.86). The addition of few-shot examples to these prompts had variable impact on F1 measurements (0.89, 0.84 respectively). Comparable ROC-AUC and PR-AUC performance was observed. Our work demonstrated that an open-weights LLM excelled at extracting compression fractures findings from free-text radiology reports using prompt-based techniques without requiring extensive manually labeled examples for model training.

CT LLM Radiology Report Musculoskeletal Retrospective Clinical In Silico Academic Lab GenAI

Automated high precision PCOS detection through a segment anything model on super resolution ultrasound ovary images.

Reka S, Praba TS, Prasanna M, Reddy VNN, Amirtharajan R

•papers•May 15 2025

PCOS (Poly-Cystic Ovary Syndrome) is a multifaceted disorder that often affects the ovarian morphology of women of their reproductive age, resulting in the development of numerous cysts on the ovaries. Ultrasound imaging typically diagnoses PCOS, which helps clinicians assess the size, shape, and existence of cysts in the ovaries. Nevertheless, manual ultrasound image analysis is often challenging and time-consuming, resulting in inter-observer variability. To effectively treat PCOS and prevent its long-term effects, prompt and accurate diagnosis is crucial. In such cases, a prediction model based on deep learning can help physicians by streamlining the diagnosis procedure, reducing time and potential errors. This article proposes a novel integrated approach, QEI-SAM (Quality Enhanced Image - Segment Anything Model), for enhancing image quality and ovarian cyst segmentation for accurate prediction. GAN (Generative Adversarial Networks) and CNN (Convolutional Neural Networks) are the most recent cutting-edge innovations that have supported the system in attaining the expected result. The proposed QEI-SAM model used Enhanced Super Resolution Generative Adversarial Networks (ESRGAN) for image enhancement to increase the resolution, sharpening the edges and restoring the finer structure of the ultrasound ovary images and achieved a better SSIM of 0.938, PSNR value of 38.60 and LPIPS value of 0.0859. Then, it incorporates the Segment Anything Model (SAM) to segment ovarian cysts and achieve the highest Dice coefficient of 0.9501 and IoU score of 0.9050. Furthermore, Convolutional Neural Network - ResNet 50, ResNet 101, VGG 16, VGG 19, AlexNet and Inception v3 have been implemented to diagnose PCOS promptly. Finally, VGG 19 has achieved the highest accuracy of 99.31%.

Ultrasound Segmentation Abdominal Methodology In Silico Academic Lab GenAI

"MR Fingerprinting for Imaging Brain Hemodynamics and Oxygenation".

Coudert T, Delphin A, Barrier A, Barbier EL, Lemasson B, Warnking JM, Christen T

•papers•May 15 2025

Over the past decade, several studies have explored the potential of magnetic resonance fingerprinting (MRF) for the quantification of brain hemodynamics, oxygenation, and perfusion. Recent advances in simulation models and reconstruction frameworks have also significantly enhanced the accuracy of vascular parameter estimation. This review provides an overview of key vascular MRF studies, emphasizing advancements in geometrical models for vascular simulations, novel sequences, and state-of-the-art reconstruction techniques incorporating machine learning and deep learning algorithms. Both pre-clinical and clinical applications are discussed. Based on these findings, we outline future directions and development areas that need to be addressed to facilitate their clinical translation. EVIDENCE LEVEL: N/A. TECHNICAL EFFICACY: Stage 1.

MRI Reconstruction Neurological Review Concept Academic Lab GenAI

Leveraging Vision Transformers in Multimodal Models for Retinal OCT Analysis.

Feretzakis G, Karakosta C, Gkoulalas-Divanis A, Bisoukis A, Boufeas IZ, Bazakidou E, Sakagianni A, Kalles D, Verykios VS

•papers•May 15 2025

Optical Coherence Tomography (OCT) has become an indispensable imaging modality in ophthalmology, providing high-resolution cross-sectional images of the retina. Accurate classification of OCT images is crucial for diagnosing retinal diseases such as Age-related Macular Degeneration (AMD) and Diabetic Macular Edema (DME). This study explores the efficacy of various deep learning models, including convolutional neural networks (CNNs) and Vision Transformers (ViTs), in classifying OCT images. We also investigate the impact of integrating metadata (patient age, sex, eye laterality, and year) into the classification process, even when a significant portion of metadata is missing. Our results demonstrate that multimodal models leveraging both image and metadata inputs, such as the Multimodal ResNet18, can achieve competitive performance compared to image-only models, such as DenseNet121. Notably, DenseNet121 and Multimodal ResNet18 achieved the highest accuracy of 95.16%, with DenseNet121 showing a slightly higher F1-score of 0.9313. The multimodal ViT-based model also demonstrated promising results, achieving an accuracy of 93.22%, indicating the potential of Vision Transformers (ViTs) in medical image analysis, especially for handling complex multimodal data.

OCT Classification Other Methodology In Silico Academic Lab GenAI