Latest Papers on Radiology AI. Tags: GenAI.

Multimodal Foundation Models for Early Disease Detection

Md Talha Mohsin, Ismail Abdulrashid

•preprint•Oct 2 2025

Healthcare generates diverse streams of data, including electronic health records (EHR), medical imaging, genetics, and ongoing monitoring from wearable devices. Traditional diagnostic models frequently analyze these sources in isolation, which constrains their capacity to identify cross-modal correlations essential for early disease diagnosis. Our research presents a multimodal foundation model that consolidates diverse patient data through an attention-based transformer framework. At first, dedicated encoders put each modality into a shared latent space. Then, they combine them using multi-head attention and residual normalization. The architecture is made for pretraining on many tasks, which makes it easy to adapt to new diseases and datasets with little extra work. We provide an experimental strategy that uses benchmark datasets in oncology, cardiology, and neurology, with the goal of testing early detection tasks. The framework includes data governance and model management tools in addition to technological performance to improve transparency, reliability, and clinical interpretability. The suggested method works toward a single foundation model for precision diagnostics, which could improve the accuracy of predictions and help doctors make decisions.

Mixed Modality Classification Methodology In Silico GenAI

Generating Findings for Jaw Cysts in Dental Panoramic Radiographs Using GPT-4o: Building a Two-Stage Self-Correction Loop with Structured Output (SLSO) Framework

Nanaka Hosokawa, Ryo Takahashi, Tomoya Kitano, Yukihiro Iida, Chisako Muramatsu, Tatsuro Hayashi, Yuta Seino, Xiangrong Zhou, Takeshi Hara, Akitoshi Katsumata, Hiroshi Fujita

•preprint•Oct 2 2025

In this study, we utilized the multimodal capabilities of OpenAI GPT-4o to automatically generate jaw cyst findings on dental panoramic radiographs. To improve accuracy, we constructed a Self-correction Loop with Structured Output (SLSO) framework and verified its effectiveness. A 10-step process was implemented for 22 cases of jaw cysts, including image input and analysis, structured data generation, tooth number extraction and consistency checking, iterative regeneration when inconsistencies were detected, and finding generation with subsequent restructuring and consistency verification. A comparative experiment was conducted using the conventional Chain-of-Thought (CoT) method across seven evaluation items: transparency, internal structure, borders, root resorption, tooth movement, relationships with other structures, and tooth number. The results showed that the proposed SLSO framework improved output accuracy for many items, with 66.9%, 33.3%, and 28.6% improvement rates for tooth number, tooth movement, and root resorption, respectively. In the successful cases, a consistently structured output was achieved after up to five regenerations. Although statistical significance was not reached because of the small size of the dataset, the overall SLSO framework enforced negative finding descriptions, suppressed hallucinations, and improved tooth number identification accuracy. However, the accurate identification of extensive lesions spanning multiple teeth is limited. Nevertheless, further refinement is required to enhance overall performance and move toward a practical finding generation system.

X-Ray Report Generation Methodology Prototype Academic Lab GenAI

Evaluating GPT-4o for emergency disposition of complex respiratory cases with pulmonology consultation: a diagnostic accuracy study.

Yıldırım C, Aykut A, Günsoy E, Öncül MV

•papers•Oct 2 2025

Large Language Models (LLMs), such as GPT-4o, are increasingly investigated for clinical decision support in emergency medicine. However, their real-world performance in disposition prediction remains insufficiently studied. This study evaluated the diagnostic accuracy of GPT-4o in predicting ED disposition-discharge, ward admission, or ICU admission-in complex emergency respiratory cases requiring pulmonology consultation and chest CT, representing a selective high-acuity subgroup of ED patients. We conducted a retrospective observational study in a tertiary ED between November 2024 and February 2025. We retrospectively included ED patients with complex respiratory presentations who underwent pulmonology consultation and chest CT, representing a selective high-acuity subgroup rather than the general ED respiratory population. GPT-4o was prompted to predict the most appropriate ED disposition using three progressively enriched input models: Model 1 (age, sex, oxygen saturation, home oxygen therapy, and venous blood gas parameters); Model 2 (Model 1 plus laboratory data); and Model 3 (Model 2 plus chest CT findings). Model performance was assessed using accuracy, sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), and F1 score. Among the 221 patients included, 69.2% were admitted to the ward, 9.0% to the intensive care unit (ICU), and 21.7% were discharged. For hospital admission prediction, Model 3 demonstrated the highest sensitivity (91.9%) and overall accuracy (76.5%), but the lowest specificity (20.8%). In contrast, for discharge prediction, Model 3 achieved the highest specificity (91.9%) but the lowest sensitivity (20.8%). Numerical improvements were observed across models, but none reached statistical significance (all p > 0.22). Model 1 therefore performed comparably to Models 2-3 while being less complex. Among patients who were discharged despite GPT-4o predicting admission, the 14-day ED re-presentation rates were 23.8% (5/21) for Model 1, 30.0% (9/30) for Model 2, and 28.9% (11/38) for Model 3. GPT-4o demonstrated high sensitivity in identifying ED patients requiring hospital admission, particularly those needing intensive care, when provided with progressively enriched clinical input. However, its low sensitivity for discharge prediction resulted in frequent overtriage, limiting its utility for autonomous decision-making. This proof-of-concept study demonstrates GPT-4o's capacity to stratify disposition decisions in complex respiratory cases under varying levels of limited input data. However, these findings should be interpreted in light of key limitations, including the selective high-acuity cohort and the absence of vital signs, and require prospective validation before clinical implementation.

CT Classification Chest Retrospective Clinical In Silico GenAI

Current and novel approaches for critical care management of aneurysmal subarachnoid hemorrhage in critical care.

Zoumprouli A, Carden R, Bilotta F

•papers•Oct 1 2025

This review highlights recent advancements and evidence-based approaches in the critical care management of aneurysmal subarachnoid hemorrhage (aSAH), focusing on developments from the past 18 months. It addresses key challenges [rebleeding prevention, delayed cerebral ischemia (DCI), hydrocephalus, transfusion strategies, and temperature management], emphasizing multidisciplinary care and personalized treatment. Recent studies underscore the importance of systolic blood pressure control (<160 mmHg) to reduce rebleeding risk before aneurysm securing. Novel prognostic tools, including the modified 5-item frailty index and quantitative imaging software, show promise in improving outcome prediction. Prophylactic lumbar drainage may reduce DCI and improve neurological outcomes, while milrinone and computed tomography perfusion-guided therapies are being explored for vasospasm management. Transfusion strategies suggest a hemoglobin threshold of 9 g/dl may optimize outcomes. Temperature management remains contentious, but consensus recommends maintaining normothermia (36.0-37.5 °C) with continuous monitoring. Advances in aSAH care emphasize precision medicine, leveraging technology [e.g. Artificial intelligence (AI), quantitative imaging], and multidisciplinary collaboration. Key unresolved questions warrant multicenter trials to validate optimal blood pressure, transfusion, and temperature targets alongside emerging therapies for DCI.

CT Classification Neurological Review In Silico GenAI

Design of AI-driven microwave imaging for lung tumor monitoring.

Singh A, Paul S, Gayen S, Mandal B, Mitra D, Augustine R

•papers•Oct 1 2025

The global incidence of lung diseases, particularly lung cancer, is increasing at an alarming rate, underscoring the urgent need for early detection, robust monitoring, and timely intervention. This study presents design aspects of an artificial intelligence (AI)-integrated microwave-based diagnostic tool for the early detection of lung tumors. The proposed method assimilates the prowess of machine learning (ML) tools with microwave imaging (MWI). A microwave unit containing eight antennas in the form of a wearable belt is employed for data collection from the CST body models. The data, collected in the form of scattering parameters, are reconstructed as 2D images. Two different ML approaches have been investigated for tumor detection and prediction of the size of the detected tumor. The first approach employs XGBoost models on raw S-parameters and the second approach uses convolutional neural networks (CNN) on the reconstructed 2-D microwave images. It is found that the XGBoost-based classifier with S-parameters outperforms the CNN-based classifier on reconstructed microwave images for tumor detection. Whereas a CNN-based model on reconstructed microwave images performs much better than an XGBoost-based regression model designed on the raw S-parameters for tumor size prediction. The performances of both of these models are evaluated on other body models to examine their generalization capacity over unknown data. This work explores the feasibility of a low-cost portable AI-integrated microwave diagnostic device for lung tumor detection, which eliminates the risk of exposure to harmful ionizing radiations of X-ray and CT scans.

OCT Detection Chest Methodology Concept Academic Lab GenAI

Artificial intelligence in regional anesthesia.

Harris J, Kamming D, Bowness JS

•papers•Oct 1 2025

Artificial intelligence (AI) is having an increasing impact on healthcare. In ultrasound-guided regional anesthesia (UGRA), commercially available devices exist that augment traditional grayscale ultrasound imaging by highlighting key sono-anatomical structures in real-time. We review the latest evidence supporting this emerging technology and consider the opportunities and challenges to its widespread deployment. The existing literature is limited and heterogenous, which impedes full appraisal of systems, comparison between devices, and informed adoption. AI-based devices promise to improve clinical practice and training in UGRA, though their impact on patient outcomes and provision of UGRA techniques is unclear at this early stage. Calls for standardization across both UGRA and AI are increasing, with greater clinical leadership required. Emerging AI applications in UGRA warrant further study due to an opaque and fragmented evidence base. Robust and consistent evaluation and reporting of algorithm performance, in a representative clinical context, will expedite discovery and appropriate deployment of AI in UGRA. A clinician-focused approach to the development, evaluation, and implementation of this exciting branch of AI has huge potential to advance the human art of regional anesthesia.

Ultrasound Segmentation Review Clinical Pilot Academic Lab GenAI

Self-Supervised Anatomical Consistency Learning for Vision-Grounded Medical Report Generation

Longzhen Yang, Zhangkai Ni, Ying Wen, Yihang Liu, Lianghua He, Heng Tao Shen

•preprint•Sep 30 2025

Vision-grounded medical report generation aims to produce clinically accurate descriptions of medical images, anchored in explicit visual evidence to improve interpretability and facilitate integration into clinical workflows. However, existing methods often rely on separately trained detection modules that require extensive expert annotations, introducing high labeling costs and limiting generalizability due to pathology distribution bias across datasets. To address these challenges, we propose Self-Supervised Anatomical Consistency Learning (SS-ACL) -- a novel and annotation-free framework that aligns generated reports with corresponding anatomical regions using simple textual prompts. SS-ACL constructs a hierarchical anatomical graph inspired by the invariant top-down inclusion structure of human anatomy, organizing entities by spatial location. It recursively reconstructs fine-grained anatomical regions to enforce intra-sample spatial alignment, inherently guiding attention maps toward visually relevant areas prompted by text. To further enhance inter-sample semantic alignment for abnormality recognition, SS-ACL introduces a region-level contrastive learning based on anatomical consistency. These aligned embeddings serve as priors for report generation, enabling attention maps to provide interpretable visual evidence. Extensive experiments demonstrate that SS-ACL, without relying on expert annotations, (i) generates accurate and visually grounded reports -- outperforming state-of-the-art methods by 10\% in lexical accuracy and 25\% in clinical efficacy, and (ii) achieves competitive performance on various downstream visual tasks, surpassing current leading visual foundation models by 8\% in zero-shot visual grounding.

X-Ray Report Generation Chest Methodology In Silico Academic Lab GenAI Benchmark SOTA

Dolphin v1.0 Technical Report

Taohan Weng, Chi zhang, Chaoran Yan, Siya Liu, Xiaoyang Liu, Yalun Wu, Boyang Wang, Boyan Wang, Jiren Ren, Kaiwen Yan, Jinze Yu, Kaibing Hu, Henan Liu, Haoyun zheng, Anjie Le, Hongcheng Guo

•preprint•Sep 30 2025

Ultrasound is crucial in modern medicine but faces challenges like operator dependence, image noise, and real-time scanning, hindering AI integration. While large multimodal models excel in other medical imaging areas, they struggle with ultrasound's complexities. To address this, we introduce Dolphin v1.0 (V1) and its reasoning-augmented version, Dolphin R1-the first large-scale multimodal ultrasound foundation models unifying diverse clinical tasks in a single vision-language framework.To tackle ultrasound variability and noise, we curated a 2-million-scale multimodal dataset, combining textbook knowledge, public data, synthetic samples, and general corpora. This ensures robust perception, generalization, and clinical adaptability.The Dolphin series employs a three-stage training strategy: domain-specialized pretraining, instruction-driven alignment, and reinforcement-based refinement. Dolphin v1.0 delivers reliable performance in classification, detection, regression, and report generation. Dolphin R1 enhances diagnostic inference, reasoning transparency, and interpretability through reinforcement learning with ultrasound-specific rewards.Evaluated on U2-Bench across eight ultrasound tasks, Dolphin R1 achieves a U2-score of 0.5835-over twice the second-best model (0.2968) setting a new state of the art. Dolphin v1.0 also performs competitively, validating the unified framework. Comparisons show reasoning-enhanced training significantly improves diagnostic accuracy, consistency, and interpretability, highlighting its importance for high-stakes medical AI.

Ultrasound LLM Radiology Report Methodology In Silico Academic Lab Benchmark SOTA Open Dataset GenAI

Automating prostate volume acquisition using abdominal ultrasound scans for prostate-specific antigen density calculations.

Bennett RD, Barrett T, Sushentsev N, Sanmugalingam N, Lee KL, Gnanapragasam VJ, Tse ZTH

•papers•Sep 30 2025

Proposed methods for prostate cancer screening are currently prohibitively expensive (due to the high costs of imaging equipment such as magnetic resonance imaging and traditional ultrasound systems), inadequate in their detection rates, require highly trained specialists, and/or are invasive, resulting in patient discomfort. These limitations make population-wide screening for prostate cancer challenging. Machine learning techniques applied to abdominal ultrasound scanning may help alleviate some of these disadvantages. Abdominal ultrasound scans are comparatively low cost and exhibit minimal patient discomfort, and machine learning can be applied to mitigate against the high operator-dependent variability of ultrasound scanning. In this study, a state-of-the-art machine learning model was compared to an expert radiologist and trainee radiologist registrars of varying experience when estimating prostate volume from abdominal ultrasound images, a crucial step in detecting prostate cancer using prostate-specific antigen density. The machine learning model calculated prostatic volume by marking out dimensions of the prolate ellipsoid formula from two orthogonal images of the prostate acquired with abdominal ultrasound scans (which could be conducted by operators with minimal experience in a primary care setting). While both the algorithm and the registrars showed high correlation with the expert ([Formula: see text]) it was found that the model outperformed the trainees in both accuracy (lowest average volume error of [Formula: see text]) and consistency (lowest IQR of [Formula: see text] and lowest average volume standard deviation of [Formula: see text]). The results are promising for the future development of an automated prostate cancer screening workflow using machine learning and abdominal ultrasound scans.

Ultrasound Segmentation Abdominal Retrospective Clinical In Silico Academic Lab GenAI

Leveraging ChatGPT for Report Error Audit: An Accuracy-Driven and Cost-Efficient Solution for Ophthalmic Imaging Reports.

Xu Y, Kang D, Shi D, Tham YC, Grzybowski A, Jin K

•papers•Sep 30 2025

Accurate ophthalmic imaging reports, including fundus fluorescein angiography (FFA) and ocular B-scan ultrasound, are essential for effective clinical decision-making. The current process, involving drafting by residents followed by review by ophthalmic technicians and ophthalmologists, is time-consuming and prone to errors. This study evaluates the effectiveness of ChatGPT-4o in auditing errors in FFA and ocular B-scan reports and assesses its potential to reduce time and costs within the reporting workflow. Preliminary 100 FFA and 80 ocular B-scan reports drafted by residents were analyzed using GPT-4o to identify the errors in identifying left or right eye and incorrect anatomical descriptions. The accuracy of GPT-4o was compared to retinal specialists, general ophthalmologists, and ophthalmic technicians. Additionally, a cost-effective analysis was conducted to estimate time and cost savings from integrating GPT-4o into the reporting process. A pilot real-world validation with 20 erroneous reports was also performed between GPT-4o and human reviewers. GPT-4o demonstrated a detection rate of 79.0% (158 of 200; 95% CI 73.0-85.0) across all examinations, which was comparable to the average detection performance of general ophthalmologists (78.0% [155 of 200; 95% CI 72.0-83.0]; P ≥ 0.09). Integration of GPT-4o reduced the average report review time by 86%, completing 180 ophthalmic reports in approximately 0.27 h compared to 2.17-3.19 h by human ophthalmologists. Additionally, compared to human reviewers, GPT-4o lowered the cost from $0.21 to $0.03 per report (savings of $0.18). In the real-world evaluation, GPT-4o detected 18 of 20 errors with no false positives, compared to 95-100% by human reviewers. GPT-4o effectively enhances the accuracy of ophthalmic imaging reports by identifying and correcting common errors. Its implementation can potentially alleviate the workload of ophthalmologists, streamline the reporting process, and reduce associated costs, thereby improving overall clinical workflow and patient outcomes.

Ultrasound LLM Radiology Report Retrospective Clinical Clinical Pilot Academic Lab GenAI

Filter Papers

Tags

Multimodal Foundation Models for Early Disease Detection

Generating Findings for Jaw Cysts in Dental Panoramic Radiographs Using GPT-4o: Building a Two-Stage Self-Correction Loop with Structured Output (SLSO) Framework

Evaluating GPT-4o for emergency disposition of complex respiratory cases with pulmonology consultation: a diagnostic accuracy study.

Current and novel approaches for critical care management of aneurysmal subarachnoid hemorrhage in critical care.

Design of AI-driven microwave imaging for lung tumor monitoring.

Artificial intelligence in regional anesthesia.

Self-Supervised Anatomical Consistency Learning for Vision-Grounded Medical Report Generation

Dolphin v1.0 Technical Report

Automating prostate volume acquisition using abdominal ultrasound scans for prostate-specific antigen density calculations.

Leveraging ChatGPT for Report Error Audit: An Accuracy-Driven and Cost-Efficient Solution for Ophthalmic Imaging Reports.

Ready to Sharpen Your Edge?