Sort by:
Page 13 of 35341 results

A Unified Platform for Radiology Report Generation and Clinician-Centered AI Evaluation

Ma, Z., Yang, X., Atalay, Z., Yang, A., Collins, S., Bai, H., Bernstein, M., Baird, G., Jiao, Z.

medrxiv logopreprintJul 8 2025
Generative AI models have demonstrated strong potential in radiology report generation, but their clinical adoption depends on physician trust. In this study, we conducted a radiology-focused Turing test to evaluate how well attendings and residents distinguish AI-generated reports from those written by radiologists, and how their confidence and decision time reflect trust. we developed an integrated web-based platform comprising two core modules: Report Generation and Report Evaluation. Using the web-based platform, eight participants evaluated 48 anonymized X-ray cases, each paired with two reports from three comparison groups: radiologist vs. AI model 1, radiologist vs. AI model 2, and AI model 1 vs. AI model 2. Participants selected the AI-generated report, rated their confidence, and indicated report preference. Attendings outperformed residents in identifying AI-generated reports (49.9% vs. 41.1%) and exhibited longer decision times, suggesting more deliberate judgment. Both groups took more time when both reports were AI-generated. Our findings highlight the role of clinical experience in AI acceptance and the need for design strategies that foster trust in clinical applications. The project page of the evaluation platform is available at: https://zachatalay89.github.io/Labsite.

External Validation of an Upgraded AI Model for Screening Ileocolic Intussusception Using Pediatric Abdominal Radiographs: Multicenter Retrospective Study.

Lee JH, Kim PH, Son NH, Han K, Kang Y, Jeong S, Kim EK, Yoon H, Gatidis S, Vasanawala S, Yoon HM, Shin HJ

pubmed logopapersJul 8 2025
Artificial intelligence (AI) is increasingly used in radiology, but its development in pediatric imaging remains limited, particularly for emergent conditions. Ileocolic intussusception is an important cause of acute abdominal pain in infants and toddlers and requires timely diagnosis to prevent complications such as bowel ischemia or perforation. While ultrasonography is the diagnostic standard due to its high sensitivity and specificity, its accessibility may be limited, especially outside tertiary centers. Abdominal radiographs (AXRs), despite their limited sensitivity, are often the first-line imaging modality in clinical practice. In this context, AI could support early screening and triage by analyzing AXRs and identifying patients who require further ultrasonography evaluation. This study aimed to upgrade and externally validate an AI model for screening ileocolic intussusception using pediatric AXRs with multicenter data and to assess the diagnostic performance of the model in comparison with radiologists of varying experience levels with and without AI assistance. This retrospective study included pediatric patients (≤5 years) who underwent both AXRs and ultrasonography for suspected intussusception. Based on the preliminary study from hospital A, the AI model was retrained using data from hospital B and validated with external datasets from hospitals C and D. Diagnostic performance of the upgraded AI model was evaluated using sensitivity, specificity, and the area under the receiver operating characteristic curve (AUC). A reader study was conducted with 3 radiologists, including 2 trainees and 1 pediatric radiologist, to evaluate diagnostic performance with and without AI assistance. Based on the previously developed AI model trained on 746 patients from hospital A, an additional 431 patients from hospital B (including 143 intussusception cases) were used for further training to develop an upgraded AI model. External validation was conducted using data from hospital C (n=68; 19 intussusception cases) and hospital D (n=90; 30 intussusception cases). The upgraded AI model achieved a sensitivity of 81.7% (95% CI 68.6%-90%) and a specificity of 81.7% (95% CI 73.3%-87.8%), with an AUC of 86.2% (95% CI 79.2%-92.1%) in the external validation set. Without AI assistance, radiologists showed lower performance (overall AUC 64%; sensitivity 49.7%; specificity 77.1%). With AI assistance, radiologists' specificity improved to 93% (difference +15.9%; P<.001), and AUC increased to 79.2% (difference +15.2%; P=.05). The least experienced reader showed the largest improvement in specificity (+37.6%; P<.001) and AUC (+14.7%; P=.08). The upgraded AI model improved diagnostic performance for screening ileocolic intussusception on pediatric AXRs. It effectively enhanced the specificity and overall accuracy of radiologists, particularly those with less experience in pediatric radiology. A user-friendly software platform was introduced to support broader clinical validation and underscores the potential of AI as a screening and triage tool in pediatric emergency settings.

Wrist bone segmentation in X-ray images using CT-based simulations

Youssef ElTantawy, Alexia Karantana, Xin Chen

arxiv logopreprintJul 8 2025
Plain X-ray is one of the most common image modalities for clinical diagnosis (e.g. bone fracture, pneumonia, cancer screening, etc.). X-ray image segmentation is an essential step for many computer-aided diagnostic systems, yet it remains challenging. Deep-learning-based methods have achieved superior performance in medical image segmentation tasks but often require a large amount of high-quality annotated data for model training. Providing such an annotated dataset is not only time-consuming but also requires a high level of expertise. This is particularly challenging in wrist bone segmentation in X-rays, due to the interposition of multiple small carpal bones in the image. To overcome the data annotation issue, this work utilizes a large number of simulated X-ray images generated from Computed Tomography (CT) volumes with their corresponding 10 bone labels to train a deep learning-based model for wrist bone segmentation in real X-ray images. The proposed method was evaluated using both simulated images and real images. The method achieved Dice scores ranging from 0.80 to 0.92 for the simulated dataset generated from different view angles. Qualitative analysis of the segmentation results of the real X-ray images also demonstrated the superior performance of the trained model. The trained model and X-ray simulation code are freely available for research purposes: the link will be provided upon acceptance.

An autonomous agent for auditing and improving the reliability of clinical AI models

Lukas Kuhn, Florian Buettner

arxiv logopreprintJul 8 2025
The deployment of AI models in clinical practice faces a critical challenge: models achieving expert-level performance on benchmarks can fail catastrophically when confronted with real-world variations in medical imaging. Minor shifts in scanner hardware, lighting or demographics can erode accuracy, but currently reliability auditing to identify such catastrophic failure cases before deployment is a bespoke and time-consuming process. Practitioners lack accessible and interpretable tools to expose and repair hidden failure modes. Here we introduce ModelAuditor, a self-reflective agent that converses with users, selects task-specific metrics, and simulates context-dependent, clinically relevant distribution shifts. ModelAuditor then generates interpretable reports explaining how much performance likely degrades during deployment, discussing specific likely failure modes and identifying root causes and mitigation strategies. Our comprehensive evaluation across three real-world clinical scenarios - inter-institutional variation in histopathology, demographic shifts in dermatology, and equipment heterogeneity in chest radiography - demonstrates that ModelAuditor is able correctly identify context-specific failure modes of state-of-the-art models such as the established SIIM-ISIC melanoma classifier. Its targeted recommendations recover 15-25% of performance lost under real-world distribution shift, substantially outperforming both baseline models and state-of-the-art augmentation methods. These improvements are achieved through a multi-agent architecture and execute on consumer hardware in under 10 minutes, costing less than US$0.50 per audit.

Investigating the Potential of Generative AI Clinical Case-Based Simulations on Radiography Education: A Pilot Study.

Zhong D, Chow SKK

pubmed logopapersJul 8 2025
Education for medical imaging technologists or radiographers in regional and rural areas often faces significant challenges due to limited financial, technological, and teaching resources. Generative AI presents a promising solution to overcome these barriers and support the professional development of radiographers. This pilot study aimed to evaluate the educational value of an in-house AI-based imaging simulation tool designed to generate clinically relevant medical images for professional training purposes. In July 2023, a professional development lecture featuring AI-generated clinical imaging content was delivered to students (N = 122/130) and recent graduates (N = 155/532), alongside a pre-lecture survey. Following the session, participants completed a questionnaire comprising structured and open-ended items to assess their understanding, perceptions, and interest in AI within medical imaging education. Survey results indicated that both students and graduates possessed a foundational awareness of AI applications in medical imaging. Graduates demonstrated significantly higher expectations for clinical realism in AI-generated simulations, likely reflecting their clinical experience. Although the simulator's current capabilities are limited in replicating complex diagnostic imaging, participants acknowledged its pedagogical value, particularly in supporting basic anatomical education. Approximately 50% of respondents expressed interest in further developing their AI knowledge and contributing to the research and development of AI-based educational tools. AI-driven imaging simulation tools have the potential to enhance radiography education and reduce teaching barriers. While further development is needed to improve clinical fidelity, such tools can play a valuable role in foundational training and foster learner engagement in AI innovation.

A fully automated deep learning framework for age estimation in adults using periapical radiographs of canine teeth.

Upalananda W, Phisutphithayakun C, Assawasuksant P, Tanwattana P, Prasatkaew P

pubmed logopapersJul 8 2025
Determining age from dental remains is vital in forensic investigations, aiding in victim identification and anthropological research. Our framework uses a two-step pipeline: tooth detection followed by age estimation, based on either canine tooth images alone or combined with sex information. The dataset included 2,587 radiographs from 1,004 patients (691 females, 313 males) aged 13.42-85.45 years. The YOLOv8-Nano model achieved exceptional performance in detecting canine teeth, with an F1 score of 0.994, a 98.94% detection success rate, and accurate numbering of all detected teeth. For age estimation, we implemented four convolutional neural network architectures: ResNet-18, DenseNet-121, EfficientNet-B0, and MobileNetV3. Each model was trained to estimate age based on one of the four individual canine teeth (13, 23, 33, and 43). The models achieved median absolute errors ranging from 3.55 to 5.18 years. Incorporating sex as an additional input feature did not improve performance. Moreover, no significant differences in predictive accuracy were observed among the individual teeth. In conclusion, the proposed framework demonstrates potential as a robust and practical tool for forensic age estimation across diverse forensic contexts.

RADAI: A Deep Learning-Based Classification of Lung Abnormalities in Chest X-Rays.

Aljuaid H, Albalahad H, Alshuaibi W, Almutairi S, Aljohani TH, Hussain N, Mohammad F

pubmed logopapersJul 7 2025
<b>Background:</b> Chest X-rays are rapidly gaining prominence as a prevalent diagnostic tool, as recognized by the World Health Organization (WHO). However, interpreting chest X-rays can be demanding and time-consuming, even for experienced radiologists, leading to potential misinterpretations and delays in treatment. <b>Method:</b> The purpose of this research is the development of a RadAI model. The RadAI model can accurately detect four types of lung abnormalities in chest X-rays and generate a report on each identified abnormality. Moreover, deep learning algorithms, particularly convolutional neural networks (CNNs), have demonstrated remarkable potential in automating medical image analysis, including chest X-rays. This work addresses the challenge of chest X-ray interpretation by fine tuning the following three advanced deep learning models: Feature-selective and Spatial Receptive Fields Network (FSRFNet50), ResNext50, and ResNet50. These models are compared based on accuracy, precision, recall, and F1-score. <b>Results:</b> The outstanding performance of RadAI shows its potential to assist radiologists to interpret the detected chest abnormalities accurately. <b>Conclusions:</b> RadAI is beneficial in enhancing the accuracy and efficiency of chest X-ray interpretation, ultimately supporting the timely and reliable diagnosis of lung abnormalities.

X-ray transferable polyrepresentation learning

Weronika Hryniewska-Guzik, Przemyslaw Biecek

arxiv logopreprintJul 7 2025
The success of machine learning algorithms is inherently related to the extraction of meaningful features, as they play a pivotal role in the performance of these algorithms. Central to this challenge is the quality of data representation. However, the ability to generalize and extract these features effectively from unseen datasets is also crucial. In light of this, we introduce a novel concept: the polyrepresentation. Polyrepresentation integrates multiple representations of the same modality extracted from distinct sources, for example, vector embeddings from the Siamese Network, self-supervised models, and interpretable radiomic features. This approach yields better performance metrics compared to relying on a single representation. Additionally, in the context of X-ray images, we demonstrate the transferability of the created polyrepresentation to a smaller dataset, underscoring its potential as a pragmatic and resource-efficient approach in various image-related solutions. It is worth noting that the concept of polyprepresentation on the example of medical data can also be applied to other domains, showcasing its versatility and broad potential impact.

MedGemma Technical Report

Andrew Sellergren, Sahar Kazemzadeh, Tiam Jaroensri, Atilla Kiraly, Madeleine Traverse, Timo Kohlberger, Shawn Xu, Fayaz Jamil, Cían Hughes, Charles Lau, Justin Chen, Fereshteh Mahvar, Liron Yatziv, Tiffany Chen, Bram Sterling, Stefanie Anna Baby, Susanna Maria Baby, Jeremy Lai, Samuel Schmidgall, Lu Yang, Kejia Chen, Per Bjornsson, Shashir Reddy, Ryan Brush, Kenneth Philbrick, Howard Hu, Howard Yang, Richa Tiwari, Sunny Jansen, Preeti Singh, Yun Liu, Shekoofeh Azizi, Aishwarya Kamath, Johan Ferret, Shreya Pathak, Nino Vieillard, Ramona Merhej, Sarah Perrin, Tatiana Matejovicova, Alexandre Ramé, Morgane Riviere, Louis Rouillard, Thomas Mesnard, Geoffrey Cideron, Jean-bastien Grill, Sabela Ramos, Edouard Yvinec, Michelle Casbon, Elena Buchatskaya, Jean-Baptiste Alayrac, Dmitry Lepikhin, Vlad Feinberg, Sebastian Borgeaud, Alek Andreev, Cassidy Hardin, Robert Dadashi, Léonard Hussenot, Armand Joulin, Olivier Bachem, Yossi Matias, Katherine Chou, Avinatan Hassidim, Kavi Goel, Clement Farabet, Joelle Barral, Tris Warkentin, Jonathon Shlens, David Fleet, Victor Cotruta, Omar Sanseviero, Gus Martins, Phoebe Kirk, Anand Rao, Shravya Shetty, David F. Steiner, Can Kirmizibayrak, Rory Pilgrim, Daniel Golden, Lin Yang

arxiv logopreprintJul 7 2025
Artificial intelligence (AI) has significant potential in healthcare applications, but its training and deployment faces challenges due to healthcare's diverse data, complex tasks, and the need to preserve privacy. Foundation models that perform well on medical tasks and require less task-specific tuning data are critical to accelerate the development of healthcare AI applications. We introduce MedGemma, a collection of medical vision-language foundation models based on Gemma 3 4B and 27B. MedGemma demonstrates advanced medical understanding and reasoning on images and text, significantly exceeding the performance of similar-sized generative models and approaching the performance of task-specific models, while maintaining the general capabilities of the Gemma 3 base models. For out-of-distribution tasks, MedGemma achieves 2.6-10% improvement on medical multimodal question answering, 15.5-18.1% improvement on chest X-ray finding classification, and 10.8% improvement on agentic evaluations compared to the base models. Fine-tuning MedGemma further improves performance in subdomains, reducing errors in electronic health record information retrieval by 50% and reaching comparable performance to existing specialized state-of-the-art methods for pneumothorax classification and histopathology patch classification. We additionally introduce MedSigLIP, a medically-tuned vision encoder derived from SigLIP. MedSigLIP powers the visual understanding capabilities of MedGemma and as an encoder achieves comparable or better performance than specialized medical image encoders. Taken together, the MedGemma collection provides a strong foundation of medical image and text capabilities, with potential to significantly accelerate medical research and development of downstream applications. The MedGemma collection, including tutorials and model weights, can be found at https://goo.gle/medgemma.

SV-DRR: High-Fidelity Novel View X-Ray Synthesis Using Diffusion Model

Chun Xie, Yuichi Yoshii, Itaru Kitahara

arxiv logopreprintJul 7 2025
X-ray imaging is a rapid and cost-effective tool for visualizing internal human anatomy. While multi-view X-ray imaging provides complementary information that enhances diagnosis, intervention, and education, acquiring images from multiple angles increases radiation exposure and complicates clinical workflows. To address these challenges, we propose a novel view-conditioned diffusion model for synthesizing multi-view X-ray images from a single view. Unlike prior methods, which are limited in angular range, resolution, and image quality, our approach leverages the Diffusion Transformer to preserve fine details and employs a weak-to-strong training strategy for stable high-resolution image generation. Experimental results demonstrate that our method generates higher-resolution outputs with improved control over viewing angles. This capability has significant implications not only for clinical applications but also for medical education and data extension, enabling the creation of diverse, high-quality datasets for training and analysis. Our code is available at GitHub.
Page 13 of 35341 results
Show
per page

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.