Latest Papers on Radiology AI. Tags: X-Ray

Label-Efficient Chest X-ray Diagnosis via Partial CLIP Adaptation

Heet Nitinkumar Dalsania

•preprint•Jul 9 2025

Modern deep learning implementations for medical imaging usually rely on large labeled datasets. These datasets are often difficult to obtain due to privacy concerns, high costs, and even scarcity of cases. In this paper, a label-efficient strategy is proposed for chest X-ray diagnosis that seeks to reflect real-world hospital scenarios. The experiments use the NIH Chest X-ray14 dataset and a pre-trained CLIP ViT-B/32 model. The model is adapted via partial fine-tuning of its visual encoder and then evaluated using zero-shot and few-shot learning with 1-16 labeled examples per disease class. The tests demonstrate that CLIP's pre-trained vision-language features can be effectively adapted to few-shot medical imaging tasks, achieving over 20\% improvement in mean AUC score as compared to the zero-shot baseline. The key aspect of this work is to attempt to simulate internal hospital workflows, where image archives exist but annotations are sparse. This work evaluates a practical and scalable solution for both common and rare disease diagnosis. Additionally this research is intended for academic and experimental purposes only and has not been peer reviewed yet. All code is found at https://github.com/heet007-code/CLIP-disease-xray.

X-Ray Classification Chest Methodology In Silico Academic Lab Open Code

Impact of polymer source variations on hydrogel structure and product performance in dexamethasone-loaded ophthalmic inserts.

VandenBerg MA, Zaman RU, Plavchak CL, Smith WC, Nejad HB, Beringhs AO, Wang Y, Xu X

•papers•Jul 9 2025

Localized drug delivery can enhance therapeutic efficacy while minimizing systemic side effects, making sustained-release ophthalmic inserts an attractive alternative to traditional eye drops. Such inserts offer improved patient compliance through prolonged therapeutic effects and a reduced need for frequent administration. This study focuses on dexamethasone-containing ophthalmic inserts. These inserts utilize a key excipient, polyethylene glycol (PEG), which forms a hydrogel upon contact with tear fluid. Developing generic equivalents of PEG-based inserts is challenging due to difficulties in characterizing inactive ingredients and the absence of standardized physicochemical characterization methods to demonstrate similarity. To address this gap, a suite of analytical approaches was applied to both PEG precursor materials sourced from different vendors and manufactured inserts. <sup>1</sup>H NMR, FTIR, MALDI, and SEC revealed variations in end-group functionalization, impurity content, and molecular weight distribution of the excipient. These differences led to changes in the finished insert network properties such as porosity, pore size and structure, gel mechanical strength, and crystallinity, which were corroborated by X-ray microscopy, AI-based image analysis, thermal, mechanical, and density measurements. In vitro release testing revealed distinct drug release profiles across formulations, with swelling rate correlated to release rate (i.e., faster release with rapid swelling). The use of non-micronized and micronized dexamethasone also contributed to release profile differences. Through comprehensive characterization of these PEG-based dexamethasone inserts, correlations between polymer quality, hydrogel microstructure, and release kinetics were established. The study highlights how excipient differences can alter product performance, emphasizing the importance of thorough analysis in developing generic equivalents of complex drug products.

X-Ray Segmentation Methodology In Silico Academic Lab Reproducibility

An autonomous agent for auditing and improving the reliability of clinical AI models

Lukas Kuhn, Florian Buettner

•preprint•Jul 8 2025

The deployment of AI models in clinical practice faces a critical challenge: models achieving expert-level performance on benchmarks can fail catastrophically when confronted with real-world variations in medical imaging. Minor shifts in scanner hardware, lighting or demographics can erode accuracy, but currently reliability auditing to identify such catastrophic failure cases before deployment is a bespoke and time-consuming process. Practitioners lack accessible and interpretable tools to expose and repair hidden failure modes. Here we introduce ModelAuditor, a self-reflective agent that converses with users, selects task-specific metrics, and simulates context-dependent, clinically relevant distribution shifts. ModelAuditor then generates interpretable reports explaining how much performance likely degrades during deployment, discussing specific likely failure modes and identifying root causes and mitigation strategies. Our comprehensive evaluation across three real-world clinical scenarios - inter-institutional variation in histopathology, demographic shifts in dermatology, and equipment heterogeneity in chest radiography - demonstrates that ModelAuditor is able correctly identify context-specific failure modes of state-of-the-art models such as the established SIIM-ISIC melanoma classifier. Its targeted recommendations recover 15-25% of performance lost under real-world distribution shift, substantially outperforming both baseline models and state-of-the-art augmentation methods. These improvements are achieved through a multi-agent architecture and execute on consumer hardware in under 10 minutes, costing less than US$0.50 per audit.

X-Ray Classification Chest Methodology In Silico Academic Lab Benchmark SOTA Reproducibility

A fully automated deep learning framework for age estimation in adults using periapical radiographs of canine teeth.

Upalananda W, Phisutphithayakun C, Assawasuksant P, Tanwattana P, Prasatkaew P

•papers•Jul 8 2025

Determining age from dental remains is vital in forensic investigations, aiding in victim identification and anthropological research. Our framework uses a two-step pipeline: tooth detection followed by age estimation, based on either canine tooth images alone or combined with sex information. The dataset included 2,587 radiographs from 1,004 patients (691 females, 313 males) aged 13.42-85.45 years. The YOLOv8-Nano model achieved exceptional performance in detecting canine teeth, with an F1 score of 0.994, a 98.94% detection success rate, and accurate numbering of all detected teeth. For age estimation, we implemented four convolutional neural network architectures: ResNet-18, DenseNet-121, EfficientNet-B0, and MobileNetV3. Each model was trained to estimate age based on one of the four individual canine teeth (13, 23, 33, and 43). The models achieved median absolute errors ranging from 3.55 to 5.18 years. Incorporating sex as an additional input feature did not improve performance. Moreover, no significant differences in predictive accuracy were observed among the individual teeth. In conclusion, the proposed framework demonstrates potential as a robust and practical tool for forensic age estimation across diverse forensic contexts.

X-Ray Classification Retrospective Clinical In Silico Academic Lab

Investigating the Potential of Generative AI Clinical Case-Based Simulations on Radiography Education: A Pilot Study.

Zhong D, Chow SKK

•papers•Jul 8 2025

Education for medical imaging technologists or radiographers in regional and rural areas often faces significant challenges due to limited financial, technological, and teaching resources. Generative AI presents a promising solution to overcome these barriers and support the professional development of radiographers. This pilot study aimed to evaluate the educational value of an in-house AI-based imaging simulation tool designed to generate clinically relevant medical images for professional training purposes. In July 2023, a professional development lecture featuring AI-generated clinical imaging content was delivered to students (N = 122/130) and recent graduates (N = 155/532), alongside a pre-lecture survey. Following the session, participants completed a questionnaire comprising structured and open-ended items to assess their understanding, perceptions, and interest in AI within medical imaging education. Survey results indicated that both students and graduates possessed a foundational awareness of AI applications in medical imaging. Graduates demonstrated significantly higher expectations for clinical realism in AI-generated simulations, likely reflecting their clinical experience. Although the simulator's current capabilities are limited in replicating complex diagnostic imaging, participants acknowledged its pedagogical value, particularly in supporting basic anatomical education. Approximately 50% of respondents expressed interest in further developing their AI knowledge and contributing to the research and development of AI-based educational tools. AI-driven imaging simulation tools have the potential to enhance radiography education and reduce teaching barriers. While further development is needed to improve clinical fidelity, such tools can play a valuable role in foundational training and foster learner engagement in AI innovation.

X-Ray Image Synthesis Prospective Prototype Academic Lab GenAI

External Validation of an Upgraded AI Model for Screening Ileocolic Intussusception Using Pediatric Abdominal Radiographs: Multicenter Retrospective Study.

Lee JH, Kim PH, Son NH, Han K, Kang Y, Jeong S, Kim EK, Yoon H, Gatidis S, Vasanawala S, Yoon HM, Shin HJ

•papers•Jul 8 2025

Artificial intelligence (AI) is increasingly used in radiology, but its development in pediatric imaging remains limited, particularly for emergent conditions. Ileocolic intussusception is an important cause of acute abdominal pain in infants and toddlers and requires timely diagnosis to prevent complications such as bowel ischemia or perforation. While ultrasonography is the diagnostic standard due to its high sensitivity and specificity, its accessibility may be limited, especially outside tertiary centers. Abdominal radiographs (AXRs), despite their limited sensitivity, are often the first-line imaging modality in clinical practice. In this context, AI could support early screening and triage by analyzing AXRs and identifying patients who require further ultrasonography evaluation. This study aimed to upgrade and externally validate an AI model for screening ileocolic intussusception using pediatric AXRs with multicenter data and to assess the diagnostic performance of the model in comparison with radiologists of varying experience levels with and without AI assistance. This retrospective study included pediatric patients (≤5 years) who underwent both AXRs and ultrasonography for suspected intussusception. Based on the preliminary study from hospital A, the AI model was retrained using data from hospital B and validated with external datasets from hospitals C and D. Diagnostic performance of the upgraded AI model was evaluated using sensitivity, specificity, and the area under the receiver operating characteristic curve (AUC). A reader study was conducted with 3 radiologists, including 2 trainees and 1 pediatric radiologist, to evaluate diagnostic performance with and without AI assistance. Based on the previously developed AI model trained on 746 patients from hospital A, an additional 431 patients from hospital B (including 143 intussusception cases) were used for further training to develop an upgraded AI model. External validation was conducted using data from hospital C (n=68; 19 intussusception cases) and hospital D (n=90; 30 intussusception cases). The upgraded AI model achieved a sensitivity of 81.7% (95% CI 68.6%-90%) and a specificity of 81.7% (95% CI 73.3%-87.8%), with an AUC of 86.2% (95% CI 79.2%-92.1%) in the external validation set. Without AI assistance, radiologists showed lower performance (overall AUC 64%; sensitivity 49.7%; specificity 77.1%). With AI assistance, radiologists' specificity improved to 93% (difference +15.9%; P<.001), and AUC increased to 79.2% (difference +15.2%; P=.05). The least experienced reader showed the largest improvement in specificity (+37.6%; P<.001) and AUC (+14.7%; P=.08). The upgraded AI model improved diagnostic performance for screening ileocolic intussusception on pediatric AXRs. It effectively enhanced the specificity and overall accuracy of radiologists, particularly those with less experience in pediatric radiology. A user-friendly software platform was introduced to support broader clinical validation and underscores the potential of AI as a screening and triage tool in pediatric emergency settings.

X-Ray Detection Abdominal Retrospective Clinical In Silico Academic Lab Benchmark SOTA

A Unified Platform for Radiology Report Generation and Clinician-Centered AI Evaluation

Ma, Z., Yang, X., Atalay, Z., Yang, A., Collins, S., Bai, H., Bernstein, M., Baird, G., Jiao, Z.

•preprint•Jul 8 2025

Generative AI models have demonstrated strong potential in radiology report generation, but their clinical adoption depends on physician trust. In this study, we conducted a radiology-focused Turing test to evaluate how well attendings and residents distinguish AI-generated reports from those written by radiologists, and how their confidence and decision time reflect trust. we developed an integrated web-based platform comprising two core modules: Report Generation and Report Evaluation. Using the web-based platform, eight participants evaluated 48 anonymized X-ray cases, each paired with two reports from three comparison groups: radiologist vs. AI model 1, radiologist vs. AI model 2, and AI model 1 vs. AI model 2. Participants selected the AI-generated report, rated their confidence, and indicated report preference. Attendings outperformed residents in identifying AI-generated reports (49.9% vs. 41.1%) and exhibited longer decision times, suggesting more deliberate judgment. Both groups took more time when both reports were AI-generated. Our findings highlight the role of clinical experience in AI acceptance and the need for design strategies that foster trust in clinical applications. The project page of the evaluation platform is available at: https://zachatalay89.github.io/Labsite.

X-Ray Report Generation Retrospective Clinical In Silico Academic Lab GenAI

Wrist bone segmentation in X-ray images using CT-based simulations

Youssef ElTantawy, Alexia Karantana, Xin Chen

•preprint•Jul 8 2025

Plain X-ray is one of the most common image modalities for clinical diagnosis (e.g. bone fracture, pneumonia, cancer screening, etc.). X-ray image segmentation is an essential step for many computer-aided diagnostic systems, yet it remains challenging. Deep-learning-based methods have achieved superior performance in medical image segmentation tasks but often require a large amount of high-quality annotated data for model training. Providing such an annotated dataset is not only time-consuming but also requires a high level of expertise. This is particularly challenging in wrist bone segmentation in X-rays, due to the interposition of multiple small carpal bones in the image. To overcome the data annotation issue, this work utilizes a large number of simulated X-ray images generated from Computed Tomography (CT) volumes with their corresponding 10 bone labels to train a deep learning-based model for wrist bone segmentation in real X-ray images. The proposed method was evaluated using both simulated images and real images. The method achieved Dice scores ranging from 0.80 to 0.92 for the simulated dataset generated from different view angles. Qualitative analysis of the segmentation results of the real X-ray images also demonstrated the superior performance of the trained model. The trained model and X-ray simulation code are freely available for research purposes: the link will be provided upon acceptance.

X-Ray Segmentation Musculoskeletal Methodology In Silico Open Code

X-ray transferable polyrepresentation learning

Weronika Hryniewska-Guzik, Przemyslaw Biecek

•preprint•Jul 7 2025

The success of machine learning algorithms is inherently related to the extraction of meaningful features, as they play a pivotal role in the performance of these algorithms. Central to this challenge is the quality of data representation. However, the ability to generalize and extract these features effectively from unseen datasets is also crucial. In light of this, we introduce a novel concept: the polyrepresentation. Polyrepresentation integrates multiple representations of the same modality extracted from distinct sources, for example, vector embeddings from the Siamese Network, self-supervised models, and interpretable radiomic features. This approach yields better performance metrics compared to relying on a single representation. Additionally, in the context of X-ray images, we demonstrate the transferability of the created polyrepresentation to a smaller dataset, underscoring its potential as a pragmatic and resource-efficient approach in various image-related solutions. It is worth noting that the concept of polyprepresentation on the example of medical data can also be applied to other domains, showcasing its versatility and broad potential impact.

X-Ray Classification Methodology In Silico Academic Lab Benchmark SOTA

MedGemma Technical Report

Andrew Sellergren, Sahar Kazemzadeh, Tiam Jaroensri, Atilla Kiraly, Madeleine Traverse, Timo Kohlberger, Shawn Xu, Fayaz Jamil, Cían Hughes, Charles Lau, Justin Chen, Fereshteh Mahvar, Liron Yatziv, Tiffany Chen, Bram Sterling, Stefanie Anna Baby, Susanna Maria Baby, Jeremy Lai, Samuel Schmidgall, Lu Yang, Kejia Chen, Per Bjornsson, Shashir Reddy, Ryan Brush, Kenneth Philbrick, Howard Hu, Howard Yang, Richa Tiwari, Sunny Jansen, Preeti Singh, Yun Liu, Shekoofeh Azizi, Aishwarya Kamath, Johan Ferret, Shreya Pathak, Nino Vieillard, Ramona Merhej, Sarah Perrin, Tatiana Matejovicova, Alexandre Ramé, Morgane Riviere, Louis Rouillard, Thomas Mesnard, Geoffrey Cideron, Jean-bastien Grill, Sabela Ramos, Edouard Yvinec, Michelle Casbon, Elena Buchatskaya, Jean-Baptiste Alayrac, Dmitry Lepikhin, Vlad Feinberg, Sebastian Borgeaud, Alek Andreev, Cassidy Hardin, Robert Dadashi, Léonard Hussenot, Armand Joulin, Olivier Bachem, Yossi Matias, Katherine Chou, Avinatan Hassidim, Kavi Goel, Clement Farabet, Joelle Barral, Tris Warkentin, Jonathon Shlens, David Fleet, Victor Cotruta, Omar Sanseviero, Gus Martins, Phoebe Kirk, Anand Rao, Shravya Shetty, David F. Steiner, Can Kirmizibayrak, Rory Pilgrim, Daniel Golden, Lin Yang

•preprint•Jul 7 2025

Artificial intelligence (AI) has significant potential in healthcare applications, but its training and deployment faces challenges due to healthcare's diverse data, complex tasks, and the need to preserve privacy. Foundation models that perform well on medical tasks and require less task-specific tuning data are critical to accelerate the development of healthcare AI applications. We introduce MedGemma, a collection of medical vision-language foundation models based on Gemma 3 4B and 27B. MedGemma demonstrates advanced medical understanding and reasoning on images and text, significantly exceeding the performance of similar-sized generative models and approaching the performance of task-specific models, while maintaining the general capabilities of the Gemma 3 base models. For out-of-distribution tasks, MedGemma achieves 2.6-10% improvement on medical multimodal question answering, 15.5-18.1% improvement on chest X-ray finding classification, and 10.8% improvement on agentic evaluations compared to the base models. Fine-tuning MedGemma further improves performance in subdomains, reducing errors in electronic health record information retrieval by 50% and reaching comparable performance to existing specialized state-of-the-art methods for pneumothorax classification and histopathology patch classification. We additionally introduce MedSigLIP, a medically-tuned vision encoder derived from SigLIP. MedSigLIP powers the visual understanding capabilities of MedGemma and as an encoder achieves comparable or better performance than specialized medical image encoders. Taken together, the MedGemma collection provides a strong foundation of medical image and text capabilities, with potential to significantly accelerate medical research and development of downstream applications. The MedGemma collection, including tutorials and model weights, can be found at https://goo.gle/medgemma.

X-Ray Classification Chest Methodology In Silico Big Tech Benchmark SOTA Open Code

Filter Papers

Tags

Label-Efficient Chest X-ray Diagnosis via Partial CLIP Adaptation

Impact of polymer source variations on hydrogel structure and product performance in dexamethasone-loaded ophthalmic inserts.

An autonomous agent for auditing and improving the reliability of clinical AI models

A fully automated deep learning framework for age estimation in adults using periapical radiographs of canine teeth.

Investigating the Potential of Generative AI Clinical Case-Based Simulations on Radiography Education: A Pilot Study.

External Validation of an Upgraded AI Model for Screening Ileocolic Intussusception Using Pediatric Abdominal Radiographs: Multicenter Retrospective Study.

A Unified Platform for Radiology Report Generation and Clinician-Centered AI Evaluation

Wrist bone segmentation in X-ray images using CT-based simulations

X-ray transferable polyrepresentation learning

MedGemma Technical Report

Ready to Sharpen Your Edge?