Latest Papers on Radiology AI. Tags: Mixed Modality

RadGPT: A system based on a large language model that generates sets of patient-centered materials to explain radiology report information.

Herwald SE, Shah P, Johnston A, Olsen C, Delbrouck JB, Langlotz CP

•papers•Jun 10 2025

The Cures Act Final Rule requires that patients have real-time access to their radiology reports, which contain technical language. Our objective to was to use a novel system called RadGPT, which integrates concept extraction and a large language model (LLM), to help patients understand their radiology reports. RadGPT generated 150 concept explanations and 390 question-and-answer pairs from 30 radiology report impressions from between 2012 and 2020. The extracted concepts were used to create concept-based explanations, as well as concept-based question-and-answer pairs where questions were generated using either a fixed template or an LLM. Additionally, report-based question-and-answer pairs were generated directly from the impression using an LLM without concept extraction. One board-certified radiologist and 4 radiology residents rated the material quality using a standardized rubric. Concept-based LLM-generated questions were significantly higher quality than concept-based template-generated questions (p < 0.001). Excluding those template-based question-and-answer pairs from further analysis, nearly all (> 95%) of RadGPT-generated materials were rated highly, with at least 50% receiving the highest possible ranking from all 5 raters. No answers or explanations were rated as likely to affect the safety or effectiveness of patient care. Report-level LLM-based questions and answers were rated particularly highly, with 92% of report-level LLM-based questions and 61% of the corresponding report-level answers receiving the highest rating from all raters. The educational tool RadGPT generated high-quality explanations and question-and-answer pairs that were personalized for each radiology report, unlikely to produce harmful explanations and likely to enhance patient understanding of radiology information.

Mixed Modality LLM Radiology Report Methodology Prototype Academic Lab GenAI

MedMoE: Modality-Specialized Mixture of Experts for Medical Vision-Language Understanding

Shivang Chopra, Gabriela Sanchez-Rodriguez, Lingchao Mao, Andrew J Feola, Jing Li, Zsolt Kira

•preprint•Jun 10 2025

Different medical imaging modalities capture diagnostic information at varying spatial resolutions, from coarse global patterns to fine-grained localized structures. However, most existing vision-language frameworks in the medical domain apply a uniform strategy for local feature extraction, overlooking the modality-specific demands. In this work, we present MedMoE, a modular and extensible vision-language processing framework that dynamically adapts visual representation based on the diagnostic context. MedMoE incorporates a Mixture-of-Experts (MoE) module conditioned on the report type, which routes multi-scale image features through specialized expert branches trained to capture modality-specific visual semantics. These experts operate over feature pyramids derived from a Swin Transformer backbone, enabling spatially adaptive attention to clinically relevant regions. This framework produces localized visual representations aligned with textual descriptions, without requiring modality-specific supervision at inference. Empirical results on diverse medical benchmarks demonstrate that MedMoE improves alignment and retrieval performance across imaging modalities, underscoring the value of modality-specialized visual representations in clinical vision-language systems.

Mixed Modality LLM Radiology Report Methodology In Silico Academic Lab GenAI

Foundation Models in Medical Imaging -- A Review and Outlook

Vivien van Veldhuizen, Vanessa Botha, Chunyao Lu, Melis Erdal Cesur, Kevin Groot Lipman, Edwin D. de Jong, Hugo Horlings, Clárisa Sanchez, Cees Snoek, Ritse Mann, Eric Marcus, Jonas Teuwen

•preprint•Jun 10 2025

Foundation models (FMs) are changing the way medical images are analyzed by learning from large collections of unlabeled data. Instead of relying on manually annotated examples, FMs are pre-trained to learn general-purpose visual features that can later be adapted to specific clinical tasks with little additional supervision. In this review, we examine how FMs are being developed and applied in pathology, radiology, and ophthalmology, drawing on evidence from over 150 studies. We explain the core components of FM pipelines, including model architectures, self-supervised learning methods, and strategies for downstream adaptation. We also review how FMs are being used in each imaging domain and compare design choices across applications. Finally, we discuss key challenges and open questions to guide future research.

Mixed Modality Review Concept Academic Lab GenAI

PatchGuard: Adversarially Robust Anomaly Detection and Localization through Vision Transformers and Pseudo Anomalies

Mojtaba Nafez, Amirhossein Koochakian, Arad Maleki, Jafar Habibi, Mohammad Hossein Rohban

•preprint•Jun 10 2025

Anomaly Detection (AD) and Anomaly Localization (AL) are crucial in fields that demand high reliability, such as medical imaging and industrial monitoring. However, current AD and AL approaches are often susceptible to adversarial attacks due to limitations in training data, which typically include only normal, unlabeled samples. This study introduces PatchGuard, an adversarially robust AD and AL method that incorporates pseudo anomalies with localization masks within a Vision Transformer (ViT)-based architecture to address these vulnerabilities. We begin by examining the essential properties of pseudo anomalies, and follow it by providing theoretical insights into the attention mechanisms required to enhance the adversarial robustness of AD and AL systems. We then present our approach, which leverages Foreground-Aware Pseudo-Anomalies to overcome the deficiencies of previous anomaly-aware methods. Our method incorporates these crafted pseudo-anomaly samples into a ViT-based framework, with adversarial training guided by a novel loss function designed to improve model robustness, as supported by our theoretical analysis. Experimental results on well-established industrial and medical datasets demonstrate that PatchGuard significantly outperforms previous methods in adversarial settings, achieving performance gains of $53.2\%$ in AD and $68.5\%$ in AL, while also maintaining competitive accuracy in non-adversarial settings. The code repository is available at https://github.com/rohban-lab/PatchGuard .

Mixed Modality Detection Methodology In Silico Academic Lab Open Code

Empirical evaluation of artificial intelligence distillation techniques for ascertaining cancer outcomes from electronic health records.

Riaz IB, Naqvi SAA, Ashraf N, Harris GJ, Kehl KL

•papers•Jun 10 2025

Phenotypic information for cancer research is embedded in unstructured electronic health records (EHR), requiring effort to extract. Deep learning models can automate this but face scalability issues due to privacy concerns. We evaluated techniques for applying a teacher-student framework to extract longitudinal clinical outcomes from EHRs. We focused on the challenging task of ascertaining two cancer outcomes-overall response and progression according to Response Evaluation Criteria in Solid Tumors (RECIST)-from free-text radiology reports. Teacher models with hierarchical Transformer architecture were trained on data from Dana-Farber Cancer Institute (DFCI). These models labeled public datasets (MIMIC-IV, Wiki-text) and GPT-4-generated synthetic data. "Student" models were then trained to mimic the teachers' predictions. DFCI "teacher" models achieved high performance, and student models trained on MIMIC-IV data showed comparable results, demonstrating effective knowledge transfer. However, student models trained on Wiki-text and synthetic data performed worse, emphasizing the need for in-domain public datasets for model distillation.

Mixed Modality LLM Radiology Report Methodology In Silico Academic Lab GenAI Reproducibility

MedMoE: Modality-Specialized Mixture of Experts for Medical Vision-Language Understanding

Shivang Chopra, Lingchao Mao, Gabriela Sanchez-Rodriguez, Andrew J Feola, Jing Li, Zsolt Kira

•preprint•Jun 10 2025

Different medical imaging modalities capture diagnostic information at varying spatial resolutions, from coarse global patterns to fine-grained localized structures. However, most existing vision-language frameworks in the medical domain apply a uniform strategy for local feature extraction, overlooking the modality-specific demands. In this work, we present MedMoE, a modular and extensible vision-language processing framework that dynamically adapts visual representation based on the diagnostic context. MedMoE incorporates a Mixture-of-Experts (MoE) module conditioned on the report type, which routes multi-scale image features through specialized expert branches trained to capture modality-specific visual semantics. These experts operate over feature pyramids derived from a Swin Transformer backbone, enabling spatially adaptive attention to clinically relevant regions. This framework produces localized visual representations aligned with textual descriptions, without requiring modality-specific supervision at inference. Empirical results on diverse medical benchmarks demonstrate that MedMoE improves alignment and retrieval performance across imaging modalities, underscoring the value of modality-specialized visual representations in clinical vision-language systems.

Mixed Modality Report Generation Methodology In Silico Academic Lab GenAI

DIsoN: Decentralized Isolation Networks for Out-of-Distribution Detection in Medical Imaging

Felix Wagner, Pramit Saha, Harry Anthony, J. Alison Noble, Konstantinos Kamnitsas

•preprint•Jun 10 2025

Safe deployment of machine learning (ML) models in safety-critical domains such as medical imaging requires detecting inputs with characteristics not seen during training, known as out-of-distribution (OOD) detection, to prevent unreliable predictions. Effective OOD detection after deployment could benefit from access to the training data, enabling direct comparison between test samples and the training data distribution to identify differences. State-of-the-art OOD detection methods, however, either discard training data after deployment or assume that test samples and training data are centrally stored together, an assumption that rarely holds in real-world settings. This is because shipping training data with the deployed model is usually impossible due to the size of training databases, as well as proprietary or privacy constraints. We introduce the Isolation Network, an OOD detection framework that quantifies the difficulty of separating a target test sample from the training data by solving a binary classification task. We then propose Decentralized Isolation Networks (DIsoN), which enables the comparison of training and test data when data-sharing is impossible, by exchanging only model parameters between the remote computational nodes of training and deployment. We further extend DIsoN with class-conditioning, comparing a target sample solely with training data of its predicted class. We evaluate DIsoN on four medical imaging datasets (dermatology, chest X-ray, breast ultrasound, histopathology) across 12 OOD detection tasks. DIsoN performs favorably against existing methods while respecting data-privacy. This decentralized OOD detection framework opens the way for a new type of service that ML developers could provide along with their models: providing remote, secure utilization of their training data for OOD detection services. Code will be available upon acceptance at: *****

Mixed Modality Classification Methodology In Silico Academic Lab Open Code

SSS: Semi-Supervised SAM-2 with Efficient Prompting for Medical Imaging Segmentation

Hongjie Zhu, Xiwei Liu, Rundong Xue, Zeyu Zhang, Yong Xu, Daji Ergu, Ying Cai, Yang Zhao

•preprint•Jun 10 2025

In the era of information explosion, efficiently leveraging large-scale unlabeled data while minimizing the reliance on high-quality pixel-level annotations remains a critical challenge in the field of medical imaging. Semi-supervised learning (SSL) enhances the utilization of unlabeled data by facilitating knowledge transfer, significantly improving the performance of fully supervised models and emerging as a highly promising research direction in medical image analysis. Inspired by the ability of Vision Foundation Models (e.g., SAM-2) to provide rich prior knowledge, we propose SSS (Semi-Supervised SAM-2), a novel approach that leverages SAM-2's robust feature extraction capabilities to uncover latent knowledge in unlabeled medical images, thus effectively enhancing feature support for fully supervised medical image segmentation. Specifically, building upon the single-stream "weak-to-strong" consistency regularization framework, this paper introduces a Discriminative Feature Enhancement (DFE) mechanism to further explore the feature discrepancies introduced by various data augmentation strategies across multiple views. By leveraging feature similarity and dissimilarity across multi-scale augmentation techniques, the method reconstructs and models the features, thereby effectively optimizing the salient regions. Furthermore, a prompt generator is developed that integrates Physical Constraints with a Sliding Window (PCSW) mechanism to generate input prompts for unlabeled data, fulfilling SAM-2's requirement for additional prompts. Extensive experiments demonstrate the superiority of the proposed method for semi-supervised medical image segmentation on two multi-label datasets, i.e., ACDC and BHSD. Notably, SSS achieves an average Dice score of 53.15 on BHSD, surpassing the previous state-of-the-art method by +3.65 Dice. Code will be available at https://github.com/AIGeeksGroup/SSS.

Mixed Modality Segmentation Cardiac Methodology In Silico Academic Lab Open Code

Machine learning is changing osteoporosis detection: an integrative review.

Zhang Y, Ma M, Huang X, Liu J, Tian C, Duan Z, Fu H, Huang L, Geng B

•papers•Jun 10 2025

Machine learning drives osteoporosis detection and screening with higher clinical accuracy and accessibility than traditional osteoporosis screening tools. This review takes a step-by-step view of machine learning for osteoporosis detection, providing insights into today's osteoporosis detection and the outlook for the future. The early diagnosis and risk detection of osteoporosis have always been crucial and challenging issues in the medical field. With the in-depth application of artificial intelligence technology, especially machine learning technology in the medical field, significant breakthroughs have been made in the application of early diagnosis and risk detection of osteoporosis. Machine learning is a multidimensional technical system that encompasses a wide variety of algorithm types. Machine learning algorithms have become relatively mature and developed over many years in medical data processing. They possess stable and accurate detection performance, laying a solid foundation for the detection and diagnosis of osteoporosis. As an essential part of the machine learning technical system, deep-learning algorithms are complex algorithm models based on artificial neural networks. Due to their robust image recognition and feature extraction capabilities, deep learning algorithms have become increasingly mature in the early diagnosis and risk assessment of osteoporosis in recent years, opening new ideas and approaches for the early and accurate diagnosis and risk detection of osteoporosis. This paper reviewed the latest research over the past decade, ranging from relatively basic and widely adopted machine learning algorithms combined with clinical data to more advanced deep learning techniques integrated with imaging data such as X-ray, CT, and MRI. By analyzing the application of algorithms at different stages, we found that these basic machine learning algorithms performed well when dealing with single structured data but encountered limitations when handling high-dimensional and unstructured imaging data. On the other hand, deep learning can significantly improve detection accuracy. It does this by automatically extracting image features, especially in image histological analysis. However, it faces challenges. These include the "black-box" problem, heavy reliance on large amounts of labeled data, and difficulties in clinical interpretability. These issues highlighted the importance of model interpretability in future machine learning research. Finally, we expect to develop a predictive model in the future that combines multimodal data (such as clinical indicators, blood biochemical indicators, imaging data, and genetic data) integrated with electronic health records and machine learning techniques. This model aims to present a skeletal health monitoring system that is highly accessible, personalized, convenient, and efficient, furthering the early detection and prevention of osteoporosis.

Mixed Modality Classification Musculoskeletal Review Concept Academic Lab Ethics

Foundation Models in Medical Imaging -- A Review and Outlook

Vivien van Veldhuizen, Vanessa Botha, Chunyao Lu, Melis Erdal Cesur, Kevin Groot Lipman, Edwin D. de Jong, Hugo Horlings, Clárisa I. Sanchez, Cees G. M. Snoek, Lodewyk Wessels, Ritse Mann, Eric Marcus, Jonas Teuwen

•preprint•Jun 10 2025

Foundation models (FMs) are changing the way medical images are analyzed by learning from large collections of unlabeled data. Instead of relying on manually annotated examples, FMs are pre-trained to learn general-purpose visual features that can later be adapted to specific clinical tasks with little additional supervision. In this review, we examine how FMs are being developed and applied in pathology, radiology, and ophthalmology, drawing on evidence from over 150 studies. We explain the core components of FM pipelines, including model architectures, self-supervised learning methods, and strategies for downstream adaptation. We also review how FMs are being used in each imaging domain and compare design choices across applications. Finally, we discuss key challenges and open questions to guide future research.

Mixed Modality Review GenAI

Filter Papers

Tags

RadGPT: A system based on a large language model that generates sets of patient-centered materials to explain radiology report information.

MedMoE: Modality-Specialized Mixture of Experts for Medical Vision-Language Understanding

Foundation Models in Medical Imaging -- A Review and Outlook

PatchGuard: Adversarially Robust Anomaly Detection and Localization through Vision Transformers and Pseudo Anomalies

Empirical evaluation of artificial intelligence distillation techniques for ascertaining cancer outcomes from electronic health records.

MedMoE: Modality-Specialized Mixture of Experts for Medical Vision-Language Understanding

DIsoN: Decentralized Isolation Networks for Out-of-Distribution Detection in Medical Imaging

SSS: Semi-Supervised SAM-2 with Efficient Prompting for Medical Imaging Segmentation

Machine learning is changing osteoporosis detection: an integrative review.

Foundation Models in Medical Imaging -- A Review and Outlook

Ready to Sharpen Your Edge?