Sort by:
Page 35 of 58575 results

MedMoE: Modality-Specialized Mixture of Experts for Medical Vision-Language Understanding

Shivang Chopra, Gabriela Sanchez-Rodriguez, Lingchao Mao, Andrew J Feola, Jing Li, Zsolt Kira

arxiv logopreprintJun 10 2025
Different medical imaging modalities capture diagnostic information at varying spatial resolutions, from coarse global patterns to fine-grained localized structures. However, most existing vision-language frameworks in the medical domain apply a uniform strategy for local feature extraction, overlooking the modality-specific demands. In this work, we present MedMoE, a modular and extensible vision-language processing framework that dynamically adapts visual representation based on the diagnostic context. MedMoE incorporates a Mixture-of-Experts (MoE) module conditioned on the report type, which routes multi-scale image features through specialized expert branches trained to capture modality-specific visual semantics. These experts operate over feature pyramids derived from a Swin Transformer backbone, enabling spatially adaptive attention to clinically relevant regions. This framework produces localized visual representations aligned with textual descriptions, without requiring modality-specific supervision at inference. Empirical results on diverse medical benchmarks demonstrate that MedMoE improves alignment and retrieval performance across imaging modalities, underscoring the value of modality-specialized visual representations in clinical vision-language systems.

Foundation Models in Medical Imaging -- A Review and Outlook

Vivien van Veldhuizen, Vanessa Botha, Chunyao Lu, Melis Erdal Cesur, Kevin Groot Lipman, Edwin D. de Jong, Hugo Horlings, Clárisa I. Sanchez, Cees G. M. Snoek, Lodewyk Wessels, Ritse Mann, Eric Marcus, Jonas Teuwen

arxiv logopreprintJun 10 2025
Foundation models (FMs) are changing the way medical images are analyzed by learning from large collections of unlabeled data. Instead of relying on manually annotated examples, FMs are pre-trained to learn general-purpose visual features that can later be adapted to specific clinical tasks with little additional supervision. In this review, we examine how FMs are being developed and applied in pathology, radiology, and ophthalmology, drawing on evidence from over 150 studies. We explain the core components of FM pipelines, including model architectures, self-supervised learning methods, and strategies for downstream adaptation. We also review how FMs are being used in each imaging domain and compare design choices across applications. Finally, we discuss key challenges and open questions to guide future research.

Advancements and Applications of Hyperpolarized Xenon MRI for COPD Assessment in China.

Li H, Li H, Zhang M, Fang Y, Shen L, Liu X, Xiao S, Zeng Q, Zhou Q, Zhao X, Shi L, Han Y, Zhou X

pubmed logopapersJun 10 2025
Chronic obstructive pulmonary disease (COPD) is one of the leading causes of morbidity and mortality in China, highlighting the importance of early diagnosis and ongoing monitoring for effective management. In recent years, hyperpolarized 129Xe MRI technology has gained significant clinical attention due to its ability to non-invasively and visually assess lung ventilation, microstructure, and gas exchange function. Its recent clinical approval in China, the United States and several European countries, represents a significant advancement in pulmonary imaging. This review provides an overview of the latest developments in hyperpolarized 129Xe MRI technology for COPD assessment in China. It covers the progress in instrument development, advanced imaging techniques, artificial intelligence-driven reconstruction methods, molecular imaging, and the application of this technology in both COPD patients and animal models. Furthermore, the review explores potential technical innovations in 129Xe MRI and discusses future directions for its clinical applications, aiming to address existing challenges and expand the technology's impact in clinical practice.

MAMBO: High-Resolution Generative Approach for Mammography Images

Milica Škipina, Nikola Jovišić, Nicola Dall'Asen, Vanja Švenda, Anil Osman Tur, Slobodan Ilić, Elisa Ricci, Dubravko Ćulibrk

arxiv logopreprintJun 10 2025
Mammography is the gold standard for the detection and diagnosis of breast cancer. This procedure can be significantly enhanced with Artificial Intelligence (AI)-based software, which assists radiologists in identifying abnormalities. However, training AI systems requires large and diverse datasets, which are often difficult to obtain due to privacy and ethical constraints. To address this issue, the paper introduces MAMmography ensemBle mOdel (MAMBO), a novel patch-based diffusion approach designed to generate full-resolution mammograms. Diffusion models have shown breakthrough results in realistic image generation, yet few studies have focused on mammograms, and none have successfully generated high-resolution outputs required to capture fine-grained features of small lesions. To achieve this, MAMBO integrates separate diffusion models to capture both local and global (image-level) contexts. The contextual information is then fed into the final patch-based model, significantly aiding the noise removal process. This thoughtful design enables MAMBO to generate highly realistic mammograms of up to 3840x3840 pixels. Importantly, this approach can be used to enhance the training of classification models and extended to anomaly detection. Experiments, both numerical and radiologist validation, assess MAMBO's capabilities in image generation, super-resolution, and anomaly detection, highlighting its potential to enhance mammography analysis for more accurate diagnoses and earlier lesion detection.

RadGPT: A system based on a large language model that generates sets of patient-centered materials to explain radiology report information.

Herwald SE, Shah P, Johnston A, Olsen C, Delbrouck JB, Langlotz CP

pubmed logopapersJun 10 2025
The Cures Act Final Rule requires that patients have real-time access to their radiology reports, which contain technical language. Our objective to was to use a novel system called RadGPT, which integrates concept extraction and a large language model (LLM), to help patients understand their radiology reports. RadGPT generated 150 concept explanations and 390 question-and-answer pairs from 30 radiology report impressions from between 2012 and 2020. The extracted concepts were used to create concept-based explanations, as well as concept-based question-and-answer pairs where questions were generated using either a fixed template or an LLM. Additionally, report-based question-and-answer pairs were generated directly from the impression using an LLM without concept extraction. One board-certified radiologist and 4 radiology residents rated the material quality using a standardized rubric. Concept-based LLM-generated questions were significantly higher quality than concept-based template-generated questions (p < 0.001). Excluding those template-based question-and-answer pairs from further analysis, nearly all (> 95%) of RadGPT-generated materials were rated highly, with at least 50% receiving the highest possible ranking from all 5 raters. No answers or explanations were rated as likely to affect the safety or effectiveness of patient care. Report-level LLM-based questions and answers were rated particularly highly, with 92% of report-level LLM-based questions and 61% of the corresponding report-level answers receiving the highest rating from all raters. The educational tool RadGPT generated high-quality explanations and question-and-answer pairs that were personalized for each radiology report, unlikely to produce harmful explanations and likely to enhance patient understanding of radiology information.

Foundation Models in Medical Imaging -- A Review and Outlook

Vivien van Veldhuizen, Vanessa Botha, Chunyao Lu, Melis Erdal Cesur, Kevin Groot Lipman, Edwin D. de Jong, Hugo Horlings, Clárisa I. Sanchez, Cees G. M. Snoek, Lodewyk Wessels, Ritse Mann, Eric Marcus, Jonas Teuwen

arxiv logopreprintJun 10 2025
Foundation models (FMs) are changing the way medical images are analyzed by learning from large collections of unlabeled data. Instead of relying on manually annotated examples, FMs are pre-trained to learn general-purpose visual features that can later be adapted to specific clinical tasks with little additional supervision. In this review, we examine how FMs are being developed and applied in pathology, radiology, and ophthalmology, drawing on evidence from over 150 studies. We explain the core components of FM pipelines, including model architectures, self-supervised learning methods, and strategies for downstream adaptation. We also review how FMs are being used in each imaging domain and compare design choices across applications. Finally, we discuss key challenges and open questions to guide future research.

Diagnostic and Technological Advances in Magnetic Resonance (Focusing on Imaging Technique and the Gadolinium-Based Contrast Media), Computed Tomography (Focusing on Photon Counting CT), and Ultrasound-State of the Art.

Runge VM, Heverhagen JT

pubmed logopapersJun 9 2025
Magnetic resonance continues to evolve and advance as a critical imaging modality for disease diagnosis and monitoring. Hardware and software advances continue to propel this modality to the forefront of the field of diagnostic imaging. Next generation MR contrast media, specifically gadolinium chelates with improved relaxivity and stability (relative to the provided contrast effect), have emerged providing a further boost to the field. Concern regarding gadolinium deposition in the body with primarily the weaker gadolinium chelates (which have been now removed from the market, at least in Europe) continues to be at the forefront of clinicians' minds. This has driven renewed interest in possible development of manganese-based contrast media. The development of photon counting CT and its clinical introduction have made possible a further major advance in CT image quality, along with the potential for decreasing radiation dose. The possibility of major clinical advances in thoracic, cardiac, and musculoskeletal imaging were first recognized, with its broader impact - across all organ systems - now also recognized. The utility of routine acquisition (without penalty in time or radiation dose) of full spectral multi-energy data is now also being recognized as an additional major advance made possible by photon counting CT. Artificial intelligence is now being used in the background across most imaging platforms and modalities, making possible further advances in imaging technique and image quality, although this field is nowhere yet near to realizing its full potential. And last, but not least, the field of ultrasound is on the cusp of further major advances in availability (with development of very low-cost systems) and a possible new generation of microbubble contrast media.

Brain tau PET-based identification and characterization of subpopulations in patients with Alzheimer's disease using deep learning-derived saliency maps.

Li Y, Wang X, Ge Q, Graeber MB, Yan S, Li J, Li S, Gu W, Hu S, Benzinger TLS, Lu J, Zhou Y

pubmed logopapersJun 9 2025
Alzheimer's disease (AD) is a heterogeneous neurodegenerative disorder in which tau neurofibrillary tangles are a pathological hallmark closely associated with cognitive dysfunction and neurodegeneration. In this study, we used brain tau data to investigate AD heterogeneity by identifying and characterizing the subpopulations among patients. We included 615 cognitively normal and 159 AD brain <sup>18</sup>F-flortaucipr PET scans, along with T1-weighted MRI from the Alzheimer Disease Neuroimaging Initiative database. A three dimensional-convolutional neural network model was employed for AD detection using standardized uptake value ratio (SUVR) images. The model-derived saliency maps were generated and employed as informative image features for clustering AD participants. Among the identified subpopulations, statistical analysis of demographics, neuropsychological measures, and SUVR were compared. Correlations between neuropsychological measures and regional SUVRs were assessed. A generalized linear model was utilized to investigate the sex and APOE ε4 interaction effect on regional SUVRs. Two distinct subpopulations of AD patients were revealed, denoted as S<sub>Hi</sub> and S<sub>Lo</sub>. Compared to the S<sub>Lo</sub> group, the S<sub>Hi</sub> group exhibited a significantly higher global tau burden in the brain, but both groups showed similar cognition distribution levels. In the S<sub>Hi</sub> group, the associations between the neuropsychological measurements and regional tau deposition were weakened. Moreover, a significant interaction effect of sex and APOE ε4 on tau deposition was observed in the S<sub>Lo</sub> group, but no such effect was found in the S<sub>Hi</sub> group. Our results suggest that tau tangles, as shown by SUVR, continue to accumulate even when cognitive function plateaus in AD patients, highlighting the advantages of PET in later disease stages. The differing relationships between cognition and tau deposition, and between gender, APOE4, and tau deposition, provide potential for subtype-specific treatments. Targeting gender-specific and genetic factors influencing tau deposition, as well as interventions aimed at tau's impact on cognition, may be effective.

Large Language Models in Medical Diagnostics: Scoping Review With Bibliometric Analysis.

Su H, Sun Y, Li R, Zhang A, Yang Y, Xiao F, Duan Z, Chen J, Hu Q, Yang T, Xu B, Zhang Q, Zhao J, Li Y, Li H

pubmed logopapersJun 9 2025
The integration of large language models (LLMs) into medical diagnostics has garnered substantial attention due to their potential to enhance diagnostic accuracy, streamline clinical workflows, and address health care disparities. However, the rapid evolution of LLM research necessitates a comprehensive synthesis of their applications, challenges, and future directions. This scoping review aimed to provide an overview of the current state of research regarding the use of LLMs in medical diagnostics. The study sought to answer four primary subquestions, as follows: (1) Which LLMs are commonly used? (2) How are LLMs assessed in diagnosis? (3) What is the current performance of LLMs in diagnosing diseases? (4) Which medical domains are investigating the application of LLMs? This scoping review was conducted according to the Joanna Briggs Institute Manual for Evidence Synthesis and adheres to the PRISMA-ScR (Preferred Reporting Items for Systematic Reviews and Meta-Analyses extension for Scoping Reviews). Relevant literature was searched from the Web of Science, PubMed, Embase, IEEE Xplore, and ACM Digital Library databases from 2022 to 2025. Articles were screened and selected based on predefined inclusion and exclusion criteria. Bibliometric analysis was performed using VOSviewer to identify major research clusters and trends. Data extraction included details on LLM types, application domains, and performance metrics. The field is rapidly expanding, with a surge in publications after 2023. GPT-4 and its variants dominated research (70/95, 74% of studies), followed by GPT-3.5 (34/95, 36%). Key applications included disease classification (text or image-based), medical question answering, and diagnostic content generation. LLMs demonstrated high accuracy in specialties like radiology, psychiatry, and neurology but exhibited biases in race, gender, and cost predictions. Ethical concerns, including privacy risks and model hallucination, alongside regulatory fragmentation, were critical barriers to clinical adoption. LLMs hold transformative potential for medical diagnostics but require rigorous validation, bias mitigation, and multimodal integration to address real-world complexities. Future research should prioritize explainable artificial intelligence frameworks, specialty-specific optimization, and international regulatory harmonization to ensure equitable and safe clinical deployment.

HAIBU-ReMUD: Reasoning Multimodal Ultrasound Dataset and Model Bridging to General Specific Domains

Shijie Wang, Yilun Zhang, Zeyu Lai, Dexing Kong

arxiv logopreprintJun 9 2025
Multimodal large language models (MLLMs) have shown great potential in general domains but perform poorly in some specific domains due to a lack of domain-specific data, such as image-text data or vedio-text data. In some specific domains, there is abundant graphic and textual data scattered around, but lacks standardized arrangement. In the field of medical ultrasound, there are ultrasonic diagnostic books, ultrasonic clinical guidelines, ultrasonic diagnostic reports, and so on. However, these ultrasonic materials are often saved in the forms of PDF, images, etc., and cannot be directly used for the training of MLLMs. This paper proposes a novel image-text reasoning supervised fine-tuning data generation pipeline to create specific domain quadruplets (image, question, thinking trace, and answer) from domain-specific materials. A medical ultrasound domain dataset ReMUD is established, containing over 45,000 reasoning and non-reasoning supervised fine-tuning Question Answering (QA) and Visual Question Answering (VQA) data. The ReMUD-7B model, fine-tuned on Qwen2.5-VL-7B-Instruct, outperforms general-domain MLLMs in medical ultrasound field. To facilitate research, the ReMUD dataset, data generation codebase, and ReMUD-7B parameters will be released at https://github.com/ShiDaizi/ReMUD, addressing the data shortage issue in specific domain MLLMs.
Page 35 of 58575 results
Show
per page

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.