Sort by:
Page 67 of 99986 results

Comparative Analysis of Multimodal Large Language Models GPT-4o and o1 vs Clinicians in Clinical Case Challenge Questions

Jung, J., Kim, H., Bae, S., Park, J. Y.

medrxiv logopreprintJun 23 2025
BackgroundGenerative Pre-trained Transformer 4 (GPT-4) has demonstrated strong performance in standardized medical examinations but has limitations in real-world clinical settings. The newly released multimodal GPT-4o model, which integrates text and image inputs to enhance diagnostic capabilities, and the multimodal o1 model, which incorporates advanced reasoning, may address these limitations. ObjectiveThis study aimed to compare the performance of GPT-4o and o1 against clinicians in real-world clinical case challenges. MethodsThis retrospective, cross-sectional study used Medscape case challenge questions from May 2011 to June 2024 (n = 1,426). Each case included text and images of patient history, physical examination findings, diagnostic test results, and imaging studies. Clinicians were required to choose one answer from among multiple options, with the most frequent response defined as the clinicians decision. Data-based decisions were made using GPT models (3.5 Turbo, 4 Turbo, 4 Omni, and o1) to interpret the text and images, followed by a process to provide a formatted answer. We compared the performances of the clinicians and GPT models using Mixed-effects logistic regression analysis. ResultsOf the 1,426 questions, clinicians achieved an overall accuracy of 85.0%, whereas GPT-4o and o1 demonstrated higher accuracies of 88.4% and 94.3% (mean difference 3.4%; P = .005 and mean difference 9.3%; P < .001), respectively. In the multimodal performance analysis, which included cases involving images (n = 917), GPT-4o achieved an accuracy of 88.3%, and o1 achieved 93.9%, both significantly outperforming clinicians (mean difference 4.2%; P = .005 and mean difference 9.8%; P < .001). o1 showed the highest accuracy across all question categories, achieving 92.6% in diagnosis (mean difference 14.5%; P < .001), 97.0% in disease characteristics (mean difference 7.2%; P < .001), 92.6% in examination (mean difference 7.3%; P = .002), and 94.8% in treatment (mean difference 4.3%; P = .005), consistently outperforming clinicians. In terms of medical specialty, o1 achieved 93.6% accuracy in internal medicine (mean difference 10.3%; P < .001), 96.6% in major surgery (mean difference 9.2%; P = .030), 97.3% in psychiatry (mean difference 10.6%; P = .030), and 95.4% in minor specialties (mean difference 10.0%; P < .001), significantly surpassing clinicians. Across five trials, GPT-4o and o1 provided the correct answer 5/5 times in 86.2% and 90.7% of the cases, respectively. ConclusionsThe GPT-4o and o1 models achieved higher accuracy than clinicians in clinical case challenge questions, particularly in disease diagnosis. The GPT-4o and o1 could serve as valuable tools to assist healthcare professionals in clinical settings.

Open Set Recognition for Endoscopic Image Classification: A Deep Learning Approach on the Kvasir Dataset

Kasra Moazzami, Seoyoun Son, John Lin, Sun Min Lee, Daniel Son, Hayeon Lee, Jeongho Lee, Seongji Lee

arxiv logopreprintJun 23 2025
Endoscopic image classification plays a pivotal role in medical diagnostics by identifying anatomical landmarks and pathological findings. However, conventional closed-set classification frameworks are inherently limited in open-world clinical settings, where previously unseen conditions can arise andcompromise model reliability. To address this, we explore the application of Open Set Recognition (OSR) techniques on the Kvasir dataset, a publicly available and diverse endoscopic image collection. In this study, we evaluate and compare the OSR capabilities of several representative deep learning architectures, including ResNet-50, Swin Transformer, and a hybrid ResNet-Transformer model, under both closed-set and open-set conditions. OpenMax is adopted as a baseline OSR method to assess the ability of these models to distinguish known classes from previously unseen categories. This work represents one of the first efforts to apply open set recognition to the Kvasir dataset and provides a foundational benchmark for evaluating OSR performance in medical image analysis. Our results offer practical insights into model behavior in clinically realistic settings and highlight the importance of OSR techniques for the safe deployment of AI systems in endoscopy.

BrainSymphony: A Transformer-Driven Fusion of fMRI Time Series and Structural Connectivity

Moein Khajehnejad, Forough Habibollahi, Adeel Razi

arxiv logopreprintJun 23 2025
Existing foundation models for neuroimaging are often prohibitively large and data-intensive. We introduce BrainSymphony, a lightweight, parameter-efficient foundation model that achieves state-of-the-art performance while being pre-trained on significantly smaller public datasets. BrainSymphony's strong multimodal architecture processes functional MRI data through parallel spatial and temporal transformer streams, which are then efficiently distilled into a unified representation by a Perceiver module. Concurrently, it models structural connectivity from diffusion MRI using a novel signed graph transformer to encode the brain's anatomical structure. These powerful, modality-specific representations are then integrated via an adaptive fusion gate. Despite its compact design, our model consistently outperforms larger models on a diverse range of downstream benchmarks, including classification, prediction, and unsupervised network identification tasks. Furthermore, our model revealed novel insights into brain dynamics using attention maps on a unique external psilocybin neuroimaging dataset (pre- and post-administration). BrainSymphony establishes that architecturally-aware, multimodal models can surpass their larger counterparts, paving the way for more accessible and powerful research in computational neuroscience.

SafeClick: Error-Tolerant Interactive Segmentation of Any Medical Volumes via Hierarchical Expert Consensus

Yifan Gao, Jiaxi Sheng, Wenbin Wu, Haoyue Li, Yaoxian Dong, Chaoyang Ge, Feng Yuan, Xin Gao

arxiv logopreprintJun 23 2025
Foundation models for volumetric medical image segmentation have emerged as powerful tools in clinical workflows, enabling radiologists to delineate regions of interest through intuitive clicks. While these models demonstrate promising capabilities in segmenting previously unseen anatomical structures, their performance is strongly influenced by prompt quality. In clinical settings, radiologists often provide suboptimal prompts, which affects segmentation reliability and accuracy. To address this limitation, we present SafeClick, an error-tolerant interactive segmentation approach for medical volumes based on hierarchical expert consensus. SafeClick operates as a plug-and-play module compatible with foundation models including SAM 2 and MedSAM 2. The framework consists of two key components: a collaborative expert layer (CEL) that generates diverse feature representations through specialized transformer modules, and a consensus reasoning layer (CRL) that performs cross-referencing and adaptive integration of these features. This architecture transforms the segmentation process from a prompt-dependent operation to a robust framework capable of producing accurate results despite imperfect user inputs. Extensive experiments across 15 public datasets demonstrate that our plug-and-play approach consistently improves the performance of base foundation models, with particularly significant gains when working with imperfect prompts. The source code is available at https://github.com/yifangao112/SafeClick.

Pre-Trained LLM is a Semantic-Aware and Generalizable Segmentation Booster

Fenghe Tang, Wenxin Ma, Zhiyang He, Xiaodong Tao, Zihang Jiang, S. Kevin Zhou

arxiv logopreprintJun 22 2025
With the advancement of Large Language Model (LLM) for natural language processing, this paper presents an intriguing finding: a frozen pre-trained LLM layer can process visual tokens for medical image segmentation tasks. Specifically, we propose a simple hybrid structure that integrates a pre-trained, frozen LLM layer within the CNN encoder-decoder segmentation framework (LLM4Seg). Surprisingly, this design improves segmentation performance with a minimal increase in trainable parameters across various modalities, including ultrasound, dermoscopy, polypscopy, and CT scans. Our in-depth analysis reveals the potential of transferring LLM's semantic awareness to enhance segmentation tasks, offering both improved global understanding and better local modeling capabilities. The improvement proves robust across different LLMs, validated using LLaMA and DeepSeek.

From "time is brain" to "time is collaterals": updates on the role of cerebral collateral circulation in stroke.

Marilena M, Romana PF, Guido A, Gianluca R, Sebastiano F, Enrico P, Sabrina A

pubmed logopapersJun 22 2025
Acute ischemic stroke (AIS) remains the leading cause of mortality and disability worldwide. While revascularization therapies-such as intravenous thrombolysis (IVT) and endovascular thrombectomy (EVT)-have significantly improved outcomes, their success is strongly influenced by the status of cerebral collateral circulation. Collateral vessels sustain cerebral perfusion during vascular occlusion, limiting infarct growth and extending therapeutic windows. Despite this recognized importance, standardized methods for assessing collateral status and integrating it into treatment strategies are still evolving. This narrative review synthesizes current evidence on the role of collateral circulation in AIS, focusing on its impact on infarct dynamics, treatment efficacy, and functional recovery. We highlight findings from major clinical trials-including MR CLEAN, DAWN, DEFUSE-3, and SWIFT PRIME which consistently demonstrate that robust collateral networks are associated with improved outcomes and expanded eligibility for reperfusion therapies. Advances in neuroimaging, such as multiphase CTA and perfusion MRI, alongside emerging AI-driven automated collateral grading, are reshaping patients' selection and clinical decision-making. We also discuss novel therapeutic strategies aimed at enhancing collateral flow, such as vasodilators, neuroprotective agents, statins, and stem cell therapies. Despite growing evidence supporting collateral-based treatment approaches, real-time clinical implementation remains limited by challenges in standardization and access. Cerebral collateral circulation is a critical determinant of stroke prognosis and treatment response. Incorporating collateral assessment into acute stroke workflows-supported by advanced imaging, artificial intelligence, and personalized medicine-offers a promising pathway to optimize outcomes. As the field moves beyond a strict "time is brain" model, the emerging paradigm of "time is collaterals" may better reflect the dynamic interplay between perfusion, tissue viability, and therapeutic opportunity in AIS management.

Training-free Test-time Improvement for Explainable Medical Image Classification

Hangzhou He, Jiachen Tang, Lei Zhu, Kaiwen Li, Yanye Lu

arxiv logopreprintJun 22 2025
Deep learning-based medical image classification techniques are rapidly advancing in medical image analysis, making it crucial to develop accurate and trustworthy models that can be efficiently deployed across diverse clinical scenarios. Concept Bottleneck Models (CBMs), which first predict a set of explainable concepts from images and then perform classification based on these concepts, are increasingly being adopted for explainable medical image classification. However, the inherent explainability of CBMs introduces new challenges when deploying trained models to new environments. Variations in imaging protocols and staining methods may induce concept-level shifts, such as alterations in color distribution and scale. Furthermore, since CBM training requires explicit concept annotations, fine-tuning models solely with image-level labels could compromise concept prediction accuracy and faithfulness - a critical limitation given the high cost of acquiring expert-annotated concept labels in medical domains. To address these challenges, we propose a training-free confusion concept identification strategy. By leveraging minimal new data (e.g., 4 images per class) with only image-level labels, our approach enhances out-of-domain performance without sacrificing source domain accuracy through two key operations: masking misactivated confounding concepts and amplifying under-activated discriminative concepts. The efficacy of our method is validated on both skin and white blood cell images. Our code is available at: https://github.com/riverback/TF-TTI-XMed.

LLM-driven Medical Report Generation via Communication-efficient Heterogeneous Federated Learning

Haoxuan Che, Haibo Jin, Zhengrui Guo, Yi Lin, Cheng Jin, Hao Chen

arxiv logopreprintJun 21 2025
LLMs have demonstrated significant potential in Medical Report Generation (MRG), yet their development requires large amounts of medical image-report pairs, which are commonly scattered across multiple centers. Centralizing these data is exceptionally challenging due to privacy regulations, thereby impeding model development and broader adoption of LLM-driven MRG models. To address this challenge, we present FedMRG, the first framework that leverages Federated Learning (FL) to enable privacy-preserving, multi-center development of LLM-driven MRG models, specifically designed to overcome the critical challenge of communication-efficient LLM training under multi-modal data heterogeneity. To start with, our framework tackles the fundamental challenge of communication overhead in FL-LLM tuning by employing low-rank factorization to efficiently decompose parameter updates, significantly reducing gradient transmission costs and making LLM-driven MRG feasible in bandwidth-constrained FL settings. Furthermore, we observed the dual heterogeneity in MRG under the FL scenario: varying image characteristics across medical centers, as well as diverse reporting styles and terminology preferences. To address this, we further enhance FedMRG with (1) client-aware contrastive learning in the MRG encoder, coupled with diagnosis-driven prompts, which capture both globally generalizable and locally distinctive features while maintaining diagnostic accuracy; and (2) a dual-adapter mutual boosting mechanism in the MRG decoder that harmonizes generic and specialized adapters to address variations in reporting styles and terminology. Through extensive evaluation of our established FL-MRG benchmark, we demonstrate the generalizability and adaptability of FedMRG, underscoring its potential in harnessing multi-center data and generating clinically accurate reports while maintaining communication efficiency.

Emergency radiology: roadmap for radiology departments.

Aydin S, Ece B, Cakmak V, Kocak B, Onur MR

pubmed logopapersJun 20 2025
Emergency radiology has evolved into a significant subspecialty over the past 2 decades, facing unique challenges including escalating imaging volumes, increasing study complexity, and heightened expectations from clinicians and patients. This review provides a comprehensive overview of the key requirements for an effective emergency radiology unit. Emergency radiologists play a crucial role in real-time decision-making by providing continuous 24/7 support, requiring expertise across various organ systems and close collaboration with emergency physicians and specialists. Beyond image interpretation, emergency radiologists are responsible for organizing staff schedules, planning equipment, determining imaging protocols, and establishing standardized reporting systems. Operational considerations in emergency radiology departments include efficient scheduling models such as circadian-based scheduling, strategic equipment organization with primary imaging modalities positioned near emergency departments, and effective imaging management through structured ordering systems and standardized protocols. Preparedness for mass casualty incidents requires a well-organized workflow process map detailing steps from patient transfer to image acquisition and interpretation, with clear task allocation and imaging pathways. Collaboration between emergency radiologists and physicians is essential, with accurate communication facilitated through various channels and structured reporting templates. Artificial intelligence has emerged as a transformative tool in emergency radiology, offering potential benefits in both interpretative domains (detecting intracranial hemorrhage, pulmonary embolism, acute ischemic stroke) and non-interpretative applications (triage systems, protocol assistance, quality control). Despite implementation challenges including clinician skepticism, financial considerations, and ethical issues, AI can enhance diagnostic accuracy and workflow optimization. Teleradiology provides solutions for staff shortages, particularly during off-hours, with hybrid models allowing radiologists to work both on-site and remotely. This review aims to guide stakeholders in establishing and maintaining efficient emergency radiology services to improve patient outcomes.

Research hotspots and development trends in molecular imaging of glioma (2014-2024): A bibliometric review.

Zhou H, Luo Y, Li S, Zhang G, Zeng X

pubmed logopapersJun 20 2025
This study aims to explore research hotspots and development trends in molecular imaging of glioma from 2014 to 2024. A total of 2957 publications indexed in the web of science core collection (WoSCC) were analyzed using bibliometric techniques. To visualize the research landscape, co-citation clustering, keyword analysis, and technological trend mapping were performed using CiteSpace and Excel. Publication output peaked in 2021. Emerging research trends included the integration of radiomics and artificial intelligence and the application of novel imaging modalities such as positron emission tomography and magnetic resonance spectroscopy. Significant progress was observed in blood-brain barrier disruption techniques and the development of molecular probes, especially those targeting IDH and MGMT mutations. Molecular imaging has been pivotal in advancing glioma research, contributing to improved diagnostic accuracy and personalized treatment strategies. However, challenges such as clinical translation and standardization remain. Future studies should focus on integrating advanced technologies into routine clinical practice to enhance patient care.
Page 67 of 99986 results
Show
per page

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.