Latest Papers on Radiology AI. Tags: GenAI

Advances and Integrations of Computer-Assisted Planning, Artificial Intelligence, and Predictive Modeling Tools for Laser Interstitial Thermal Therapy in Neurosurgical Oncology.

Warman A, Moorthy D, Gensler R, Horowtiz MA, Ellis J, Tomasovic L, Srinivasan E, Ahmed K, Azad TD, Anderson WS, Rincon-Torroella J, Bettegowda C

•papers•Jun 24 2025

Laser interstitial thermal therapy (LiTT) has emerged as a minimally invasive, MRI-guided treatment of brain tumors that are otherwise considered inoperable because of their location or the patient's poor surgical candidacy. By directing thermal energy at neoplastic lesions while minimizing damage to surrounding healthy tissue, LiTT offers promising therapeutic outcomes for both newly diagnosed and recurrent tumors. However, challenges such as postprocedural edema, unpredictable heat diffusion near blood vessels and ventricles in real time underscore the need for improved planning and monitoring. Incorporating artificial intelligence (AI) presents a viable solution to many of these obstacles. AI has already demonstrated effectiveness in optimizing surgical trajectories, predicting seizure-free outcomes in epilepsy cases, and generating heat distribution maps to guide real-time ablation. This technology could be similarly deployed in neurosurgical oncology to identify patients most likely to benefit from LiTT, refine trajectory planning, and predict tissue-specific heat responses. Despite promising initial studies, further research is needed to establish the robust data sets and clinical trials necessary to develop and validate AI-driven LiTT protocols. Such advancements have the potential to bolster LiTT's efficacy, minimize complications, and ultimately transform the neurosurgical management of primary and metastatic brain tumors.

MRI Segmentation Neurological Review Concept Academic Lab GenAI

Comparative Analysis of Multimodal Large Language Models GPT-4o and o1 vs Clinicians in Clinical Case Challenge Questions

Jung, J., Kim, H., Bae, S., Park, J. Y.

•preprint•Jun 23 2025

BackgroundGenerative Pre-trained Transformer 4 (GPT-4) has demonstrated strong performance in standardized medical examinations but has limitations in real-world clinical settings. The newly released multimodal GPT-4o model, which integrates text and image inputs to enhance diagnostic capabilities, and the multimodal o1 model, which incorporates advanced reasoning, may address these limitations. ObjectiveThis study aimed to compare the performance of GPT-4o and o1 against clinicians in real-world clinical case challenges. MethodsThis retrospective, cross-sectional study used Medscape case challenge questions from May 2011 to June 2024 (n = 1,426). Each case included text and images of patient history, physical examination findings, diagnostic test results, and imaging studies. Clinicians were required to choose one answer from among multiple options, with the most frequent response defined as the clinicians decision. Data-based decisions were made using GPT models (3.5 Turbo, 4 Turbo, 4 Omni, and o1) to interpret the text and images, followed by a process to provide a formatted answer. We compared the performances of the clinicians and GPT models using Mixed-effects logistic regression analysis. ResultsOf the 1,426 questions, clinicians achieved an overall accuracy of 85.0%, whereas GPT-4o and o1 demonstrated higher accuracies of 88.4% and 94.3% (mean difference 3.4%; P = .005 and mean difference 9.3%; P < .001), respectively. In the multimodal performance analysis, which included cases involving images (n = 917), GPT-4o achieved an accuracy of 88.3%, and o1 achieved 93.9%, both significantly outperforming clinicians (mean difference 4.2%; P = .005 and mean difference 9.8%; P < .001). o1 showed the highest accuracy across all question categories, achieving 92.6% in diagnosis (mean difference 14.5%; P < .001), 97.0% in disease characteristics (mean difference 7.2%; P < .001), 92.6% in examination (mean difference 7.3%; P = .002), and 94.8% in treatment (mean difference 4.3%; P = .005), consistently outperforming clinicians. In terms of medical specialty, o1 achieved 93.6% accuracy in internal medicine (mean difference 10.3%; P < .001), 96.6% in major surgery (mean difference 9.2%; P = .030), 97.3% in psychiatry (mean difference 10.6%; P = .030), and 95.4% in minor specialties (mean difference 10.0%; P < .001), significantly surpassing clinicians. Across five trials, GPT-4o and o1 provided the correct answer 5/5 times in 86.2% and 90.7% of the cases, respectively. ConclusionsThe GPT-4o and o1 models achieved higher accuracy than clinicians in clinical case challenge questions, particularly in disease diagnosis. The GPT-4o and o1 could serve as valuable tools to assist healthcare professionals in clinical settings.

Mixed Modality Classification Retrospective Clinical In Silico Academic Lab Benchmark SOTA GenAI

From BERT to generative AI - Comparing encoder-only vs. large language models in a cohort of lung cancer patients for named entity recognition in unstructured medical reports.

Arzideh K, Schäfer H, Allende-Cid H, Baldini G, Hilser T, Idrissi-Yaghir A, Laue K, Chakraborty N, Doll N, Antweiler D, Klug K, Beck N, Giesselbach S, Friedrich CM, Nensa F, Schuler M, Hosch R

•papers•Jun 23 2025

Extracting clinical entities from unstructured medical documents is critical for improving clinical decision support and documentation workflows. This study examines the performance of various encoder and decoder models trained for Named Entity Recognition (NER) of clinical parameters in pathology and radiology reports, highlighting the applicability of Large Language Models (LLMs) for this task. Three NER methods were evaluated: (1) flat NER using transformer-based models, (2) nested NER with a multi-task learning setup, and (3) instruction-based NER utilizing LLMs. A dataset of 2013 pathology reports and 413 radiology reports, annotated by medical students, was used for training and testing. The performance of encoder-based NER models (flat and nested) was superior to that of LLM-based approaches. The best-performing flat NER models achieved F1-scores of 0.87-0.88 on pathology reports and up to 0.78 on radiology reports, while nested NER models performed slightly lower. In contrast, multiple LLMs, despite achieving high precision, yielded significantly lower F1-scores (ranging from 0.18 to 0.30) due to poor recall. A contributing factor appears to be that these LLMs produce fewer but more accurate entities, suggesting they become overly conservative when generating outputs. LLMs in their current form are unsuitable for comprehensive entity extraction tasks in clinical domains, particularly when faced with a high number of entity types per document, though instructing them to return more entities in subsequent refinements may improve recall. Additionally, their computational overhead does not provide proportional performance gains. Encoder-based NER models, particularly those pre-trained on biomedical data, remain the preferred choice for extracting information from unstructured medical documents.

Mixed Modality Classification Chest Methodology In Silico Academic Lab GenAI

Ensemble-based Convolutional Neural Networks for brain tumor classification in MRI: Enhancing accuracy and interpretability using explainable AI.

Sánchez-Moreno L, Perez-Peña A, Duran-Lopez L, Dominguez-Morales JP

•papers•Jun 23 2025

Accurate and efficient classification of brain tumors, including gliomas, meningiomas, and pituitary adenomas, is critical for early diagnosis and treatment planning. Magnetic resonance imaging (MRI) is a key diagnostic tool, and deep learning models have shown promise in automating tumor classification. However, challenges remain in achieving high accuracy while maintaining interpretability for clinical use. This study explores the use of transfer learning with pre-trained architectures, including VGG16, DenseNet121, and Inception-ResNet-v2, to classify brain tumors from MRI images. An ensemble-based classifier was developed using a majority voting strategy to improve robustness. To enhance clinical applicability, explainability techniques such as Grad-CAM++ and Integrated Gradients were employed, allowing visualization of model decision-making. The ensemble model outperformed individual Convolutional Neural Network (CNN) architectures, achieving an accuracy of 86.17% in distinguishing gliomas, meningiomas, pituitary adenomas, and benign cases. Interpretability techniques provided heatmaps that identified key regions influencing model predictions, aligning with radiological features and enhancing trust in the results. The proposed ensemble-based deep learning framework improves the accuracy and interpretability of brain tumor classification from MRI images. By combining multiple CNN architectures and integrating explainability methods, this approach offers a more reliable and transparent diagnostic tool to support medical professionals in clinical decision-making.

MRI Classification Neurological Methodology In Silico Academic Lab GenAI

Cost-effectiveness of a novel AI technology to quantify coronary inflammation and cardiovascular risk in patients undergoing routine coronary computed tomography angiography.

Tsiachristas A, Chan K, Wahome E, Kearns B, Patel P, Lyasheva M, Syed N, Fry S, Halborg T, West H, Nicol E, Adlam D, Modi B, Kardos A, Greenwood JP, Sabharwal N, De Maria GL, Munir S, McAlindon E, Sohan Y, Tomlins P, Siddique M, Shirodaria C, Blankstein R, Desai M, Neubauer S, Channon KM, Deanfield J, Akehurst R, Antoniades C

•papers•Jun 23 2025

Coronary computed tomography angiography (CCTA) is a first-line investigation for chest pain in patients with suspected obstructive coronary artery disease (CAD). However, many acute cardiac events occur in the absence of obstructive CAD. We assessed the lifetime cost-effectiveness of integrating a novel artificial intelligence-enhanced image analysis algorithm (AI-Risk) that stratifies the risk of cardiac events by quantifying coronary inflammation, combined with the extent of coronary artery plaque and clinical risk factors, by analysing images from routine CCTA. A hybrid decision-tree with population cohort Markov model was developed from 3393 consecutive patients who underwent routine CCTA for suspected obstructive CAD and followed up for major adverse cardiac events over a median (interquartile range) of 7.7(6.4-9.1) years. In a prospective real-world evaluation survey of 744 consecutive patients undergoing CCTA for chest pain investigation, the availability of AI-Risk assessment led to treatment initiation or intensification in 45% of patients. In a further prospective study of 1214 consecutive patients with extensive guidelines recommended cardiovascular risk profiling, AI-Risk stratification led to treatment initiation or intensification in 39% of patients beyond the current clinical guideline recommendations. Treatment guided by AI-Risk modelled over a lifetime horizon could lead to fewer cardiac events (relative reductions of 11%, 4%, 4%, and 12% for myocardial infarction, ischaemic stroke, heart failure, and cardiac death, respectively). Implementing AI-Risk Classification in routine interpretation of CCTA is highly likely to be cost-effective (incremental cost-effectiveness ratio £1371-3244), both in scenarios of current guideline compliance, or when applied only to patients without obstructive CAD. Compared with standard care, the addition of AI-Risk assessment in routine CCTA interpretation is cost-effective, by refining risk-guided medical management.

CT Classification Cardiac Retrospective Clinical Clinical Pilot Academic Lab GenAI

BrainSymphony: A Transformer-Driven Fusion of fMRI Time Series and Structural Connectivity

Moein Khajehnejad, Forough Habibollahi, Adeel Razi

•preprint•Jun 23 2025

Existing foundation models for neuroimaging are often prohibitively large and data-intensive. We introduce BrainSymphony, a lightweight, parameter-efficient foundation model that achieves state-of-the-art performance while being pre-trained on significantly smaller public datasets. BrainSymphony's strong multimodal architecture processes functional MRI data through parallel spatial and temporal transformer streams, which are then efficiently distilled into a unified representation by a Perceiver module. Concurrently, it models structural connectivity from diffusion MRI using a novel signed graph transformer to encode the brain's anatomical structure. These powerful, modality-specific representations are then integrated via an adaptive fusion gate. Despite its compact design, our model consistently outperforms larger models on a diverse range of downstream benchmarks, including classification, prediction, and unsupervised network identification tasks. Furthermore, our model revealed novel insights into brain dynamics using attention maps on a unique external psilocybin neuroimaging dataset (pre- and post-administration). BrainSymphony establishes that architecturally-aware, multimodal models can surpass their larger counterparts, paving the way for more accessible and powerful research in computational neuroscience.

Mixed Modality Classification Neurological Methodology In Silico Academic Lab Benchmark SOTA GenAI

From "time is brain" to "time is collaterals": updates on the role of cerebral collateral circulation in stroke.

Marilena M, Romana PF, Guido A, Gianluca R, Sebastiano F, Enrico P, Sabrina A

•papers•Jun 22 2025

Acute ischemic stroke (AIS) remains the leading cause of mortality and disability worldwide. While revascularization therapies-such as intravenous thrombolysis (IVT) and endovascular thrombectomy (EVT)-have significantly improved outcomes, their success is strongly influenced by the status of cerebral collateral circulation. Collateral vessels sustain cerebral perfusion during vascular occlusion, limiting infarct growth and extending therapeutic windows. Despite this recognized importance, standardized methods for assessing collateral status and integrating it into treatment strategies are still evolving. This narrative review synthesizes current evidence on the role of collateral circulation in AIS, focusing on its impact on infarct dynamics, treatment efficacy, and functional recovery. We highlight findings from major clinical trials-including MR CLEAN, DAWN, DEFUSE-3, and SWIFT PRIME which consistently demonstrate that robust collateral networks are associated with improved outcomes and expanded eligibility for reperfusion therapies. Advances in neuroimaging, such as multiphase CTA and perfusion MRI, alongside emerging AI-driven automated collateral grading, are reshaping patients' selection and clinical decision-making. We also discuss novel therapeutic strategies aimed at enhancing collateral flow, such as vasodilators, neuroprotective agents, statins, and stem cell therapies. Despite growing evidence supporting collateral-based treatment approaches, real-time clinical implementation remains limited by challenges in standardization and access. Cerebral collateral circulation is a critical determinant of stroke prognosis and treatment response. Incorporating collateral assessment into acute stroke workflows-supported by advanced imaging, artificial intelligence, and personalized medicine-offers a promising pathway to optimize outcomes. As the field moves beyond a strict "time is brain" model, the emerging paradigm of "time is collaterals" may better reflect the dynamic interplay between perfusion, tissue viability, and therapeutic opportunity in AIS management.

Mixed Modality Classification Neurological Review In Silico Academic Lab GenAI

Pre-Trained LLM is a Semantic-Aware and Generalizable Segmentation Booster

Fenghe Tang, Wenxin Ma, Zhiyang He, Xiaodong Tao, Zihang Jiang, S. Kevin Zhou

•preprint•Jun 22 2025

With the advancement of Large Language Model (LLM) for natural language processing, this paper presents an intriguing finding: a frozen pre-trained LLM layer can process visual tokens for medical image segmentation tasks. Specifically, we propose a simple hybrid structure that integrates a pre-trained, frozen LLM layer within the CNN encoder-decoder segmentation framework (LLM4Seg). Surprisingly, this design improves segmentation performance with a minimal increase in trainable parameters across various modalities, including ultrasound, dermoscopy, polypscopy, and CT scans. Our in-depth analysis reveals the potential of transferring LLM's semantic awareness to enhance segmentation tasks, offering both improved global understanding and better local modeling capabilities. The improvement proves robust across different LLMs, validated using LLaMA and DeepSeek.

Mixed Modality Segmentation Methodology In Silico Academic Lab GenAI

CT Radiomics-Based Explainable Machine Learning Model for Accurate Differentiation of Malignant and Benign Endometrial Tumors: A Two-Center Study

Tingrui Zhang, Honglin Wu, Zekun Jiang, Yingying Wang, Rui Ye, Huiming Ni, Chang Liu, Jin Cao, Xuan Sun, Rong Shao, Xiaorong Wei, Yingchun Sun

•preprint•Jun 22 2025

Aimed to develop and validate a CT radiomics-based explainable machine learning model for diagnosing malignancy and benignity specifically in endometrial cancer (EC) patients. A total of 83 EC patients from two centers, including 46 with malignant and 37 with benign conditions, were included, with data split into a training set (n=59) and a testing set (n=24). The regions of interest (ROIs) were manually segmented from pre-surgical CT scans, and 1132 radiomic features were extracted from the pre-surgical CT scans using Pyradiomics. Six explainable machine learning modeling algorithms were implemented respectively, for determining the optimal radiomics pipeline. The diagnostic performance of the radiomic model was evaluated by using sensitivity, specificity, accuracy, precision, F1 score, confusion matrices, and ROC curves. To enhance clinical understanding and usability, we separately implemented SHAP analysis and feature mapping visualization, and evaluated the calibration curve and decision curve. By comparing six modeling strategies, the Random Forest model emerged as the optimal choice for diagnosing EC, with a training AUC of 1.00 and a testing AUC of 0.96. SHAP identified the most important radiomic features, revealing that all selected features were significantly associated with EC (P < 0.05). Radiomics feature maps also provide a feasible assessment tool for clinical applications. DCA indicated a higher net benefit for our model compared to the "All" and "None" strategies, suggesting its clinical utility in identifying high-risk cases and reducing unnecessary interventions. In conclusion, the CT radiomics-based explainable machine learning model achieved high diagnostic performance, which could be used as an intelligent auxiliary tool for the diagnosis of endometrial cancer.

CT Classification Abdominal Retrospective Clinical In Silico Academic Lab GenAI

LLM-driven Medical Report Generation via Communication-efficient Heterogeneous Federated Learning

Haoxuan Che, Haibo Jin, Zhengrui Guo, Yi Lin, Cheng Jin, Hao Chen

•preprint•Jun 21 2025

LLMs have demonstrated significant potential in Medical Report Generation (MRG), yet their development requires large amounts of medical image-report pairs, which are commonly scattered across multiple centers. Centralizing these data is exceptionally challenging due to privacy regulations, thereby impeding model development and broader adoption of LLM-driven MRG models. To address this challenge, we present FedMRG, the first framework that leverages Federated Learning (FL) to enable privacy-preserving, multi-center development of LLM-driven MRG models, specifically designed to overcome the critical challenge of communication-efficient LLM training under multi-modal data heterogeneity. To start with, our framework tackles the fundamental challenge of communication overhead in FL-LLM tuning by employing low-rank factorization to efficiently decompose parameter updates, significantly reducing gradient transmission costs and making LLM-driven MRG feasible in bandwidth-constrained FL settings. Furthermore, we observed the dual heterogeneity in MRG under the FL scenario: varying image characteristics across medical centers, as well as diverse reporting styles and terminology preferences. To address this, we further enhance FedMRG with (1) client-aware contrastive learning in the MRG encoder, coupled with diagnosis-driven prompts, which capture both globally generalizable and locally distinctive features while maintaining diagnostic accuracy; and (2) a dual-adapter mutual boosting mechanism in the MRG decoder that harmonizes generic and specialized adapters to address variations in reporting styles and terminology. Through extensive evaluation of our established FL-MRG benchmark, we demonstrate the generalizability and adaptability of FedMRG, underscoring its potential in harnessing multi-center data and generating clinically accurate reports while maintaining communication efficiency.

Mixed Modality Report Generation Methodology In Silico Academic Lab GenAI Open Dataset

Filter Papers

Tags

Advances and Integrations of Computer-Assisted Planning, Artificial Intelligence, and Predictive Modeling Tools for Laser Interstitial Thermal Therapy in Neurosurgical Oncology.

Comparative Analysis of Multimodal Large Language Models GPT-4o and o1 vs Clinicians in Clinical Case Challenge Questions

From BERT to generative AI - Comparing encoder-only vs. large language models in a cohort of lung cancer patients for named entity recognition in unstructured medical reports.

Ensemble-based Convolutional Neural Networks for brain tumor classification in MRI: Enhancing accuracy and interpretability using explainable AI.

Cost-effectiveness of a novel AI technology to quantify coronary inflammation and cardiovascular risk in patients undergoing routine coronary computed tomography angiography.

BrainSymphony: A Transformer-Driven Fusion of fMRI Time Series and Structural Connectivity

From "time is brain" to "time is collaterals": updates on the role of cerebral collateral circulation in stroke.

Pre-Trained LLM is a Semantic-Aware and Generalizable Segmentation Booster

CT Radiomics-Based Explainable Machine Learning Model for Accurate Differentiation of Malignant and Benign Endometrial Tumors: A Two-Center Study

LLM-driven Medical Report Generation via Communication-efficient Heterogeneous Federated Learning

Ready to Sharpen Your Edge?