Sort by:
Page 38 of 79781 results

Automated MRI protocoling in neuroradiology in the era of large language models.

Reiner LN, Chelbi M, Fetscher L, Stöckel JC, Csapó-Schmidt C, Guseynova S, Al Mohamad F, Bressem KK, Nawabi J, Siebert E, Wattjes MP, Scheel M, Meddeb A

pubmed logopapersJul 11 2025
This study investigates the automation of MRI protocoling, a routine task in radiology, using large language models (LLMs), comparing an open-source (LLama 3.1 405B) and a proprietary model (GPT-4o) with and without retrieval-augmented generation (RAG), a method for incorporating domain-specific knowledge. This retrospective study included MRI studies conducted between January and December 2023, along with institution-specific protocol assignment guidelines. Clinical questions were extracted, and a neuroradiologist established the gold standard protocol. LLMs were tasked with assigning MRI protocols and contrast medium administration with and without RAG. The results were compared to protocols selected by four radiologists. Token-based symmetric accuracy, the Wilcoxon signed-rank test, and the McNemar test were used for evaluation. Data from 100 neuroradiology reports (mean age = 54.2 years ± 18.41, women 50%) were included. RAG integration significantly improved accuracy in sequence and contrast media prediction for LLama 3.1 (Sequences: 38% vs. 70%, P < .001, Contrast Media: 77% vs. 94%, P < .001), and GPT-4o (Sequences: 43% vs. 81%, P < .001, Contrast Media: 79% vs. 92%, P = .006). GPT-4o outperformed LLama 3.1 in MRI sequence prediction (81% vs. 70%, P < .001), with comparable accuracies to the radiologists (81% ± 0.21, P = .43). Both models equaled radiologists in predicting contrast media administration (LLama 3.1 RAG: 94% vs. 91% ± 0.2, P = .37, GPT-4o RAG: 92% vs. 91% ± 0.24, P = .48). Large language models show great potential as decision-support tools for MRI protocoling, with performance similar to radiologists. RAG enhances the ability of LLMs to provide accurate, institution-specific protocol recommendations.

Multivariate whole brain neurodegenerative-cognitive-clinical severity mapping in the Alzheimer's disease continuum using explainable AI

Murad, T., Miao, H., Thakuri, D. S., Darekar, G., Chand, G.

medrxiv logopreprintJul 11 2025
Neurodegeneration and cognitive impairment are commonly reported in Alzheimers disease (AD); however, their multivariate links are not well understood. To map the multivariate relationships between whole brain neurodegenerative (WBN) markers, global cognition, and clinical severity in the AD continuum, we developed the explainable artificial intelligence (AI) methods, validated on semi-simulated data, and applied the outperforming method systematically to large-scale experimental data (N=1,756). The outperforming explainable AI method showed robust performance in predicting cognition from regional WBN markers and identified the ground-truth simulated dominant brain regions contributing to cognition. This method also showed excellent performance on experimental data and identified several prominent WBN regions hierarchically and simultaneously associated with cognitive declines across the AD continuum. These multivariate regional features also correlated with clinical severity, suggesting their clinical relevance. Overall, this study innovatively mapped the multivariate regional WBN-cognitive-clinical severity relationships in the AD continuum, thereby significantly advancing AD-relevant neurobiological pathways.

An Enhanced Privacy-preserving Federated Few-shot Learning Framework for Respiratory Disease Diagnosis

Ming Wang, Zhaoyang Duan, Dong Xue, Fangzhou Liu, Zhongheng Zhang

arxiv logopreprintJul 10 2025
The labor-intensive nature of medical data annotation presents a significant challenge for respiratory disease diagnosis, resulting in a scarcity of high-quality labeled datasets in resource-constrained settings. Moreover, patient privacy concerns complicate the direct sharing of local medical data across institutions, and existing centralized data-driven approaches, which rely on amounts of available data, often compromise data privacy. This study proposes a federated few-shot learning framework with privacy-preserving mechanisms to address the issues of limited labeled data and privacy protection in diagnosing respiratory diseases. In particular, a meta-stochastic gradient descent algorithm is proposed to mitigate the overfitting problem that arises from insufficient data when employing traditional gradient descent methods for neural network training. Furthermore, to ensure data privacy against gradient leakage, differential privacy noise from a standard Gaussian distribution is integrated into the gradients during the training of private models with local data, thereby preventing the reconstruction of medical images. Given the impracticality of centralizing respiratory disease data dispersed across various medical institutions, a weighted average algorithm is employed to aggregate local diagnostic models from different clients, enhancing the adaptability of a model across diverse scenarios. Experimental results show that the proposed method yields compelling results with the implementation of differential privacy, while effectively diagnosing respiratory diseases using data from different structures, categories, and distributions.

MeD-3D: A Multimodal Deep Learning Framework for Precise Recurrence Prediction in Clear Cell Renal Cell Carcinoma (ccRCC)

Hasaan Maqsood, Saif Ur Rehman Khan

arxiv logopreprintJul 10 2025
Accurate prediction of recurrence in clear cell renal cell carcinoma (ccRCC) remains a major clinical challenge due to the disease complex molecular, pathological, and clinical heterogeneity. Traditional prognostic models, which rely on single data modalities such as radiology, histopathology, or genomics, often fail to capture the full spectrum of disease complexity, resulting in suboptimal predictive accuracy. This study aims to overcome these limitations by proposing a deep learning (DL) framework that integrates multimodal data, including CT, MRI, histopathology whole slide images (WSI), clinical data, and genomic profiles, to improve the prediction of ccRCC recurrence and enhance clinical decision-making. The proposed framework utilizes a comprehensive dataset curated from multiple publicly available sources, including TCGA, TCIA, and CPTAC. To process the diverse modalities, domain-specific models are employed: CLAM, a ResNet50-based model, is used for histopathology WSIs, while MeD-3D, a pre-trained 3D-ResNet18 model, processes CT and MRI images. For structured clinical and genomic data, a multi-layer perceptron (MLP) is used. These models are designed to extract deep feature embeddings from each modality, which are then fused through an early and late integration architecture. This fusion strategy enables the model to combine complementary information from multiple sources. Additionally, the framework is designed to handle incomplete data, a common challenge in clinical settings, by enabling inference even when certain modalities are missing.

Breast Ultrasound Tumor Generation via Mask Generator and Text-Guided Network:A Clinically Controllable Framework with Downstream Evaluation

Haoyu Pan, Hongxin Lin, Zetian Feng, Chuxuan Lin, Junyang Mo, Chu Zhang, Zijian Wu, Yi Wang, Qingqing Zheng

arxiv logopreprintJul 10 2025
The development of robust deep learning models for breast ultrasound (BUS) image analysis is significantly constrained by the scarcity of expert-annotated data. To address this limitation, we propose a clinically controllable generative framework for synthesizing BUS images. This framework integrates clinical descriptions with structural masks to generate tumors, enabling fine-grained control over tumor characteristics such as morphology, echogencity, and shape. Furthermore, we design a semantic-curvature mask generator, which synthesizes structurally diverse tumor masks guided by clinical priors. During inference, synthetic tumor masks serve as input to the generative framework, producing highly personalized synthetic BUS images with tumors that reflect real-world morphological diversity. Quantitative evaluations on six public BUS datasets demonstrate the significant clinical utility of our synthetic images, showing their effectiveness in enhancing downstream breast cancer diagnosis tasks. Furthermore, visual Turing tests conducted by experienced sonographers confirm the realism of the generated images, indicating the framework's potential to support broader clinical applications.

Data Extraction and Curation from Radiology Reports for Pancreatic Cyst Surveillance Using Large Language Models.

Choubey AP, Eguia E, Hollingsworth A, Chatterjee S, D'Angelica MI, Jarnagin WR, Wei AC, Schattner MA, Do RKG, Soares KC

pubmed logopapersJul 10 2025
Manual curation of radiographic features in pancreatic cyst registries for data abstraction and longitudinal evaluation is time consuming and limits widespread implementation. We examined the feasibility and accuracy of using large language models (LLMs) to extract clinical variables from radiology reports. A single center retrospective study included patients under surveillance for pancreatic cysts. Nine radiographic elements used to monitor cyst progression were included: cyst size, main pancreatic duct (MPD) size (continuous variable), number of lesions, MPD dilation ≥5mm (categorical), branch duct dilation, presence of solid component, calcific lesion, pancreatic atrophy, and pancreatitis. LLMs (GPT) on the OpenAI GPT-4 platform were employed to extract elements of interest with a zero-shot learning approach using prompting to facilitate annotation without any training data. A manually annotated institutional cyst database was used as the ground truth (GT) for comparison. Overall, 3198 longitudinal scans from 991 patients were included. GPT successfully extracted the selected radiographic elements with high accuracy. Among categorical variables, accuracy ranged from 97% for solid component to 99% for calcific lesions. In the continuous variables, accuracy varied from 92% for cyst size to 97% for MPD size. However, Cohen's Kappa was higher for cyst size (0.92) compared to MPD size (0.82). Lowest accuracy (81%) was noted in the multi-class variable for number of cysts. LLM can accurately extract and curate data from radiology reports for pancreatic cyst surveillance and can be reliably used to assemble longitudinal databases. Future application of this work may potentiate the development of artificial intelligence-based surveillance models.

MCA-RG: Enhancing LLMs with Medical Concept Alignment for Radiology Report Generation

Qilong Xing, Zikai Song, Youjia Zhang, Na Feng, Junqing Yu, Wei Yang

arxiv logopreprintJul 9 2025
Despite significant advancements in adapting Large Language Models (LLMs) for radiology report generation (RRG), clinical adoption remains challenging due to difficulties in accurately mapping pathological and anatomical features to their corresponding text descriptions. Additionally, semantic agnostic feature extraction further hampers the generation of accurate diagnostic reports. To address these challenges, we introduce Medical Concept Aligned Radiology Report Generation (MCA-RG), a knowledge-driven framework that explicitly aligns visual features with distinct medical concepts to enhance the report generation process. MCA-RG utilizes two curated concept banks: a pathology bank containing lesion-related knowledge, and an anatomy bank with anatomical descriptions. The visual features are aligned with these medical concepts and undergo tailored enhancement. We further propose an anatomy-based contrastive learning procedure to improve the generalization of anatomical features, coupled with a matching loss for pathological features to prioritize clinically relevant regions. Additionally, a feature gating mechanism is employed to filter out low-quality concept features. Finally, the visual features are corresponding to individual medical concepts, and are leveraged to guide the report generation process. Experiments on two public benchmarks (MIMIC-CXR and CheXpert Plus) demonstrate that MCA-RG achieves superior performance, highlighting its effectiveness in radiology report generation.

A Unified Platform for Radiology Report Generation and Clinician-Centered AI Evaluation

Ma, Z., Yang, X., Atalay, Z., Yang, A., Collins, S., Bai, H., Bernstein, M., Baird, G., Jiao, Z.

medrxiv logopreprintJul 8 2025
Generative AI models have demonstrated strong potential in radiology report generation, but their clinical adoption depends on physician trust. In this study, we conducted a radiology-focused Turing test to evaluate how well attendings and residents distinguish AI-generated reports from those written by radiologists, and how their confidence and decision time reflect trust. we developed an integrated web-based platform comprising two core modules: Report Generation and Report Evaluation. Using the web-based platform, eight participants evaluated 48 anonymized X-ray cases, each paired with two reports from three comparison groups: radiologist vs. AI model 1, radiologist vs. AI model 2, and AI model 1 vs. AI model 2. Participants selected the AI-generated report, rated their confidence, and indicated report preference. Attendings outperformed residents in identifying AI-generated reports (49.9% vs. 41.1%) and exhibited longer decision times, suggesting more deliberate judgment. Both groups took more time when both reports were AI-generated. Our findings highlight the role of clinical experience in AI acceptance and the need for design strategies that foster trust in clinical applications. The project page of the evaluation platform is available at: https://zachatalay89.github.io/Labsite.

An Institutional Large Language Model for Musculoskeletal MRI Improves Protocol Adherence and Accuracy.

Patrick Decourcy Hallinan JT, Leow NW, Low YX, Lee A, Ong W, Zhou Chan MD, Devi GK, He SS, De-Liang Loh D, Wei Lim DS, Low XZ, Teo EC, Furqan SM, Yang Tham WW, Tan JH, Kumar N, Makmur A, Yonghan T

pubmed logopapersJul 8 2025
Privacy-preserving large language models (PP-LLMs) hold potential for assisting clinicians with documentation. We evaluated a PP-LLM to improve the clinical information on radiology request forms for musculoskeletal magnetic resonance imaging (MRI) and to automate protocoling, which ensures that the most appropriate imaging is performed. The present retrospective study included musculoskeletal MRI radiology request forms that had been randomly collected from June to December 2023. Studies without electronic medical record (EMR) entries were excluded. An institutional PP-LLM (Claude Sonnet 3.5) augmented the original radiology request forms by mining EMRs, and, in combination with rule-based processing of the LLM outputs, suggested appropriate protocols using institutional guidelines. Clinical information on the original and PP-LLM radiology request forms were compared with use of the RI-RADS (Reason for exam Imaging Reporting and Data System) grading by 2 musculoskeletal (MSK) radiologists independently (MSK1, with 13 years of experience, and MSK2, with 11 years of experience). These radiologists established a consensus reference standard for protocoling, against which the PP-LLM and of 2 second-year board-certified radiologists (RAD1 and RAD2) were compared. Inter-rater reliability was assessed with use of the Gwet AC1, and the percentage agreement with the reference standard was calculated. Overall, 500 musculoskeletal MRI radiology request forms were analyzed for 407 patients (202 women and 205 men with a mean age [and standard deviation] of 50.3 ± 19.5 years) across a range of anatomical regions, including the spine/pelvis (143 MRI scans; 28.6%), upper extremity (169 scans; 33.8%) and lower extremity (188 scans; 37.6%). Two hundred and twenty-two (44.4%) of the 500 MRI scans required contrast. The clinical information provided in the PP-LLM-augmented radiology request forms was rated as superior to that in the original requests. Only 0.4% to 0.6% of PP-LLM radiology request forms were rated as limited/deficient, compared with 12.4% to 22.6% of the original requests (p < 0.001). Almost-perfect inter-rater reliability was observed for LLM-enhanced requests (AC1 = 0.99; 95% confidence interval [CI], 0.99 to 1.0), compared with substantial agreement for the original forms (AC1 = 0.62; 95% CI, 0.56 to 0.67). For protocoling, MSK1 and MSK2 showed almost-perfect agreement on the region/coverage (AC1 = 0.96; 95% CI, 0.95 to 0.98) and contrast requirement (AC1 = 0.98; 95% CI, 0.97 to 0.99). Compared with the consensus reference standard, protocoling accuracy for the PP-LLM was 95.8% (95% CI, 94.0% to 97.6%), which was significantly higher than that for both RAD1 (88.6%; 95% CI, 85.8% to 91.4%) and RAD2 (88.2%; 95% CI, 85.4% to 91.0%) (p < 0.001 for both). Musculoskeletal MRI request form augmentation with an institutional LLM provided superior clinical information and improved protocoling accuracy compared with clinician requests and non-MSK-trained radiologists. Institutional adoption of such LLMs could enhance the appropriateness of MRI utilization and patient care. Diagnostic Level III. See Instructions for Authors for a complete description of levels of evidence.

Modeling and Reversing Brain Lesions Using Diffusion Models

Omar Zamzam, Haleh Akrami, Anand Joshi, Richard Leahy

arxiv logopreprintJul 8 2025
Brain lesions are abnormalities or injuries in brain tissue that are often detectable using magnetic resonance imaging (MRI), which reveals structural changes in the affected areas. This broad definition of brain lesions includes areas of the brain that are irreversibly damaged, as well as areas of brain tissue that are deformed as a result of lesion growth or swelling. Despite the importance of differentiating between damaged and deformed tissue, existing lesion segmentation methods overlook this distinction, labeling both of them as a single anomaly. In this work, we introduce a diffusion model-based framework for analyzing and reversing the brain lesion process. Our pipeline first segments abnormal regions in the brain, then estimates and reverses tissue deformations by restoring displaced tissue to its original position, isolating the core lesion area representing the initial damage. Finally, we inpaint the core lesion area to arrive at an estimation of the pre-lesion healthy brain. This proposed framework reverses a forward lesion growth process model that is well-established in biomechanical studies that model brain lesions. Our results demonstrate improved accuracy in lesion segmentation, characterization, and brain labeling compared to traditional methods, offering a robust tool for clinical and research applications in brain lesion analysis. Since pre-lesion healthy versions of abnormal brains are not available in any public dataset for validation of the reverse process, we simulate a forward model to synthesize multiple lesioned brain images.
Page 38 of 79781 results
Show
per page

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.