Sort by:
Page 22 of 42416 results

Multimodal Large Language Models for Medical Report Generation via Customized Prompt Tuning

Chunlei Li, Jingyang Hou, Yilei Shi, Jingliang Hu, Xiao Xiang Zhu, Lichao Mou

arxiv logopreprintJun 18 2025
Medical report generation from imaging data remains a challenging task in clinical practice. While large language models (LLMs) show great promise in addressing this challenge, their effective integration with medical imaging data still deserves in-depth exploration. In this paper, we present MRG-LLM, a novel multimodal large language model (MLLM) that combines a frozen LLM with a learnable visual encoder and introduces a dynamic prompt customization mechanism. Our key innovation lies in generating instance-specific prompts tailored to individual medical images through conditional affine transformations derived from visual features. We propose two implementations: prompt-wise and promptbook-wise customization, enabling precise and targeted report generation. Extensive experiments on IU X-ray and MIMIC-CXR datasets demonstrate that MRG-LLM achieves state-of-the-art performance in medical report generation. Our code will be made publicly available.

Quality control system for patient positioning and filling in meta-information for chest X-ray examinations.

Borisov AA, Semenov SS, Kirpichev YS, Arzamasov KM, Omelyanskaya OV, Vladzymyrskyy AV, Vasilev YA

pubmed logopapersJun 18 2025
During radiography, irregularities occur, leading to decrease in the diagnostic value of the images obtained. The purpose of this work was to develop a system for automated quality assurance of patient positioning in chest radiographs, with detection of suboptimal contrast, brightness, and metadata errors. The quality assurance system was trained and tested using more than 69,000 X-rays of the chest and other anatomical areas from the Unified Radiological Information Service (URIS) and several open datasets. Our dataset included studies regardless of a patient's gender and race, while the sole exclusion criterion being age below 18 years. A training dataset of radiographs labeled by expert radiologists was used to train an ensemble of modified deep convolutional neural networks architectures ResNet152V2 and VGG19 to identify various quality deficiencies. Model performance was accessed using area under the receiver operating characteristic curve (ROC-AUC), precision, recall, F1-score, and accuracy metrics. Seven neural network models were trained to classify radiographs by the following quality deficiencies: failure to capture the target anatomic region, chest rotation, suboptimal brightness, incorrect anatomical area, projection errors, and improper photometric interpretation. All metrics for each model exceed 95%, indicating high predictive value. All models were combined into a unified system for evaluating radiograph quality. The processing time per image is approximately 3 s. The system supports multiple use cases: integration into an automated radiographic workstations, external quality assurance system for radiology departments, acquisition quality audits for municipal health systems, and routing of studies to diagnostic AI models.

Innovative technologies and their clinical prospects for early lung cancer screening.

Deng Z, Ma X, Zou S, Tan L, Miao T

pubmed logopapersJun 18 2025
Lung cancer remains the leading cause of cancer-related mortality worldwide, due to lacking effective early-stage screening approaches. Imaging, such as low-dose CT, poses radiation risk, and biopsies can induce some complications. Additionally, traditional serum tumor markers lack diagnostic specificity. This highlights the urgent need for precise and non-invasive early detection techniques. This systematic review aims to evaluate the limitations of conventional screening methods (imaging/biopsy/tumor markers), seek breakthroughs in liquid biopsy for early lung cancer detection, and assess the potential value of Artificial Intelligence (AI), thereby providing evidence-based insights for establishing an optimal screening framework. We systematically searched the PubMed database for the literature published up to May 2025. Key words include "Artificial Intelligence", "Early Lung cancer screening", "Imaging examination", "Innovative technologies", "Liquid biopsy", and "Puncture biopsy". Our inclusion criteria focused on studies about traditional and innovative screening methods, with an emphasis on original research concerning diagnostic performance or high-quality reviews. This approach helps identify critical studies in early lung cancer screening. Novel liquid biopsy techniques are non-invasive and have superior diagnostic efficacy. AI-assisted diagnostics further enhance accuracy. We propose three development directions: establishing risk-based liquid biopsy screening protocols, developing a stepwise "imaging-AI-liquid biopsy" diagnostic workflow, and creating standardized biomarker panel testing solutions. Integrating traditional methodologies, novel liquid biopsies, and AI to establish a comprehensive early lung cancer screening model is important. These innovative strategies aim to significantly increase early detection rates, substantially enhancing lung cancer control. This review provides both theoretical guidance for clinical practice and future research.

Deep learning model using CT images for longitudinal prediction of benign and malignant ground-glass nodules.

Yang X, Wang J, Wang P, Li Y, Wen Z, Shang J, Chen K, Tang C, Liang S, Meng W

pubmed logopapersJun 18 2025
To develop and validate a CT image-based multiple time-series deep learning model for the longitudinal prediction of benign and malignant pulmonary ground-glass nodules (GGNs). A total of 486 GGNs from an equal number of patients were included in this research, which took place at two medical centers. Each nodule underwent surgical removal and was confirmed pathologically. The patients were randomly assigned to a training set, validation set, and test set, following a distribution ratio of 7:2:1. We established a transformer-based deep learning framework that leverages multi-temporal CT images for the longitudinal prediction of GGNs, focusing on distinguishing between benign and malignant types. Additionally, we utilized 13 different machine learning algorithms to formulate clinical models, delta-radiomics models, and combined models that merge deep learning with CT semantic features. The predictive capabilities of the models were assessed using the receiver operating characteristic (ROC) curve and the area under the curve (AUC). The multiple time-series deep learning model based on CT images surpassed both the clinical model and the delta-radiomics model, showcasing strong predictive capabilities for GGNs across the training, validation, and test sets, with AUCs of 0.911 (95% CI, 0.879-0.939), 0.809 (95% CI,0.715-0.908), and 0.817 (95% CI,0.680-0.937), respectively. Furthermore, the models that integrated deep learning with CT semantic features achieved the highest performance, resulting in AUCs of 0.960 (95% CI, 0.912-0.977), 0.878 (95% CI,0.801-0.942), and 0.890(95% CI, 0.790-0.968). The multiple time-series deep learning model utilizing CT images was effective in predicting benign and malignant GGNs.

A Deep Learning Lung Cancer Segmentation Pipeline to Facilitate CT-based Radiomics

So, A. C. P., Cheng, D., Aslani, S., Azimbagirad, M., Yamada, D., Dunn, R., Josephides, E., McDowall, E., Henry, A.-R., Bille, A., Sivarasan, N., Karapanagiotou, E., Jacob, J., Pennycuick, A.

medrxiv logopreprintJun 18 2025
BackgroundCT-based radio-biomarkers could provide non-invasive insights into tumour biology to risk-stratify patients. One of the limitations is laborious manual segmentation of regions-of-interest (ROI). We present a deep learning auto-segmentation pipeline for radiomic analysis. Patients and Methods153 patients with resected stage 2A-3B non-small cell lung cancer (NSCLCs) had tumours segmented using nnU-Net with review by two clinicians. The nnU-Net was pretrained with anatomical priors in non-cancerous lungs and finetuned on NSCLCs. Three ROIs were segmented: intra-tumoural, peri-tumoural, and whole lung. 1967 features were extracted using PyRadiomics. Feature reproducibility was tested using segmentation perturbations. Features were selected using minimum-redundancy-maximum-relevance with Random Forest-recursive feature elimination nested in 500 bootstraps. ResultsAuto-segmentation time was [~]36 seconds/series. Mean volumetric and surface Dice-Sorensen coefficient (DSC) scores were 0.84 ({+/-}0.28), and 0.79 ({+/-}0.34) respectively. DSC were significantly correlated with tumour shape (sphericity, diameter) and location (worse with chest wall adherence), but not batch effects (e.g. contrast, reconstruction kernel). 6.5% cases had missed segmentations; 6.5% required major changes. Pre-training on anatomical priors resulted in better segmentations compared to training on tumour-labels alone (p<0.001) and tumour with anatomical labels (p<0.001). Most radiomic features were not reproducible following perturbations and resampling. Adding radiomic features, however, did not significantly improve the clinical model in predicting 2-year disease-free survival: AUCs 0.67 (95%CI 0.59-0.75) vs 0.63 (95%CI 0.54-0.71) respectively (p=0.28). ConclusionOur study demonstrates that integrating auto-segmentation into radio-biomarker discovery is feasible with high efficiency and accuracy. Whilst radiomic analysis show limited reproducibility, our auto-segmentation may allow more robust radio-biomarker analysis using deep learning features.

SCISSOR: Mitigating Semantic Bias through Cluster-Aware Siamese Networks for Robust Classification

Shuo Yang, Bardh Prenkaj, Gjergji Kasneci

arxiv logopreprintJun 17 2025
Shortcut learning undermines model generalization to out-of-distribution data. While the literature attributes shortcuts to biases in superficial features, we show that imbalances in the semantic distribution of sample embeddings induce spurious semantic correlations, compromising model robustness. To address this issue, we propose SCISSOR (Semantic Cluster Intervention for Suppressing ShORtcut), a Siamese network-based debiasing approach that remaps the semantic space by discouraging latent clusters exploited as shortcuts. Unlike prior data-debiasing approaches, SCISSOR eliminates the need for data augmentation and rewriting. We evaluate SCISSOR on 6 models across 4 benchmarks: Chest-XRay and Not-MNIST in computer vision, and GYAFC and Yelp in NLP tasks. Compared to several baselines, SCISSOR reports +5.3 absolute points in F1 score on GYAFC, +7.3 on Yelp, +7.7 on Chest-XRay, and +1 on Not-MNIST. SCISSOR is also highly advantageous for lightweight models with ~9.5% improvement on F1 for ViT on computer vision datasets and ~11.9% for BERT on NLP. Our study redefines the landscape of model generalization by addressing overlooked semantic biases, establishing SCISSOR as a foundational framework for mitigating shortcut learning and fostering more robust, bias-resistant AI systems.

Development and interpretation of machine learning-based prognostic models for predicting high-risk prognostic pathological components in pulmonary nodules: integrating clinical features, serum tumor marker and imaging features.

Wang D, Qiu J, Li R, Tian H

pubmed logopapersJun 17 2025
With the improvement of imaging, the screening rate of Pulmonary nodules (PNs) has further increased, but their identification of High-Risk Prognostic Pathological Components (HRPPC) is still a major challenge. In this study, we aimed to build a multi-parameter machine learning predictive model to improve the discrimination accuracy of HRPPC. This study included 816 patients with ≤ 3 cm pulmonary nodules with clear pathology and underwent pulmonary resection. High-resolution chest CT images, clinicopathological characteristics were collected from patients. Lasso regression was utilized in order to identify key features, and a machine learning prediction model was constructed based on the screened key features. The recognition ability of the prediction model was evaluated using (ROC) curves and confusion matrices. Model calibration ability was evaluated using calibration curves. Decision curve analysis (DCA) was used to evaluate the value of the model for clinical applications. Use SHAP values for interpreting predictive models. A total of 816 patients were included in this study, of which 112 (13.79%) had HRPPC of pulmonary nodules. By selecting key variables through Lasso recursive feature elimination, we finally identified 13 key relevant features. The XGB model performed the best, with an area under the ROC curve (AUC) of 0.930 (95% CI: 0.906-0.954) in the training cohort and 0.835 (95% CI: 0.774-0.895) in the validation cohort, indicating that the XGB model had excellent predictive performance. In addition, the calibration curves of the XGB model showed good calibration in both cohorts. DCA demonstrated that the predictive model had a positive benefit in general clinical decision-making. The SHAP values identified the top 3 predictors affecting the HRPPC of PNs as CT Value, Nodule Long Diameter, and PRO-GRP. Our prediction model for identifying HRPPC in PNs has excellent discrimination, calibration and clinical utility. Thoracic surgeons could make relatively reliable predictions of HRPPC in PNs without the possibility of invasive testing.

SCISSOR: Mitigating Semantic Bias through Cluster-Aware Siamese Networks for Robust Classification

Shuo Yang, Bardh Prenkaj, Gjergji Kasneci

arxiv logopreprintJun 17 2025
Shortcut learning undermines model generalization to out-of-distribution data. While the literature attributes shortcuts to biases in superficial features, we show that imbalances in the semantic distribution of sample embeddings induce spurious semantic correlations, compromising model robustness. To address this issue, we propose SCISSOR (Semantic Cluster Intervention for Suppressing ShORtcut), a Siamese network-based debiasing approach that remaps the semantic space by discouraging latent clusters exploited as shortcuts. Unlike prior data-debiasing approaches, SCISSOR eliminates the need for data augmentation and rewriting. We evaluate SCISSOR on 6 models across 4 benchmarks: Chest-XRay and Not-MNIST in computer vision, and GYAFC and Yelp in NLP tasks. Compared to several baselines, SCISSOR reports +5.3 absolute points in F1 score on GYAFC, +7.3 on Yelp, +7.7 on Chest-XRay, and +1 on Not-MNIST. SCISSOR is also highly advantageous for lightweight models with ~9.5% improvement on F1 for ViT on computer vision datasets and ~11.9% for BERT on NLP. Our study redefines the landscape of model generalization by addressing overlooked semantic biases, establishing SCISSOR as a foundational framework for mitigating shortcut learning and fostering more robust, bias-resistant AI systems.

Risk factors and prognostic indicators for progressive fibrosing interstitial lung disease: a deep learning-based CT quantification approach.

Lee K, Lee JH, Koh SY, Park H, Goo JM

pubmed logopapersJun 17 2025
To investigate the value of deep learning-based quantitative CT (QCT) in predicting progressive fibrosing interstitial lung disease (PF-ILD) and assessing prognosis. This single-center retrospective study included ILD patients with CT examinations between January 2015 and June 2021. Each ILD finding (ground-glass opacity (GGO), reticular opacity (RO), honeycombing) and fibrosis (sum of RO and honeycombing) was quantified from baseline and follow-up CTs. Logistic regression was performed to identify predictors of PF-ILD, defined as radiologic progression along with forced vital capacity (FVC) decline ≥ 5% predicted. Cox proportional hazard regression was used to assess mortality. The added value of incorporating QCT into FVC was evaluated using C-index. Among 465 ILD patients (median age [IQR], 65 [58-71] years; 238 men), 148 had PF-ILD. After adjusting for clinico-radiological variables, baseline RO (OR: 1.096, 95% CI: 1.042, 1.152, p < 0.001) and fibrosis extent (OR: 1.035, 95% CI: 1.004, 1.067, p = 0.025) were PF-ILD predictors. Baseline RO (HR: 1.063, 95% CI: 1.013, 1.115, p = 0.013), honeycombing (HR: 1.074, 95% CI: 1.034, 1.116, p < 0.001), and fibrosis extent (HR: 1.067, 95% CI: 1.043, 1.093, p < 0.001) predicted poor prognosis. The Cox models combining baseline percent predicted FVC with QCT (each ILD finding, C-index: 0.714, 95% CI: 0.660, 0.764; fibrosis, C-index: 0.703, 95% CI: 0.649, 0.752; both p-values < 0.001) outperformed the model without QCT (C-index: 0.545, 95% CI: 0.500, 0.599). Deep learning-based QCT for ILD findings is useful for predicting PF-ILD and its prognosis. Question Does deep learning-based CT quantification of interstitial lung disease (ILD) findings have value in predicting progressive fibrosing ILD (PF-ILD) and improving prognostication? Findings Deep learning-based CT quantification of baseline reticular opacity and fibrosis predicted the development of PF-ILD. In addition, CT quantification demonstrated value in predicting all-cause mortality. Clinical relevance Deep learning-based CT quantification of ILD findings is useful for predicting PF-ILD and its prognosis. Identifying patients at high risk of PF-ILD through CT quantification enables closer monitoring and earlier treatment initiation, which may lead to improved clinical outcomes.

Enhancing Ultrasound-Based Diagnosis of Unilateral Diaphragmatic Paralysis with a Visual Transformer-Based Model.

Kalkanis A, Bakalis D, Testelmans D, Buyse B, Simos YV, Tsamis KI, Manis G

pubmed logopapersJun 17 2025
This paper presents a novel methodology that combines a pre-trained Visual Transformer-Based Deep Model (ViT) with a custom denoising image filter for the diagnosis of Unilateral Diaphragmatic Paralysis (UDP) using Ultrasound (US) images. The ViT is employed to extract complex features from US images of 17 volunteers, capturing intricate patterns and details that are critical for accurate diagnosis. The extracted features are then fed into an ensemble learning model to determine the presence of UDP. The proposed framework achieves an average accuracy of 93.8% on a stratified 5-fold cross-validation, surpassing relevant state-of-the-art (SOTA) image classifiers. This high level of performance underscores the robustness and effectiveness of the framework, highlighting its potential as a prominent diagnostic tool in medical imaging.
Page 22 of 42416 results
Show
per page

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.