Sort by:
Page 35 of 41408 results

Benchmarking Chest X-ray Diagnosis Models Across Multinational Datasets

Qinmei Xu, Yiheng Li, Xianghao Zhan, Ahmet Gorkem Er, Brittany Dashevsky, Chuanjun Xu, Mohammed Alawad, Mengya Yang, Liu Ya, Changsheng Zhou, Xiao Li, Haruka Itakura, Olivier Gevaert

arxiv logopreprintMay 21 2025
Foundation models leveraging vision-language pretraining have shown promise in chest X-ray (CXR) interpretation, yet their real-world performance across diverse populations and diagnostic tasks remains insufficiently evaluated. This study benchmarks the diagnostic performance and generalizability of foundation models versus traditional convolutional neural networks (CNNs) on multinational CXR datasets. We evaluated eight CXR diagnostic models - five vision-language foundation models and three CNN-based architectures - across 37 standardized classification tasks using six public datasets from the USA, Spain, India, and Vietnam, and three private datasets from hospitals in China. Performance was assessed using AUROC, AUPRC, and other metrics across both shared and dataset-specific tasks. Foundation models outperformed CNNs in both accuracy and task coverage. MAVL, a model incorporating knowledge-enhanced prompts and structured supervision, achieved the highest performance on public (mean AUROC: 0.82; AUPRC: 0.32) and private (mean AUROC: 0.95; AUPRC: 0.89) datasets, ranking first in 14 of 37 public and 3 of 4 private tasks. All models showed reduced performance on pediatric cases, with average AUROC dropping from 0.88 +/- 0.18 in adults to 0.57 +/- 0.29 in children (p = 0.0202). These findings highlight the value of structured supervision and prompt design in radiologic AI and suggest future directions including geographic expansion and ensemble modeling for clinical deployment. Code for all evaluated models is available at https://drive.google.com/drive/folders/1B99yMQm7bB4h1sVMIBja0RfUu8gLktCE

Mask of Truth: Model Sensitivity to Unexpected Regions of Medical Images.

Sourget T, Hestbek-Møller M, Jiménez-Sánchez A, Junchi Xu J, Cheplygina V

pubmed logopapersMay 20 2025
The development of larger models for medical image analysis has led to increased performance. However, it also affected our ability to explain and validate model decisions. Models can use non-relevant parts of images, also called spurious correlations or shortcuts, to obtain high performance on benchmark datasets but fail in real-world scenarios. In this work, we challenge the capacity of convolutional neural networks (CNN) to classify chest X-rays and eye fundus images while masking out clinically relevant parts of the image. We show that all models trained on the PadChest dataset, irrespective of the masking strategy, are able to obtain an area under the curve (AUC) above random. Moreover, the models trained on full images obtain good performance on images without the region of interest (ROI), even superior to the one obtained on images only containing the ROI. We also reveal a possible spurious correlation in the Chákṣu dataset while the performances are more aligned with the expectation of an unbiased model. We go beyond the performance analysis with the usage of the explainability method SHAP and the analysis of embeddings. We asked a radiology resident to interpret chest X-rays under different masking to complement our findings with clinical knowledge.

Federated learning in low-resource settings: A chest imaging study in Africa -- Challenges and lessons learned

Jorge Fabila, Lidia Garrucho, Víctor M. Campello, Carlos Martín-Isla, Karim Lekadir

arxiv logopreprintMay 20 2025
This study explores the use of Federated Learning (FL) for tuberculosis (TB) diagnosis using chest X-rays in low-resource settings across Africa. FL allows hospitals to collaboratively train AI models without sharing raw patient data, addressing privacy concerns and data scarcity that hinder traditional centralized models. The research involved hospitals and research centers in eight African countries. Most sites used local datasets, while Ghana and The Gambia used public ones. The study compared locally trained models with a federated model built across all institutions to evaluate FL's real-world feasibility. Despite its promise, implementing FL in sub-Saharan Africa faces challenges such as poor infrastructure, unreliable internet, limited digital literacy, and weak AI regulations. Some institutions were also reluctant to share model updates due to data control concerns. In conclusion, FL shows strong potential for enabling AI-driven healthcare in underserved regions, but broader adoption will require improvements in infrastructure, education, and regulatory support.

RADAR: Enhancing Radiology Report Generation with Supplementary Knowledge Injection

Wenjun Hou, Yi Cheng, Kaishuai Xu, Heng Li, Yan Hu, Wenjie Li, Jiang Liu

arxiv logopreprintMay 20 2025
Large language models (LLMs) have demonstrated remarkable capabilities in various domains, including radiology report generation. Previous approaches have attempted to utilize multimodal LLMs for this task, enhancing their performance through the integration of domain-specific knowledge retrieval. However, these approaches often overlook the knowledge already embedded within the LLMs, leading to redundant information integration and inefficient utilization of learned representations. To address this limitation, we propose RADAR, a framework for enhancing radiology report generation with supplementary knowledge injection. RADAR improves report generation by systematically leveraging both the internal knowledge of an LLM and externally retrieved information. Specifically, it first extracts the model's acquired knowledge that aligns with expert image-based classification outputs. It then retrieves relevant supplementary knowledge to further enrich this information. Finally, by aggregating both sources, RADAR generates more accurate and informative radiology reports. Extensive experiments on MIMIC-CXR, CheXpert-Plus, and IU X-ray demonstrate that our model outperforms state-of-the-art LLMs in both language quality and clinical accuracy

Diagnostic value of fully automated CT pulmonary angiography in patients with chronic thromboembolic pulmonary hypertension and chronic thromboembolic disease.

Lin Y, Li M, Xie S

pubmed logopapersMay 20 2025
To evaluate the value of employing artificial intelligence (AI)-assisted CT pulmonary angiography (CTPA) for patients with chronic thromboembolic pulmonary hypertension (CTEPH) and chronic thromboembolic disease (CTED). A single-center, retrospective analysis of 350 sequential patients with right heart catheterization (RHC)-confirmed CTEPH, CTED, and normal controls was conducted. Parameters such as the main pulmonary artery diameter (MPAd), the ratio of MPA to ascending aorta diameter (MPAd/AAd), the ratio of right to left ventricle diameter (RVd/LVd), and the ratio of RV to LV volume (RVv/LVv) were evaluated using automated AI software and compared with manual analysis. The reliability was assessed through an intraclass correlation coefficient (ICC) analysis. The diagnostic accuracy was determined using receiver-operating characteristic (ROC) curves. Compared to CTED and control groups, CTEPH patients were significantly more likely to have elevated automatic CTPA metrics (all p < 0.001, respectively). Automated MPAd, MPAd/Aad, and RVv/LVv had a strong correlation with mPAP (r = 0.952, 0.904, and 0.815, respectively, all p < 0.001). The automated and manual CTPA analyses showed strong concordance. For the CTEPH and CTED categories, the optimal area under the curve (AU-ROC) reached 0.939 (CI: 0.908-0.969). In the CTEPH and control groups, the best AU-ROC was 0.970 (CI: 0.953-0.988). In the CTED and control groups, the best AU-ROC was 0.782 (CI: 0.724-0.840). Automated AI-driven CTPA analysis provides a dependable approach for evaluating patients with CTEPH, CTED, and normal controls, demonstrating excellent consistency and efficiency. Question Guidelines do not advocate for applying treatment protocols for CTEPH to patients with CTED; early detection of the condition is crucial. Findings Automated CTPA analysis was feasible in 100% of patients with good agreement and would have added information for early detection and identification. Clinical relevance Automated AI-driven CTPA analysis provides a reliable approach demonstrating excellent consistency and efficiency. Additionally, these noninvasive imaging findings may aid in treatment stratification and determining optimal intervention directed by RHC.

CT-guided CBCT Multi-Organ Segmentation Using a Multi-Channel Conditional Consistency Diffusion Model for Lung Cancer Radiotherapy.

Chen X, Qiu RLJ, Pan S, Shelton J, Yang X, Kesarwala AH

pubmed logopapersMay 20 2025
In cone beam computed tomography(CBCT)-guided adaptive radiotherapy, rapid and precise segmentation of organs-at-risk(OARs)is essential for accurate dose verification and online replanning. The quality of CBCT images obtained with current onboard CBCT imagers and clinical imaging protocols, however, is often compromised by artifacts such as scatter and motion, particularly for thoracic CBCTs. These artifacts not only degrade image contrast but also obscure anatomical boundaries, making accurate segmentation on CBCT images significantly more challenging compared to planning CT images. To address these persistent challenges, we propose a novel multi-channel conditional consistency diffusion model(MCCDM)for segmentation of OARs in thoracic CBCT images (CBCT-MCCDM), which harnesses its domain transfer capabilities to improve segmentation accuracy across different imaging modalities. By jointly training the MCCDM with CT images and their corresponding masks, our framework enables an end-to-end mapping learning process that generates accurate segmentation of OARs.&#xD;This CBCT-MCCDM was used to delineate esophagus, heart, the left and right lungs, and spinal cord on CBCT images from each patient with lung cancer. We quantitatively evaluated our approach by comparing model-generated contours with ground truth contours from 33 patients with lung cancer treated with 5-fraction stereotactic body radiation therapy (SBRT), demonstrating its potential to enhance segmentation accuracy despite the presence of challenging CBCT artifacts. The proposed method was evaluated using average Dice similarity coefficients (DSC), sensitivity, specificity, 95th Percentile Hausdorff Distance (HD95), and mean surface distance (MSD) for each of the five OARs. The method achieved average DSC values of 0.82, 0.88, 0.95, 0.96, and 0.96 for the esophagus, heart, left lung, right lung, and spinal cord, respectively. Sensitivity values were 0.813, 0.922, 0.956, 0.958, and 0.929, respectively, while specificity values were 0.991, 0.994, 0.996, 0.996, and 0.995, respectively. We compared the proposed method with two state-of-art methods, CBCT-only method and U-Net, and demonstrated that the proposed CBCT-MCCDM.

Fusing radiomics and deep learning features for automated classification of multi-type pulmonary nodule.

Du L, Tang G, Che Y, Ling S, Chen X, Pan X

pubmed logopapersMay 20 2025
The accurate classification of lung nodules is critical to achieving personalized lung cancer treatment and prognosis prediction. The treatment options for lung cancer and the prognosis of patients are closely related to the type of lung nodules, but there are many types of lung nodules, and the distinctions between certain types are subtle, making accurate classification based on traditional medical imaging technology and doctor experience challenging. In this study, a novel method was used to analyze quantitative features in CT images using CT radiomics to reveal the characteristics of pulmonary nodules, and then feature fusion was used to integrate radiomics features and deep learning features to improve the accuracy of classification. This paper proposes a fusion feature pulmonary nodule classification method that fuses radiomics features with deep learning neural network features, aiming to automatically classify different types of pulmonary nodules (such as Malignancy, Calcification, Spiculation, Lobulation, Margin, and Texture). By introducing the Discriminant Correlation Analysis feature fusion algorithm, the method maximizes the complementarity between the two types of features and the differences between different classes. This ensures interaction between the information, effectively utilizing the complementary characteristics of the features. The LIDC-IDRI dataset is used for training, and the fusion feature model has been validated for its advantages and effectiveness in classifying multiple types of pulmonary nodules. The experimental results show that the fusion feature model outperforms the single-feature model in all classification tasks. The AUCs for the tasks of classifying Calcification, Lobulation, Margin, Spiculation, Texture, and Malignancy reached 0.9663, 0.8113, 0.8815, 0.8140, 0.9010, and 0.9316, respectively. In tasks such as nodule calcification and texture classification, the fusion feature model significantly improved the recognition ability of minority classes. The fusion of radiomics features and deep learning neural network features can effectively enhance the overall performance of pulmonary nodule classification models while also improving the recognition of minority classes when there is a significant class imbalance.

Artificial intelligence based pulmonary vessel segmentation: an opportunity for automated three-dimensional planning of lung segmentectomy.

Mank QJ, Thabit A, Maat APWM, Siregar S, Van Walsum T, Kluin J, Sadeghi AH

pubmed logopapersMay 19 2025
This study aimed to develop an automated method for pulmonary artery and vein segmentation in both left and right lungs from computed tomography (CT) images using artificial intelligence (AI). The segmentations were evaluated using PulmoSR software, which provides 3D visualizations of patient-specific anatomy, potentially enhancing a surgeon's understanding of the lung structure. A dataset of 125 CT scans from lung segmentectomy patients at Erasmus MC was used. Manual annotations for pulmonary arteries and veins were created with 3D Slicer. nnU-Net models were trained for both lungs, assessed using Dice score, sensitivity, and specificity. Intraoperative recordings demonstrated clinical applicability. A paired t-test evaluated statistical significance of the differences between automatic and manual segmentations. The nnU-Net model, trained at full 3D resolution, achieved a mean Dice score between 0.91 and 0.92. The mean sensitivity and specificity were: left artery: 0.86 and 0.99, right artery: 0.84 and 0.99, left vein: 0.85 and 0.99, right vein: 0.85 and 0.99. The automatic method reduced segmentation time from ∼1.5 hours to under 5 min. Five cases were evaluated to demonstrate how the segmentations support lung segmentectomy procedures. P-values for Dice scores were all below 0.01, indicating statistical significance. The nnU-Net models successfully performed automatic segmentation of pulmonary arteries and veins in both lungs. When integrated with visualization tools, these automatic segmentations can enhance preoperative and intraoperative planning by providing detailed 3D views of patients anatomy.

Non-invasive CT based multiregional radiomics for predicting pathologic complete response to preoperative neoadjuvant chemoimmunotherapy in non-small cell lung cancer.

Fan S, Xie J, Zheng S, Wang J, Zhang B, Zhang Z, Wang S, Cui Y, Liu J, Zheng X, Ye Z, Cui X, Yue D

pubmed logopapersMay 19 2025
This study aims to develop and validate a multiregional radiomics model to predict pathological complete response (pCR) to neoadjuvant chemoimmunotherapy in non-small cell lung cancer (NSCLC), and further evaluate the performance of the model in different specific subgroups (N2 stage and anti-PD-1/PD-L1). 216 patients with NSCLC who underwent neoadjuvant chemoimmunotherapy followed by surgical intervention were included and assigned to training and validation sets randomly. From pre-treatment baseline CT, one intratumoral (T) and two peritumoral regions (P<sub>3</sub>: 0-3 mm; P<sub>6</sub>: 0-6 mm) were extracted. Five radiomics models were developed using machine learning algorithms to predict pCR, utilizing selected features from intratumoral (T), peritumoral (P<sub>3</sub>, P<sub>6</sub>), and combined intra- and peritumoral regions (T + P<sub>3</sub>, T + P<sub>6</sub>). Additionally, the predictive efficacy of the optimal model was specifically assessed for patients in the N2 stage and anti-PD-1/PD-L1 subgroups. A total of 51.4 % (111/216) of patients exhibited pCR following neoadjuvant chemoimmunotherapy. Multivariable analysis identified that only the T + P<sub>3</sub> radiomics signature served as independent predictor of pCR (P < 0.001). The multiregional radiomics model (T + P<sub>3</sub>) exhibited superior predictive performance for pCR, achieving an area under the curve (AUC) of 0.75 in the validation cohort. Furthermore, this multiregional model maintained robust predictive accuracy in both N2 stage and anti-PD-1/PD-L1 subgroups, with an AUC of 0.829 and 0.833, respectively. The proposed multiregional radiomics model showed potential in predicting pCR in NSCLC after neoadjuvant chemoimmunotherapy, and demonstrated good predictive performance in different specific subgroups. This capability may assist clinicians in identifying suitable candidates for neoadjuvant chemoimmunotherapy and promote the advancement in precision therapy.

Diagnosis of early idiopathic pulmonary fibrosis: current status and future perspective.

Wang X, Xia X, Hou Y, Zhang H, Han W, Sun J, Li F

pubmed logopapersMay 19 2025
The standard approach to diagnosing idiopathic pulmonary fibrosis (IPF) includes identifying the usual interstitial pneumonia (UIP) pattern via high resolution computed tomography (HRCT) or lung biopsy and excluding known causes of interstitial lung disease (ILD). However, limitations of manual interpretation of lung imaging, along with other reasons such as lack of relevant knowledge and non-specific symptoms have hindered the timely diagnosis of IPF. This review proposes the definition of early IPF, emphasizes the diagnostic urgency of early IPF, and highlights current diagnostic strategies and future prospects for early IPF. The integration of artificial intelligence (AI), specifically machine learning (ML) and deep learning (DL), is revolutionizing the diagnostic procedure of early IPF by standardizing and accelerating the interpretation of thoracic images. Innovative bronchoscopic techniques such as transbronchial lung cryobiopsy (TBLC), genomic classifier, and endobronchial optical coherence tomography (EB-OCT) provide less invasive diagnostic alternatives. In addition, chest auscultation, serum biomarkers, and susceptibility genes are pivotal for the indication of early diagnosis. Ongoing research is essential for refining diagnostic methods and treatment strategies for early IPF.
Page 35 of 41408 results
Show
per page

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.