Latest Papers on Radiology AI. Tags: Chest

Artificial intelligence for diagnosis in interstitial lung disease and digital ontology for unclassified interstitial lung disease.

Baba T, Goto T, Kitamura Y, Iwasawa T, Okudela K, Takemura T, Osawa A, Ogura T

•papers•Sep 24 2025

Multidisciplinary discussion (MDD) is the gold standard for diagnosis in interstitial lung disease (ILD). However, its inter-rater agreement is not satisfactory, and access to the MDD is limited due to a shortage of ILD experts. Therefore, artificial intelligence would be helpful for diagnosing ILD. We retrospectively analyzed data from 630 patients with ILD, including clinical information, CT images, and pathological results. The ILD classification into four clinicopathologic entities (i.e., idiopathic pulmonary fibrosis, non-specific interstitial pneumonia, hypersensitivity pneumonitis, connective tissue disease) consists of two stages: first, pneumonia pattern classification of CT images using a convolutional neural network (CNN) model; second, multimodal (clinical, radiological, and pathological) classification using a support vector machine (SVM). The performance of the classification algorithm was evaluated using 5-fold cross-validation. The mean accuracies of the CNN model and SVM were 62.4 % and 85.4 %, respectively. For multimodal classification using SVM, the overall accuracy was very high, especially with sensitivities for idiopathic pulmonary fibrosis and hypersensitivity pneumonitis exceeding 90 %. When pneumonia patterns from CT images, pathological results, or clinical information were not used, the SVM accuracy was 84.3 %, 70.3 % and 79.8 %, respectively, suggesting that pathological results contributed most to MDD diagnosis. When an unclassifiable interstitial pneumonia was input, the SVM output tended to align with the most likely diagnosis by the expert MDD team. The algorithm based on multimodal information can assist in diagnosing interstitial lung disease and is suitable for ontology diagnosis. (242 words).

CT Classification Chest Retrospective Clinical In Silico Academic Lab

Artificial Intelligence Chest CT Imaging for the Diagnosis of Tuberculosis-Destroyed Lung with PH.

Yu W, Liu M, Qin W, Liu J, Chen S, Chen Y, Hu B, Chen Y, Liu E, Jin X, Liu S, Li C, Zhu Z

•papers•Sep 24 2025

Explore the clinical characteristics of Tuberculosis Destroyed Lung (TDL) with pulmonary hypertension. Use Artificial Intelligence (AI) CT Imaging for the Diagnosis of TDL Patients with PH. 51 cases of TDL patients. Based on the results of the right heart catheterization examination, the patients were divided into two groups: TDL with group (n=31) and TDL Non-PH (n=20). The original chest CT data of the patients were reconstructed, segmented, and rendered using AI, and lung volume-related data were calculated. The differences in clinical data, hemodynamic data, and lung volume-related data between the two groups of patients were compared. The proportion of TDL patients with PH is significantly higher than those without TDL (61.82% vs. 22.64%, P<0.01). There were significant differences between the two groups of patients in terms of pulmonary function, PCWP/PVR, PASP/TRV and total volume of destroyed lung tissue (VTDLT) (P<0.05), and VTDLT is positively correlated with mean pulmonary arterial pressure (mPAP). Combined Diagnosis (VTDLT + PSAP): The area under the AUC was 0.917 (95%CI: 0.802-1), with a predicted probability of 0.51 and a Youden index of 0.789. The sensitivity was 90% and specificity was 88.9%. Patients with TDL accompanied by pulmonary hypertension are related to restrictive disorders. The VTDLT is positively correlated with mPAP. By calculating the VTDLT and combining it with the estimated PASP from echocardiography, it assists in the diagnosis of PH in these patients.

CT Segmentation Chest Retrospective Clinical In Silico

Comparative Evaluation of Radiomics and Deep Learning Models for Disease Detection in Chest Radiography.

He Z, McMillan AB

•papers•Sep 23 2025

The application of artificial intelligence (AI) in medical imaging has revolutionized diagnostic practices, enabling advanced analysis and interpretation of radiological data. This study presents a comprehensive evaluation of radiomics-based and deep learning-based approaches for disease detection in chest radiography, focusing on COVID-19, lung opacity, and viral pneumonia. While deep learning models, particularly convolutional neural networks (CNNs) and vision transformers (ViTs), learn directly from image data, radiomics-based models extract handcrafted features, offering potential advantages in data-limited scenarios. We systematically compared the diagnostic performance of various AI models, including Decision Trees, Gradient Boosting, Random Forests, Support Vector Machines (SVMs), and Multi-Layer Perceptrons (MLPs) for radiomics, against state-of-the-art deep learning models such as InceptionV3, EfficientNetL, and ConvNeXtXLarge. Performance was evaluated across multiple sample sizes. At 24 samples, EfficientNetL achieved an AUC of 0.839, outperforming SVM (AUC = 0.762). At 4000 samples, InceptionV3 achieved the highest AUC of 0.996, compared to 0.885 for Random Forest. A Scheirer-Ray-Hare test confirmed significant main and interaction effects of model type and sample size on all metrics. Post hoc Mann-Whitney U tests with Bonferroni correction further revealed consistent performance advantages for deep learning models across most conditions. These findings provide statistically validated, data-driven recommendations for model selection in diagnostic AI. Deep learning models demonstrated higher performance and better scalability with increasing data availability, while radiomics-based models may remain useful in low-data contexts. This study addresses a critical gap in AI-based diagnostic research by offering practical guidance for deploying AI models across diverse clinical environments.

X-Ray Classification Chest Methodology In Silico

Imaging in chronic thromboembolic pulmonary hypertension: review of the current literature.

Hekimoglu K, Gopalan D, Onur MR, Kahraman G, Akay T

•papers•Sep 23 2025

Chronic thromboembolic pulmonary hypertension (CTEPH) is a severe, life-threatening complication of pulmonary embolism with pulmonary hypertension (PH). The combination of insufficient resolution of thrombi following pulmonary emboli and accompanying microvascular disease results in PH. Advances in imaging can offer better insight into CTEPH diagnosis and management, but lack of disease awareness among radiologists has been shown to be a cause of CTEPH misdiagnosis or delayed diagnosis. This review highlights features pertinent to CTEPH diagnosis. The primary focus is on different modalities with their distinctive signs and newly developed technologies employing artificial intelligence systems.

Mixed Modality Classification Chest Review Concept

Threshold optimization in AI chest radiography analysis: integrating real-world data and clinical subgroups.

Rudolph J, Huemmer C, Preuhs A, Buizza G, Dinkel J, Koliogiannis V, Fink N, Goller SS, Schwarze V, Heimer M, Hoppe BF, Liebig T, Ricke J, Sabel BO, Rueckel J

•papers•Sep 22 2025

Manufacturer-defined AI thresholds for chest x-ray (CXR) often lack customization options. Threshold optimization strategies utilizing users' clinical real-world data along with pathology-enriched validation data may better address subgroup-specific and user-specific needs. A pathology-enriched dataset (study cohort, 563 (CXRs)) with pleural effusions, consolidations, pneumothoraces, nodules, and unremarkable findings was analysed by an AI system and six reference radiologists. The same AI model was applied to a routine dataset (clinical cohort, 15,786 consecutive routine CXRs). Iterative receiver operating characteristic analysis linked achievable sensitivities (study cohort) to resulting AI alert rates in clinical routine inpatient or outpatient subgroups. "Optimized" thresholds (OTs) were defined by a 1% sensitivity increase leading to more than a 1% rise in AI alert rates. Threshold comparisons (OTs versus AI vendor's default thresholds (AIDT) versus Youden's thresholds) were based on 400 clinical cohort cases with expert radiologists' reference. AIDTs, OTs, and Youden's thresholds varied across scenarios, with OTs differing based on tailoring for inpatient or outpatient CXRs. AIDT lowering most reasonably improved sensitivity for pleural effusion, with increases from 46.8% (AIDT) to 87.2% (OT) for outpatients and from 76.3% (AIDT) to 93.5% (OT) for inpatients; similar trends appeared for consolidations. Conversely, regarding inpatient nodule detection, increasing the threshold improved accuracy from 69.5% (AIDT) to 82.5% (OT) without compromising sensitivity. Graphical analysis supports threshold selection by illustrating estimated sensitivities and clinical routine AI alert rates. An innovative, subgroup-specific AI threshold optimization is proposed, automatically implemented and transferable to other AI algorithms and varying clinical subgroup settings. Individually customizing thresholds tailored to specific medical experts' needs and patient subgroup characteristics is promising and may enhance diagnostic accuracy and the clinical acceptance of diagnostic AI algorithms. Customizing AI thresholds individually addresses specific user/patient subgroup needs. The presented approach utilizes pathology-enriched and real-world subgroup data for optimization. Potential is shown by comparing individualized thresholds with vendor defaults. Distinct thresholds for in- and outpatient CXR AI analysis may improve perception. The automated pipeline methodology is transferable to other AI models or subgroups.

X-Ray Detection Chest Retrospective Clinical In Silico Academic Lab

Machine learning predicts severe adverse events and salvage success of CT-guided lung biopsy after nondiagnostic transbronchial lung biopsy.

Yang S, Hua Z, Chen Y, Liu L, Wang Z, Cheng Y, Wang J, Xu Z, Chen C

•papers•Sep 22 2025

To address the unmet clinical need for validated risk stratification tools in salvage CT-guided percutaneous lung biopsy (PNLB) following nondiagnostic transbronchial lung biopsy (TBLB). We aimed to develop machine learning models predicting severe adverse events (SAEs) in PNLB (Model 1) and diagnostic success of salvage PNLB post-TBLB failure (Model 2). This multicenter predictive modeling study enrolled 2910 cases undergoing PNLB across two centers (Center 1: n = 2653 (2016-2020); Center 2: n = 257 (2017-2022)) with complete imaging and clinical documentation meeting predefined inclusion and exclusion criteria. Key variables were selected via LASSO regression, followed by development and validation of Model 1 (incorporating sex, smoking, pleural contact, lesion size, and puncture depth) and Model 2 (including age, lesion size, lesion characteristics, and post-bronchoscopic pathological categories (PBPCs)) using ten machine learning algorithms. Model performance was rigorously evaluated through discrimination metrics, calibration curves, and decision curve analysis to assess clinical applicability. A total of 2653 and 257 PNLB cases were included from two centers, where Model 1 achieved external validation ROC-AUC 0.717 (95% CI: 0.609-0.825) and PR-AUC 0.258 (95% CI: 0.0365-0.708), while Model 2 exhibited ROC-AUC 0.884 (95% CI: 0.784-0.984) and PR-AUC 0.852 (95% CI: 0.784-0.896), with XGBoost outperforming other algorithms. The dual XGBoost system stratifies salvage PNLB candidates by quantifying SAE risks (AUC = 0.717) versus diagnostic yield (AUC = 0.884), addressing the unmet need for personalized biopsy pathway optimization. Question Current tools cannot quantify severe adverse event (SAE) risks versus salvage diagnostic success for CT-guided lung biopsy (PNLB) after failed transbronchial biopsy (TBLB). Findings Dual XGBoost models successfully predicted the risks of PNLB SAEs (AUC = 0.717) and diagnostic success post-TBLB failure (AUC = 0.884) with validated clinical stratification benefits. Clinical relevance The dual XGBoost system guides clinical decision-making by integrating individual risk of SAEs with predictors of diagnostic success, enabling personalized salvage biopsy strategies that balance safety and diagnostic yield.

CT Classification Chest Retrospective Clinical In Silico Academic Lab

Beyond Diagnosis: Evaluating Multimodal LLMs for Pathology Localization in Chest Radiographs

Advait Gosai, Arun Kavishwar, Stephanie L. McNamara, Soujanya Samineni, Renato Umeton, Alexander Chowdhury, William Lotter

•preprint•Sep 22 2025

Recent work has shown promising performance of frontier large language models (LLMs) and their multimodal counterparts in medical quizzes and diagnostic tasks, highlighting their potential for broad clinical utility given their accessible, general-purpose nature. However, beyond diagnosis, a fundamental aspect of medical image interpretation is the ability to localize pathological findings. Evaluating localization not only has clinical and educational relevance but also provides insight into a model's spatial understanding of anatomy and disease. Here, we systematically assess two general-purpose MLLMs (GPT-4 and GPT-5) and a domain-specific model (MedGemma) in their ability to localize pathologies on chest radiographs, using a prompting pipeline that overlays a spatial grid and elicits coordinate-based predictions. Averaged across nine pathologies in the CheXlocalize dataset, GPT-5 exhibited a localization accuracy of 49.7%, followed by GPT-4 (39.1%) and MedGemma (17.7%), all lower than a task-specific CNN baseline (59.9%) and a radiologist benchmark (80.1%). Despite modest performance, error analysis revealed that GPT-5's predictions were largely in anatomically plausible regions, just not always precisely localized. GPT-4 performed well on pathologies with fixed anatomical locations, but struggled with spatially variable findings and exhibited anatomically implausible predictions more frequently. MedGemma demonstrated the lowest performance on all pathologies, showing limited capacity to generalize to this novel task. Our findings highlight both the promise and limitations of current MLLMs in medical imaging and underscore the importance of integrating them with task-specific tools for reliable use.

X-Ray Segmentation Chest Retrospective Clinical In Silico Academic Lab Benchmark SOTA

CPT-4DMR: Continuous sPatial-Temporal Representation for 4D-MRI Reconstruction

Xinyang Wu, Muheng Li, Xia Li, Orso Pusterla, Sairos Safai, Philippe C. Cattin, Antony J. Lomax, Ye Zhang

•preprint•Sep 22 2025

Four-dimensional MRI (4D-MRI) is an promising technique for capturing respiratory-induced motion in radiation therapy planning and delivery. Conventional 4D reconstruction methods, which typically rely on phase binning or separate template scans, struggle to capture temporal variability, complicate workflows, and impose heavy computational loads. We introduce a neural representation framework that considers respiratory motion as a smooth, continuous deformation steered by a 1D surrogate signal, completely replacing the conventional discrete sorting approach. The new method fuses motion modeling with image reconstruction through two synergistic networks: the Spatial Anatomy Network (SAN) encodes a continuous 3D anatomical representation, while a Temporal Motion Network (TMN), guided by Transformer-derived respiratory signals, produces temporally consistent deformation fields. Evaluation using a free-breathing dataset of 19 volunteers demonstrates that our template- and phase-free method accurately captures both regular and irregular respiratory patterns, while preserving vessel and bronchial continuity with high anatomical fidelity. The proposed method significantly improves efficiency, reducing the total processing time from approximately five hours required by conventional discrete sorting methods to just 15 minutes of training. Furthermore, it enables inference of each 3D volume in under one second. The framework accurately reconstructs 3D images at any respiratory state, achieves superior performance compared to conventional methods, and demonstrates strong potential for application in 4D radiation therapy planning and real-time adaptive treatment.

MRI Reconstruction Chest Methodology In Silico Academic Lab Reproducibility Breakthrough

Explainable AI-driven analysis of radiology reports using text and image data: An experimental study.

Zamir MT, Khan SU, Gelbukh A, Felipe Riverón EM, Gelbukh I

•papers•Sep 22 2025

Artificial intelligence is increasingly being integrated into clinical diagnostics, yet its lack of transparency hinders trust and adoption among healthcare professionals. The explainable AI (XAI) has the potential to improve interpretability and reliability of AI-based decisions in clinical practice. This study evaluates the use of Explainable AI (XAI) for interpreting radiology reports to improve healthcare practitioners' confidence and comprehension of AI-assisted diagnostics. This study employed the Indiana University chest X-ray Dataset containing 3169 textual reports and 6471 images. Textual were being classified as either normal or abnormal by using a range of machine learning approaches. This includes traditional machine learning models and ensemble methods, deep learning models (LSTM), and advanced transformer-based language models (GPT-2, T5, LLaMA-2, LLaMA-3.1). For image-based classifications, convolution neural networks (CNNs) including DenseNet121, and DenseNet169 were used. Top performing models were interpreted using Explainable AI (XAI) methods SHAP and LIME to support clinical decision making by enhancing transparency and trust in model predictions. LLaMA-3.1 model achieved highest accuracy of 98% in classifying the textual radiology reports. Statistical analysis confirmed the model robustness, with Cohen's kappa (k=0.981) indicating near perfect agreement beyond chance, both Chi-Square and Fisher's Exact test revealing a high significant association between actual and predicted labels (p<0.0001). Although McNemar's Test yielded a non-significant result (p=0.25) suggests balance class performance. While the highest accuracy of 84% was achieved in the analysis of imaging data using the DenseNet169 and DenseNet121 models. To assess explainability, LIME and SHAP were applied to best performing models. These models consistently highlighted the medical related terms such as "opacity", "consolidation" and "pleural" are clear indication for abnormal finding in textual reports. The research underscores that explainability is an essential component of any AI systems used in diagnostics and helpful in the design and implementation of AI in the healthcare sector. Such approach improves the accuracy of the diagnosis and builds confidence in health workers, who in the future will use explainable AI in clinical settings, particularly in the application of AI explainability for medical purposes.

X-Ray Classification Chest Methodology In Silico Academic Lab GenAI

Causal Representation Learning from Multimodal Clinical Records under Non-Random Modality Missingness

Zihan Liang, Ziwen Pan, Ruoxuan Xiong

•preprint•Sep 21 2025

Clinical notes contain rich patient information, such as diagnoses or medications, making them valuable for patient representation learning. Recent advances in large language models have further improved the ability to extract meaningful representations from clinical texts. However, clinical notes are often missing. For example, in our analysis of the MIMIC-IV dataset, 24.5% of patients have no available discharge summaries. In such cases, representations can be learned from other modalities such as structured data, chest X-rays, or radiology reports. Yet the availability of these modalities is influenced by clinical decision-making and varies across patients, resulting in modality missing-not-at-random (MMNAR) patterns. We propose a causal representation learning framework that leverages observed data and informative missingness in multimodal clinical records. It consists of: (1) an MMNAR-aware modality fusion component that integrates structured data, imaging, and text while conditioning on missingness patterns to capture patient health and clinician-driven assignment; (2) a modality reconstruction component with contrastive learning to ensure semantic sufficiency in representation learning; and (3) a multitask outcome prediction model with a rectifier that corrects for residual bias from specific modality observation patterns. Comprehensive evaluations across MIMIC-IV and eICU show consistent gains over the strongest baselines, achieving up to 13.8% AUC improvement for hospital readmission and 13.1% for ICU admission.

Mixed Modality Classification Chest Methodology In Silico Benchmark SOTA

Filter Papers

Tags

Artificial intelligence for diagnosis in interstitial lung disease and digital ontology for unclassified interstitial lung disease.

Artificial Intelligence Chest CT Imaging for the Diagnosis of Tuberculosis-Destroyed Lung with PH.

Comparative Evaluation of Radiomics and Deep Learning Models for Disease Detection in Chest Radiography.

Imaging in chronic thromboembolic pulmonary hypertension: review of the current literature.

Threshold optimization in AI chest radiography analysis: integrating real-world data and clinical subgroups.

Machine learning predicts severe adverse events and salvage success of CT-guided lung biopsy after nondiagnostic transbronchial lung biopsy.

Beyond Diagnosis: Evaluating Multimodal LLMs for Pathology Localization in Chest Radiographs

CPT-4DMR: Continuous sPatial-Temporal Representation for 4D-MRI Reconstruction

Explainable AI-driven analysis of radiology reports using text and image data: An experimental study.

Causal Representation Learning from Multimodal Clinical Records under Non-Random Modality Missingness

Ready to Sharpen Your Edge?