Latest Papers on Radiology AI. Tags: Chest

A Comparative Evaluation of Meta-Learning Models for Few-Shot Chest X-Ray Disease Classification.

Quiñonez-Baca LC, Ramirez-Alonso G, Gaxiola F, Manzo-Martinez A, Cornejo R, Lopez-Flores DR

•papers•Sep 21 2025

Background/Objectives: The limited availability of labeled data, particularly in the medical domain, poses a significant challenge for training accurate diagnostic models. While deep learning techniques have demonstrated notable efficacy in image-based tasks, they require large annotated datasets. In data-scarce scenarios-especially involving rare diseases-their performance deteriorates significantly. Meta-learning offers a promising alternative by enabling models to adapt quickly to new tasks using prior knowledge and only a few labeled examples. This study aims to evaluate the effectiveness of representative meta-learning models for thoracic disease classification in chest X-rays. Methods: We conduct a comparative evaluation of four meta-learning models: Prototypical Networks, Relation Networks, MAML, and FoMAML. First, we assess five backbone architectures (ConvNeXt, DenseNet-121, ResNet-50, MobileNetV2, and ViT) using a Prototypical Network. The best-performing backbone is then used across all meta-learning models for fair comparison. Experiments are performed on the ChestX-ray14 dataset under a 2-way setting with multiple k-shot configurations. Results: Prototypical Networks combined with DenseNet-121 achieved the best performance, with a recall of 68.1%, an F1-score of 67.4%, and a precision of 0.693 in the 2-way, 10-shot configuration. In a disease-specific analysis, Hernia obtains the best classification results. Furthermore, Prototypical and Relation Networks demonstrate significantly higher computational efficiency, requiring fewer FLOPs and shorter execution times than MAML and FoMAML. Conclusions: Prototype-based meta-learning, particularly with DenseNet-121, proves to be a robust and computationally efficient approach for few-shot chest X-ray disease classification. These findings highlight its potential for real-world clinical applications, especially in scenarios with limited annotated medical data.

X-Ray Classification Chest Methodology In Silico Academic Lab

Causal Representation Learning from Multimodal Clinical Records under Non-Random Modality Missingness

Zihan Liang, Ziwen Pan, Ruoxuan Xiong

•preprint•Sep 21 2025

Clinical notes contain rich patient information, such as diagnoses or medications, making them valuable for patient representation learning. Recent advances in large language models have further improved the ability to extract meaningful representations from clinical texts. However, clinical notes are often missing. For example, in our analysis of the MIMIC-IV dataset, 24.5% of patients have no available discharge summaries. In such cases, representations can be learned from other modalities such as structured data, chest X-rays, or radiology reports. Yet the availability of these modalities is influenced by clinical decision-making and varies across patients, resulting in modality missing-not-at-random (MMNAR) patterns. We propose a causal representation learning framework that leverages observed data and informative missingness in multimodal clinical records. It consists of: (1) an MMNAR-aware modality fusion component that integrates structured data, imaging, and text while conditioning on missingness patterns to capture patient health and clinician-driven assignment; (2) a modality reconstruction component with contrastive learning to ensure semantic sufficiency in representation learning; and (3) a multitask outcome prediction model with a rectifier that corrects for residual bias from specific modality observation patterns. Comprehensive evaluations across MIMIC-IV and eICU show consistent gains over the strongest baselines, achieving up to 13.8% AUC improvement for hospital readmission and 13.1% for ICU admission.

Mixed Modality Classification Chest Methodology In Silico Benchmark SOTA

Chest computed tomography-based artificial intelligence-aided latent class analysis for diagnosis of severe pneumonia.

Chu C, Guo Y, Lu Z, Gui T, Zhao S, Cui X, Lu S, Jiang M, Li W, Gao C

•papers•Sep 20 2025

There is little literature describing the artificial intelligence (AI)-aided diagnosis of severe pneumonia (SP) subphenotypes and the association of the subphenotypes with the ventilatory treatment efficacy. The aim of our study is to illustrate whether clinical and biological heterogeneity, such as ventilation and gas-exchange, exists among patients with SP using chest computed tomography (CT)-based AI-aided latent class analysis (LCA). This retrospective study included 413 patients hospitalized at Xinhua Hospital diagnosed with SP from June 1, 2015 to May 30, 2020. AI quantification results of chest CT and their combination with additional clinical variables were used to develop LCA models in an SP population. The optimal subphenotypes were determined though evaluating statistical indicators of all the LCA models, and clinical implications of them such as guiding ventilation strategies were further explored by statistical methods. The two-class LCA model based on AI quantification results of chest CT can describe the biological characteristics of the SP population well and hence yielded the two clinical subphenotypes. Patients with subphenotype-1 had milder infections ( P <0.001) than patients with subphenotype-2 and had lower 30-day ( P <0.001) and 90-day ( P <0.001) mortality, and lower in-hospital ( P = 0.001) and 2-year ( P <0.001) mortality. Patients with subphenotype-1 showed a better match between the percentage of non-infected lung volume (used to quantify ventilation) and oxygen saturation (used to reflect gas exchange), compared with patients with subphenotype-2. There were significant differences in the matching degree of lung ventilation and gas exchange between the two subphenotypes ( P <0.001). Compared with patients with subphenotype-2, those with subphenotype-1 showed a relatively better match between CT-based AI metrics of the non-infected region and oxygenation, and their clinical outcomes were effectively improved after receiving invasive ventilation treatment. A two-class LCA model based on AI quantification results of chest CT in the SP population particularly revealed clinical heterogeneity of lung function. Identifying the degree of match between ventilation and gas-exchange may help guide decisions about assisted ventilation.

CT Classification Chest Retrospective Clinical In Silico

Radiologist Interaction with AI-Generated Preliminary Reports: A Longitudinal Multi-Reader Study.

Hong EK, Suh CH, Nukala M, Esfahani A, Licaros A, Madan R, Hunsaker A, Hammer M

•papers•Sep 20 2025

To investigate the integration of multimodal AI-generated reports into radiology workflow over time, focusing on their impact on efficiency, acceptability, and report quality. A multicase, multireader study involved 756 publicly available chest radiographs interpreted by five radiologists using preliminary reports generated by a radiology-specific multimodal AI model, divided into seven sequential batches of 108 radiographs each. Two thoracic radiologists assessed the final reports using RADPEER criteria for agreement and 5-point Likert scale for quality. Reading times, rate of acceptance without modification, agreement, and quality scores were measured, with statistical analyses evaluating trends across seven sequential batches. Radiologists' reading times for chest radiographs decreased from 25.8 seconds in Batch 1 to 19.3 seconds in Batch 7 (p < .001). Acceptability increased from 54.6% to 60.2% (p < .001), with normal chest radiographs demonstrating high rates (68.9%) compared to abnormal chest radiographs (52.6%; p < .001). Median agreement and quality scores remained stable for normal chest radiographs but varied significantly for abnormal chest radiographs (ps < .05). The introduction of AI-generated reports improved efficiency of chest radiograph interpretation, acceptability increased over time. However, agreement and quality scores showed variability, particularly in abnormal cases, emphasizing the need for oversight in the interpretation of complex chest radiographs.

X-Ray Report Generation Chest Retrospective Clinical Clinical Pilot Academic Lab Benchmark SOTA

PneumoNet: Deep Neural Network for Advanced Pneumonia Detection.

Mahesh TR, Gupta M, Thakur A, Khan SB, Quasim MT, Almusharraf A

•papers•Sep 19 2025

Advancements in computational methods in medicine have brought about extensive improvement in the diagnosis of illness, with machine learning models such as Convolutional Neural Networks leading the charge. This work introduces PneumoNet, a novel deep-learning model designed for accurate pneumonia detection from chest X-ray images. Pneumonia detection from chest X-ray images is one of the greatest challenges in diagnostic practice and medical imaging. Proper identification of standard chest X-ray views or pneumonia-specific views is required to perform this task effectively. Contemporary methods, such as classical machine learning models and initial deep learning methods, guarantee good performance but are generally marred by accuracy, generalizability, and preprocessing issues. These techniques are generally marred by clinical usage constraints like high false positives and poor performance over a broad spectrum of datasets. A novel deep learning architecture, PneumoNet, has been proposed as a solution to these problems. PneumoNet applies a convolutional neural network (CNN) structure specifically employed for the improvement of accuracy and precision in image classification. The model employs several layers of convolution as well as pooling, followed by fully connected dense layers, for efficient extraction of intricate features in X-ray images. The innovation of this approach lies in its advanced layer structure and its training, which are optimized to enhance feature extraction and classification performance greatly. The model proposed here, PneumoNet, has been cross-validated and trained on a well-curated dataset that includes a balanced representation of normal and pneumonia cases. Quantitative results demonstrate the model's performance, with an overall accuracy of 98% and precision values of 96% for normal and 98% for pneumonia cases. The recall values for normal and pneumonia cases are 96% and 98%, respectively, highlighting the consistency of the model. These performance measures collectively indicate the promise of the proposed model to improve the diagnostic process, with a substantial advancement over current methods and paving the way for its application in clinical practice.

X-Ray Classification Chest Methodology In Silico

MFFC-Net: Multi-feature Fusion Deep Networks for Classifying Pulmonary Edema of a Pilot Study by Using Lung Ultrasound Image with Texture Analysis and Transfer Learning Technique.

Bui NT, Luoma CE, Zhang X

•papers•Sep 19 2025

Lung ultrasound (LUS) has been widely used by point-of-care systems in both children and adult populations to provide different clinical diagnostics. This research aims to develop an interpretable system that uses a deep fusion network for classifying LUS video/patients based on extracted features by using texture analysis and transfer learning techniques to assist physicians. The pulmonary edema dataset includes 56 LUS videos and 4234 LUS frames. The COVID-BLUES dataset includes 294 LUS videos and 15,826 frames. The proposed multi-feature fusion classification network (MFFC-Net) includes the following: (1) two features extracted from Inception-ResNet-v2, Inception-v3, and 9 texture features of gray-level co-occurrence matrix (GLCM) and histogram of the region of interest (ROI); (2) a neural network for classifying LUS images with feature fusion input; and (3) four models (i.e., ANN, SVM, XGBoost, and kNN) used for classifying COVID/NON COVID patients. The training process was evaluated based on accuracy (0.9969), F1-score (0.9968), sensitivity (0.9967), specificity (0.9990), and precision (0.9970) metrics after the fivefold cross-validation stage. The results of the ANOVA analysis with 9 features of LUS images show that there was a significant difference between pulmonary edema and normal lungs (p < 0.01). The test results at the frame level of the MFFC-Net model achieved an accuracy of 100% and ROC-AUC (1.000) compared with ground truth at the video level with 4 groups of LUS videos. Test results at the patient level with the COVID-BLUES dataset achieved the highest accuracy of 81.25% with the kNN model. The proposed MFFC-Net model has 125 times higher information density (ID) compared to Inception-ResNet-v2 and 53.2 times compared with Inception-v3.

Ultrasound Classification Chest Methodology In Silico Academic Lab Benchmark SOTA

AI-Based Algorithm to Detect Heart and Lung Disease From Acute Chest Computed Tomography Scans: Protocol for an Algorithm Development and Validation Study.

Olesen ASO, Miger K, Ørting SN, Petersen J, de Bruijne M, Boesen MP, Andersen MB, Grand J, Thune JJ, Nielsen OW

•papers•Sep 19 2025

Dyspnea is a common cause of hospitalization, posing diagnostic challenges among older adult patients with multimorbid conditions. Chest computed tomography (CT) scans are increasingly used in patients with dyspnea and offer superior diagnostic accuracy over chest radiographs but face limited use due to a shortage of radiologists. This study aims to develop and validate artificial intelligence (AI) algorithms to enable automatic analysis of acute CT scans and provide immediate feedback on the likelihood of pneumonia, pulmonary embolism, and cardiac decompensation. This protocol will focus on cardiac decompensation. We designed a retrospective method development and validation study. This study has been approved by the Danish National Committee on Health Research Ethics (1575037). We extracted 4672 acute chest CT scans with corresponding radiological reports from the Copenhagen University Hospital-Bispebjerg and Frederiksberg, Denmark, from 2016 to 2021. The scans will be randomly split into training (2/3) and internal validation (1/3) sets. Development of the AI algorithm involves parameter tuning and feature selection using cross validation. Internal validation uses radiological reports as the ground truth, with algorithm-specific thresholds based on true positive and negative rates of 90% or greater for heart and lung diseases. The AI models will be validated in low-dose chest CT scans from consecutive patients admitted with acute dyspnea and in coronary CT angiography scans from patients with acute coronary syndrome. As of August 2025, CT data extraction has been completed. Algorithm development, including image segmentation and natural language processing, is ongoing. However, for pulmonary congestion, the algorithm development has been completed. Internal and external validation are planned, with overall validation expected to conclude in 2025 and the final results to be available in 2026. The results are expected to enhance clinical decision-making by providing immediate, AI-driven insights from CT scans, which will be beneficial for both clinicians and patients. DERR1-10.2196/77030.

CT Classification Chest Retrospective Clinical In Silico Academic Lab

Radiology Report Conditional 3D CT Generation with Multi Encoder Latent diffusion Model

Sina Amirrajab, Zohaib Salahuddin, Sheng Kuang, Henry C. Woodruff, Philippe Lambin

•preprint•Sep 18 2025

Text to image latent diffusion models have recently advanced medical image synthesis, but applications to 3D CT generation remain limited. Existing approaches rely on simplified prompts, neglecting the rich semantic detail in full radiology reports, which reduces text image alignment and clinical fidelity. We propose Report2CT, a radiology report conditional latent diffusion framework for synthesizing 3D chest CT volumes directly from free text radiology reports, incorporating both findings and impression sections using multiple text encoder. Report2CT integrates three pretrained medical text encoders (BiomedVLP CXR BERT, MedEmbed, and ClinicalBERT) to capture nuanced clinical context. Radiology reports and voxel spacing information condition a 3D latent diffusion model trained on 20000 CT volumes from the CT RATE dataset. Model performance was evaluated using Frechet Inception Distance (FID) for real synthetic distributional similarity and CLIP based metrics for semantic alignment, with additional qualitative and quantitative comparisons against GenerateCT model. Report2CT generated anatomically consistent CT volumes with excellent visual quality and text image alignment. Multi encoder conditioning improved CLIP scores, indicating stronger preservation of fine grained clinical details in the free text radiology reports. Classifier free guidance further enhanced alignment with only a minor trade off in FID. We ranked first in the VLM3D Challenge at MICCAI 2025 on Text Conditional CT Generation and achieved state of the art performance across all evaluation metrics. By leveraging complete radiology reports and multi encoder text conditioning, Report2CT advances 3D CT synthesis, producing clinically faithful and high quality synthetic data.

CT Image Synthesis Chest Methodology In Silico Academic Lab GenAI Benchmark SOTA

Multimodal radiomics fusion for predicting postoperative recurrence in NSCLC patients.

Mehri-Kakavand G, Mdletshe S, Amini M, Wang A

•papers•Sep 18 2025

Postoperative recurrence in non-small cell lung cancer (NSCLC) affects up to 55% of patients, underscoring limits of TNM staging. We assessed multimodal radiomics—positron emission tomography (PET), computed tomography (CT), and clinicopathological (CP) data—for personalized recurrence prediction. Data from 131 NSCLC patients with PET/CT imaging and CP variables were analysed. Radiomics features were extracted using PyRadiomics (1,316 PET and 1,409 CT features per tumor), with robustness testing and selection yielding 20 CT, 20 PET, and 23 CP variables. Prediction models were trained using Logistic Regression (L1, L2, Elastic Net), Random Forest, Gradient Boosting, XGBoost, and CatBoost. Nested cross-validation with SMOTE addressed class imbalance. Fusion strategies included early (feature concatenation), intermediate (stacked ensembles), and late (weighted averaging) fusion. Among single modalities, CT with Elastic Net achieved the highest cross-validated AUC (0.679, 95% CI: 0.57–0.79). Fusion improved performance: PET + CT + Clinical late fusion with Elastic Net achieved the best cross-validated AUC (0.811, 95% CI: 0.69–0.91). Out-of-fold ROC curves confirmed stronger discrimination for the fusion model (AUC = 0.836 vs. 0.741 for CT). Fusion also showed better calibration, higher net clinical benefit (decision-curve analysis), and clearer survival stratification (Kaplan–Meier). Integrating PET, CT, and CP data—particularly via late fusion with Elastic Net—enhances discrimination beyond single-modality models and supports more consistent risk stratification. These findings suggest practical potential for informing postoperative surveillance and adjuvant therapy decisions, encouraging a shift beyond TNM alone toward interpretable multimodal frameworks. External validation in larger, multicenter cohorts is warranted. The online version contains supplementary material available at 10.1007/s00432-025-06311-w.

Mixed Modality Classification Chest Retrospective Clinical In Silico Academic Lab Benchmark SOTA

Limitations of Public Chest Radiography Datasets for Artificial Intelligence: Label Quality, Domain Shift, Bias and Evaluation Challenges

Amy Rafferty, Rishi Ramaesh, Ajitha Rajan

•preprint•Sep 18 2025

Artificial intelligence has shown significant promise in chest radiography, where deep learning models can approach radiologist-level diagnostic performance. Progress has been accelerated by large public datasets such as MIMIC-CXR, ChestX-ray14, PadChest, and CheXpert, which provide hundreds of thousands of labelled images with pathology annotations. However, these datasets also present important limitations. Automated label extraction from radiology reports introduces errors, particularly in handling uncertainty and negation, and radiologist review frequently disagrees with assigned labels. In addition, domain shift and population bias restrict model generalisability, while evaluation practices often overlook clinically meaningful measures. We conduct a systematic analysis of these challenges, focusing on label quality, dataset bias, and domain shift. Our cross-dataset domain shift evaluation across multiple model architectures revealed substantial external performance degradation, with pronounced reductions in AUPRC and F1 scores relative to internal testing. To assess dataset bias, we trained a source-classification model that distinguished datasets with near-perfect accuracy, and performed subgroup analyses showing reduced performance for minority age and sex groups. Finally, expert review by two board-certified radiologists identified significant disagreement with public dataset labels. Our findings highlight important clinical weaknesses of current benchmarks and emphasise the need for clinician-validated datasets and fairer evaluation frameworks.

X-Ray Classification Chest Review In Silico Benchmark SOTA Ethics

Filter Papers

Tags

A Comparative Evaluation of Meta-Learning Models for Few-Shot Chest X-Ray Disease Classification.

Causal Representation Learning from Multimodal Clinical Records under Non-Random Modality Missingness

Chest computed tomography-based artificial intelligence-aided latent class analysis for diagnosis of severe pneumonia.

Radiologist Interaction with AI-Generated Preliminary Reports: A Longitudinal Multi-Reader Study.

PneumoNet: Deep Neural Network for Advanced Pneumonia Detection.

MFFC-Net: Multi-feature Fusion Deep Networks for Classifying Pulmonary Edema of a Pilot Study by Using Lung Ultrasound Image with Texture Analysis and Transfer Learning Technique.

AI-Based Algorithm to Detect Heart and Lung Disease From Acute Chest Computed Tomography Scans: Protocol for an Algorithm Development and Validation Study.

Radiology Report Conditional 3D CT Generation with Multi Encoder Latent diffusion Model

Multimodal radiomics fusion for predicting postoperative recurrence in NSCLC patients.

Limitations of Public Chest Radiography Datasets for Artificial Intelligence: Label Quality, Domain Shift, Bias and Evaluation Challenges

Ready to Sharpen Your Edge?