Sort by:
Page 23 of 66652 results

On the effectiveness of multimodal privileged knowledge distillation in two vision transformer based diagnostic applications

Simon Baur, Alexandra Benova, Emilio Dolgener Cantú, Jackie Ma

arxiv logopreprintAug 6 2025
Deploying deep learning models in clinical practice often requires leveraging multiple data modalities, such as images, text, and structured data, to achieve robust and trustworthy decisions. However, not all modalities are always available at inference time. In this work, we propose multimodal privileged knowledge distillation (MMPKD), a training strategy that utilizes additional modalities available solely during training to guide a unimodal vision model. Specifically, we used a text-based teacher model for chest radiographs (MIMIC-CXR) and a tabular metadata-based teacher model for mammography (CBIS-DDSM) to distill knowledge into a vision transformer student model. We show that MMPKD can improve the resulting attention maps' zero-shot capabilities of localizing ROI in input images, while this effect does not generalize across domains, as contrarily suggested by prior research.

Application of prediction model based on CT radiomics in prognosis of patients with non-small cell lung cancer.

Peng Z, Wang Y, Qi Y, Hu H, Fu Y, Li J, Li W, Li Z, Guo W, Shen C, Jiang J, Yang B

pubmed logopapersAug 6 2025
To establish and validate the utility of computed tomography (CT) radiomics for the prognosis of patients with non-small cell lung cancer (NSCLC). Overall, 215 patients with pathologic diagnosis of NSCLC were included, chest CT images and clinical data were collected before treatment, and follow-up was conducted to assess brain metastasis and survival. Radiomics characteristics were extracted from the chest CT lung window images of each patient, key characteristics were screened, the radiomics score (Radscore) was calculated, and radiomics, clinical, and combined models were constructed using clinically independent predictive factors. A nomogram was constructed based on the final joint model to visualize prediction results. Predictive efficacy was evaluated using the concordance index (C-index), and survival (Kaplan-Meier) and calibration curves were drawn to further evaluate predictive efficacy. The training set included 151 patients (43 with brain metastasis and 108 without brain metastasis) and 64 patients (18 with brain metastasis and 46 without). Multivariate analysis revealed that lymph node metastasis, lymphocyte percentage, and neuron-specific enolase (NSE) were independent predictors of brain metastasis in patients with NSCLC. The area under the curve (AUC) of the these models were 0.733, 0.836, and 0.849, respectively, in the training set and were 0.739, 0.779, and 0.816, respectively, in the validation set. Multivariate Cox regression analysis revealed that the number of brain metastases, distant metastases elsewhere, and C-reactive protein levels were independent predictors of postoperative survival in patients with brain metastases (<i>P</i> < 0.05). The calibration curve exhibited that the predicted values of the prognostic prediction model agreed well with the actual values. The model based on CT radiomics characteristics can effectively predict NSCLC brain metastasis and its prognosis and provide guidance for individualized treatment of NSCLC patients.

GL-LCM: Global-Local Latent Consistency Models for Fast High-Resolution Bone Suppression in Chest X-Ray Images

Yifei Sun, Zhanghao Chen, Hao Zheng, Yuqing Lu, Lixin Duan, Fenglei Fan, Ahmed Elazab, Xiang Wan, Changmiao Wang, Ruiquan Ge

arxiv logopreprintAug 5 2025
Chest X-Ray (CXR) imaging for pulmonary diagnosis raises significant challenges, primarily because bone structures can obscure critical details necessary for accurate diagnosis. Recent advances in deep learning, particularly with diffusion models, offer significant promise for effectively minimizing the visibility of bone structures in CXR images, thereby improving clarity and diagnostic accuracy. Nevertheless, existing diffusion-based methods for bone suppression in CXR imaging struggle to balance the complete suppression of bones with preserving local texture details. Additionally, their high computational demand and extended processing time hinder their practical use in clinical settings. To address these limitations, we introduce a Global-Local Latent Consistency Model (GL-LCM) architecture. This model combines lung segmentation, dual-path sampling, and global-local fusion, enabling fast high-resolution bone suppression in CXR images. To tackle potential boundary artifacts and detail blurring in local-path sampling, we further propose Local-Enhanced Guidance, which addresses these issues without additional training. Comprehensive experiments on a self-collected dataset SZCH-X-Rays, and the public dataset JSRT, reveal that our GL-LCM delivers superior bone suppression and remarkable computational efficiency, significantly outperforming several competitive methods. Our code is available at https://github.com/diaoquesang/GL-LCM.

A novel lung cancer diagnosis model using hybrid convolution (2D/3D)-based adaptive DenseUnet with attention mechanism.

Deepa J, Badhu Sasikala L, Indumathy P, Jerrin Simla A

pubmed logopapersAug 5 2025
Existing Lung Cancer Diagnosis (LCD) models have difficulty in detecting early-stage lung cancer due to the asymptomatic nature of the disease which leads to an increased death rate of patients. Therefore, it is important to diagnose lung disease at an early stage to save the lives of affected persons. Hence, the research work aims to develop an efficient lung disease diagnosis using deep learning techniques for the early and accurate detection of lung cancer. This is achieved by. Initially, the proposed model collects the mandatory CT images from the standard benchmark datasets. Then, the lung cancer segmentation is done by using the development of Hybrid Convolution (2D/3D)-based Adaptive DenseUnet with Attention mechanism (HC-ADAM). The Hybrid Sewing Training with Spider Monkey Optimization (HSTSMO) is introduced to optimize the parameters in the developed HC-ADAM segmentation approach. Finally, the dissected lung nodule imagery is considered for the lung cancer classification stage, where the Hybrid Adaptive Dilated Networks with Attention mechanism (HADN-AM) are implemented with the serial cascading of ResNet and Long Short Term Memory (LSTM) for attaining better categorization performance. The accuracy, precision, and F1-score of the developed model for the LIDC-IDRI dataset are 96.3%, 96.38%, and 96.36%, respectively.

R2GenKG: Hierarchical Multi-modal Knowledge Graph for LLM-based Radiology Report Generation

Futian Wang, Yuhan Qiao, Xiao Wang, Fuling Wang, Yuxiang Zhang, Dengdi Sun

arxiv logopreprintAug 5 2025
X-ray medical report generation is one of the important applications of artificial intelligence in healthcare. With the support of large foundation models, the quality of medical report generation has significantly improved. However, challenges such as hallucination and weak disease diagnostic capability still persist. In this paper, we first construct a large-scale multi-modal medical knowledge graph (termed M3KG) based on the ground truth medical report using the GPT-4o. It contains 2477 entities, 3 kinds of relations, 37424 triples, and 6943 disease-aware vision tokens for the CheXpert Plus dataset. Then, we sample it to obtain multi-granularity semantic graphs and use an R-GCN encoder for feature extraction. For the input X-ray image, we adopt the Swin-Transformer to extract the vision features and interact with the knowledge using cross-attention. The vision tokens are fed into a Q-former and retrieved the disease-aware vision tokens using another cross-attention. Finally, we adopt the large language model to map the semantic knowledge graph, input X-ray image, and disease-aware vision tokens into language descriptions. Extensive experiments on multiple datasets fully validated the effectiveness of our proposed knowledge graph and X-ray report generation framework. The source code of this paper will be released on https://github.com/Event-AHU/Medical_Image_Analysis.

The Use of Artificial Intelligence to Improve Detection of Acute Incidental Pulmonary Emboli.

Kuzo RS, Levin DL, Bratt AK, Walkoff LA, Suman G, Houghton DE

pubmed logopapersAug 4 2025
Incidental pulmonary emboli (IPE) are frequently overlooked by radiologists. Artificial intelligence (AI) algorithms have been developed to aid detection of pulmonary emboli. To measure diagnostic performance of AI compared with prospective interpretation by radiologists. A commercially available AI algorithm was used to retrospectively review 14,453 contrast-enhanced outpatient CT CAP exams in 9171 patients where PE was not clinically suspected. Natural language processing (NLP) searches of reports identified IPE detected prospectively. Thoracic radiologists reviewed all cases read as positive by AI or NLP to confirm IPE and assess the most proximal level of clot and overall clot burden. 1,400 cases read as negative by both the initial radiologist and AI were re-reviewed to assess for additional IPE. Radiologists prospectively detected 218 IPE and AI detected an additional 36 unreported cases. AI missed 30 cases of IPE detected by the radiologist and had 94 false positives. For 36 IPE missed by the radiologist, median clot burden was 1 and 19 were solitary segmental or subsegmental. For 30 IPE missed by AI, one case had large central emboli and the others were small with 23 solitary subsegmental emboli. Radiologist re-review of 1,400 exams interpreted as negative found 8 additional cases of IPE. Compared with radiologists, AI had similar sensitivity but reduced positive predictive value. Our experience indicates that the AI tool is not ready to be used autonomously without human oversight, but a human observer plus AI is better than either alone for detection of incidental pulmonary emboli.

Analysis on artificial intelligence-based chest computed tomography in multidisciplinary treatment models for discriminating benign and malignant pulmonary nodules.

Liu XY, Shan FC, Li H, Zhu JB

pubmed logopapersAug 4 2025
To evaluate the effectiveness of AI-based chest Computed Tomography (CT) in a Multidisciplinary Diagnosis and Treatment (MDT) model for differentiating benign and malignant pulmonary nodules. This retrospective study screened a total of 87 patients with pulmonary nodules who were treated between January 2019 and December 2020 at Binzhou People's Hospital, Qingdao Municipal Hospital, and Laiwu People's Hospital. AI analysis, MDT consultation, and a combined diagnostic approach were assessed using postoperative pathology as the reference standard. Among 87 nodules, 69 (79.31 %) were malignant, and 18 (20.69 %) were benign. AI analysis showed moderate agreement with pathology (κ = 0.637, p < 0.05), while MDT and the combined approach demonstrated higher consistency (κ = 0.847, 0.888, p < 0.05). Sensitivity and specificity were as follows: AI (89.86 %, 77.78 %, AUC = 0.838), MDT (100 %, 77.78 %, AUC = 0.889), and the combined approach (100 %, 83.33 %, AUC = 0.917). The accuracy of the combined method (96.55 %) was superior to MDT (95.40 %) and AI alone (87.36 %) (p < 0.05). AI-based chest CT combined with MDT may improve diagnostic accuracy and shows potential for broader clinical application.

A Multi-Agent System for Complex Reasoning in Radiology Visual Question Answering

Ziruo Yi, Jinyu Liu, Ting Xiao, Mark V. Albert

arxiv logopreprintAug 4 2025
Radiology visual question answering (RVQA) provides precise answers to questions about chest X-ray images, alleviating radiologists' workload. While recent methods based on multimodal large language models (MLLMs) and retrieval-augmented generation (RAG) have shown promising progress in RVQA, they still face challenges in factual accuracy, hallucinations, and cross-modal misalignment. We introduce a multi-agent system (MAS) designed to support complex reasoning in RVQA, with specialized agents for context understanding, multimodal reasoning, and answer validation. We evaluate our system on a challenging RVQA set curated via model disagreement filtering, comprising consistently hard cases across multiple MLLMs. Extensive experiments demonstrate the superiority and effectiveness of our system over strong MLLM baselines, with a case study illustrating its reliability and interpretability. This work highlights the potential of multi-agent approaches to support explainable and trustworthy clinical AI applications that require complex reasoning.

AI-Driven Integration of Deep Learning with Lung Imaging, Functional Analysis, and Blood Gas Metrics for Perioperative Hypoxemia Prediction: Progress and Perspectives.

Huang K, Wu C, Fang J, Pi R

pubmed logopapersAug 4 2025
This Perspective article explores the transformative role of artificial intelligence (AI) in predicting perioperative hypoxemia through the integration of deep learning (DL) with multimodal clinical data, including lung imaging, pulmonary function tests (PFTs), and arterial blood gas (ABG) analysis. Perioperative hypoxemia, defined as arterial oxygen partial pressure (PaO₂) <60 mmHg or oxygen saturation (SpO₂) <90%, poses significant risks of delayed recovery and organ dysfunction. Traditional diagnostic methods, such as radiological imaging and ABG analysis, often lack integrated predictive accuracy. AI frameworks, particularly convolutional neural networks (CNNs) and hybrid models like TD-CNNLSTM-LungNet, demonstrate exceptional performance in detecting pulmonary inflammation and stratifying hypoxemia risk, achieving up to 96.57% accuracy in pneumonia subtype differentiation and an AUC of 0.96 for postoperative hypoxemia prediction. Multimodal AI systems, such as DeepLung-Predict, unify CT scans, PFTs, and ABG parameters to enhance predictive precision, surpassing conventional methods by 22%. However, challenges persist, including dataset heterogeneity, model interpretability, and clinical workflow integration. Future directions emphasize multicenter validation, explainable AI (XAI) frameworks, and pragmatic trials to ensure equitable and reliable deployment. This AI-driven approach not only optimizes resource allocation but also mitigates financial burdens on healthcare systems by enabling early interventions and reducing ICU admission risks.

A Dual Radiomic and Dosiomic Filtering Technique for Locoregional Radiation Pneumonitis Prediction in Breast Cancer Patients

Zhenyu Yang, Qian Chen, Rihui Zhang, Manju Liu, Fengqiu Guo, Minjie Yang, Min Tang, Lina Zhou, Chunhao Wang, Minbin Chen, Fang-Fang Yin

arxiv logopreprintAug 4 2025
Purpose: Radiation pneumonitis (RP) is a serious complication of intensity-modulated radiation therapy (IMRT) for breast cancer patients, underscoring the need for precise and explainable predictive models. This study presents an Explainable Dual-Omics Filtering (EDOF) model that integrates spatially localized dosiomic and radiomic features for voxel-level RP prediction. Methods: A retrospective cohort of 72 breast cancer patients treated with IMRT was analyzed, including 28 who developed RP. The EDOF model consists of two components: (1) dosiomic filtering, which extracts local dose intensity and spatial distribution features from planning dose maps, and (2) radiomic filtering, which captures texture-based features from pre-treatment CT scans. These features are jointly analyzed using the Explainable Boosting Machine (EBM), a transparent machine learning model that enables feature-specific risk evaluation. Model performance was assessed using five-fold cross-validation, reporting area under the curve (AUC), sensitivity, and specificity. Feature importance was quantified by mean absolute scores, and Partial Dependence Plots (PDPs) were used to visualize nonlinear relationships between RP risk and dual-omic features. Results: The EDOF model achieved strong predictive performance (AUC = 0.95 +- 0.01; sensitivity = 0.81 +- 0.05). The most influential features included dosiomic Intensity Mean, dosiomic Intensity Mean Absolute Deviation, and radiomic SRLGLE. PDPs revealed that RP risk increases beyond 5 Gy and rises sharply between 10-30 Gy, consistent with clinical dose thresholds. SRLGLE also captured structural heterogeneity linked to RP in specific lung regions. Conclusion: The EDOF framework enables spatially resolved, explainable RP prediction and may support personalized radiation planning to mitigate pulmonary toxicity.
Page 23 of 66652 results
Show
per page

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.