Sort by:
Page 51 of 58574 results

Comprehensive Lung Disease Detection Using Deep Learning Models and Hybrid Chest X-ray Data with Explainable AI

Shuvashis Sarker, Shamim Rahim Refat, Faika Fairuj Preotee, Tanvir Rouf Shawon, Raihan Tanvir

arxiv logopreprintMay 21 2025
Advanced diagnostic instruments are crucial for the accurate detection and treatment of lung diseases, which affect millions of individuals globally. This study examines the effectiveness of deep learning and transfer learning models using a hybrid dataset, created by merging four individual datasets from Bangladesh and global sources. The hybrid dataset significantly enhances model accuracy and generalizability, particularly in detecting COVID-19, pneumonia, lung opacity, and normal lung conditions from chest X-ray images. A range of models, including CNN, VGG16, VGG19, InceptionV3, Xception, ResNet50V2, InceptionResNetV2, MobileNetV2, and DenseNet121, were applied to both individual and hybrid datasets. The results showed superior performance on the hybrid dataset, with VGG16, Xception, ResNet50V2, and DenseNet121 each achieving an accuracy of 99%. This consistent performance across the hybrid dataset highlights the robustness of these models in handling diverse data while maintaining high accuracy. To understand the models implicit behavior, explainable AI techniques were employed to illuminate their black-box nature. Specifically, LIME was used to enhance the interpretability of model predictions, especially in cases of misclassification, contributing to the development of reliable and interpretable AI-driven solutions for medical imaging.

Multi-modal Integration Analysis of Alzheimer's Disease Using Large Language Models and Knowledge Graphs

Kanan Kiguchi, Yunhao Tu, Katsuhiro Ajito, Fady Alnajjar, Kazuyuki Murase

arxiv logopreprintMay 21 2025
We propose a novel framework for integrating fragmented multi-modal data in Alzheimer's disease (AD) research using large language models (LLMs) and knowledge graphs. While traditional multimodal analysis requires matched patient IDs across datasets, our approach demonstrates population-level integration of MRI, gene expression, biomarkers, EEG, and clinical indicators from independent cohorts. Statistical analysis identified significant features in each modality, which were connected as nodes in a knowledge graph. LLMs then analyzed the graph to extract potential correlations and generate hypotheses in natural language. This approach revealed several novel relationships, including a potential pathway linking metabolic risk factors to tau protein abnormalities via neuroinflammation (r>0.6, p<0.001), and unexpected correlations between frontal EEG channels and specific gene expression profiles (r=0.42-0.58, p<0.01). Cross-validation with independent datasets confirmed the robustness of major findings, with consistent effect sizes across cohorts (variance <15%). The reproducibility of these findings was further supported by expert review (Cohen's k=0.82) and computational validation. Our framework enables cross modal integration at a conceptual level without requiring patient ID matching, offering new possibilities for understanding AD pathology through fragmented data reuse and generating testable hypotheses for future research.

The Desmoid Dilemma: Challenges and Opportunities in Assessing Tumor Burden and Therapeutic Response.

Chang YC, Nixon B, Souza F, Cardoso FN, Dayan E, Geiger EJ, Rosenberg A, D'Amato G, Subhawong T

pubmed logopapersMay 21 2025
Desmoid tumors are rare, locally invasive soft-tissue tumors with unpredictable clinical behavior. Imaging plays a crucial role in their diagnosis, measurement of disease burden, and assessment of treatment response. However, desmoid tumors' unique imaging features present challenges to conventional imaging metrics. The heterogeneous nature of these tumors, with a variable composition (fibrous, myxoid, or cellular), complicates accurate delineation of tumor boundaries and volumetric assessment. Furthermore, desmoid tumors can demonstrate prolonged stability or spontaneous regression, and biologic quiescence is often manifested by collagenization rather than bulk size reduction, making traditional size-based response criteria, such as Response Evaluation Criteria in Solid Tumors (RECIST), suboptimal. To overcome these limitations, advanced imaging techniques offer promising opportunities. Functional and parametric imaging methods, such as diffusion-weighted MRI, dynamic contrast-enhanced MRI, and T2 relaxometry, can provide insights into tumor cellularity and maturation. Radiomics and artificial intelligence approaches may enhance quantitative analysis by extracting and correlating complex imaging features with biological behavior. Moreover, imaging biomarkers could facilitate earlier detection of treatment efficacy or resistance, enabling tailored therapy. By integrating advanced imaging into clinical practice, it may be possible to refine the evaluation of disease burden and treatment response, ultimately improving the management and outcomes of patients with desmoid tumors.

Synthesizing [<sup>18</sup>F]PSMA-1007 PET bone images from CT images with GAN for early detection of prostate cancer bone metastases: a pilot validation study.

Chai L, Yao X, Yang X, Na R, Yan W, Jiang M, Zhu H, Sun C, Dai Z, Yang X

pubmed logopapersMay 21 2025
[<sup>18</sup>F]FDG PET/CT scan combined with [<sup>18</sup>F]PSMA-1007 PET/CT scan is commonly conducted for detecting bone metastases in prostate cancer (PCa). However, it is expensive and may expose patients to more radiation hazards. This study explores deep learning (DL) techniques to synthesize [<sup>18</sup>F]PSMA-1007 PET bone images from CT bone images for the early detection of bone metastases in PCa, which may reduce additional PET/CT scans and relieve the burden on patients. We retrospectively collected paired whole-body (WB) [<sup>18</sup>F]PSMA-1007 PET/CT images from 152 patients with clinical and pathological diagnosis results, including 123 PCa and 29 cases of benign lesions. The average age of the patients was 67.48 ± 10.87 years, and the average lesion size was 8.76 ± 15.5 mm. The paired low-dose CT and PET images were preprocessed and segmented to construct the WB bone structure images. 152 subjects were randomly stratified into training, validation, and test groups in the number of 92:41:19. Two generative adversarial network (GAN) models-Pix2pix and Cycle GAN-were trained to synthesize [<sup>18</sup>F]PSMA-1007 PET bone images from paired CT bone images. The performance of two synthesis models was evaluated using quantitative metrics of mean absolute error (MAE), mean squared error (MSE), peak signal-to-noise ratio (PSNR), and structural similarity index metrics (SSIM), as well as the target-to-background ratio (TBR). The results of DL-based image synthesis indicated that the synthesis of [<sup>18</sup>F]PSMA-1007 PET bone images from low-dose CT bone images was highly feasible. The Pix2pix model performed better with an SSIM of 0.97, PSNR of 44.96, MSE of 0.80, and MAE of 0.10, respectively. The TBRs of bone metastasis lesions calculated on DL-synthesized PET bone images were highly correlated with those of real PET bone images (Pearson's r > 0.90) and had no significant differences (p < 0.05). It is feasible to generate synthetic [<sup>18</sup>F]PSMA-1007 PET bone images from CT bone images by using DL techniques with reasonable accuracy, which can provide information for early detection of PCa bone metastases.

Performance of multimodal prediction models for intracerebral hemorrhage outcomes using real-world data.

Matsumoto K, Suzuki M, Ishihara K, Tokunaga K, Matsuda K, Chen J, Yamashiro S, Soejima H, Nakashima N, Kamouchi M

pubmed logopapersMay 21 2025
We aimed to develop and validate multimodal models integrating computed tomography (CT) images, text and tabular clinical data to predict poor functional outcomes and in-hospital mortality in patients with intracerebral hemorrhage (ICH). These models were designed to assist non-specialists in emergency settings with limited access to stroke specialists. A retrospective analysis of 527 patients with ICH admitted to a Japanese tertiary hospital between April 2019 and February 2022 was conducted. Deep learning techniques were used to extract features from three-dimensional CT images and unstructured data, which were then combined with tabular data to develop an L1-regularized logistic regression model to predict poor functional outcomes (modified Rankin scale score 3-6) and in-hospital mortality. The model's performance was evaluated by assessing discrimination metrics, calibration plots, and decision curve analysis (DCA) using temporal validation data. The multimodal model utilizing both imaging and text data, such as medical interviews, exhibited the highest performance in predicting poor functional outcomes. In contrast, the model that combined imaging with tabular data, including physiological and laboratory results, demonstrated the best predictive performance for in-hospital mortality. These models exhibited high discriminative performance, with areas under the receiver operating curve (AUROCs) of 0.86 (95% CI: 0.79-0.92) and 0.91 (95% CI: 0.84-0.96) for poor functional outcomes and in-hospital mortality, respectively. Calibration was satisfactory for predicting poor functional outcomes, but requires refinement for mortality prediction. The models performed similar to or better than conventional risk scores, and DCA curves supported their clinical utility. Multimodal prediction models have the potential to aid non-specialists in making informed decisions regarding ICH cases in emergency departments as part of clinical decision support systems. Enhancing real-world data infrastructure and improving model calibration are essential for successful implementation in clinical practice.

Customized GPT-4V(ision) for radiographic diagnosis: can large language model detect supernumerary teeth?

Aşar EM, İpek İ, Bi Lge K

pubmed logopapersMay 21 2025
With the growing capabilities of language models like ChatGPT to process text and images, this study evaluated their accuracy in detecting supernumerary teeth on periapical radiographs. A customized GPT-4V model (CGPT-4V) was also developed to assess whether domain-specific training could improve diagnostic performance compared to standard GPT-4V and GPT-4o models. One hundred eighty periapical radiographs (90 with and 90 without supernumerary teeth) were evaluated using GPT-4 V, GPT-4o, and a fine-tuned CGPT-4V model. Each image was assessed separately with the standardized prompt "Are there any supernumerary teeth in the radiograph above?" to avoid contextual bias. Three dental experts scored the responses using a three-point Likert scale for positive cases and a binary scale for negatives. Chi-square tests and ROC analysis were used to compare model performances (p < 0.05). Among the three models, CGPT-4 V exhibited the highest accuracy, detecting supernumerary teeth correctly in 91% of cases, compared to 77% for GPT-4o and 63% for GPT-4V. The CGPT-4V model also demonstrated a significantly lower false positive rate (16%) than GPT-4V (42%). A statistically significant difference was found between CGPT-4V and GPT-4o (p < 0.001), while no significant difference was observed between GPT-4V and CGPT-4V or between GPT-4V and GPT-4o. Additionally, CGPT-4V successfully identified multiple supernumerary teeth in radiographs where present. These findings highlight the diagnostic potential of customized GPT models in dental radiology. Future research should focus on multicenter validation, seamless clinical integration, and cost-effectiveness to support real-world implementation.

Right Ventricular Strain as a Key Feature in Interpretable Machine Learning for Identification of Takotsubo Syndrome: A Multicenter CMR-based Study.

Du Z, Hu H, Shen C, Mei J, Feng Y, Huang Y, Chen X, Guo X, Hu Z, Jiang L, Su Y, Biekan J, Lyv L, Chong T, Pan C, Liu K, Ji J, Lu C

pubmed logopapersMay 21 2025
To develop an interpretable machine learning (ML) model based on cardiac magnetic resonance (CMR) multimodal parameters and clinical data to discriminate Takotsubo syndrome (TTS), acute myocardial infarction (AMI), and acute myocarditis (AM), and to further assess the diagnostic value of right ventricular (RV) strain in TTS. This study analyzed CMR and clinical data of 130 patients from three centers. Key features were selected using least absolute shrinkage and selection operator regression and random forest. Data were split into a training cohort and an internal testing cohort (ITC) in the ratio 7:3, with overfitting avoided using leave-one-out cross-validation and bootstrap methods. Nine ML models were evaluated using standard performance metrics, with Shapley additive explanations (SHAP) analysis used for model interpretation. A total of 11 key features were identified. The extreme gradient boosting model showed the best performance, with an area under the curve (AUC) value of 0.94 (95% CI: 0.85-0.97) in the ITC. Right ventricular basal circumferential strain (RVCS-basal) was the most important feature for identifying TTS. Its absolute value was significantly higher in TTS patients than in AMI and AM patients (-9.93%, -5.21%, and -6.18%, respectively, p < 0.001), with values above -6.55% contributing to a diagnosis of TTS. This study developed an interpretable ternary classification ML model for identifying TTS and used SHAP analysis to elucidate the significant value of RVCS-basal in TTS diagnosis. An online calculator (https://lsszxyy.shinyapps.io/XGboost/) based on this model was developed to provide immediate decision support for clinical use.

Multi-modal Integration Analysis of Alzheimer's Disease Using Large Language Models and Knowledge Graphs

Kanan Kiguchi, Yunhao Tu, Katsuhiro Ajito, Fady Alnajjar, Kazuyuki Murase

arxiv logopreprintMay 21 2025
We propose a novel framework for integrating fragmented multi-modal data in Alzheimer's disease (AD) research using large language models (LLMs) and knowledge graphs. While traditional multimodal analysis requires matched patient IDs across datasets, our approach demonstrates population-level integration of MRI, gene expression, biomarkers, EEG, and clinical indicators from independent cohorts. Statistical analysis identified significant features in each modality, which were connected as nodes in a knowledge graph. LLMs then analyzed the graph to extract potential correlations and generate hypotheses in natural language. This approach revealed several novel relationships, including a potential pathway linking metabolic risk factors to tau protein abnormalities via neuroinflammation (r>0.6, p<0.001), and unexpected correlations between frontal EEG channels and specific gene expression profiles (r=0.42-0.58, p<0.01). Cross-validation with independent datasets confirmed the robustness of major findings, with consistent effect sizes across cohorts (variance <15%). The reproducibility of these findings was further supported by expert review (Cohen's k=0.82) and computational validation. Our framework enables cross modal integration at a conceptual level without requiring patient ID matching, offering new possibilities for understanding AD pathology through fragmented data reuse and generating testable hypotheses for future research.

An Exploratory Approach Towards Investigating and Explaining Vision Transformer and Transfer Learning for Brain Disease Detection

Shuvashis Sarker, Shamim Rahim Refat, Faika Fairuj Preotee, Shifat Islam, Tashreef Muhammad, Mohammad Ashraful Hoque

arxiv logopreprintMay 21 2025
The brain is a highly complex organ that manages many important tasks, including movement, memory and thinking. Brain-related conditions, like tumors and degenerative disorders, can be hard to diagnose and treat. Magnetic Resonance Imaging (MRI) serves as a key tool for identifying these conditions, offering high-resolution images of brain structures. Despite this, interpreting MRI scans can be complicated. This study tackles this challenge by conducting a comparative analysis of Vision Transformer (ViT) and Transfer Learning (TL) models such as VGG16, VGG19, Resnet50V2, MobilenetV2 for classifying brain diseases using MRI data from Bangladesh based dataset. ViT, known for their ability to capture global relationships in images, are particularly effective for medical imaging tasks. Transfer learning helps to mitigate data constraints by fine-tuning pre-trained models. Furthermore, Explainable AI (XAI) methods such as GradCAM, GradCAM++, LayerCAM, ScoreCAM, and Faster-ScoreCAM are employed to interpret model predictions. The results demonstrate that ViT surpasses transfer learning models, achieving a classification accuracy of 94.39%. The integration of XAI methods enhances model transparency, offering crucial insights to aid medical professionals in diagnosing brain diseases with greater precision.

RADAR: Enhancing Radiology Report Generation with Supplementary Knowledge Injection

Wenjun Hou, Yi Cheng, Kaishuai Xu, Heng Li, Yan Hu, Wenjie Li, Jiang Liu

arxiv logopreprintMay 20 2025
Large language models (LLMs) have demonstrated remarkable capabilities in various domains, including radiology report generation. Previous approaches have attempted to utilize multimodal LLMs for this task, enhancing their performance through the integration of domain-specific knowledge retrieval. However, these approaches often overlook the knowledge already embedded within the LLMs, leading to redundant information integration and inefficient utilization of learned representations. To address this limitation, we propose RADAR, a framework for enhancing radiology report generation with supplementary knowledge injection. RADAR improves report generation by systematically leveraging both the internal knowledge of an LLM and externally retrieved information. Specifically, it first extracts the model's acquired knowledge that aligns with expert image-based classification outputs. It then retrieves relevant supplementary knowledge to further enrich this information. Finally, by aggregating both sources, RADAR generates more accurate and informative radiology reports. Extensive experiments on MIMIC-CXR, CheXpert-Plus, and IU X-ray demonstrate that our model outperforms state-of-the-art LLMs in both language quality and clinical accuracy
Page 51 of 58574 results
Show
per page

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.