Sort by:
Page 108 of 1331324 results

Enhancing medical explainability in deep learning for age-related macular degeneration diagnosis.

Shi L

pubmed logopapersMay 15 2025
Deep learning models hold significant promise for disease diagnosis but often lack transparency in their decision-making processes, limiting trust and hindering clinical adoption. This study introduces a novel multi-task learning framework to enhance the medical explainability of deep learning models for diagnosing age-related macular degeneration (AMD) using fundus images. The framework simultaneously performs AMD classification and lesion segmentation, allowing the model to support its diagnoses with AMD-associated lesions identified through segmentation. In addition, we perform an in-depth interpretability analysis of the model, proposing the Medical Explainability Index (MXI), a novel metric that quantifies the medical relevance of the generated heatmaps by comparing them with the model's lesion segmentation output. This metric provides a measurable basis to evaluate whether the model's decisions are grounded in clinically meaningful information. The proposed method was trained and evaluated on the Automatic Detection Challenge on Age-Related Macular Degeneration (ADAM) dataset. Experimental results demonstrate robust performance, achieving an area under the curve (AUC) of 0.96 for classification and a Dice similarity coefficient (DSC) of 0.59 for segmentation, outperforming single-task models. By offering interpretable and clinically relevant insights, our approach aims to foster greater trust in AI-driven disease diagnosis and facilitate its adoption in clinical practice.

A monocular endoscopic image depth estimation method based on a window-adaptive asymmetric dual-branch Siamese network.

Chong N, Yang F, Wei K

pubmed logopapersMay 15 2025
Minimally invasive surgery involves entering the body through small incisions or natural orifices, using a medical endoscope for observation and clinical procedures. However, traditional endoscopic images often suffer from low texture and uneven illumination, which can negatively impact surgical and diagnostic outcomes. To address these challenges, many researchers have applied deep learning methods to enhance the processing of endoscopic images. This paper proposes a monocular medical endoscopic image depth estimation method based on a window-adaptive asymmetric dual-branch Siamese network. In this network, one branch focuses on processing global image information, while the other branch concentrates on local details. An improved lightweight Squeeze-and-Excitation (SE) module is added to the final layer of each branch, dynamically adjusting the inter-channel weights through self-attention. The outputs from both branches are then integrated using a lightweight cross-attention feature fusion module, enabling cross-branch feature interaction and enhancing the overall feature representation capability of the network. Extensive ablation and comparative experiments were conducted on medical datasets (EAD2019, Hamlyn, M2caiSeg, UCL) and a non-medical dataset (NYUDepthV2), with both qualitative and quantitative results-measured in terms of RMSE, AbsRel, FLOPs and running time-demonstrating the superiority of the proposed model. Additionally, comparisons with CT images show good organ boundary matching capability, highlighting the potential of our method for clinical applications. The key code of this paper is available at: https://github.com/superchongcnn/AttenAdapt_DE .

Dual-Domain deep prior guided sparse-view CT reconstruction with multi-scale fusion attention.

Wu J, Lin J, Jiang X, Zheng W, Zhong L, Pang Y, Meng H, Li Z

pubmed logopapersMay 15 2025
Sparse-view CT reconstruction is a challenging ill-posed inverse problem, where insufficient projection data leads to degraded image quality with increased noise and artifacts. Recent deep learning approaches have shown promising results in CT reconstruction. However, existing methods often neglect projection data constraints and rely heavily on convolutional neural networks, resulting in limited feature extraction capabilities and inadequate adaptability. To address these limitations, we propose a Dual-domain deep Prior-guided Multi-scale fusion Attention (DPMA) model for sparse-view CT reconstruction, aiming to enhance reconstruction accuracy while ensuring data consistency and stability. First, we establish a residual regularization strategy that applies constraints on the difference between the prior image and target image, effectively integrating deep learning-based priors with model-based optimization. Second, we develop a multi-scale fusion attention mechanism that employs parallel pathways to simultaneously model global context, regional dependencies, and local details in a unified framework. Third, we incorporate a physics-informed consistency module based on range-null space decomposition to ensure adherence to projection data constraints. Experimental results demonstrate that DPMA achieves improved reconstruction quality compared to existing approaches, particularly in noise suppression, artifact reduction, and fine detail preservation.

Development and Validation of Ultrasound Hemodynamic-based Prediction Models for Acute Kidney Injury After Renal Transplantation.

Ni ZH, Xing TY, Hou WH, Zhao XY, Tao YL, Zhou FB, Xing YQ

pubmed logopapersMay 14 2025
Acute kidney injury (AKI) post-renal transplantation often has a poor prognosis. This study aimed to identify patients with elevated risks of AKI after kidney transplantation. A retrospective analysis was conducted on 422 patients who underwent kidney transplants from January 2020 to April 2023. Participants from 2020 to 2022 were randomized to training group (n=261) and validation group 1 (n=113), and those in 2023, as validation group 2 (n=48). Risk factors were determined by employing logistic regression analysis alongside the least absolute shrinkage and selection operator, making use of ultrasound hemodynamic, clinical, and laboratory information. Models for prediction were developed using logistic regression analysis and six machine-learning techniques. The evaluation of the logistic regression model encompassed its discrimination, calibration, and applicability in clinical settings, and a nomogram was created to illustrate the model. SHapley Additive exPlanations were used to explain and visualize the best of the six machine learning models. The least absolute shrinkage and selection operator combined with logistic regression identified and incorporated five risk factors into the predictive model. The logistic regression model (AUC=0.927 in the validation set 1; AUC=0.968 in the validation set 2) and the random forest model (AUC=0.946 in the validation set 1;AUC=0.996 in the validation set 2) showed good performance post-validation, with no significant difference in their predictive accuracy. These findings can assist clinicians in the early identification of patients at high risk for AKI, allowing for timely interventions and potentially enhancing the prognosis following kidney transplantation.

Explainability Through Human-Centric Design for XAI in Lung Cancer Detection

Amy Rafferty, Rishi Ramaesh, Ajitha Rajan

arxiv logopreprintMay 14 2025
Deep learning models have shown promise in lung pathology detection from chest X-rays, but widespread clinical adoption remains limited due to opaque model decision-making. In prior work, we introduced ClinicXAI, a human-centric, expert-guided concept bottleneck model (CBM) designed for interpretable lung cancer diagnosis. We now extend that approach and present XpertXAI, a generalizable expert-driven model that preserves human-interpretable clinical concepts while scaling to detect multiple lung pathologies. Using a high-performing InceptionV3-based classifier and a public dataset of chest X-rays with radiology reports, we compare XpertXAI against leading post-hoc explainability methods and an unsupervised CBM, XCBs. We assess explanations through comparison with expert radiologist annotations and medical ground truth. Although XpertXAI is trained for multiple pathologies, our expert validation focuses on lung cancer. We find that existing techniques frequently fail to produce clinically meaningful explanations, omitting key diagnostic features and disagreeing with radiologist judgments. XpertXAI not only outperforms these baselines in predictive accuracy but also delivers concept-level explanations that better align with expert reasoning. While our focus remains on explainability in lung cancer detection, this work illustrates how human-centric model design can be effectively extended to broader diagnostic contexts - offering a scalable path toward clinically meaningful explainable AI in medical diagnostics.

Recognizing artery segments on carotid ultrasonography using embedding concatenation of deep image and vision-language models.

Lo CM, Sung SF

pubmed logopapersMay 14 2025
Evaluating large artery atherosclerosis is critical for predicting and preventing ischemic strokes. Ultrasonographic assessment of the carotid arteries is the preferred first-line examination due to its ease of use, noninvasive, and absence of radiation exposure. This study proposed an automated classification model for the common carotid artery (CCA), carotid bulb, internal carotid artery (ICA), and external carotid artery (ECA) to enhance the quantification of carotid artery examinations.&#xD;Approach: A total of 2,943 B-mode ultrasound images (CCA: 1,563; bulb: 611; ICA: 476; ECA: 293) from 288 patients were collected. Three distinct sets of embedding features were extracted from artificial intelligence networks including pre-trained DenseNet201, vision Transformer (ViT), and echo contrastive language-image pre-training (EchoCLIP) models using deep learning architectures for pattern recognition. These features were then combined in a support vector machine (SVM) classifier to interpret the anatomical structures in B-mode images.&#xD;Main results: After ten-fold cross-validation, the model achieved an accuracy of 82.3%, which was significantly better than using individual feature sets, with a p-value of <0.001.&#xD;Significance: The proposed model could make carotid artery examinations more accurate and consistent with the achieved classification accuracy. The source code is available at https://github.com/buddykeywordw/Artery-Segments-Recognition&#xD.

An Annotated Multi-Site and Multi-Contrast Magnetic Resonance Imaging Dataset for the study of the Human Tongue Musculature.

Ribeiro FL, Zhu X, Ye X, Tu S, Ngo ST, Henderson RD, Steyn FJ, Kiernan MC, Barth M, Bollmann S, Shaw TB

pubmed logopapersMay 14 2025
This dataset provides the first annotated, openly available MRI-based imaging dataset for investigations of tongue musculature, including multi-contrast and multi-site MRI data from non-disease participants. The present dataset includes 47 participants collated from three studies: BeLong (four participants; T2-weighted images), EATT4MND (19 participants; T2-weighted images), and BMC (24 participants; T1-weighted images). We provide manually corrected segmentations of five key tongue muscles: the superior longitudinal, combined transverse/vertical, genioglossus, and inferior longitudinal muscles. Other phenotypic measures, including age, sex, weight, height, and tongue muscle volume, are also available for use. This dataset will benefit researchers across domains interested in the structure and function of the tongue in health and disease. For instance, researchers can use this data to train new machine learning models for tongue segmentation, which can be leveraged for segmentation and tracking of different tongue muscles engaged in speech formation in health and disease. Altogether, this dataset provides the means to the scientific community for investigation of the intricate tongue musculature and its role in physiological processes and speech production.

Early detection of Alzheimer's disease progression stages using hybrid of CNN and transformer encoder models.

Almalki H, Khadidos AO, Alhebaishi N, Senan EM

pubmed logopapersMay 14 2025
Alzheimer's disease (AD) is a neurodegenerative disorder that affects memory and cognitive functions. Manual diagnosis is prone to human error, often leading to misdiagnosis or delayed detection. MRI techniques help visualize the fine tissues of the brain cells, indicating the stage of disease progression. Artificial intelligence techniques analyze MRI with high accuracy and extract subtle features that are difficult to diagnose manually. In this study, a modern methodology was designed that combines the power of CNN models (ResNet101 and GoogLeNet) to extract local deep features and the power of Vision Transformer (ViT) models to extract global features and find relationships between image spots. First, the MRI images of the Open Access Imaging Studies Series (OASIS) dataset were improved by two filters: the adaptive median filter (AMF) and Laplacian filter. The ResNet101 and GoogLeNet models were modified to suit the feature extraction task and reduce computational cost. The ViT architecture was modified to reduce the computational cost while increasing the number of attention vertices to further discover global features and relationships between image patches. The enhanced images were fed into the proposed ViT-CNN methodology. The enhanced images were fed to the modified ResNet101 and GoogLeNet models to extract the deep feature maps with high accuracy. Deep feature maps were fed into the modified ViT model. The deep feature maps were partitioned into 32 feature maps using ResNet101 and 16 feature maps using GoogLeNet, both with a size of 64 features. The feature maps were encoded to recognize the spatial arrangement of the patch and preserve the relationship between patches, helping the self-attention layers distinguish between patches based on their positions. They were fed to the transformer encoder, which consisted of six blocks and multiple vertices to focus on different patterns or regions simultaneously. Finally, the MLP classification layers classify each image into one of four dataset classes. The improved ResNet101-ViT hybrid methodology outperformed the GoogLeNet-ViT hybrid methodology. ResNet101-ViT achieved 98.7% accuracy, 95.05% AUC, 96.45% precision, 99.68% sensitivity, and 97.78% specificity.

Whole-body CT-to-PET synthesis using a customized transformer-enhanced GAN.

Xu B, Nie Z, He J, Li A, Wu T

pubmed logopapersMay 14 2025
Positron emission tomography with 2-deoxy-2-[fluorine-18]fluoro-D-glucose integrated with computed tomography (18F-FDG PET-CT) is a multi-modality medical imaging technique widely used for screening and diagnosis of lesions and tumors, in which, CT can provide detailed anatomical structures, while PET can show metabolic activities. Nevertheless, it has disadvantages such as long scanning time, high cost, and relatively high radiation doses.&#xD;&#xD;Purpose: We propose a deep learning model for the whole-body CT-to-PET synthesis task, generating high-quality synthetic PET images that are comparable to real ones in both clinical relevance and diagnostic value.&#xD;&#xD;Material: We collect 102 pairs of 3D CT and PET scans, which are sliced into 27,240 pairs of 2D CT and PET images ( training: 21,855 pairs, validation: 2,810, testing: 2,575 pairs).&#xD;&#xD;Methods: We propose a Transformer-enhanced Generative Adversarial Network (GAN) for whole-body CT-to-PET synthesis task. The CPGAN model uses residual blocks and Fully Connected Transformer Residual (FCTR) blocks to capture both local features and global contextual information. A customized loss function incorporating structural consistency is designed to improve the quality of synthesized PET images.&#xD;&#xD;Results: Both quantitative and qualitative evaluation results demonstrate effectiveness of the CPGAN model. The mean and standard variance of NRMSE,PSNR and SSIM values on test set are (16.90 ± 12.27) × 10-4, 28.71 ± 2.67 and 0.926 ± 0.033, respectively, outperforming other seven state-of-the-art models. Three radiologists independently and blindly evaluated and gave subjective scores to 100 randomly chosen PET images (50 real and 50 synthetic). By Wilcoxon signed rank test, there are no statistical differences between the synthetic PET images and the real ones.&#xD;&#xD;Conclusions: Despite the inherent limitations of CT images to directly reflect biological information of metabolic tissues, CPGAN model effectively synthesizes satisfying PET images from CT scans, which has potential in reducing the reliance on actual PET-CT scans.

CT-based AI framework leveraging multi-scale features for predicting pathological grade and Ki67 index in clear cell renal cell carcinoma: a multicenter study.

Yang H, Zhang Y, Li F, Liu W, Zeng H, Yuan H, Ye Z, Huang Z, Yuan Y, Xiang Y, Wu K, Liu H

pubmed logopapersMay 14 2025
To explore whether a CT-based AI framework, leveraging multi-scale features, can offer a non-invasive approach to accurately predict pathological grade and Ki67 index in clear cell renal cell carcinoma (ccRCC). In this multicenter retrospective study, a total of 1073 pathologically confirmed ccRCC patients from seven cohorts were split into internal cohorts (training and validation sets) and an external test set. The AI framework comprised an image processor, a 3D-kidney and tumor segmentation model by 3D-UNet, a multi-scale features extractor built upon unsupervised learning, and a multi-task classifier utilizing XGBoost. A quantitative model interpretation technique, known as SHapley Additive exPlanations (SHAP), was employed to explore the contribution of multi-scale features. The 3D-UNet model showed excellent performance in segmenting both the kidney and tumor regions, with Dice coefficients exceeding 0.92. The proposed multi-scale features model exhibited strong predictive capability for pathological grading and Ki67 index, with AUROC values of 0.84 and 0.87, respectively, in the internal validation set, and 0.82 and 0.82, respectively, in the external test set. The SHAP results demonstrated that features from radiomics, the 3D Auto-Encoder, and dimensionality reduction all made significant contributions to both prediction tasks. The proposed AI framework, leveraging multi-scale features, accurately predicts the pathological grade and Ki67 index of ccRCC. The CT-based AI framework leveraging multi-scale features offers a promising avenue for accurately predicting the pathological grade and Ki67 index of ccRCC preoperatively, indicating a direction for non-invasive assessment. Non-invasively determining pathological grade and Ki67 index in ccRCC could guide treatment decisions. The AI framework integrates segmentation, classification, and model interpretation, enabling fully automated analysis. The AI framework enables non-invasive preoperative detection of high-risk tumors, assisting clinical decision-making.
Page 108 of 1331324 results
Show
per page
Get Started

Upload your X-ray image and get interpretation.

Upload now →

Disclaimer: X-ray Interpreter's AI-generated results are for informational purposes only and not a substitute for professional medical advice. Always consult a healthcare professional for medical diagnosis and treatment.