Latest Papers on Radiology AI. Tags: None

Explainable deep learning for age and gender estimation in dental CBCT scans using attention mechanisms and multi task learning.

Pishghadam N, Esmaeilyfard R, Paknahad M

•papers•May 24 2025

Accurate and interpretable age estimation and gender classification are essential in forensic and clinical diagnostics, particularly when using high-dimensional medical imaging data such as Cone Beam Computed Tomography (CBCT). Traditional CBCT-based approaches often suffer from high computational costs and limited interpretability, reducing their applicability in forensic investigations. This study aims to develop a multi-task deep learning framework that enhances both accuracy and explainability in CBCT-based age estimation and gender classification using attention mechanisms. We propose a multi-task learning (MTL) model that simultaneously estimates age and classifies gender using panoramic slices extracted from CBCT scans. To improve interpretability, we integrate Convolutional Block Attention Module (CBAM) and Grad-CAM visualization, highlighting relevant craniofacial regions. The dataset includes 2,426 CBCT images from individuals aged 7 to 23 years, and performance is assessed using Mean Absolute Error (MAE) for age estimation and accuracy for gender classification. The proposed model achieves a MAE of 1.08 years for age estimation and 95.3% accuracy in gender classification, significantly outperforming conventional CBCT-based methods. CBAM enhances the model's ability to focus on clinically relevant anatomical features, while Grad-CAM provides visual explanations, improving interpretability. Additionally, using panoramic slices instead of full 3D CBCT volumes reduces computational costs without sacrificing accuracy. Our framework improves both accuracy and interpretability in forensic age estimation and gender classification from CBCT images. By incorporating explainable AI techniques, this model provides a computationally efficient and clinically interpretable tool for forensic and medical applications.

CT Classification Methodology In Silico Academic Lab GenAI

Stroke prediction in elderly patients with atrial fibrillation using machine learning combined clinical and left atrial appendage imaging phenotypic features.

Huang H, Xiong Y, Yao Y, Zeng J

•papers•May 24 2025

Atrial fibrillation (AF) is one of the primary etiologies for ischemic stroke, and it is of paramount importance to delineate the risk phenotypes among elderly AF patients and to investigate more efficacious models for predicting stroke risk. This single-center prospective cohort study collected clinical data and cardiac computed tomography angiography (CTA) images from elderly AF patients. The clinical phenotypes and left atrial appendage (LAA) radiomic phenotypes of elderly AF patients were identified through K-means clustering. The independent correlations between these phenotypes and stroke risk were subsequently analyzed. Machine learning algorithms-Logistic Regression, Naive Bayes, Support Vector Machine (SVM), Random Forest, and Extreme Gradient Boosting-were selected to develop a predictive model for stroke risk in this patient cohort. The model was assessed using the Area Under the Receiver Operating Characteristic Curve, Hosmer-Lemeshow tests, and Decision Curve Analysis. A total of 419 elderly AF patients (≥ 65 years old) were included. K-means clustering identified three clinical phenotypes: Group A (cardiac enlargement/dysfunction), Group B (normal phenotype), and Group C (metabolic/coagulation abnormalities). Stroke incidence was highest in Group A (19.3%) and Group C (14.5%) versus Group B (3.3%). Similarly, LAA radiomic phenotypes revealed elevated stroke risk in patients with enlarged LAA structure (Group B: 20.0%) and complex LAA morphology (Group C: 14.0%) compared to normal LAA (Group A: 2.9%). Among the five machine learning models, the SVM model achieved superior prediction performance (AUROC: 0.858 [95% CI: 0.830-0.887]). The stroke-risk prediction model for elderly AF patients constructed based on the SVM algorithm has strong predictive efficacy.

CT Classification Cardiac Prospective Clinical Pilot Academic Lab Benchmark SOTA

Deep learning-based identification of vertebral fracture and osteoporosis in lateral spine radiographs and DXA vertebral fracture assessment to predict incident fracture.

Hong N, Cho SW, Lee YH, Kim CO, Kim HC, Rhee Y, Leslie WD, Cummings SR, Kim KM

•papers•May 24 2025

Deep learning (DL) identification of vertebral fractures and osteoporosis in lateral spine radiographs and DXA vertebral fracture assessment (VFA) images may improve fracture risk assessment in older adults. In 26 299 lateral spine radiographs from 9276 individuals attending a tertiary-level institution (60% train set; 20% validation set; 20% test set; VERTE-X cohort), DL models were developed to detect prevalent vertebral fracture (pVF) and osteoporosis. The pre-trained DL models from lateral spine radiographs were then fine-tuned in 30% of a DXA VFA dataset (KURE cohort), with performance evaluated in the remaining 70% test set. The area under the receiver operating characteristics curve (AUROC) for DL models to detect pVF and osteoporosis was 0.926 (95% CI 0.908-0.955) and 0.848 (95% CI 0.827-0.869) from VERTE-X spine radiographs, respectively, and 0.924 (95% CI 0.905-0.942) and 0.867 (95% CI 0.853-0.881) from KURE DXA VFA images, respectively. A total of 13.3% and 13.6% of individuals sustained an incident fracture during a median follow-up of 5.4 years and 6.4 years in the VERTE-X test set (n = 1852) and KURE test set (n = 2456), respectively. Incident fracture risk was significantly greater among individuals with DL-detected vertebral fracture (hazard ratios [HRs] 3.23 [95% CI 2.51-5.17] and 2.11 [95% CI 1.62-2.74] for the VERTE-X and KURE test sets) or DL-detected osteoporosis (HR 2.62 [95% CI 1.90-3.63] and 2.14 [95% CI 1.72-2.66]), which remained significant after adjustment for clinical risk factors and femoral neck bone mineral density. DL scores improved incident fracture discrimination and net benefit when combined with clinical risk factors. In summary, DL-detected pVF and osteoporosis in lateral spine radiographs and DXA VFA images enhanced fracture risk prediction in older adults.

X-Ray Detection Musculoskeletal Retrospective Clinical In Silico Academic Lab

SW-ViT: A Spatio-Temporal Vision Transformer Network with Post Denoiser for Sequential Multi-Push Ultrasound Shear Wave Elastography

Ahsan Habib Akash, MD Jahin Alam, Md. Kamrul Hasan

•preprint•May 24 2025

Objective: Ultrasound Shear Wave Elastography (SWE) demonstrates great potential in assessing soft-tissue pathology by mapping tissue stiffness, which is linked to malignancy. Traditional SWE methods have shown promise in estimating tissue elasticity, yet their susceptibility to noise interference, reliance on limited training data, and inability to generate segmentation masks concurrently present notable challenges to accuracy and reliability. Approach: In this paper, we propose SW-ViT, a novel two-stage deep learning framework for SWE that integrates a CNN-Spatio-Temporal Vision Transformer-based reconstruction network with an efficient Transformer-based post-denoising network. The first stage uses a 3D ResNet encoder with multi-resolution spatio-temporal Transformer blocks that capture spatial and temporal features, followed by a squeeze-and-excitation attention decoder that reconstructs 2D stiffness maps. To address data limitations, a patch-based training strategy is adopted for localized learning and reconstruction. In the second stage, a denoising network with a shared encoder and dual decoders processes inclusion and background regions to produce a refined stiffness map and segmentation mask. A hybrid loss combining regional, smoothness, fusion, and Intersection over Union (IoU) components ensures improvements in both reconstruction and segmentation. Results: On simulated data, our method achieves PSNR of 32.68 dB, CNR of 46.78 dB, and SSIM of 0.995. On phantom data, results include PSNR of 21.11 dB, CNR of 42.14 dB, and SSIM of 0.936. Segmentation IoU values reach 0.949 (simulation) and 0.738 (phantom) with ASSD values being 0.184 and 1.011, respectively. Significance: SW-ViT delivers robust, high-quality elasticity map estimates from noisy SWE data and holds clear promise for clinical application.

Ultrasound Segmentation Methodology In Silico Academic Lab

Deep Learning for Breast Cancer Detection: Comparative Analysis of ConvNeXT and EfficientNet

Mahmudul Hasan

•preprint•May 24 2025

Breast cancer is the most commonly occurring cancer worldwide. This cancer caused 670,000 deaths globally in 2022, as reported by the WHO. Yet since health officials began routine mammography screening in age groups deemed at risk in the 1980s, breast cancer mortality has decreased by 40% in high-income nations. Every day, a greater and greater number of people are receiving a breast cancer diagnosis. Reducing cancer-related deaths requires early detection and treatment. This paper compares two convolutional neural networks called ConvNeXT and EfficientNet to predict the likelihood of cancer in mammograms from screening exams. Preprocessing of the images, classification, and performance evaluation are main parts of the whole procedure. Several evaluation metrics were used to compare and evaluate the performance of the models. The result shows that ConvNeXT generates better results with a 94.33% AUC score, 93.36% accuracy, and 95.13% F-score compared to EfficientNet with a 92.34% AUC score, 91.47% accuracy, and 93.06% F-score on RSNA screening mammography breast cancer dataset.

Mammography Classification Breast Methodology In Silico Academic Lab

TK-Mamba: Marrying KAN with Mamba for Text-Driven 3D Medical Image Segmentation

Haoyu Yang, Yuxiang Cai, Jintao Chen, Xuhong Zhang, Wenhui Lei, Xiaoming Shi, Jianwei Yin, Yankai Jiang

•preprint•May 24 2025

3D medical image segmentation is vital for clinical diagnosis and treatment but is challenged by high-dimensional data and complex spatial dependencies. Traditional single-modality networks, such as CNNs and Transformers, are often limited by computational inefficiency and constrained contextual modeling in 3D settings. We introduce a novel multimodal framework that leverages Mamba and Kolmogorov-Arnold Networks (KAN) as an efficient backbone for long-sequence modeling. Our approach features three key innovations: First, an EGSC (Enhanced Gated Spatial Convolution) module captures spatial information when unfolding 3D images into 1D sequences. Second, we extend Group-Rational KAN (GR-KAN), a Kolmogorov-Arnold Networks variant with rational basis functions, into 3D-Group-Rational KAN (3D-GR-KAN) for 3D medical imaging - its first application in this domain - enabling superior feature representation tailored to volumetric data. Third, a dual-branch text-driven strategy leverages CLIP's text embeddings: one branch swaps one-hot labels for semantic vectors to preserve inter-organ semantic relationships, while the other aligns images with detailed organ descriptions to enhance semantic alignment. Experiments on the Medical Segmentation Decathlon (MSD) and KiTS23 datasets show our method achieving state-of-the-art performance, surpassing existing approaches in accuracy and efficiency. This work highlights the power of combining advanced sequence modeling, extended network architectures, and vision-language synergy to push forward 3D medical image segmentation, delivering a scalable solution for clinical use. The source code is openly available at https://github.com/yhy-whu/TK-Mamba.

Mixed Modality Segmentation Methodology In Silico Academic Lab Open Code Benchmark SOTA

Joint Reconstruction of Activity and Attenuation in PET by Diffusion Posterior Sampling in Wavelet Coefficient Space

Clémentine Phung-Ngoc, Alexandre Bousse, Antoine De Paepe, Hong-Phuong Dang, Olivier Saut, Dimitris Visvikis

•preprint•May 24 2025

Attenuation correction (AC) is necessary for accurate activity quantification in positron emission tomography (PET). Conventional reconstruction methods typically rely on attenuation maps derived from a co-registered computed tomography (CT) or magnetic resonance imaging scan. However, this additional scan may complicate the imaging workflow, introduce misalignment artifacts and increase radiation exposure. In this paper, we propose a joint reconstruction of activity and attenuation (JRAA) approach that eliminates the need for auxiliary anatomical imaging by relying solely on emission data. This framework combines wavelet diffusion model (WDM) and diffusion posterior sampling (DPS) to reconstruct fully three-dimensional (3-D) data. Experimental results show our method outperforms maximum likelihood activity and attenuation (MLAA) and MLAA with UNet-based post processing, and yields high-quality noise-free reconstructions across various count settings when time-of-flight (TOF) information is available. It is also able to reconstruct non-TOF data, although the reconstruction quality significantly degrades in low-count (LC) conditions, limiting its practical effectiveness in such settings. This approach represents a step towards stand-alone PET imaging by reducing the dependence on anatomical modalities while maintaining quantification accuracy, even in low-count scenarios when TOF information is available.

PET Reconstruction Whole Body Methodology In Silico Academic Lab

MATI: A GPU-accelerated toolbox for microstructural diffusion MRI simulation and data fitting with a graphical user interface.

Xu J, Devan SP, Shi D, Pamulaparthi A, Yan N, Zu Z, Smith DS, Harkins KD, Gore JC, Jiang X

•papers•May 24 2025

To introduce MATI (Microstructural Analysis Toolbox for Imaging), a versatile MATLAB-based toolbox that combines both simulation and data fitting capabilities for microstructural dMRI research. MATI provides a user-friendly, graphical user interface that enables researchers, including those without much programming experience, to perform advanced simulations and data analyses for microstructural MRI research. For simulation, MATI supports arbitrary microstructural tissues and pulse sequences. For data fitting, MATI supports a range of fitting methods, including traditional non-linear least squares, Bayesian approaches, machine learning, and dictionary matching methods, allowing users to tailor analyses based on specific research needs. Optimized with vectorized matrix operations and high-performance numerical libraries, MATI achieves high computational efficiency, enabling rapid simulations and data fitting on CPU and GPU hardware. While designed for microstructural dMRI, MATI's generalized framework can be extended to other imaging methods, making it a flexible and scalable tool for quantitative MRI research. MATI offers a significant step toward translating advanced microstructural MRI techniques into clinical applications.

MRI Classification Neurological Methodology In Silico Academic Lab Open Code

Cross-Fusion Adaptive Feature Enhancement Transformer: Efficient high-frequency integration and sparse attention enhancement for brain MRI super-resolution.

Yang Z, Xiao H, Wang X, Zhou F, Deng T, Liu S

•papers•May 24 2025

High-resolution magnetic resonance imaging (MRI) is essential for diagnosing and treating brain diseases. Transformer-based approaches demonstrate strong potential in MRI super-resolution by capturing long-range dependencies effectively. However, existing Transformer-based super-resolution methods face several challenges: (1) they primarily focus on low-frequency information, neglecting the utilization of high-frequency information; (2) they lack effective mechanisms to integrate both low-frequency and high-frequency information; (3) they struggle to effectively eliminate redundant information during the reconstruction process. To address these issues, we propose the Cross-fusion Adaptive Feature Enhancement Transformer (CAFET). Our model maximizes the potential of both CNNs and Transformers. It consists of four key blocks: a high-frequency enhancement block for extracting high-frequency information; a hybrid attention block for capturing global information and local fitting, which includes channel attention and shifted rectangular window attention; a large-window fusion attention block for integrating local high-frequency features and global low-frequency features; and an adaptive sparse overlapping attention block for dynamically retaining key information and enhancing the aggregation of cross-window features. Extensive experiments validate the effectiveness of the proposed method. On the BraTS and IXI datasets, with an upsampling factor of ×2, the proposed method achieves a maximum PSNR improvement of 2.4 dB and 1.3 dB compared to state-of-the-art methods, along with an SSIM improvement of up to 0.16% and 1.42%. Similarly, at an upsampling factor of ×4, the proposed method achieves a maximum PSNR improvement of 1.04 dB and 0.3 dB over the current leading methods, along with an SSIM improvement of up to 0.25% and 1.66%. Our method is capable of reconstructing high-quality super-resolution brain MRI images, demonstrating significant clinical potential.

MRI Reconstruction Neurological Methodology In Silico Academic Lab

Generalizable AI approach for detecting projection type and left-right reversal in chest X-rays.

Ohta Y, Katayama Y, Ichida T, Utsunomiya A, Ishida T

•papers•May 23 2025

The verification of chest X-ray images involves several checkpoints, including orientation and reversal. To address the challenges of manual verification, this study developed an artificial intelligence (AI)-based system using a deep convolutional neural network (DCNN) to automatically verify the consistency between the imaging direction and examination orders. The system classified the chest X-ray images into four categories: anteroposterior (AP), posteroanterior (PA), flipped AP, and flipped PA. To evaluate the impact of internal and external datasets on the classification accuracy, the DCNN was trained using multiple publicly available chest X-ray datasets and tested on both internal and external data. The results demonstrated that the DCNN accurately classified the imaging directions and detected image reversal. However, the classification accuracy was strongly influenced by the training dataset. When trained exclusively on NIH data, the network achieved an accuracy of 98.9% on the same dataset; however, this reduced to 87.8% when evaluated with PADChest data. When trained on a mixed dataset, the accuracy improved to 96.4%; however, it decreased to 76.0% when tested on an external COVID-CXNet dataset. Further, using Grad-CAM, we visualized the decision-making process of the network, highlighting the areas of influence, such as the cardiac silhouette and arm positioning, depending on the imaging direction. Thus, this study demonstrated the potential of AI in assisting in automating the verification of imaging direction and positioning in chest X-rays. However, the network must be fine-tuned to local data characteristics to achieve optimal performance.

X-Ray Classification Chest Methodology In Silico Academic Lab

Filter Papers

Tags

Explainable deep learning for age and gender estimation in dental CBCT scans using attention mechanisms and multi task learning.

Stroke prediction in elderly patients with atrial fibrillation using machine learning combined clinical and left atrial appendage imaging phenotypic features.

Deep learning-based identification of vertebral fracture and osteoporosis in lateral spine radiographs and DXA vertebral fracture assessment to predict incident fracture.

SW-ViT: A Spatio-Temporal Vision Transformer Network with Post Denoiser for Sequential Multi-Push Ultrasound Shear Wave Elastography

Deep Learning for Breast Cancer Detection: Comparative Analysis of ConvNeXT and EfficientNet

TK-Mamba: Marrying KAN with Mamba for Text-Driven 3D Medical Image Segmentation

Joint Reconstruction of Activity and Attenuation in PET by Diffusion Posterior Sampling in Wavelet Coefficient Space

MATI: A GPU-accelerated toolbox for microstructural diffusion MRI simulation and data fitting with a graphical user interface.

Cross-Fusion Adaptive Feature Enhancement Transformer: Efficient high-frequency integration and sparse attention enhancement for brain MRI super-resolution.

Generalizable AI approach for detecting projection type and left-right reversal in chest X-rays.

Ready to Sharpen Your Edge?