Latest Papers on Radiology AI. Tags: None

SpectMamba: Integrating Frequency and State Space Models for Enhanced Medical Image Detection

Yao Wang, Dong Yang, Zhi Qiao, Wenjian Huang, Liuzhi Yang, Zhen Qian

•preprint•Sep 1 2025

Abnormality detection in medical imaging is a critical task requiring both high efficiency and accuracy to support effective diagnosis. While convolutional neural networks (CNNs) and Transformer-based models are widely used, both face intrinsic challenges: CNNs have limited receptive fields, restricting their ability to capture broad contextual information, and Transformers encounter prohibitive computational costs when processing high-resolution medical images. Mamba, a recent innovation in natural language processing, has gained attention for its ability to process long sequences with linear complexity, offering a promising alternative. Building on this foundation, we present SpectMamba, the first Mamba-based architecture designed for medical image detection. A key component of SpectMamba is the Hybrid Spatial-Frequency Attention (HSFA) block, which separately learns high- and low-frequency features. This approach effectively mitigates the loss of high-frequency information caused by frequency bias and correlates frequency-domain features with spatial features, thereby enhancing the model's ability to capture global context. To further improve long-range dependencies, we propose the Visual State-Space Module (VSSM) and introduce a novel Hilbert Curve Scanning technique to strengthen spatial correlations and local dependencies, further optimizing the Mamba framework. Comprehensive experiments show that SpectMamba achieves state-of-the-art performance while being both effective and efficient across various medical image detection tasks.

Mixed Modality Detection Methodology In Silico Academic Lab Benchmark SOTA

Evaluating Undersampling Schemes and Deep Learning Reconstructions for High-Resolution 3D Double Echo Steady State Knee Imaging at 7 T: A Comparison Between GRAPPA, CAIPIRINHA, and Compressed Sensing.

Marth T, Marth AA, Kajdi GW, Nickel MD, Paul D, Sutter R, Nanz D, von Deuster C

•papers•Sep 1 2025

The 3-dimensional (3D) double echo steady state (DESS) magnetic resonance imaging sequence can image knee cartilage with high, isotropic resolution, particularly at high and ultra-high field strengths. Advanced undersampling techniques with high acceleration factors can provide the short acquisition times required for clinical use. However, the optimal undersampling scheme and its limits are unknown. High-resolution isotropic (reconstructed voxel size: 0.3 × 0.3 × 0.3 mm 3 ) 3D DESS images of 40 knees in 20 volunteers were acquired at 7 T with varying undersampling factors (R = 4-30) and schemes (regular: GRAPPA, CAIPIRINHA; incoherent: compressed sensing [CS]), whereas the remaining imaging parameters were kept constant. All imaging data were reconstructed with deep learning (DL) algorithms. Three readers rated image quality on a 4-point Likert scale. Four-fold accelerated GRAPPA was used as reference standard. Incidental cartilage lesions were graded on a modified Whole-Organ Magnetic Resonance Imaging Score (WORMS). Friedman's analysis of variance characterized rating differences. The interreader agreement was assessed using κ statistics. The quality of 16-fold accelerated CS images was not rated significantly different from that of 4-fold accelerated GRAPPA and 8-fold accelerated CAIPIRINHA images, whereas the corresponding data were acquired 4.5 and 2 times faster (01:12 min:s) than in 4-fold accelerated GRAPPA (5:22 min:s) and 8-fold accelerated CAIPIRINHA (2:22 min:s) acquisitions, respectively. Interreader agreement for incidental cartilage lesions was almost perfect for 4-fold accelerated GRAPPA (κ = 0.91), 8-fold accelerated CAIPIRINHA (κ = 0.86), and 8- to 16-fold accelerated CS (κ = 0.91). Our results suggest significant advantages of incoherent versus regular undersampling patterns for high-resolution 3D DESS cartilage imaging with high acceleration factors. The combination of CS undersampling with DL reconstruction enables fast, isotropic, high-resolution acquisitions without apparent impairment of image quality. Since DESS specific absorption rate values tend to be moderate, CS DESS with DL reconstruction promises potential for high-resolution assessment of cartilage morphology and other musculoskeletal anatomies at 7 T.

MRI Reconstruction Musculoskeletal Retrospective Clinical In Silico Academic Lab

Can super resolution via deep learning improve classification accuracy in dental radiography?

Çelik B, Mikaeili M, Genç MZ, Çelik ME

•papers•Sep 1 2025

Deep learning-driven super resolution (SR) aims to enhance the quality and resolution of images, offering potential benefits in dental imaging. Although extensive research has focused on deep learning based dental classification tasks, the impact of applying SR techniques on classification remains underexplored. This study seeks to address this gap by evaluating and comparing the performance of deep learning classification models on dental images with and without SR enhancement. An open-source dental image dataset was utilized to investigate the impact of SR on image classification performance. SR was applied by 2 models with a scaling ratio of 2 and 4, while classification was performed by 4 deep learning models. Performances were evaluated by well-accepted metrics like structural similarity index (SSIM), peak signal-to-noise ratio (PSNR), accuracy, recall, precision, and F1 score. The effect of SR on classification performance is interpreted through 2 different approaches. Two SR models yielded average SSIM and PSNR values of 0.904 and 36.71 for increasing resolution with 2 scaling ratios. Average accuracy and F-1 score for the classification trained and tested with 2 SR-generated images were 0.859 and 0.873. In the first of the comparisons carried out with 2 different approaches, it was observed that the accuracy increased in at least half of the cases (8 out of 16) when different models and scaling ratios were considered, while in the second approach, SR showed a significantly higher performance for almost all cases (12 out of 16). This study demonstrated that the classification with SR-generated images significantly improved outcomes. For the first time, the classification performance of dental radiographs with improved resolution by SR has been investigated. Significant performance improvement was observed compared to the case without SR.

X-Ray Classification Methodology In Silico Academic Lab

Pulmonary Embolism Survival Prediction Using Multimodal Learning Based on Computed Tomography Angiography and Clinical Data.

Zhong Z, Zhang H, Fayad FH, Lancaster AC, Sollee J, Kulkarni S, Lin CT, Li J, Gao X, Collins S, Greineder CF, Ahn SH, Bai HX, Jiao Z, Atalay MK

•papers•Sep 1 2025

Pulmonary embolism (PE) is a significant cause of mortality in the United States. The objective of this study is to implement deep learning (DL) models using computed tomography pulmonary angiography (CTPA), clinical data, and PE Severity Index (PESI) scores to predict PE survival. In total, 918 patients (median age 64 y, range 13 to 99 y, 48% male) with 3978 CTPAs were identified via retrospective review across 3 institutions. To predict survival, an AI model was used to extract disease-related imaging features from CTPAs. Imaging features and clinical variables were then incorporated into independent DL models to predict survival outcomes. Cross-modal fusion CoxPH models were used to develop multimodal models from combinations of DL models and calculated PESI scores. Five multimodal models were developed as follows: (1) using CTPA imaging features only, (2) using clinical variables only, (3) using both CTPA and clinical variables, (4) using CTPA and PESI score, and (5) using CTPA, clinical variables, and PESI score. Performance was evaluated using the concordance index (c-index). Kaplan-Meier analysis was performed to stratify patients into high-risk and low-risk groups. Additional factor-risk analysis was conducted to account for right ventricular (RV) dysfunction. For both data sets, the multimodal models incorporating CTPA features, clinical variables, and PESI score achieved higher c-indices than PESI alone. Following the stratification of patients into high-risk and low-risk groups by models, survival outcomes differed significantly (both P <0.001). A strong correlation was found between high-risk grouping and RV dysfunction. Multiomic DL models incorporating CTPA features, clinical data, and PESI achieved higher c-indices than PESI alone for PE survival prediction.

CT Classification Chest Retrospective Clinical In Silico Academic Lab

Automatic detection of mandibular fractures on CT scan using deep learning.

Liu Y, Wang X, Tu Y, Chen W, Shi F, You M

•papers•Sep 1 2025

This study explores the application of artificial intelligence (AI), specifically deep learning, in the detection and classification of mandibular fractures using CT scans. Data from 459 patients were retrospectively obtained from West China Hospital of Stomatology, Sichuan University, spanning from 2020 to 2023. The CT scans were divided into training, testing, and independent validation sets. This research focuses on training and validating a deep learning model using the nnU-Net segmentation framework for pixel-level accuracy in identifying fracture locations. Additionally, a 3D-ResNet with pre-trained weights was employed to classify fractures into 3 types based on severity. Performance metrics included sensitivity, precision, specificity, and area under the receiver operating characteristic curve (AUC). The study achieved high diagnostic accuracy in mandibule fracture detection, with sensitivity >0.93, precision >0.79, and specificity >0.80. For mandibular fracture classification, accuracies were all above 0.718, with a mean AUC of 0.86. Detection and classification of mandibular fractures in CT images can be significantly enhanced using the nnU-Net segmentation framework, aiding in clinical diagnosis.

CT Segmentation Retrospective Clinical In Silico Academic Lab

Explainable self-supervised learning for medical image diagnosis based on DINO V2 model and semantic search.

Hussien A, Elkhateb A, Saeed M, Elsabawy NM, Elnakeeb AE, Elrashidy N

•papers•Sep 1 2025

Medical images have become indispensable for decision-making and significantly affect treatment planning. However, increasing medical imaging has widened the gap between medical images and available radiologists, leading to delays and diagnosis errors. Recent studies highlight the potential of deep learning (DL) in medical image diagnosis. However, their reliance on labelled data limits their applicability in various clinical settings. As a result, recent studies explore the role of self-supervised learning to overcome these challenges. Our study aims to address these challenges by examining the performance of self-supervised learning (SSL) in diverse medical image datasets and comparing it with traditional pre-trained supervised learning models. Unlike prior SSL methods that focus solely on classification, our framework leverages DINOv2's embeddings to enable semantic search in medical databases (via Qdrant), allowing clinicians to retrieve similar cases efficiently. This addresses a critical gap in clinical workflows where rapid case The results affirmed SSL's ability, especially DINO v2, to overcome the challenge associated with labelling data and provide an accurate diagnosis superior to traditional SL. DINO V2 provides 100%, 99%, 99%, 100 and 95% for classification accuracy of Lung cancer, brain tumour, leukaemia and Eye Retina Disease datasets, respectively. While existing SSL models (e.g., BYOL, SimCLR) lack interpretability, we uniquely combine DINOv2 with ViT-CX, a causal explanation method tailored for transformers. This provides clinically actionable heatmaps, revealing how the model localizes tumors/cellular patternsa feature absent in prior SSL medical imaging studies Furthermore, our research explores the impact of semantic search in the medical images domain and how it can revolutionize the querying process and provide semantic results alongside SSL and the Qudra Net dataset utilized to save the embedding of the developed model after the training process. Cosine similarity measures the distance between the image query and stored information in the embedding using cosine similarity. Our study aims to enhance the efficiency and accuracy of medical image analysis, ultimately improving the decision-making process.

Mixed Modality Classification Methodology In Silico Academic Lab GenAI

Application of deep learning for detection of nasal bone fracture on X-ray nasal bone lateral view.

Mortezaei T, Dalili Kajan Z, Mirroshandel SA, Mehrpour M, Shahidzadeh S

•papers•Sep 1 2025

This study aimed to assess the efficacy of deep learning applications for the detection of nasal bone fracture on X-ray nasal bone lateral view. In this retrospective observational study, 2968 X-ray nasal bone lateral views of trauma patients were collected from a radiology centre, and randomly divided into training, validation, and test sets. Preprocessing included noise reduction by using the Gaussian filter and image resizing. Edge detection was performed using the Canny edge detector. Feature extraction was conducted using the gray-level co-occurrence matrix (GLCM), histogram of oriented gradients (HOG), and local binary pattern (LBP) techniques. Several machine learning algorithms namely CNN, VGG16, VGG19, MobileNet, Xception, ResNet50V2, and InceptionV3 were employed for the classification of images into 2 classes of normal and fracture. The accuracy was the highest for VGG16 and Swin Transformer (79%) followed by ResNet50V2 and InceptionV3 (0.74), Xception (0.72), and MobileNet (0.71). The AUC was the highest for VGG16 (0.86) followed by VGG19 (0.84), MobileNet and Xception (0.83), and Swin Transformer (0.79). The tested deep learning models were capable of detecting nasal bone fractures on X-ray nasal bone lateral views with high accuracy. VGG16 was the best model with successful results.

X-Ray Classification Retrospective Clinical In Silico Academic Lab

Comparison of the diagnostic performance of the artificial intelligence-based TIRADS algorithm with established classification systems for thyroid nodules.

Bozkuş A, Başar Y, Güven K

•papers•Sep 1 2025

This study aimed to evaluate and compare the diagnostic performance of various Thyroid Imaging Reporting and Data Systems (TIRADS), with a particular focus on the artificial intelligence-based TIRADS (AI-TIRADS), in characterizing thyroid nodules. In this retrospective study conducted between April 2016 and May 2022, 1,322 thyroid nodules from 1,139 patients with confirmed cytopathological diagnoses were included. Each nodule was assessed using TIRADS classifications defined by the American College of Radiology (ACR-TIRADS), the American Thyroid Association (ATA-TIRADS), the European Thyroid Association (EU-TIRADS), the Korean Thyroid Association (K-TIRADS), and the AI-TIRADS. Three radiologists independently evaluated the ultrasound (US) characteristics of the nodules using all classification systems. Diagnostic performance was assessed using sensitivity, specificity, positive predictive value (PPV), and negative predictive value, and comparisons were made using the McNemar test. Among the nodules, 846 (64%) were benign, 299 (22.6%) were of intermediate risk, and 147 (11.1%) were malignant. The AI-TIRADS demonstrated a PPV of 21.2% and a specificity of 53.6%, outperforming the other systems in specificity without compromising sensitivity. The specificities of the ACR-TIRADS, the ATA-TIRADS, the EU-TIRADS, and the K-TIRADS were 44.6%, 39.3%, 40.1%, and 40.1%, respectively (all pairwise comparisons with the AI-TIRADS: <i>P</i> < 0.001). The PPVs for the ACR-TIRADS, the ATA-TIRADS, the EU-TIRADS, and the K-TIRADS were 18.5%, 17.9%, 17.9%, and 17.4%, respectively (all pairwise comparisons with the AI-TIRADS, excluding the ACR-TIRADS: <i>P</i> < 0.05). The AI-TIRADS shows promise in improving diagnostic specificity and reducing unnecessary biopsies in thyroid nodule assessment while maintaining high sensitivity. The findings suggest that the AI-TIRADS may enhance risk stratification, leading to better patient management. Additionally, the study found that the presence of multiple suspicious US features markedly increases the risk of malignancy, whereas isolated features do not substantially elevate the risk. The AI-TIRADS can enhance thyroid nodule risk stratification by improving diagnostic specificity and reducing unnecessary biopsies, potentially leading to more efficient patient management and better utilization of healthcare resources.

Ultrasound Classification Abdominal Retrospective Clinical In Silico Academic Lab

Added prognostic value of histogram features from preoperative multi-modal diffusion MRI in predicting Ki-67 proliferation for adult-type diffuse gliomas.

Huang Y, He S, Hu H, Ma H, Huang Z, Zeng S, Mazu L, Zhou W, Zhao C, Zhu N, Wu J, Liu Q, Yang Z, Wang W, Shen G, Zhang N, Chu J

•papers•Sep 1 2025

Ki-67 labelling index (LI), a critical marker of tumor proliferation, is vital for grading adult-type diffuse gliomas and predicting patient survival. However, its accurate assessment currently relies on invasive biopsy or surgical resection. This makes it challenging to non-invasively predict Ki-67 LI and subsequent prognosis. Therefore, this study aimed to investigate whether histogram analysis of multi-parametric diffusion model metrics-specifically diffusion tensor imaging (DTI), diffusion kurtosis imaging (DKI), and neurite orientation dispersion and density imaging (NODDI)-could help predict Ki-67 LI in adult-type diffuse gliomas and further predict patient survival. A total of 123 patients with diffuse gliomas who underwent preoperative bipolar spin-echo diffusion magnetic resonance imaging (MRI) were included. Diffusion metrics (DTI, DKI and NODDI) and their histogram features were extracted and used to develop a nomogram model in the training set (n=86), and the performance was verified in the test set (n=37). Area under the receiver operating characteristics curve of the nomogram model was calculated. The outcome cohort, including 123 patients, was used to evaluate the predictive value of the diffusion nomogram model for overall survival (OS). Cox proportion regression was performed to predict OS. Among 123 patients, 87 exhibited high Ki-67 LI (Ki-67 LI >5%). The patients had a mean age of 46.08±13.24 years, with 39 being female. Tumor grading showed 46 cases of grade 2, 21 cases of grade 3, and 56 cases of grade 4. The nomogram model included eight histogram features from diffusion MRI and showed good performance for prediction Ki-67 LI, with area under the receiver operating characteristic curves (AUCs) of 0.92 [95% confidence interval (CI): 0.85-0.98, sensitivity =0.85, specificity =0.84] and 0.84 (95% CI: 0.64-0.98, sensitivity =0.77, specificity =0.73) in the training set and test set, respectively. Further nomogram incorporating these variables showed good discrimination in Ki-67 LI predicting and glioma grading. A low nomogram model score relative to the median value in the outcomes cohort was independently associated with OS (P<0.01). Accurate prediction of the Ki-67 LI in adult-type diffuse glioma patients was achieved by using multi-modal diffusion MRI histogram radiomics model, which also reliably and accurately determined survival. ClinicalTrials.gov Identifier: NCT06572592.

MRI Classification Neurological Retrospective Clinical In Silico Academic Lab Benchmark SOTA

RibPull: Implicit Occupancy Fields and Medial Axis Extraction for CT Ribcage Scans

Emmanouil Nikolakakis, Amine Ouasfi, Julie Digne, Razvan Marinescu

•preprint•Sep 1 2025

We present RibPull, a methodology that utilizes implicit occupancy fields to bridge computational geometry and medical imaging. Implicit 3D representations use continuous functions that handle sparse and noisy data more effectively than discrete methods. While voxel grids are standard for medical imaging, they suffer from resolution limitations, topological information loss, and inefficient handling of sparsity. Coordinate functions preserve complex geometrical information and represent a better solution for sparse data representation, while allowing for further morphological operations. Implicit scene representations enable neural networks to encode entire 3D scenes within their weights. The result is a continuous function that can implicitly compesate for sparse signals and infer further information about the 3D scene by passing any combination of 3D coordinates as input to the model. In this work, we use neural occupancy fields that predict whether a 3D point lies inside or outside an object to represent CT-scanned ribcages. We also apply a Laplacian-based contraction to extract the medial axis of the ribcage, thus demonstrating a geometrical operation that benefits greatly from continuous coordinate-based 3D scene representations versus voxel-based representations. We evaluate our methodology on 20 medical scans from the RibSeg dataset, which is itself an extension of the RibFrac dataset. We will release our code upon publication.

CT Segmentation Chest Methodology In Silico Academic Lab Open Code

Filter Papers

Tags

SpectMamba: Integrating Frequency and State Space Models for Enhanced Medical Image Detection

Evaluating Undersampling Schemes and Deep Learning Reconstructions for High-Resolution 3D Double Echo Steady State Knee Imaging at 7 T: A Comparison Between GRAPPA, CAIPIRINHA, and Compressed Sensing.

Can super resolution via deep learning improve classification accuracy in dental radiography?

Pulmonary Embolism Survival Prediction Using Multimodal Learning Based on Computed Tomography Angiography and Clinical Data.

Automatic detection of mandibular fractures on CT scan using deep learning.

Explainable self-supervised learning for medical image diagnosis based on DINO V2 model and semantic search.

Application of deep learning for detection of nasal bone fracture on X-ray nasal bone lateral view.

Comparison of the diagnostic performance of the artificial intelligence-based TIRADS algorithm with established classification systems for thyroid nodules.

Added prognostic value of histogram features from preoperative multi-modal diffusion MRI in predicting Ki-67 proliferation for adult-type diffuse gliomas.

RibPull: Implicit Occupancy Fields and Medial Axis Extraction for CT Ribcage Scans

Ready to Sharpen Your Edge?