Latest Papers on Radiology AI. Tags: None

Efficient needle guidance: multi-camera augmented reality navigation without patient-specific calibration.

Wei Y, Huang B, Zhao B, Lin Z, Zhou SZ

•papers•Jul 12 2025

Augmented reality (AR) technology holds significant promise for enhancing surgical navigation in needle-based procedures such as biopsies and ablations. However, most existing AR systems rely on patient-specific markers, which disrupt clinical workflows and require time-consuming preoperative calibrations, thereby hindering operational efficiency and precision. We developed a novel multi-camera AR navigation system that eliminates the need for patient-specific markers by utilizing ceiling-mounted markers mapped to fixed medical imaging devices. A hierarchical optimization framework integrates both marker mapping and multi-camera calibration. Deep learning techniques are employed to enhance marker detection and registration accuracy. Additionally, a vision-based pose compensation method is implemented to mitigate errors caused by patient movement, improving overall positional accuracy. Validation through phantom experiments and simulated clinical scenarios demonstrated an average puncture accuracy of 3.72 ± 1.21 mm. The system reduced needle placement time by 20 s compared to traditional marker-based methods. It also effectively corrected errors induced by patient movement, with a mean positional error of 0.38 pixels and an angular deviation of 0.51 <math xmlns="http://www.w3.org/1998/Math/MathML"><mmultiscripts><mrow></mrow> <mrow></mrow> <mo>∘</mo></mmultiscripts> </math> . These results highlight the system's precision, adaptability, and reliability in realistic surgical conditions. This marker-free AR guidance system significantly streamlines surgical workflows while enhancing needle navigation accuracy. Its simplicity, cost-effectiveness, and adaptability make it an ideal solution for both high- and low-resource clinical environments, offering the potential for improved precision, reduced procedural time, and better patient outcomes.

Mixed Modality Registration Methodology Phantom/Animal Academic Lab Benchmark SOTA

PanoDiff-SR: Synthesizing Dental Panoramic Radiographs using Diffusion and Super-resolution

Sanyam Jain, Bruna Neves de Freitas, Andreas Basse-OConnor, Alexandros Iosifidis, Ruben Pauwels

•preprint•Jul 12 2025

There has been increasing interest in the generation of high-quality, realistic synthetic medical images in recent years. Such synthetic datasets can mitigate the scarcity of public datasets for artificial intelligence research, and can also be used for educational purposes. In this paper, we propose a combination of diffusion-based generation (PanoDiff) and Super-Resolution (SR) for generating synthetic dental panoramic radiographs (PRs). The former generates a low-resolution (LR) seed of a PR (256 X 128) which is then processed by the SR model to yield a high-resolution (HR) PR of size 1024 X 512. For SR, we propose a state-of-the-art transformer that learns local-global relationships, resulting in sharper edges and textures. Experimental results demonstrate a Frechet inception distance score of 40.69 between 7243 real and synthetic images (in HR). Inception scores were 2.55, 2.30, 2.90 and 2.98 for real HR, synthetic HR, real LR and synthetic LR images, respectively. Among a diverse group of six clinical experts, all evaluating a mixture of 100 synthetic and 100 real PRs in a time-limited observation, the average accuracy in distinguishing real from synthetic images was 68.5% (with 50% corresponding to random guessing).

X-Ray Image Synthesis Methodology In Silico Open Dataset

Calibrated and Robust Foundation Models for Vision-Language and Medical Image Tasks Under Distribution Shift

Behraj Khan, Tahir Syed

•preprint•Jul 12 2025

Foundation models like CLIP and SAM have transformed computer vision and medical imaging via low-shot transfer learning. However, deployment of these models hindered by two key challenges: \textit{distribution shift} between training and test data, and \textit{confidence misalignment} that leads to overconfident incorrect predictions. These issues manifest differently in vision-language classification and medical segmentation tasks, yet existing solutions remain domain-specific. We propose \textit{StaRFM}, a unified framework addressing both challenges. It introduces a Fisher information penalty (FIP), extended to 3D medical data via patch-wise regularization, to reduce covariate shift in CLIP and SAM embeddings. Additionally, a confidence misalignment penalty (CMP), reformulated for voxel-level predictions, calibrates uncertainty in segmentation tasks. We theoretically derive PAC-Bayes bounds showing FIP controls generalization via the Fisher-Rao norm, while CMP minimizes calibration error through Brier score optimization. StaRFM shows consistent performance like \texttt{+}3.5\% accuracy and 28\% lower ECE on 19 vision datasets (e.g., ImageNet, Office-Home), 84.7\% DSC and 4.8mm HD95 in medical segmentation (e.g., BraTS, ATLAS), and 40\% lower cross-domain performance gap compared to prior benchmarking methods. The framework is plug-and-play, requiring minimal architectural changes for seamless integration with foundation models. Code and models will be released at https://anonymous.4open.science/r/StaRFM-C0CD/README.md

MRI Segmentation Neurological Methodology In Silico Academic Lab Benchmark SOTA Open Code

Oriented tooth detection: a CBCT image processing method integrated with RoI transformer.

Zhao Z, Wu B, Su S, Liu D, Wu Z, Gao R, Zhang N

•papers•Jul 11 2025

Cone beam computed tomography (CBCT) has revolutionized dental imaging due to its high spatial resolution and ability to provide detailed three-dimensional reconstructions of dental structures. This study introduces an innovative CBCT image processing method using an oriented object detection approach integrated with a Region of Interest (RoI) Transformer. This study addresses the challenge of accurate tooth detection and classification in PAN derived from CBCT, introducing an innovative oriented object detection approach, which has not been previously applied in dental imaging. This method better aligns with the natural growth patterns of teeth, allowing for more accurate detection and classification of molars, premolars, canines, and incisors. By integrating RoI transformer, the model demonstrates relatively acceptable performance metrics compared to conventional horizontal detection methods, while also offering enhanced visualization capabilities. Furthermore, post-processing techniques, including distance and grayscale value constraints, are employed to correct classification errors and reduce false positives, especially in areas with missing teeth. The experimental results indicate that the proposed method achieves an accuracy of 98.48%, a recall of 97.21%, an F1 score of 97.21%, and an mAP of 98.12% in tooth detection. The proposed method enhances the accuracy of tooth detection in CBCT-derived PAN by reducing background interference and improving the visualization of tooth orientation.

CT Detection Methodology In Silico Academic Lab

HNOSeg-XS: Extremely Small Hartley Neural Operator for Efficient and Resolution-Robust 3D Image Segmentation.

Wong KCL, Wang H, Syeda-Mahmood T

•papers•Jul 11 2025

In medical image segmentation, convolutional neural networks (CNNs) and transformers are dominant. For CNNs, given the local receptive fields of convolutional layers, long-range spatial correlations are captured through consecutive convolutions and pooling. However, as the computational cost and memory footprint can be prohibitively large, 3D models can only afford fewer layers than 2D models with reduced receptive fields and abstract levels. For transformers, although long-range correlations can be captured by multi-head attention, its quadratic complexity with respect to input size is computationally demanding. Therefore, either model may require input size reduction to allow more filters and layers for better segmentation. Nevertheless, given their discrete nature, models trained with patch-wise training or image downsampling may produce suboptimal results when applied on higher resolutions. To address this issue, here we propose the resolution-robust HNOSeg-XS architecture. We model image segmentation by learnable partial differential equations through the Fourier neural operator which has the zero-shot super-resolution property. By replacing the Fourier transform by the Hartley transform and reformulating the problem in the frequency domain, we created the HNOSeg-XS model, which is resolution robust, fast, memory efficient, and extremely parameter efficient. When tested on the BraTS'23, KiTS'23, and MVSeg'23 datasets with a Tesla V100 GPU, HNOSeg-XS showed its superior resolution robustness with fewer than 34.7k model parameters. It also achieved the overall best inference time (< 0.24 s) and memory efficiency (< 1.8 GiB) compared to the tested CNN and transformer models<sup>1</sup>.

Mixed Modality Segmentation Methodology In Silico Academic Lab Benchmark SOTA

Automated MRI protocoling in neuroradiology in the era of large language models.

Reiner LN, Chelbi M, Fetscher L, Stöckel JC, Csapó-Schmidt C, Guseynova S, Al Mohamad F, Bressem KK, Nawabi J, Siebert E, Wattjes MP, Scheel M, Meddeb A

•papers•Jul 11 2025

This study investigates the automation of MRI protocoling, a routine task in radiology, using large language models (LLMs), comparing an open-source (LLama 3.1 405B) and a proprietary model (GPT-4o) with and without retrieval-augmented generation (RAG), a method for incorporating domain-specific knowledge. This retrospective study included MRI studies conducted between January and December 2023, along with institution-specific protocol assignment guidelines. Clinical questions were extracted, and a neuroradiologist established the gold standard protocol. LLMs were tasked with assigning MRI protocols and contrast medium administration with and without RAG. The results were compared to protocols selected by four radiologists. Token-based symmetric accuracy, the Wilcoxon signed-rank test, and the McNemar test were used for evaluation. Data from 100 neuroradiology reports (mean age = 54.2 years ± 18.41, women 50%) were included. RAG integration significantly improved accuracy in sequence and contrast media prediction for LLama 3.1 (Sequences: 38% vs. 70%, P < .001, Contrast Media: 77% vs. 94%, P < .001), and GPT-4o (Sequences: 43% vs. 81%, P < .001, Contrast Media: 79% vs. 92%, P = .006). GPT-4o outperformed LLama 3.1 in MRI sequence prediction (81% vs. 70%, P < .001), with comparable accuracies to the radiologists (81% ± 0.21, P = .43). Both models equaled radiologists in predicting contrast media administration (LLama 3.1 RAG: 94% vs. 91% ± 0.2, P = .37, GPT-4o RAG: 92% vs. 91% ± 0.24, P = .48). Large language models show great potential as decision-support tools for MRI protocoling, with performance similar to radiologists. RAG enhances the ability of LLMs to provide accurate, institution-specific protocol recommendations.

MRI LLM Radiology Report Neurological Retrospective Clinical In Silico Academic Lab GenAI Benchmark SOTA

Semi-supervised Medical Image Segmentation Using Heterogeneous Complementary Correction Network and Confidence Contrastive Learning.

Li L, Xue M, Li S, Dong Z, Liao T, Li P

•papers•Jul 11 2025

Semi-supervised medical image segmentation techniques have demonstrated significant potential and effectiveness in clinical diagnosis. The prevailing approaches using the mean-teacher (MT) framework achieve promising image segmentation results. However, due to the unreliability of the pseudo labels generated by the teacher model, existing methods still have some inherent limitations that must be considered and addressed. In this paper, we propose an innovative semi-supervised method for medical image segmentation by combining the heterogeneous complementary correction network and confidence contrastive learning (HC-CCL). Specifically, we develop a triple-branch framework by integrating a heterogeneous complementary correction (HCC) network into the MT framework. HCC serves as an auxiliary branch that corrects prediction errors in the student model and provides complementary information. To improve the capacity for feature learning in our proposed model, we introduce a confidence contrastive learning (CCL) approach with a novel sampling strategy. Furthermore, we develop a momentum style transfer (MST) method to narrow the gap between labeled and unlabeled data distributions. In addition, we introduce a Cutout-style augmentation for unsupervised learning to enhance performance. Three medical image datasets (including left atrial (LA) dataset, NIH pancreas dataset, Brats-2019 dataset) were employed to rigorously evaluate HC-CCL. Quantitative results demonstrate significant performance advantages over existing approaches, achieving state-of-the-art performance across all metrics. The implementation will be released at https://github.com/xxmmss/HC-CCL .

Mixed Modality Segmentation Methodology In Silico Academic Lab Benchmark SOTA Open Code

A novel artificial Intelligence-Based model for automated Lenke classification in adolescent idiopathic scoliosis.

Xie K, Zhu S, Lin J, Li Y, Huang J, Lei W, Yan Y

•papers•Jul 11 2025

To develop an artificial intelligence (AI)-driven model for automatic Lenke classification of adolescent idiopathic scoliosis (AIS) and assess its performance. This retrospective study utilized 860 spinal radiographs from 215 AIS patients with four views, including 161 training sets and 54 testing sets. Additionally, 1220 spinal radiographs from 610 patients with only anterior-posterior (AP) and lateral (LAT) views were collected for training. The model was designed to perform keypoint detection, pedicle segmentation, and AIS classification based on a custom classification strategy. Its performance was evaluated against the gold standard using metrics such as mean absolute difference (MAD), intraclass correlation coefficient (ICC), Bland-Altman plots, Cohen's Kappa, and the confusion matrix. In comparison to the gold standard, the MAD for all predicted angles was 2.29°, with an excellent ICC. Bland-Altman analysis revealed minimal differences between the methods. For Lenke classification, the model exhibited exceptional consistency in curve type, lumbar modifier, and thoracic sagittal profile, with average Kappa values of 0.866, 0.845, and 0.827, respectively, and corresponding accuracy rates of 87.07%, 92.59%, and 92.59%. Subgroup analysis further confirmed the model's high consistency, with Kappa values ranging from 0.635 to 0.930, 0.672 to 0.926, and 0.815 to 0.847, and accuracy rates between 90.7 and 98.1%, 92.6-98.3%, and 92.6-98.1%, respectively. This novel AI system facilitates the rapid and accurate automatic Lenke classification, offering potential assistance to spinal surgeons.

X-Ray Classification Musculoskeletal Retrospective Clinical In Silico Academic Lab

An integrated strategy based on radiomics and quantum machine learning: diagnosis and clinical interpretation of pulmonary ground-glass nodules.

Huang X, Xu F, Zhu W, Yao L, He J, Su J, Zhao W, Hu H

•papers•Jul 11 2025

Accurate classification of pulmonary pure ground-glass nodules (pGGNs) is essential for distinguishing invasive adenocarcinoma (IVA) from adenocarcinoma in situ (AIS) and minimally invasive adenocarcinoma (MIA), which significantly influences treatment decisions. This study aims to develop a high-precision integrated strategy by combining radiomics-based feature extraction, Quantum Machine Learning (QML) models, and SHapley Additive exPlanations (SHAP) analysis to improve diagnostic accuracy and interpretability in pGGN classification. A total of 322 pGGNs from 275 patients were retrospectively analyzed. The CT images was randomly divided into training and testing cohorts (80:20), with radiomic features extracted from the training cohort. Three QML models-Quantum Support Vector Classifier (QSVC), Pegasos QSVC, and Quantum Neural Network (QNN)-were developed and compared with a classical Support Vector Machine (SVM). SHAP analysis was applied to interpret the contribution of radiomic features to the models' predictions. All three QML models outperformed the classical SVM, with the QNN model achieving the highest improvements ([Formula: see text]) in classification metrics, including accuracy (89.23%, 95% CI: 81.54% - 95.38%), sensitivity (96.55%, 95% CI: 89.66% - 100.00%), specificity (83.33%, 95% CI: 69.44% - 94.44%), and area under the curve (AUC) (0.937, 95% CI: 0.871 - 0.983), respectively. SHAP analysis identified Low Gray Level Run Emphasis (LGLRE), Gray Level Non-uniformity (GLN), and Size Zone Non-uniformity (SZN) as the most critical features influencing classification. This study demonstrates that the proposed integrated strategy, combining radiomics, QML models, and SHAP analysis, significantly enhances the accuracy and interpretability of pGGN classification, particularly in small-sample datasets. It offers a promising tool for early, non-invasive lung cancer diagnosis and helps clinicians make more informed treatment decisions. Not applicable.

CT Classification Chest Retrospective Clinical In Silico Academic Lab Benchmark SOTA

Interpretable MRI Subregional Radiomics-Deep Learning Model for Preoperative Lymphovascular Invasion Prediction in Rectal Cancer: A Dual-Center Study.

Huang T, Zeng Y, Jiang R, Zhou Q, Wu G, Zhong J

•papers•Jul 11 2025

Develop a fusion model based on explainable machine learning, combining multiparametric MRI subregional radiomics and deep learning, to preoperatively predict the lymphovascular invasion status in rectal cancer. We collected data from RC patients with histopathological confirmation from two medical centers, with 301 patients used as a training set and 75 patients as an external validation set. Using K-means clustering techniques, we meticulously divided the tumor areas into multiple subregions and extracted crucial radiomic features from them. Additionally, we employed an advanced Vision Transformer (ViT) deep learning model to extract features. These features were integrated to construct the SubViT model. To better understand the decision-making process of the model, we used the Shapley Additive Properties (SHAP) tool to evaluate the model's interpretability. Finally, we comprehensively assessed the performance of the SubViT model through receiver operating characteristic (ROC) curves, decision curve analysis (DCA), and the Delong test, comparing it with other models. In this study, the SubViT model demonstrated outstanding predictive performance in the training set, achieving an area under the curve (AUC) of 0.934 (95% confidence interval: 0.9074 to 0.9603). It also performed well in the external validation set, with an AUC of 0.884 (95% confidence interval: 0.8055 to 0.9616), outperforming both subregion radiomics and imaging-based models. Furthermore, decision curve analysis (DCA) indicated that the SubViT model provides higher clinical utility compared to other models. As an advanced composite model, the SubViT model demonstrated its efficiency in the non-invasive assessment of local vascular invasion (LVI) in rectal cancer.

MRI Classification Abdominal Retrospective Clinical In Silico Academic Lab Benchmark SOTA

Filter Papers

Tags

Efficient needle guidance: multi-camera augmented reality navigation without patient-specific calibration.

PanoDiff-SR: Synthesizing Dental Panoramic Radiographs using Diffusion and Super-resolution

Calibrated and Robust Foundation Models for Vision-Language and Medical Image Tasks Under Distribution Shift

Oriented tooth detection: a CBCT image processing method integrated with RoI transformer.

HNOSeg-XS: Extremely Small Hartley Neural Operator for Efficient and Resolution-Robust 3D Image Segmentation.

Automated MRI protocoling in neuroradiology in the era of large language models.

Semi-supervised Medical Image Segmentation Using Heterogeneous Complementary Correction Network and Confidence Contrastive Learning.

A novel artificial Intelligence-Based model for automated Lenke classification in adolescent idiopathic scoliosis.

An integrated strategy based on radiomics and quantum machine learning: diagnosis and clinical interpretation of pulmonary ground-glass nodules.

Interpretable MRI Subregional Radiomics-Deep Learning Model for Preoperative Lymphovascular Invasion Prediction in Rectal Cancer: A Dual-Center Study.

Ready to Sharpen Your Edge?