Sort by:
Page 5 of 14131 results

Are Vision Foundation Models Ready for Out-of-the-Box Medical Image Registration?

Hanxue Gu, Yaqian Chen, Nicholas Konz, Qihang Li, Maciej A. Mazurowski

arxiv logopreprintJul 15 2025
Foundation models, pre-trained on large image datasets and capable of capturing rich feature representations, have recently shown potential for zero-shot image registration. However, their performance has mostly been tested in the context of rigid or less complex structures, such as the brain or abdominal organs, and it remains unclear whether these models can handle more challenging, deformable anatomy. Breast MRI registration is particularly difficult due to significant anatomical variation between patients, deformation caused by patient positioning, and the presence of thin and complex internal structure of fibroglandular tissue, where accurate alignment is crucial. Whether foundation model-based registration algorithms can address this level of complexity remains an open question. In this study, we provide a comprehensive evaluation of foundation model-based registration algorithms for breast MRI. We assess five pre-trained encoders, including DINO-v2, SAM, MedSAM, SSLSAM, and MedCLIP, across four key breast registration tasks that capture variations in different years and dates, sequences, modalities, and patient disease status (lesion versus no lesion). Our results show that foundation model-based algorithms such as SAM outperform traditional registration baselines for overall breast alignment, especially under large domain shifts, but struggle with capturing fine details of fibroglandular tissue. Interestingly, additional pre-training or fine-tuning on medical or breast-specific images in MedSAM and SSLSAM, does not improve registration performance and may even decrease it in some cases. Further work is needed to understand how domain-specific training influences registration and to explore targeted strategies that improve both global alignment and fine structure accuracy. We also publicly release our code at \href{https://github.com/mazurowski-lab/Foundation-based-reg}{Github}.

Efficient needle guidance: multi-camera augmented reality navigation without patient-specific calibration.

Wei Y, Huang B, Zhao B, Lin Z, Zhou SZ

pubmed logopapersJul 12 2025
Augmented reality (AR) technology holds significant promise for enhancing surgical navigation in needle-based procedures such as biopsies and ablations. However, most existing AR systems rely on patient-specific markers, which disrupt clinical workflows and require time-consuming preoperative calibrations, thereby hindering operational efficiency and precision. We developed a novel multi-camera AR navigation system that eliminates the need for patient-specific markers by utilizing ceiling-mounted markers mapped to fixed medical imaging devices. A hierarchical optimization framework integrates both marker mapping and multi-camera calibration. Deep learning techniques are employed to enhance marker detection and registration accuracy. Additionally, a vision-based pose compensation method is implemented to mitigate errors caused by patient movement, improving overall positional accuracy. Validation through phantom experiments and simulated clinical scenarios demonstrated an average puncture accuracy of 3.72 ± 1.21 mm. The system reduced needle placement time by 20 s compared to traditional marker-based methods. It also effectively corrected errors induced by patient movement, with a mean positional error of 0.38 pixels and an angular deviation of 0.51 <math xmlns="http://www.w3.org/1998/Math/MathML"><mmultiscripts><mrow></mrow> <mrow></mrow> <mo>∘</mo></mmultiscripts> </math> . These results highlight the system's precision, adaptability, and reliability in realistic surgical conditions. This marker-free AR guidance system significantly streamlines surgical workflows while enhancing needle navigation accuracy. Its simplicity, cost-effectiveness, and adaptability make it an ideal solution for both high- and low-resource clinical environments, offering the potential for improved precision, reduced procedural time, and better patient outcomes.

AI lesion tracking in PET/CT imaging: a proposal for a Siamese-based CNN pipeline applied to PSMA PET/CT scans.

Hein SP, Schultheiss M, Gafita A, Zaum R, Yagubbayli F, Tauber R, Rauscher I, Eiber M, Pfeiffer F, Weber WA

pubmed logopapersJul 8 2025
Assessing tumor response to systemic therapies is one of the main applications of PET/CT. Routinely, only a small subset of index lesions out of multiple lesions is analyzed. However, this operator dependent selection may bias the results due to possible significant inter-metastatic heterogeneity of response to therapy. Automated, AI-based approaches for lesion tracking hold promise in enabling the analysis of many more lesions and thus providing a better assessment of tumor response. This work introduces a Siamese CNN approach for lesion tracking between PET/CT scans. Our approach is applied on the laborious task of tracking a high number of bone lesions in full-body baseline and follow-up [<sup>68</sup>Ga]Ga- or [<sup>18</sup>F]F-PSMA PET/CT scans after two cycles of [<sup>177</sup>Lu]Lu-PSMA therapy of metastatic castration resistant prostate cancer patients. Data preparation includes lesion segmentation and affine registration. Our algorithm extracts suitable lesion patches and forwards them into a Siamese CNN trained to classify the lesion patch pairs as corresponding or non-corresponding lesions. Experiments have been performed with different input patch types and a Siamese network in 2D and 3D. The CNN model successfully learned to classify lesion assignments, reaching an accuracy of 83 % in its best configuration with an AUC = 0.91. For corresponding lesions the pipeline accomplished lesion tracking accuracy of even 89 %. We proved that a CNN may facilitate the tracking of multiple lesions in PSMA PET/CT scans. Future clinical studies are necessary if this improves the prediction of the outcome of therapies.

A novel framework for fully-automated co-registration of intravascular ultrasound and optical coherence tomography imaging data

Xingwei He, Kit Mills Bransby, Ahmet Emir Ulutas, Thamil Kumaran, Nathan Angelo Lecaros Yap, Gonul Zeren, Hesong Zeng, Yaojun Zhang, Andreas Baumbach, James Moon, Anthony Mathur, Jouke Dijkstra, Qianni Zhang, Lorenz Raber, Christos V Bourantas

arxiv logopreprintJul 8 2025
Aims: To develop a deep-learning (DL) framework that will allow fully automated longitudinal and circumferential co-registration of intravascular ultrasound (IVUS) and optical coherence tomography (OCT) images. Methods and results: Data from 230 patients (714 vessels) with acute coronary syndrome that underwent near-infrared spectroscopy (NIRS)-IVUS and OCT imaging in their non-culprit vessels were included in the present analysis. The lumen borders annotated by expert analysts in 61,655 NIRS-IVUS and 62,334 OCT frames, and the side branches and calcific tissue identified in 10,000 NIRS-IVUS frames and 10,000 OCT frames, were used to train DL solutions for the automated extraction of these features. The trained DL solutions were used to process NIRS-IVUS and OCT images and their output was used by a dynamic time warping algorithm to co-register longitudinally the NIRS-IVUS and OCT images, while the circumferential registration of the IVUS and OCT was optimized through dynamic programming. On a test set of 77 vessels from 22 patients, the DL method showed high concordance with the expert analysts for the longitudinal and circumferential co-registration of the two imaging sets (concordance correlation coefficient >0.99 for the longitudinal and >0.90 for the circumferential co-registration). The Williams Index was 0.96 for longitudinal and 0.97 for circumferential co-registration, indicating a comparable performance to the analysts. The time needed for the DL pipeline to process imaging data from a vessel was <90s. Conclusion: The fully automated, DL-based framework introduced in this study for the co-registration of IVUS and OCT is fast and provides estimations that compare favorably to the expert analysts. These features renders it useful in research in the analysis of large-scale data collected in studies that incorporate multimodality imaging to characterize plaque composition.

AI-enhanced patient-specific dosimetry in I-131 planar imaging with a single oblique view.

Jalilifar M, Sadeghi M, Emami-Ardekani A, Bitarafan-Rajabi A, Geravand K, Geramifar P

pubmed logopapersJul 8 2025
This study aims to enhance the dosimetry accuracy in <sup>131</sup>I planar imaging by utilizing a single oblique view and Monte Carlo (MC) validated dose point kernels (DPKs) alongside the integration of artificial intelligence (AI) for accurate dose prediction within planar imaging. Forty patients with thyroid cancers post-thyroidectomy surgery and 30 with neuroendocrine tumors underwent planar and SPECT/CT imaging. Using whole-body (WB) planar images with an additional oblique view, organ thicknesses were estimated. DPKs and organ-specific S-values were used to estimate the absorbed doses. Four AI algorithms- multilayer perceptron (MLP), linear regression, support vector regression model, decision tree, convolution neural network, and U-Net were used for dose estimation. Planar image counts, body thickness, patient BMI, age, S-values, and tissue attenuation coefficients were imported as input into the AI algorithm. To provide the ground truth, the CT-based segmentation generated binary masks for each organ, and the corresponding SPECT images were used for GATE MC dosimetry. The MLP-predicted dose values across all organs represented superior performance with the lowest mean absolute error in the liver but higher in the spleen and salivary glands. Notably, MLP-based dose estimations closely matched ground truth data with < 15% differences in most tissues. The MLP-estimated dose values present a robust patient-specific dosimetry approach capable of swiftly predicting absorbed doses in different organs using WB planar images and a single oblique view. This approach facilitates the implementation of 2D planar imaging as a pre-therapeutic technique for a more accurate assessment of the administrated activity.

Uncertainty and normalized glandular dose evaluations in digital mammography and digital breast tomosynthesis with a machine learning methodology.

Sarno A, Massera RT, Paternò G, Cardarelli P, Marshall N, Bosmans H, Bliznakova K

pubmed logopapersJul 8 2025
To predict the normalized glandular dose (DgN) coefficients and the related uncertainty in mammography and digital breast tomosynthesis (DBT) using a machine learning algorithm and patient-like digital breast models. 126 patient-like digital breast phantoms were used for DgN Monte Carlo ground truth calculations. An Automatic Relevance Determination Regression algorithm was used to predict DgN from anatomical breast features. These features included compressed breast thickness, glandular fraction by volume, glandular volume, center of mass and standard deviation of the glandular tissue distribution in the cranio-caudal direction. An algorithm for data imputation was explored to account for avoiding the use of the latter two features. 5-fold cross validation showed that the predictive model provides an estimation of DgN with 1% average difference from the ground truth; this difference was less than 3% in 50% of the cases. The average uncertainty of the estimated DgN values was 9%. Excluding the information related to the glandular distribution increased this uncertainty to 17% without inducing a significant discrepancy in estimated DgN values, with half of the predicted cases differing from the ground truth by less than 9%. The data imputation algorithm reduced the estimated uncertainty, without restoring the original performance. Predictive performance improved by increasing tube voltage. The proposed methodology predicts the DgN in mammography and DBT for patient-derived breasts with an uncertainty below 9%. Predicting test evaluations reported 1% average difference from the ground truth, with 50% of the cohort cases differing by less than 5%.

Development and validation of an improved volumetric breast density estimation model using the ResNet technique.

Asai Y, Yamamuro M, Yamada T, Kimura Y, Ishii K, Nakamura Y, Otsuka Y, Kondo Y

pubmed logopapersJul 7 2025
&#xD;Temporal changes in volumetric breast density (VBD) may serve as prognostic biomarkers for predicting the risk of future breast cancer development. However, accurately measuring VBD from archived X-ray mammograms remains challenging. In a previous study, we proposed a method to estimate volumetric breast density using imaging parameters (tube voltage, tube current, and exposure time) and patient age. This approach, based on a multiple regression model, achieved a determination coefficient (R²) of 0.868. &#xD;Approach:&#xD;In this study, we developed and applied machine learning models-Random Forest, XG-Boost-and the deep learning model Residual Network (ResNet) to the same dataset. Model performance was assessed using several metrics: determination coefficient, correlation coefficient, root mean square error, mean absolute error, root mean square percentage error, and mean absolute percentage error. A five-fold cross-validation was conducted to ensure robust validation. &#xD;Main results:&#xD;The best-performing fold resulted in R² values of 0.895, 0.907, and 0.918 for Random Forest, XG-Boost, and ResNet, respectively, all surpassing the previous study's results. ResNet consistently achieved the lowest error values across all metrics. &#xD;Significance:&#xD;These findings suggest that ResNet successfully achieved the task of accurately determining VBD from past mammography-a task that has not been realised to date. We are confident that this achievement contributes to advancing research aimed at predicting future risks of breast cancer development by enabling high-accuracy time-series analyses of retrospective VBD.&#xD.

DHR-Net: Dynamic Harmonized registration network for multimodal medical images.

Yang X, Li D, Chen S, Deng L, Wang J, Huang S

pubmed logopapersJul 5 2025
Although deep learning has driven remarkable advancements in medical image registration, deep neural network-based non-rigid deformation field generation methods demonstrate high accuracy in single-modality scenarios. However, multi-modal medical image registration still faces critical challenges. To address the issues of insufficient anatomical consistency and unstable deformation field optimization in cross-modal registration tasks among existing methods, this paper proposes an end-to-end medical image registration method based on a Dynamic Harmonized Registration framework (DHR-Net). DHR-Net employs a cascaded two-stage architecture, comprising a translation network and a registration network that operate in sequential processing phases. Furthermore, we propose a loss function based on the Noise Contrastive Estimation framework, which enhances anatomical consistency in cross-modal translation by maximizing mutual information between input and transformed image patches. This loss function incorporates a dynamic temperature adjustment mechanism that progressively optimizes feature contrast constraints during training to improve high-frequency detail preservation, thereby better constraining the topological structure of target images. Experiments conducted on the M&M Heart Dataset demonstrate that DHR-Net outperforms existing methods in registration accuracy, deformation field smoothness, and cross-modal robustness. The framework significantly enhances the registration quality of cardiac images while demonstrating exceptional performance in preserving anatomical structures, exhibiting promising potential for clinical applications.

Joint Shape Reconstruction and Registration via a Shared Hybrid Diffeomorphic Flow.

Shi H, Wang P, Zhang S, Zhao X, Yang B, Zhang C

pubmed logopapersJul 3 2025
Deep implicit functions (DIFs) effectively represent shapes by using a neural network to map 3D spatial coordinates to scalar values that encode the shape's geometry, but it is difficult to establish correspondences between shapes directly, limiting their use in medical image registration. The recently presented deformation field-based methods achieve implicit templates learning via template field learning with DIFs and deformation field learning, establishing shape correspondence through deformation fields. Although these approaches enable joint learning of shape representation and shape correspondence, the decoupled optimization for template field and deformation field, caused by the absence of deformation annotations lead to a relatively accurate template field but an underoptimized deformation field. In this paper, we propose a novel implicit template learning framework via a shared hybrid diffeomorphic flow (SHDF), which enables shared optimization for deformation and template, contributing to better deformations and shape representation. Specifically, we formulate the signed distance function (SDF, a type of DIFs) as a one-dimensional (1D) integral, unifying dimensions to match the form used in solving ordinary differential equation (ODE) for deformation field learning. Then, SDF in 1D integral form is integrated seamlessly into the deformation field learning. Using a recurrent learning strategy, we frame shape representations and deformations as solving different initial value problems of the same ODE. We also introduce a global smoothness regularization to handle local optima due to limited outside-of-shape data. Experiments on medical datasets show that SHDF outperforms state-of-the-art methods in shape representation and registration.

A novel few-shot learning framework for supervised diffeomorphic image registration network.

Chen K, Han H, Wei J, Zhang Y

pubmed logopapersJul 2 2025
Image registration is a key technique in image processing and analysis. Due to its high complexity, the traditional registration frameworks often fail to meet real-time demands in practice. To address the real-time demand, several deep learning networks for registration have been proposed, including the supervised and the unsupervised networks. Unsupervised networks rely on large amounts of training data to minimize specific loss functions, but the lack of physical information constraints results in the lower accuracy compared with the supervised networks. However, the supervised networks in medical image registration face two major challenges: physical mesh folding and the scarcity of labeled training data. To address these two challenges, we propose a novel few-shot learning framework for image registration. The framework contains two parts: random diffeomorphism generator (RDG) and a supervised few-shot learning network for image registration. By randomly generating a complex vector field, the RDG produces a series of diffeomorphism. With the help of diffeomorphism generated by RDG, one can use only a few image data (theoretically, one image data is enough) to generate a series of labels for training the supervised few-shot learning network. Concerning the elimination of the physical mesh folding phenomenon, in the proposed network, the loss function is only required to ensure the smoothness of deformation (no other control for mesh folding elimination is necessary). The experimental results indicate that the proposed method demonstrates superior performance in eliminating physical mesh folding when compared to other existing learning-based methods. Our code is available at this link https://github.com/weijunping111/RDG-TMI.git.
Page 5 of 14131 results
Show
per page

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.