Sort by:
Page 1 of 983 results
Next

SurgPointTransformer: transformer-based vertebra shape completion using RGB-D imaging.

Massalimova A, Liebmann F, Jecklin S, Carrillo F, Farshad M, Fürnstahl P

pubmed logopapersDec 1 2025
State-of-the-art computer- and robot-assisted surgery systems rely on intraoperative imaging technologies such as computed tomography and fluoroscopy to provide detailed 3D visualizations of patient anatomy. However, these methods expose both patients and clinicians to ionizing radiation. This study introduces a radiation-free approach for 3D spine reconstruction using RGB-D data. Inspired by the "mental map" surgeons form during procedures, we present SurgPointTransformer, a shape completion method that reconstructs unexposed spinal regions from sparse surface observations. The method begins with a vertebra segmentation step that extracts vertebra-level point clouds for subsequent shape completion. SurgPointTransformer then uses an attention mechanism to learn the relationship between visible surface features and the complete spine structure. The approach is evaluated on an <i>ex vivo</i> dataset comprising nine samples, with CT-derived data used as ground truth. SurgPointTransformer significantly outperforms state-of-the-art baselines, achieving a Chamfer distance of 5.39 mm, an F-score of 0.85, an Earth mover's distance of 11.00 and a signal-to-noise ratio of 22.90 dB. These results demonstrate the potential of our method to reconstruct 3D vertebral shapes without exposing patients to ionizing radiation. This work contributes to the advancement of computer-aided and robot-assisted surgery by enhancing system perception and intelligence.

URFM: A general Ultrasound Representation Foundation Model for advancing ultrasound image diagnosis.

Kang Q, Lao Q, Gao J, Bao W, He Z, Du C, Lu Q, Li K

pubmed logopapersAug 15 2025
Ultrasound imaging is critical for clinical diagnostics, providing insights into various diseases and organs. However, artificial intelligence (AI) in this field faces challenges, such as the need for large labeled datasets and limited task-specific model applicability, particularly due to ultrasound's low signal-to-noise ratio (SNR). To overcome these, we introduce the Ultrasound Representation Foundation Model (URFM), designed to learn robust, generalizable representations from unlabeled ultrasound images, enabling label-efficient adaptation to diverse diagnostic tasks. URFM is pre-trained on over 1M images from 15 major anatomical organs using representation-based masked image modeling (MIM), an advanced self-supervised learning. Unlike traditional pixel-based MIM, URFM integrates high-level representations from BiomedCLIP, a specialized medical vision-language model, to address the low SNR issue. Extensive evaluation shows that URFM outperforms state-of-the-art methods, offering enhanced generalization, label efficiency, and training-time efficiency. URFM's scalability and flexibility signal a significant advancement in diagnostic accuracy and clinical workflow optimization in ultrasound imaging.

SKOOTS: Skeleton oriented object segmentation for mitochondria

Buswinka, C. J., Osgood, R. T., Nitta, H., Indzhykulian, A. A.

biorxiv logopreprintAug 13 2025
Segmenting individual instances of mitochondria from imaging datasets can provide rich quantitative information, but is prohibitively time-consuming when done manually, prompting interest in the development of automated algorithms using deep neural networks. Existing solutions for various segmentation tasks are optimized for either: high-resolution three-dimensional imaging, relying on well-defined object boundaries (e.g., whole neuron segmentation in volumetric electron microscopy datasets); or low-resolution two-dimensional imaging, boundary-invariant but poorly suited to large 3D objects (e.g., whole-cell segmentation of light microscopy images). Mitochondria in whole-cell 3D electron microscopy datasets often lie in the middle ground - large, yet with ambiguous borders, challenging current segmentation tools. To address this, we developed skeleton-oriented object segmentation (SKOOTS) - a novel approach that efficiently segments large, densely packed mitochondria. SKOOTS accurately and efficiently segments mitochondria in previously difficult contexts and can also be applied to segment other objects in 3D light microscopy datasets. This approach bridges a critical gap between existing segmentation approaches, improving the utility of automated analysis of three-dimensional biomedical imaging data. We demonstrate the utility of SKOOTS by applying it to segment over 15,000 cochlear hair cell mitochondria across experimental conditions in under 2 hours on a consumer-grade PC, enabling downstream morphological analysis that revealed subtle structural changes following aminoglycoside exposure - differences not detectable using analysis approaches currently used in the field.

Dynamic Survival Prediction using Longitudinal Images based on Transformer

Bingfan Liu, Haolun Shi, Jiguo Cao

arxiv logopreprintAug 12 2025
Survival analysis utilizing multiple longitudinal medical images plays a pivotal role in the early detection and prognosis of diseases by providing insight beyond single-image evaluations. However, current methodologies often inadequately utilize censored data, overlook correlations among longitudinal images measured over multiple time points, and lack interpretability. We introduce SurLonFormer, a novel Transformer-based neural network that integrates longitudinal medical imaging with structured data for survival prediction. Our architecture comprises three key components: a Vision Encoder for extracting spatial features, a Sequence Encoder for aggregating temporal information, and a Survival Encoder based on the Cox proportional hazards model. This framework effectively incorporates censored data, addresses scalability issues, and enhances interpretability through occlusion sensitivity analysis and dynamic survival prediction. Extensive simulations and a real-world application in Alzheimer's disease analysis demonstrate that SurLonFormer achieves superior predictive performance and successfully identifies disease-related imaging biomarkers.

Switchable Deep Beamformer for High-quality and Real-time Passive Acoustic Mapping.

Zeng Y, Li J, Zhu H, Lu S, Li J, Cai X

pubmed logopapersAug 12 2025
Passive acoustic mapping (PAM) is a promising tool for monitoring acoustic cavitation activities in the applications of ultrasound therapy. Data-adaptive beamformers for PAM have better image quality compared with time exposure acoustics (TEA) algorithms. However, the computational cost of data-adaptive beamformers is considerably expensive. In this work, we develop a deep beamformer based on a generative adversarial network that can switch between different transducer arrays and reconstruct high-quality PAM images directly from radiofrequency ultrasound signals with low computational cost. The deep beamformer was trained on a dataset consisting of simulated and experimental cavitation signals of single and multiple microbubble clouds measured by different (linear and phased) arrays covering 1-15 MHz. We compared the performance of the deep beamformer to TEA and three different data-adaptive beamformers using simulated and experimental test dataset. Compared with TEA, the deep beamformer reduced the energy spread area by 27.3%-77.8% and improved the image signal-to-noise ratio by 13.9-25.1 dB on average for the different arrays in our data. Compared with the data-adaptive beamformers, the deep beamformer reduced the computational cost by three orders of magnitude achieving 10.5 ms image reconstruction speed in our data, while the image quality was as good as that of the data-adaptive beamformers. These results demonstrate the potential of the deep beamformer for high-resolution monitoring of microbubble cavitation activities for ultrasound therapy.

CMVFT: A Multi-Scale Attention Guided Framework for Enhanced Keratoconus Suspect Classification in Multi-View Corneal Topography.

Lu Y, Li B, Zhang Y, Qi Y, Shi X

pubmed logopapersAug 11 2025
Retrospective cross-sectional study. To develop a multi-view fusion framework that effectively identifies suspect keratoconus cases and facilitates the possibility of early clinical intervention. A total of 573 corneal topography maps representing eyes classified as normal, suspect, or keratoconus. We designed the Corneal Multi-View Fusion Transformer (CMVFT), which integrates features from seven standard corneal topography maps. A pretrained ResNet-50 extracts single-view representations that are further refined by a custom-designed Multi-Scale Attention Module (MSAM). This integrated design specifically compensates for the representation gap commonly encountered when applying Transformers to small-sample corneal topography datasets by dynamically bridging local convolution-based feature extraction with global self-attention mechanisms. A subsequent fusion Transformer then models long-range dependencies across views for comprehensive multi-view feature integration. The primary measure was the framework's ability to differentiate suspect cases from normal and keratoconus cases, thereby creating a pathway for early clinical intervention. Experimental evaluation demonstrated that CMVFT effectively distinguishes suspect cases within a feature space characterized by overlapping attributes. Ablation studies confirmed that both the MSAM and the fusion Transformer are essential for robust multi-view feature integration, successfully compensating for potential representation shortcomings in small datasets. This study is the first to apply a Transformer-driven multi-view fusion approach in corneal topography analysis. By compensating for the representation gap inherent in small-sample settings, CMVFT shows promise in enabling the identification of suspect keratoconus cases and supporting early intervention strategies, with prospective implications for early clinical intervention.

SOFA: Deep Learning Framework for Simulating and Optimizing Atrial Fibrillation Ablation

Yunsung Chung, Chanho Lim, Ghassan Bidaoui, Christian Massad, Nassir Marrouche, Jihun Hamm

arxiv logopreprintAug 11 2025
Atrial fibrillation (AF) is a prevalent cardiac arrhythmia often treated with catheter ablation procedures, but procedural outcomes are highly variable. Evaluating and improving ablation efficacy is challenging due to the complex interaction between patient-specific tissue and procedural factors. This paper asks two questions: Can AF recurrence be predicted by simulating the effects of procedural parameters? How should we ablate to reduce AF recurrence? We propose SOFA (Simulating and Optimizing Atrial Fibrillation Ablation), a novel deep-learning framework that addresses these questions. SOFA first simulates the outcome of an ablation strategy by generating a post-ablation image depicting scar formation, conditioned on a patient's pre-ablation LGE-MRI and the specific procedural parameters used (e.g., ablation locations, duration, temperature, power, and force). During this simulation, it predicts AF recurrence risk. Critically, SOFA then introduces an optimization scheme that refines these procedural parameters to minimize the predicted risk. Our method leverages a multi-modal, multi-view generator that processes 2.5D representations of the atrium. Quantitative evaluations show that SOFA accurately synthesizes post-ablation images and that our optimization scheme leads to a 22.18\% reduction in the model-predicted recurrence risk. To the best of our knowledge, SOFA is the first framework to integrate the simulation of procedural effects, recurrence prediction, and parameter optimization, offering a novel tool for personalizing AF ablation.

Unsupervised learning for inverse problems in computed tomography

Laura Hellwege, Johann Christopher Engster, Moritz Schaar, Thorsten M. Buzug, Maik Stille

arxiv logopreprintAug 7 2025
This study presents an unsupervised deep learning approach for computed tomography (CT) image reconstruction, leveraging the inherent similarities between deep neural network training and conventional iterative reconstruction methods. By incorporating forward and backward projection layers within the deep learning framework, we demonstrate the feasibility of reconstructing images from projection data without relying on ground-truth images. Our method is evaluated on the two-dimensional 2DeteCT dataset, showcasing superior performance in terms of mean squared error (MSE) and structural similarity index (SSIM) compared to traditional filtered backprojection (FBP) and maximum likelihood (ML) reconstruction techniques. Additionally, our approach significantly reduces reconstruction time, making it a promising alternative for real-time medical imaging applications. Future work will focus on extending this methodology to three-dimensional reconstructions and enhancing the adaptability of the projection geometry.

Quantum annealing feature selection on light-weight medical image datasets.

Nau MA, Nutricati LA, Camino B, Warburton PA, Maier AK

pubmed logopapersAug 7 2025
We investigate the use of quantum computing algorithms on real quantum hardware to tackle the computationally intensive task of feature selection for light-weight medical image datasets. Feature selection is often formulated as a k of n selection problem, where the complexity grows binomially with increasing k and n. Quantum computers, particularly quantum annealers, are well-suited for such problems, which may offer advantages under certain problem formulations. We present a method to solve larger feature selection instances than previously demonstrated on commercial quantum annealers. Our approach combines a linear Ising penalty mechanism with subsampling and thresholding techniques to enhance scalability. The method is tested in a toy problem where feature selection identifies pixel masks used to reconstruct small-scale medical images. We compare our approach against a range of feature selection strategies, including randomized baselines, classical supervised and unsupervised methods, combinatorial optimization via classical and quantum solvers, and learning-based feature representations. The results indicate that quantum annealing-based feature selection is effective for this simplified use case, demonstrating its potential in high-dimensional optimization tasks. However, its applicability to broader, real-world problems remains uncertain, given the current limitations of quantum computing hardware. While learned feature representations such as autoencoders achieve superior reconstruction performance, they do not offer the same level of interpretability or direct control over input feature selection as our approach.

Conditional Diffusion Model with Anatomical-Dose Dual Constraints for End-to-End Multi-Tumor Dose Prediction

Hui Xie, Haiqin Hu, Lijuan Ding, Qing Li, Yue Sun, Tao Tan

arxiv logopreprintAug 4 2025
Radiotherapy treatment planning often relies on time-consuming, trial-and-error adjustments that heavily depend on the expertise of specialists, while existing deep learning methods face limitations in generalization, prediction accuracy, and clinical applicability. To tackle these challenges, we propose ADDiff-Dose, an Anatomical-Dose Dual Constraints Conditional Diffusion Model for end-to-end multi-tumor dose prediction. The model employs LightweightVAE3D to compress high-dimensional CT data and integrates multimodal inputs, including target and organ-at-risk (OAR) masks and beam parameters, within a progressive noise addition and denoising framework. It incorporates conditional features via a multi-head attention mechanism and utilizes a composite loss function combining MSE, conditional terms, and KL divergence to ensure both dosimetric accuracy and compliance with clinical constraints. Evaluation on a large-scale public dataset (2,877 cases) and three external institutional cohorts (450 cases in total) demonstrates that ADDiff-Dose significantly outperforms traditional baselines, achieving an MAE of 0.101-0.154 (compared to 0.316 for UNet and 0.169 for GAN models), a DICE coefficient of 0.927 (a 6.8% improvement), and limiting spinal cord maximum dose error to within 0.1 Gy. The average plan generation time per case is reduced to 22 seconds. Ablation studies confirm that the structural encoder enhances compliance with clinical dose constraints by 28.5%. To our knowledge, this is the first study to introduce a conditional diffusion model framework for radiotherapy dose prediction, offering a generalizable and efficient solution for automated treatment planning across diverse tumor sites, with the potential to substantially reduce planning time and improve clinical workflow efficiency.
Page 1 of 983 results
Show
per page

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.