Towards a zero-shot low-latency navigation for open surgery augmented reality applications.

Authors

Schwimmbeck M,Khajarian S,Auer C,Wittenberg T,Remmele S

Affiliations (6)

  • Research Group Medical Technologies, University of Applied Sciences Landshut, Landshut, Germany. [email protected].
  • Chair for Visual Computing, Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany. [email protected].
  • Research Group Medical Technologies, University of Applied Sciences Landshut, Landshut, Germany.
  • Intelligent Embedded Systems Lab, University of Freiburg, Freiburg im Breisgau, Germany.
  • Chair for Visual Computing, Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany.
  • Fraunhofer Institute for Integrated Circuits IIS, Erlangen, Germany.

Abstract

Augmented reality (AR) enhances surgical navigation by superimposing visible anatomical structures with three-dimensional virtual models using head-mounted displays (HMDs). In particular, interventions such as open liver surgery can benefit from AR navigation, as it aids in identifying and distinguishing tumors and risk structures. However, there is a lack of automatic and markerless methods that are robust against real-world challenges, such as partial occlusion and organ motion. We introduce a novel multi-device approach for automatic live navigation in open liver surgery that enhances the visualization and interaction capabilities of a HoloLens 2 HMD through precise and reliable registration using an Intel RealSense RGB-D camera. The intraoperative RGB-D segmentation and the preoperative CT data are utilized to register a virtual liver model to the target anatomy. An AR-prompted Segment Anything Model (SAM) enables robust segmentation of the liver in situ without the need for additional training data. To mitigate algorithmic latency, Double Exponential Smoothing (DES) is applied to forecast registration results. We conducted a phantom study for open liver surgery, investigating various scenarios of liver motion, viewpoints, and occlusion. The mean registration errors (8.31 mm-18.78 mm TRE) are comparable to those reported in prior work, while our approach demonstrates high success rates even for high occlusion factors and strong motion. Using forecasting, we bypassed the algorithmic latency of 79.8 ms per frame, with median forecasting errors below 2 mms and 1.5 degrees between the quaternions. To our knowledge, this is the first work to approach markerless in situ visualization by combining a multi-device method with forecasting and a foundation model for segmentation and tracking. This enables a more reliable and precise AR registration of surgical targets with low latency. Our approach can be applied to other surgical applications and AR hardware with minimal effort.

Topics

Journal Article

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.