Unsupervised Cardiac Video Translation Via Motion Feature Guided Diffusion Model

July 1, 2025

arXiv: 2507.02003v1

Authors

Swakshar Deb,Nian Wu,Frederick H. Epstein,Miaomiao Zhang

Abstract

This paper presents a novel motion feature guided diffusion model for unpaired video-to-video translation (MFD-V2V), designed to synthesize dynamic, high-contrast cine cardiac magnetic resonance (CMR) from lower-contrast, artifact-prone displacement encoding with stimulated echoes (DENSE) CMR sequences. To achieve this, we first introduce a Latent Temporal Multi-Attention (LTMA) registration network that effectively learns more accurate and consistent cardiac motions from cine CMR image videos. A multi-level motion feature guided diffusion model, equipped with a specialized Spatio-Temporal Motion Encoder (STME) to extract fine-grained motion conditioning, is then developed to improve synthesis quality and fidelity. We evaluate our method, MFD-V2V, on a comprehensive cardiac dataset, demonstrating superior performance over the state-of-the-art in both quantitative metrics and qualitative assessments. Furthermore, we show the benefits of our synthesized cine CMRs improving downstream clinical and analytical tasks, underscoring the broader impact of our approach. Our code is publicly available at https://github.com/SwaksharDeb/MFD-V2V.

View Source Full Text PDF

Topics

eess.IV

Unsupervised Cardiac Video Translation Via Motion Feature Guided Diffusion Model

Authors

Abstract

Tags

Topics

Ready to Sharpen Your Edge?