Cross-modal image generation with uncertainty quantification from echo cardiogram to MRI.
Authors
Affiliations (6)
Affiliations (6)
- School of Computing, University of Otago, New Zealand.
- Indian Association for the Cultivation of Science, Kolkata, India.
- University College London, London, UK.
- Department of Medicine, University of Otago, New Zealand.
- University College London, London, UK; The Alan Turing Institute, London, UK.
- School of Computing, University of Otago, New Zealand. Electronic address: [email protected].
Abstract
Medical imaging is fundamental to cardiovascular diagnostics, with modalities such as Transthoracic Echocardiography (TTE) and Cardiac Magnetic Resonance (CMR) offering complementary strengths. TTE provides real-time, non-invasive visualization of cardiac function but is often limited by operator dependency and incomplete views. In contrast, CMR delivers comprehensive, high-resolution structural assessments, although it comes with greater time and cost burdens. To address these limitations, this study explores cross-modal generative modeling techniques for synthesizing CMR-like images directly from TTE. We propose a novel architecture that combines a UNet backbone with a vision transformer, utilizing UNet for feature extraction and the transformer for global attention to improve image synthesis quality. Quantitative and qualitative evaluations demonstrate the model's ability to produce realistic and anatomically consistent CMR images, with strong potential to improve diagnostic accuracy and clinical decision-making with multiple image modalities.