Influence of high-performance image-to-image translation networks on clinical visual assessment and outcome prediction: utilizing ultrasound to MRI translation in prostate cancer.

Authors

Salmanpour MR,Mousavi A,Xu Y,Weeks WB,Hacihaliloglu I

Affiliations (6)

  • Department of Radiology, University of British Columbia, Vancouver, BC, Canada. [email protected].
  • AI for Good Research Lab, Microsoft Corporation, Redmond, WA, USA. [email protected].
  • Department of Computer, Abhar Branch, Islamic Azad University, Abhar, Iran.
  • AI for Good Research Lab, Microsoft Corporation, Redmond, WA, USA.
  • Department of Radiology, University of British Columbia, Vancouver, BC, Canada.
  • Department of Medicine, University of British Columbia, Vancouver, BC, Canada.

Abstract

Image-to-image (I2I) translation networks have emerged as promising tools for generating synthetic medical images; however, their clinical reliability and ability to preserve diagnostically relevant features remain underexplored. This study evaluates the performance of state-of-the-art 2D/3D I2I networks for converting ultrasound (US) images to synthetic MRI in prostate cancer (PCa) imaging. The novelty lies in combining radiomics, expert clinical evaluation, and classification performance to comprehensively benchmark these models for potential integration into real-world diagnostic workflows. A dataset of 794 PCa patients was analyzed using ten leading I2I networks to synthesize MRI from US input. Radiomics feature (RF) analysis was performed using Spearman correlation to assess whether high-performing networks (SSIM > 0.85) preserved quantitative imaging biomarkers. A qualitative evaluation by seven experienced physicians assessed the anatomical realism, presence of artifacts, and diagnostic interpretability of synthetic images. Additionally, classification tasks using synthetic images were conducted using two machine learning and one deep learning model to assess the practical diagnostic benefit. Among all networks, 2D-Pix2Pix achieved the highest SSIM (0.855 ± 0.032). RF analysis showed that 76 out of 186 features were preserved post-translation, while the remainder were degraded or lost. Qualitative feedback revealed consistent issues with low-level feature preservation and artifact generation, particularly in lesion-rich regions. These evaluations were conducted to assess whether synthetic MRI retained clinically relevant patterns, supported expert interpretation, and improved diagnostic accuracy. Importantly, classification performance using synthetic MRI significantly exceeded that of US-based input, achieving average accuracy and AUC of ~ 0.93 ± 0.05. Although 2D-Pix2Pix showed the best overall performance in similarity and partial RF preservation, improvements are still required in lesion-level fidelity and artifact suppression. The combination of radiomics, qualitative, and classification analyses offered a holistic view of the current strengths and limitations of I2I models, supporting their potential in clinical applications pending further refinement and validation.

Topics

Journal Article

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.