A Genomics-Guided Multimodal Contrastive Learning Framework for Clinically Significant Prostate Cancer Risk Stratification with Missing Clinical Data.
Authors
Affiliations (5)
Affiliations (5)
- Centro de Investigación en Computación (CIC), Instituto Politécnico Nacional (IPN), Mexico City 07738, Mexico.
- Department of Computer Sciences, Bahria University, Lahore 54600, Pakistan.
- Faculty of Allied Health Sciences, Superior University, Lahore 54000, Pakistan.
- Interdisciplinary Professional Unit in Engineering and Advanced Technologies (UPIITA), Instituto Politécnico Nacional (IPN), Mexico City 07340, Mexico.
- Higher School of Computing (ESCOM), Instituto Politécnico Nacional (IPN), Mexico City 07738, Mexico.
Abstract
Heterogeneous data integration remains a major challenge in intelligent information systems, particularly under missing-modality and cross-domain conditions. Existing multimodal fusion approaches often rely on complete datasets and weak alignment mechanisms, limiting their robustness and practical applicability. This study aims to develop and evaluate a genomics-guided multimodal representation learning framework that enables robust heterogeneous data fusion, reliable cross-modal correspondence, and accurate prediction under incomplete-data conditions. We propose a multimodal learning architecture that models genomics as the primary biological anchor and learns conditional projections to imaging modalities, including multiparametric MRI and whole-slide histopathology (WSI). The framework formulates multimodal fusion as a genomics-guided contrastive learning problem, incorporates domain-specific optimization constraints, and learns a latent shared-state representation to support inference without requiring fully paired datasets. Evaluation was conducted using public datasets, including TCGA-PRAD and TCIA, across low-risk versus higher-risk/clinically significant prostate cancer (csPCa) discrimination, Gleason-based risk stratification, and clinically significant outcome prediction tasks under realistic multimodal and missing-modality scenarios. In the adequately powered Genomics+WSI cohort (<i>n</i> = 486), the framework achieved an AUROC of 0.985 ± 0.005 for low-risk versus higher-risk/csPCa discrimination (<i>p</i> < 0.001). Exploratory analysis in a small, matched Genomics+MRI cohort (<i>n</i> = 28) yielded an AUROC of 0.980 ± 0.006 for the same endpoint; these findings are reported descriptively with bootstrap confidence intervals due to limited sample size. Because the negative reference group consisted of low-risk prostate cancer cases rather than cancer-free controls, results are interpreted as within-cancer risk discrimination rather than de novo cancer detection. The framework achieved weighted accuracy up to 92.1%, Cohen's κ up to 0.86, and reduced critical decision errors by 58%. Calibration remained strong (ECE 0.021-0.024), and decision-curve analysis indicated improved utility with reduced unnecessary invasive workups in retrospective modeling. Robustness analysis demonstrated AUROC degradation below 0.04 under domain shifts. Single-modality inference using genomics alone maintained AUROC > 0.90. Interpretability analysis revealed feature attributions aligned with domain-relevant genomic markers. The proposed framework provides a scalable and generalizable solution for heterogeneous multimodal data fusion, supporting reliable prediction, robustness to missing modalities, and applicability to complex information systems beyond the studied domain.