Virtual Treatment for radiotherapy: Multimodal generative prediction of longitudinal NSCLC tumor progression.
Authors
Affiliations (8)
Affiliations (8)
- Unit of Artificial Intelligence and Computer Systems, Università Campus Bio-Medico di Roma, Department of Engineering, Rome, Italy; Multi-Specialist Clinical Institute for Orthopaedic Trauma Care (COT), Messina, Italy.
- Unit of Artificial Intelligence and Computer Systems, Università Campus Bio-Medico di Roma, Department of Engineering, Rome, Italy.
- Department of Diagnostics and Intervention, Radiation Physics, Biomedical Engineering, Umeå University, Umeå, Sweden.
- Operative Research Unit of Radiation Oncology, Fondazione Policlinico Universitario Campus Bio-Medico, Rome, Italy.
- Operative Research Unit of Radiation Oncology, Fondazione Policlinico Universitario Campus Bio-Medico, Rome, Italy; Research Unit of Radiation Oncology, Department of Medicine and Surgery, Università Campus Bio-Medico di Roma, Rome, Italy.
- Department of Biomedical Engineering, University of Basel, Allschwil, Switzerland.
- Department of Naval, Electrical, Electronics and Telecommunications Engineering, University of Genoa, Genova, Italy.
- Unit of Artificial Intelligence and Computer Systems, Università Campus Bio-Medico di Roma, Department of Engineering, Rome, Italy; Department of Diagnostics and Intervention, Radiation Physics, Biomedical Engineering, Umeå University, Umeå, Sweden. Electronic address: [email protected].
Abstract
Predicting the longitudinal evolution of tumors during radiotherapy is a complex and clinically critical challenge in medical imaging analysis, especially when models are conditioned solely on imaging data. This work introduces a Virtual Treatment (VT) framework that formulates non-small cell lung cancer (NSCLC) tumor progression as a multimodal conditional image-to-image translation paradigm, where the generative process is driven not only by past CT scans but also by treatment-related and patient-specific clinical information. Using a private longitudinal dataset of 222 NSCLC patients with 895 CT scans acquired during radiotherapy, it is shown that incorporating diverse modalities such as delivered dose increments, demographic characteristics, histological diagnosis, and tumor staging (cT, cN) improves the ability of generative models to predict plausible future anatomical states. A benchmarking study is conducted across four families of generative models including 2D GANs, 2.5D diffusion models, and fully 3D latent diffusion architectures, and showing that multimodal conditioning is essential for capturing patient-specific radiobiological dynamics. Unlike traditional image-based synthesis methods, the proposed VT framework leverages these complementary modalities to refine the generative trajectory, enabling the model to forecast future CT scans based on both anatomical evolution and clinical context, thereby reflecting realistic treatment-response patterns. Across quantitative metrics, tumor volumetric evaluation, and statistical tests, diffusion-based models conditioned on the combination of demographic and tumor-related features achieved the most stable and dose-aware tumor evolution forecasts. These findings underscore the importance of multimodal information for generative modeling in longitudinal oncology and position VT as a promising tool for in-silico treatment monitoring and adaptive radiotherapy support in NSCLC.