Back to all papers

Synthetic data generation method improves risk prediction model for early tumor recurrence after surgery in patients with pancreatic cancer.

Authors

Jeong H,Lee JM,Kim HS,Chae H,Yoon SJ,Shin SH,Han IW,Heo JS,Min JH,Hyun SH,Kim H

Affiliations (6)

  • Division of Hepatobiliary-Pancreatic Surgery, Department of Surgery, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, South Korea.
  • Division of Hepatobiliary-Pancreatic Surgery, Department of Surgery, Daejeon Eulji University Medical Center, Eulji University School of Medicine, Daejeon, South Korea.
  • Department of Surgery, Seoul National University College of Medicine, Seoul, South Korea.
  • Department of Radiology and Center for Imaging Science, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, South Korea. [email protected].
  • Department of Nuclear Medicine, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, South Korea. [email protected].
  • Division of Hepatobiliary-Pancreatic Surgery, Department of Surgery, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, South Korea. [email protected].

Abstract

Pancreatic cancer is aggressive with high recurrence rates, necessitating accurate prediction models for effective treatment planning, particularly for neoadjuvant chemotherapy or upfront surgery. This study explores the use of variational autoencoder (VAE)-generated synthetic data to predict early tumor recurrence (within six months) in pancreatic cancer patients who underwent upfront surgery. Preoperative data of 158 patients between January 2021 and December 2022 was analyzed, and machine learning models-including Logistic Regression, Random Forest (RF), Gradient Boosting Machine (GBM), and Deep Neural Networks (DNN)-were trained on both original and synthetic datasets. The VAE-generated dataset (n = 94) closely matched the original data (p > 0.05) and enhanced model performance, improving accuracy (GBM: 0.81 to 0.87; RF: 0.84 to 0.87) and sensitivity (GBM: 0.73 to 0.91; RF: 0.82 to 0.91). PET/CT-derived metabolic parameters were the strongest predictors, accounting for 54.7% of the model predictive power with maximum standardized uptake value (SUVmax) showing the highest importance (0.182, 95% CI: 0.165-0.199). This study demonstrates that synthetic data can significantly enhance predictive models for pancreatic cancer recurrence, especially in data-limited scenarios, offering a promising strategy for oncology prediction models.

Topics

Pancreatic NeoplasmsNeoplasm Recurrence, LocalJournal Article

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.