Back to all papers

PETWB-REP: A Multi-Cancer Whole-Body FDG PET/CT Dataset with Corresponding Radiology Reports.

March 13, 2026pubmed logopapers

Authors

Xue L,Feng G,Zhang W,Zhang Y,Li L,Wang S,Peng L,Peng S,Gao X

Affiliations (9)

  • Department of Nuclear Medicine/PET center, Huashan Hospital, Fudan University, Shanghai, China. [email protected].
  • Shanghai Academy of Artificial Intelligence for Science, Shanghai, China. [email protected].
  • Institute of Science and Technology for Brain-inspired Intelligence, Fudan University, Shanghai, China.
  • Shanghai Universal Medical Imaging Diagnostic Center, Shanghai, China.
  • Shanghai Academy of Artificial Intelligence for Science, Shanghai, China.
  • Human Phenome Institute, Fudan University, Shanghai, China.
  • Artificial Intelligence Innovation and Incubation Institute, Fudan University, Shanghai, China.
  • Shanghai Universal Medical Imaging Diagnostic Center, Shanghai, China. [email protected].
  • Shanghai Universal Medical Imaging Diagnostic Center, Shanghai, China. [email protected].

Abstract

Publicly available, large-scale medical imaging datasets are crucial for developing and validating artificial intelligence (AI) models and conducting retrospective clinical research. However, multimodal datasets that integrate functional and anatomical imaging with high-quality radiology reports across diverse malignancies remain scarce. Here, we present PETWB-REP, a curated dataset comprising whole-body <sup>18</sup>F-Fluorodeoxyglucose (FDG) Positron Emission Tomography/Computed Tomography (PET/CT) scans and corresponding radiology reports from 490 patients. The cohort encompasses a broad spectrum of malignancies, including but not limited to lung, liver, breast, prostate, and ovarian cancers. Distinct from existing resources, PETWB-REP is organized following the Brain Imaging Data Structure (BIDS) standard, providing both raw data (with 3D de-facing for privacy) and processed derivatives (SUV-converted and registered). Each case includes bilingual (Chinese and English) de-identified textual reports and structured clinical metadata. This dataset is uniquely positioned to support multi-center validation and cross-disciplinary research in medical imaging, radiomics, automated report generation, and multimodal representation learning.

Topics

Journal Article

Ready to Sharpen Your Edge?

Subscribe to join 11k+ peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.