Summary Report of the SNMMI AI Task Force Radiomics Challenge 2024.
Authors
Affiliations (9)
Affiliations (9)
- Department of Radiology and Nuclear Medicine, Amsterdam UMC, Cancer Center Amsterdam, Amsterdam, The Netherlands; [email protected].
- Departments of Radiology and Physics, University of British Columbia, Vancouver, British Columbia, Canada.
- Department of Radiology and Nuclear Medicine, Amsterdam UMC, Cancer Center Amsterdam, Amsterdam, The Netherlands.
- Department of Hematology, West German Cancer Center, University Hospital Essen, University of Duisburg-Essen, Essen, Germany.
- Clinic and Polyclinic for Nuclear Medicine, Department of Nuclear Medicine, University of Leipzig, Leipzig, Germany.
- Department of Hematology, Erasmus MC Cancer Institute, University Medical Center Rotterdam, Rotterdam, The Netherlands.
- Department of Hematology, Amsterdam UMC, Cancer Center Amsterdam, Amsterdam, The Netherlands.
- Department of Epidemiology and Data Science, Amsterdam Public Health Research Institute, Amsterdam UMC, Amsterdam, The Netherlands; and.
- Institut Curie, Université PSL, Laboratoire d'Imagerie Translationnelle en Oncologie, Orsay, France.
Abstract
In medical imaging, challenges are competitions that aim to provide a fair comparison of different methodologic solutions to a common problem. Challenges typically focus on addressing real-world problems, such as segmentation, detection, and prediction tasks, using various types of medical images and associated data. Here, we describe the organization and results of such a challenge to compare machine-learning models for predicting survival in patients with diffuse large B-cell lymphoma using a baseline <sup>18</sup>F-FDG PET/CT radiomics dataset. <b>Methods:</b> This challenge aimed to predict progression-free survival (PFS) in patients with diffuse large B-cell lymphoma, either as a binary outcome (shorter than 2 y versus longer than 2 y) or as a continuous outcome (survival in months). All participants were provided with a radiomic training dataset, including the ground truth survival for designing a predictive model and a radiomic test dataset without ground truth. Figures of merit (FOMs) used to assess model performance were the root-mean-square error for continuous outcomes and the C-index for 1-, 2-, and 3-y PFS binary outcomes. The challenge was endorsed and initiated by the Society of Nuclear Medicine and Molecular Imaging AI Task Force. <b>Results:</b> Nineteen models for predicting PFS as a continuous outcome from 15 teams were received. Among those models, external validation identified 6 models showing similar performance to that of a simple general linear reference model using SUV and total metabolic tumor volumes (TMTV) only. Twelve models for predicting binary outcomes were submitted by 9 teams. External validation showed that 1 model had higher, but nonsignificant, C-index values compared with values obtained by a simple logistic regression model using SUV and TMTV. <b>Conclusion:</b> Some of the radiomic-based machine-learning models developed by participants showed better FOMs than did simple linear or logistic regression models based on SUV and TMTV only, although the differences in observed FOMs were nonsignificant. This suggests that, for the challenge dataset, there was limited or no value seen from the addition of sophisticated radiomic features and use of machine learning when developing models for outcome prediction.