Back to all papers

AutoPET Challenge on Fully Automated Lesion Segmentation in Oncologic PET/CT Imaging, Part 2: Domain Generalization.

December 30, 2025pubmed logopapers

Authors

Dexl J,Gatidis S,Früh M,Jeblick K,Mittermeier A,Stüber AT,Schachtner B,Topalis J,Fabritius MP,Gu S,Murugesan GK,VanOss J,Ye J,He J,Alloula A,Papież BW,Mesbah Z,Modzelewski R,Hadlich M,Marinov Z,Stiefelhagen R,Isensee F,Maier-Hein KH,Galdran A,Nikolaou K,la Fougère C,Kim M,Kallenberg N,Kleesiek J,Herrmann K,Werner R,Ingrisch M,Cyran CC,Küstner T

Affiliations (29)

  • Department of Radiology, LMU University Hospital, LMU Munich, Munich, Germany; [email protected].
  • Munich Center for Machine Learning, Munich, Germany.
  • Department of Radiology, University Hospital Tübingen, Tübingen, Germany.
  • Department of Radiology, Stanford University, Stanford, California.
  • Department of Radiology, LMU University Hospital, LMU Munich, Munich, Germany.
  • Comprehensive Pneumology Center, Member of the German Center for Lung Research, Munich, Germany.
  • Konrad Zuse School of Excellence in Reliable AI, Garching, Germany.
  • BAMF Health, Grand Rapids, Michigan.
  • Shanghai AI Lab, Shanghai, China.
  • Big Data Institute, University of Oxford, Oxford, United Kingdom.
  • Université Rouen Normandie, LITIS UR 4108, Rouen, France.
  • Nuclear Medicine Department, Henri Becquerel Cancer Center, Rouen, France.
  • Siemens Healthcare SAS, Courbevoie, France.
  • Karlsruhe Institute of Technology, Karlsruhe, Germany.
  • HIDSS4Health, Karlsruhe and Heidelberg, Germany.
  • Division of Medical Image Computing, German Cancer Research Center, Heidelberg, Germany.
  • Helmholtz Imaging, German Cancer Research Center, Heidelberg, Germany.
  • Pattern Analysis and Learning Group, Department of Radiation Oncology, Heidelberg University Hospital, Heidelberg, Germany.
  • Universitat Pompeu Fabra, Barcelona, Spain.
  • AIML, University of Adelaide, Adelaide, South Australia, Australia.
  • Cluster of Excellence 2180, Image-Guided and Functionally Instructed Tumor Therapies, Tübingen, Germany.
  • Department of Nuclear Medicine and Clinical Molecular Imaging, University Hospital Tübingen, Tübingen, Germany.
  • German Cancer Consortium, Partner Site Tübingen, Tübingen, Germany.
  • Institute for Artificial Intelligence in Medicine, University Hospital Essen, Essen, Germany.
  • University Duisburg-Essen, Essen, Germany.
  • Cancer Research Center Cologne Essen, University Medicine Essen, Essen, Germany.
  • German Cancer Consortium, Partner Site Essen, Essen, Germany.
  • Department of Nuclear Medicine, University Medicine Essen, Essen, Germany; and.
  • Department of Nuclear Medicine, LMU University Hospital, LMU Munich, Munich, Germany.

Abstract

This article reports the results of the second iteration of the autoPET challenge on automated lesion segmentation in whole-body PET/CT, held in conjunction with the 26th International Conference on Medical Image Computing and Computer Assisted Intervention in 2023. In contrast to the first autoPET challenge, which served as a proof of concept, this study investigates whether machine learning-based segmentation models trained on data from a single source can maintain performance across clinically relevant variations in PET/CT data, reflecting the demands of real-world deployment. <b>Methods:</b> A comprehensive biomedical segmentation challenge on PET/CT domain generalization was designed and conducted. Participants were tasked to train machine learning models on annotated whole-body <sup>18</sup>F-FDG data (<i>n</i> = 1,014). These models were then evaluated on a test set of 200 samples from 5 clinically relevant domains, including variations in institutions, pathologies, and populations and a different tracer. Performance was measured in terms of average dice similarity coefficient, average false-positive volume, and average false-negative volume. The best-performing teams were awarded in 3 categories. Furthermore, a detailed analysis was conducted after the challenge, examining results across domains and unique instances, along with a ranking analysis. <b>Results:</b> Generalization from a single-source domain remains a significant challenge. Seventeen international teams successfully participated in the challenge. The best-performing team reached an average dice similarity coefficient of 0.5038, a mean false-positive volume of 87.8388 mL, and a mean false-negative volume of 8.4154 mL on the test set. nnU-Net was the most commonly used framework, with most participants using a 3-dimensional U-Net. Despite competitive in-domain results, out-of-domain performance deteriorated substantially, particularly on pediatric and prostate-specific membrane antigen data. Detailed error analysis revealed frequent false-positives due to physiologic uptake and decreased sensitivity in detecting small or low-uptake lesions. A majority-vote ensemble offered minimal performance gains, whereas an oracle ensemble indicates hypothetical gains. Ranking analysis showed no single team consistently outperformed all others across ranking schemes. <b>Conclusion:</b> The second autoPET challenge provides a comprehensive evaluation of the current state of automated PET/CT tumor segmentation, highlighting both progress and persistent challenges of single-source domain generalization and the need for diverse public datasets to enhance algorithm robustness.

Topics

Journal Article

Ready to Sharpen Your Edge?

Subscribe to join 7,800+ peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.