Back to all papers

Validation of an AI Method for Automated Lymphoma Metabolic Tumor Volume Segmentation Using a Public Benchmark PET/CT Dataset.

March 5, 2026pubmed logopapers

Authors

Sadik M,Larsson M,Enqvist O,Edenbrandt L,Trägårdh E

Affiliations (5)

  • Department of Molecular and Clinical Medicine, Sahlgrenska University Hospital, Sahlgrenska Academy at University of Gothenburg, Gothenburg, Sweden.
  • Eigenvision AB, Lund, Sweden.
  • Department of Electrical Engineering, Chalmers University of Technology, Gothenburg, Sweden.
  • Department of Clinical Physiology and Nuclear Medicine, SkÃ¥ne University Hospital, Malmö, Sweden; and [email protected].
  • Department of Translational Medicine and Wallenberg Center for Molecular Medicine, Lund University, Malmö, Sweden.

Abstract

The aim of this study was to evaluate the performance of an artificial intelligence (AI)-based method for automated segmentation of total metabolic tumor volume (TMTV) in <sup>18</sup>F-FDG PET/CT scans of patients with lymphoma, using an independent, publicly available benchmark dataset curated and segmented by expert readers in a previously published study. <b>Methods:</b> The AI model, based on a 3-dimensional U-Net architecture implemented in MONAI (the medical open-source network for AI framework), was trained on 1,500 <sup>18</sup>F-FDG PET/CT scans of patients with lymphoma. It was tested on a benchmark dataset comprising 60 baseline scans (20 each of follicular lymphoma, Hodgkin lymphoma, and diffuse large B-cell lymphoma), each segmented by 3 or 4 nuclear medicine physicians using an SUV threshold of 4. Agreement between AI-derived and benchmark TMTVs was assessed using Bland-Altman analysis, with acceptable deviation defined as within 10% or 10 cm<sup>3</sup>, consistent with interreader variability reported in the benchmark study. <b>Results:</b> In 50 (83%) of the 60 benchmark cases, AI-derived TMTVs were within 10% or 10 cm<sup>3</sup> of the benchmark reference. In 4 of the remaining 10 cases, AI-derived results were within the same margin of at least 1 of the expert readers, indicating partial concordance. <b>Conclusion:</b> The AI-based method achieved high concordance with expert-derived TMTVs in a standardized benchmark setting. The findings demonstrate that the AI model performs comparably to human experts in most cases, even in an externally curated dataset deliberately enriched with challenging cases by its original authors. The AI model's ability to produce accurate, reproducible segmentations without user interaction could significantly reduce manual workload and interreader variability in lymphoma imaging. However, human supervision is required to minimize errors.

Topics

Journal Article

Ready to Sharpen Your Edge?

Subscribe to join 11k+ peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.