Back to all papers

Multidisciplinary Consensus Prostate Contours on Magnetic Resonance Imaging: Educational Atlas and Reference Standard for Artificial Intelligence Benchmarking.

Authors

Song Y,Dornisch AM,Dess RT,Margolis DJA,Weinberg EP,Barrett T,Cornell M,Fan RE,Harisinghani M,Kamran SC,Lee JH,Li CX,Liss MA,Rusu M,Santos J,Sonn GA,Vidic I,Woolen SA,Dale AM,Seibert TM

Affiliations (20)

  • Department of Radiation Medicine and Applied Sciences, University of California San Diego, La Jolla, California; Department of Electrical and Computer Engineering, University of California San Diego, La Jolla, California.
  • Department of Radiation Medicine and Applied Sciences, University of California San Diego, La Jolla, California.
  • Department of Radiation Oncology, University of Michigan, Ann Arbor, Michigan.
  • Department of Radiology, Cornell University, Ithaca, New York.
  • Department of Clinical Imaging Sciences, University of Rochester Medical Center, Rochester, New York.
  • Department of Radiology, University of Cambridge, Cambridge, United Kingdom.
  • Radformation, New York, New York.
  • Department of Urology, Stanford School of Medicine, Palo Alto, California.
  • Department of Radiology, Massachusetts General Hospital, Boston, Massachusetts.
  • Department of Radiation Oncology, Massachusetts General Hospital, Boston, Massachusetts.
  • Department of Radiology, Stanford School of Medicine, Palo Alto, California.
  • Institute for Computational and Mathematical Engineering, Stanford University, Palo Alto, California.
  • Department of Urology, University of Texas Health Sciences Center San Antonio, San Antonio, Texas.
  • Department of Urology, Stanford School of Medicine, Palo Alto, California; Department of Radiology, Stanford School of Medicine, Palo Alto, California; Department of Biomedical Data Science, Stanford University, Palo Alto, California.
  • Quibim, New York, New York.
  • Department of Urology, Stanford School of Medicine, Palo Alto, California; Department of Radiology, Stanford School of Medicine, Palo Alto, California.
  • Cortechs.ai, San Diego, California.
  • Department of Radiology and Biomedical Imaging, University of California San Francisco, San Francisco, California.
  • Department of Radiology, University of California San Diego, La Jolla, California; Department of Neurosciences, University of California San Diego, La Jolla, California; Halıcıoğlu Data Science Institute, University of California San Diego, La Jolla, California.
  • Department of Radiation Medicine and Applied Sciences, University of California San Diego, La Jolla, California; Department of Radiology, University of California San Diego, La Jolla, California; Department of Bioengineering, University of California San Diego, La Jolla, California; Department of Urology, University of California San Diego, La Jolla, California. Electronic address: [email protected].

Abstract

Evaluation of artificial intelligence (AI) algorithms for prostate segmentation is challenging because ground truth is lacking. We aimed to: (1) create a reference standard data set with precise prostate contours by expert consensus, and (2) evaluate various AI tools against this standard. We obtained prostate magnetic resonance imaging cases from six institutions from the Qualitative Prostate Imaging Consortium. A panel of 4 experts (2 genitourinary radiologists and 2 prostate radiation oncologists) meticulously developed consensus prostate segmentations on axial T<sub>2</sub>-weighted series. We evaluated the performance of 6 AI tools (3 commercially available and 3 academic) using Dice scores, distance from reference contour, and volume error. The panel achieved consensus prostate segmentation on each slice of all 68 patient cases included in the reference data set. We present 2 patient examples to serve as contouring guides. Depending on the AI tool, median Dice scores (across patients) ranged from 0.80 to 0.94 for whole prostate segmentation. For a typical (median) patient, AI tools had a mean error over the prostate surface ranging from 1.3 to 2.4 mm. They maximally deviated 3.0 to 9.4 mm outside the prostate and 3.0 to 8.5 mm inside the prostate for a typical patient. Error in prostate volume measurement for a typical patient ranged from 4.3% to 31.4%. We established an expert consensus benchmark for prostate segmentation. The best-performing AI tools have typical accuracy greater than that reported for radiation oncologists using computed tomography scans (the most common clinical approach for radiation therapy planning). Physician review remains essential to detect occasional major errors.

Topics

Magnetic Resonance ImagingArtificial IntelligenceProstatic NeoplasmsBenchmarkingProstateJournal Article

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.