Deep learning for appendicitis: development of a three-dimensional localization model on CT.

Authors

Takaishi T,Kawai T,Kokubo Y,Fujinaga T,Ojio Y,Yamamoto T,Hayashi K,Owatari Y,Ito H,Hiwatashi A

Affiliations (4)

  • Department of Radiology, Nagoya City University Graduate School of Medical Sciences, Kawasumi Mizuho-Cho, Mizuho-Ku, Nagoya, Aichi, 467-8602, Japan. [email protected].
  • Department of Radiology, Nagoya City University Graduate School of Medical Sciences, Kawasumi Mizuho-Cho, Mizuho-Ku, Nagoya, Aichi, 467-8602, Japan.
  • Department of Radiology, Nagoya City University West Medical Center, Nagoya, Japan.
  • IT Solution Division, Medical Systems Business Div, FUJIFILM Corporation, Tokyo, Japan.

Abstract

To develop and evaluate a deep learning model for detecting appendicitis on abdominal CT. This retrospective single-center study included 567 CTs of appendicitis patients (330 males, age range 20-96) obtained between 2011 and 2020, randomly split into training (n = 517) and validation (n = 50) sets. The validation set was supplemented with 50 control CTs performed for acute abdomen. For a test dataset, 100 appendicitis CTs and 100 control CTs were consecutively collected from a separate period after 2021. Exclusion criteria included age < 20, perforation, unclear appendix, and appendix tumors. Appendicitis CTs were annotated with three-dimensional bounding boxes that encompassed inflamed appendices. CT protocols were unenhanced, 5-mm slice-thickness, 512 × 512 pixel matrix. The deep learning algorithm was based on faster region convolutional neural network (Faster R-CNN). Two board-certified radiologists visually graded model predictions on the test dataset using a 5-point Likert scale (0: no detection, 1: false, 2: poor, 3: fair, 4: good), with scores ≥ 3 considered true positives. Inter-rater agreement was assessed using weighted kappa statistics. The effects of intra-abdominal fat, periappendiceal fat-stranding, presence of appendicolith, and appendix diameter on the model's recall were analyzed using binary logistic regression. The model showed a precision of 0.66 (87/132), a recall of 0.87 (87/100), and a false-positive rate per patient of 0.23 (45/200). The inter-rater agreement for Likert scores of 2-4 was κ = 0.76. The logistic regression analysis showed that only intra-abdominal fat had a significant impact on the model's precision (p = 0.02). We developed a model capable of detecting appendicitis on CT with a three-dimensional bounding box.

Topics

Journal Article

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.