S-MEOD: A Novel Evaluation Metric for Frame-Based Medical Object Detection.
Authors
Affiliations (2)
Affiliations (2)
- Department of Computer Science, Engineering and Information Technology, Shiraz University, Shiraz, Iran. [email protected].
- Department of Computer Science, Engineering and Information Technology, Shiraz University, Shiraz, Iran.
Abstract
Traditional metrics such as precision, recall, mean Average Precision (mAP), and F-score are widely used to evaluate object detection models. However, in some frame-based medical scenarios, these metrics often fail to capture the true effectiveness of models. For instance, in frame-based data, an object detection model may detect true positives in just a few frames, resulting in a perfect precision, but miss the same targets in other frames, leading to a high number of false negatives and, as a result, a very low recall. In practice, this model may still function effectively as a medical assistant and accurately identify critical features. Yet, the traditional metrics do not reflect this acceptable performance. This study aims to address this limitation by introducing a new evaluation metric tailored for frame-based medical object detection tasks. We propose the S-MEOD (Sequential Method of Evaluation for Object Detection), a novel metric that combines Sequence-aware Precision (SaP) and Sequence-oriented Detection (SoD) to provide a more comprehensive assessment of model performance. The metric was evaluated on frame-based sequences using object detection models, including YOLO-based architectures, with experiments on medical data. Experimental evaluations showed that S-MEOD provides a more accurate and intuitive reflection of model effectiveness in frame-based detection compared to traditional metrics. In our experimental evaluation on coronary angiography data, increasing the confidence threshold led to higher precision (up to 0.964) and mAP50 ( <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mo>≈</mo> <mn>0.49</mn></mrow> </math> ), but caused recall to drop from ( <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mo>≈</mo> <mn>0.22</mn></mrow> </math> ) to 0.028 and the F1-score from ( <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mo>≈</mo> <mn>0.29</mn></mrow> </math> ) to 0.055; correspondingly, S-MEOD, where lower values indicate better performance, increased from 1.30 at low thresholds to 2.06 at high thresholds, indicating a substantial deterioration in temporal detection performance. Compared to traditional metrics, S-MEOD more accurately reflects clinically relevant detection behavior by distinguishing between sparse high-precision detections and genuine sequence-level detection failure. The S-MEOD offers an easy-to-interpret and reliable alternative to existing metrics for evaluating frame-based medical object detection models. Its adoption could improve the assessment of clinical applicability and redefine performance standards in medical imaging research.