Back to all papers

4DO-DETR for otitis media detection.

June 12, 2026pubmed logopapers

Authors

Zhao X,Zhang H,Liu D,Lei G

Affiliations (8)

  • Hunan Engineering Research Center of Advanced Embedded Computing and Intelligent Medical Systems, Xiangnan University, Chenzhou, 423300, China.
  • School of Artificial Intelligence and Data Science, University of Science and Technology of China, Hefei, 230026, China.
  • The University of HONG KONG, HONG KONG, China.
  • School of Computer and Artificial Intelligence, Xiangnan University, Chenzhou, 423300, China.
  • Key Laboratory of Medical Imaging and Artificial Intelligence of Hunan Province, Xiangnan University, Chenzhou, 423300, China.
  • Hunan Engineering Research Center of Advanced Embedded Computing and Intelligent Medical Systems, Xiangnan University, Chenzhou, 423300, China. [email protected].
  • Key Laboratory of Medical Imaging and Artificial Intelligence of Hunan Province, Xiangnan University, Chenzhou, 423300, China. [email protected].
  • Clinical College, Xiangnan University, Chenzhou, 423300, China. [email protected].

Abstract

Otitis media (OM) and its complications can cause immense suffering for patients. However, due to the limited number of experts and their knowledge base, human specialists can only handle a limited number of CT images. Therefore, the auxiliary diagnosis of OM is crucial. Object detection is one of the primary tasks in computer vision. With the development of Transformers and attention mechanisms in recent years, the DETR series detectors have gradually become mainstream. However, these detectors are highly sensitive to the number of decoder layers, and slight deviations in the layers significantly impair model performance. The required number of decoder layers varies across different datasets, with grayscale images such as CT scans likely needing a different number of decoder layers than color images. Considering the stringent requirements for performance in clinical CT diagnostics, this paper analyzes the decline in model effectiveness due to excessive decoder layers. It proposes a new model called 4DO-DETR to address the instability issues of DETR performance. Regarding performance, 4DO-DETR has shown significant improvement over the baseline model (DN-DAB-DETR), surpassing the state-of-the-art (SOTA) algorithms Co-DETR, DINO and RT-DETR in the past two years. Rigorous experimental evaluations on a benchmark medical dataset demonstrate that our method achieves higher scores than other sophisticated network models——our model’s mAP reached 56.8%, higher than DINO’s 54.7%, Co-DETR’s 54.0%, and the baseline’s 45.1%. The datasets employed in this paper are available at https://github.com/promisedong/Four-DO-DETR.

Topics

Journal Article

Ready to Sharpen Your Edge?

Subscribe to join 11k+ peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.