Back to all papers

Medical hierarchical image classification via dual-geometry image-text learning.

April 30, 2026pubmed logopapers

Authors

Fan L,Sowmya A,Meijering E,Yu Z,Ge Z,Song Y

Affiliations (3)

  • Centre for Healthy Brain Ageing, Discipline of Psychiatry and Mental Health, School of Clinical Medicine, Faculty of Medicine and Health, UNSW Sydney, Australia; School of Computer Science and Engineering, UNSW Sydney, Australia. Electronic address: [email protected].
  • School of Computer Science and Engineering, UNSW Sydney, Australia.
  • Department of Data Science & AI, Monash University, Australia.

Abstract

Hierarchical image classification is a fundamental challenge in medical image analysis, as tree-structured taxonomies inherently reflect biological and clinical relationships, spanning the general categorisation of disease entities and fine-grained cellular distinctions. Existing approaches primarily rely on multi-task learning and fine-grained detection, often requiring intricate model design and complex training strategies. In this paper, we aim to exploit the negative curvature property of hyperbolic space, which allows efficient representation of hierarchical structures. We propose a dual-geometry image-text framework, termed H<sup>2</sup>CL. Specifically, we introduce a lightweight classifier head on top of image backbones to extract both Euclidean and hyperbolic features, which are then combined to simultaneously preserve taxonomic consistency from an etiological perspective and enhance instance discrimination from a morphological perspective. Furthermore, a text branch is incorporated to integrate label semantics, where an entailment loss is employed to jointly model image-text alignment and inter-sample relationships. Extensive experiments on cervical cell, skin lesion, and gallbladder disease datasets demonstrate that our framework consistently outperforms advanced methods. Compared to the standard Swin Transformer, H<sup>2</sup>CL achieves an average accuracy improvement of 7% across all three datasets at the fine-grained level, with similarly consistent gains observed when integrated with other backbone models. The source code is publicly available at https://github.com/MCPathology/H2CL.

Topics

Journal Article

Ready to Sharpen Your Edge?

Subscribe to join 11k+ peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.