Back to all papers

HMC-transducer: hierarchical mamba-CNN transducer for robust liver tumor segmentation.

January 23, 2026pubmed logopapers

Authors

Zhu J,Xu C,Lei C,Zhang G,Fang S,Zhang S,Chen J,Wang X

Affiliations (11)

  • Department of Hepatopancreatobiliary Surgery, The First Affiliated Hospital of Ningbo University, Ningbo, Zhejiang, China.
  • Department of Oncology, The Second Affiliated Hospital of Zhejiang Chinese Medical University, Hangzhou, Zhejiang, China.
  • Zhejiang Key Laboratory of Blood-Stasis-Toxin Syndrome, Zhejiang Chinese Medical University, Hangzhou, Zhejiang Province, China.
  • Jinling Hospital, Affiliated Hospital of Medical School, Nanjing University, Nanjing, Jiangsu Province, China.
  • School of Basic Medical Sciences, Zhejiang Chinese Medical University, Hangzhou, Zhejiang Province, China.
  • Traditional Chinese Medicine "Preventing Disease" Wisdom Health Project Research Center of Zhejiang, Hangzhou, Zhejiang Province, China.
  • Emergency Department, The First Affiliated Hospital of Ningbo University, Ningbo, Zhejiang, China.
  • Department of Hepatobiliary Surgery, The Second Hospital of Hebei Medical University, Shijiazhuang, Hebei, China. [email protected].
  • Department of Oncology, Tongde Hospital of Zhejiang Province, Hangzhou, Zhejiang Province, China. [email protected].
  • Zhejiang Key Laboratory of Disease-Syndrome Integration for Cancer Prevention and Treatment, Tongde Hospital of Zhejiang Province, Hangzhou, Zhejiang Province, China. [email protected].
  • Emergency Department, The First Affiliated Hospital of Ningbo University, Ningbo, Zhejiang, China. [email protected].

Abstract

Accurate segmentation of liver tumors from computed tomography (CT) scans is critical for clinical diagnosis and treatment planning, yet it remains a significant challenge due to the high variability in tumor shape, size, and indistinct boundaries. Existing deep learning models, dominated by convolutional neural networks (CNNs) and Transformers, face a fundamental trade-off: CNNs excel at capturing local features but are limited in modeling long-range spatial dependencies, while transformers capture global context at a computationally prohibitive quadratic cost for high-resolution 3D volumes. To overcome this, we propose the hierarchical mamba-CNN transducer (HMC-transducer), a novel and efficient hybrid architecture. Our model synergistically integrates the strengths of CNNs with the linear-complexity long-range modeling capabilities of Mamba, a recent state space model. The core innovations are twofold: (1) a direction-aware 3D Mamba (DA3D-Mamba) block, specifically designed to process volumetric data by preserving spatial topology along all three axes, and (2) a Mamba-CNN Transducer block with a gated fusion mechanism that learns to adaptively weigh and combine local and global features at each level of the network hierarchy. Extensive experiments on multiple public benchmarks, including LiTS17, MSD-liver, and KiTS21, demonstrate that our HMC-Transducer not only sets a new state-of-the-art in segmentation accuracy but also exhibits superior generalization and computational efficiency compared to leading CNN- and transformer-based methods. Our work presents a significant step towards developing generalizable and practical segmentation models for clinical applications.

Topics

Journal Article

Ready to Sharpen Your Edge?

Subscribe to join 9,500+ peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.