Back to all papers

MSTM-Net: a two-stage prostate cancer segmentation network based on swin-transformer-mamba architecture.

May 29, 2026pubmed logopapers

Authors

Chen J,Liu X,Wang S,Ji W

Affiliations (4)

  • Shanghai University of Engineering Science, Shanghai, 201620, China.
  • Shanghai University of Engineering Science, Shanghai, 201620, China. [email protected].
  • Department of Cell Biology, Harvard Medical School, Boston, 02115, MA, USA.
  • Shanghai Electric Power Hospital, Shanghai, 201620, China. [email protected].

Abstract

Magnetic resonance imaging (MRI) has become a core imaging modality for prostate cancer screening and diagnosis. Accurate and automatic segmentation of lesion regions is critical for subsequent staging assessment and treatment planning. To this end, this research proposes a two-stage segmentation framework for multimodal MRI. In the first stage, the prostate gland is segmented to extract the region of interest (ROI), thereby removing complex pelvic background structures. In the second stage, fine-grained prostate cancer lesion segmentation is performed within the ROI, enabling the model to focus on anatomically plausible lesion regions.A segmentation network, termed MSTM-Net, is developed based on this framework. The network adopts a Swin Transformer-based decoder architecture. At the input stage, T2-weighted images and apparent diffusion coefficient (ADC) maps are spatially aligned and concatenated along the channel dimension. During decoding, a Mamba module based on state-space modeling is introduced to jointly capture local structural information and long-range dependencies. Multi-head attention fusion and multi-scale feature fusion are further integrated into the skip connections to enhance the consistency between shallow spatial details and deep semantic representations. Experiments conducted on the cleaned PROSTATEx dataset demonstrate that the proposed method achieves a Dice score of 95.38% for prostate gland segmentation and 63.89% for lesion segmentation, outperforming the best comparative network by approximately 4% points, with an mIoU of 61.32%. Furthermore, cross-dataset validation on the PI-CAI dataset yields a Dice score of 63.14%, indicating good generalization ability and clinical feasibility for automated prostate cancer segmentation. The proposed MSTM-Net demonstrates effective performance for prostate cancer segmentation in multimodal MRI, achieving improved accuracy and feature representation compared with existing methods. The results indicate that the two-stage framework combined with multi-modal fusion and state-space modeling is a promising approach, although further validation on larger and more diverse datasets is required to enhance robustness and generalization.

Topics

Journal Article

Ready to Sharpen Your Edge?

Subscribe to join 11k+ peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.