Are Vision-xLSTM-embedded U-Nets better at segmenting medical images?

Authors

Dutta P,Bose S,Roy SK,Mitra S

Affiliations (4)

  • Machine Intelligence Unit, Indian Statistical Institute, 203, B.T. Road, Kolkata, 700108, West Bengal, India. Electronic address: [email protected].
  • Department of Computer Science and Engineering, Jadavpur University, 188, Raja Subodh Chandra Mallick Rd, Kolkata, 700032, West Bengal, India.
  • Department of Computer Science and Engineering, Alipurduar Government Engineering and Management College, Alipurduar, 736206, West Bengal, India.
  • Machine Intelligence Unit, Indian Statistical Institute, 203, B.T. Road, Kolkata, 700108, West Bengal, India.

Abstract

The development of efficient segmentation strategies for medical images has evolved from its initial dependence on Convolutional Neural Networks (CNNs) to the current investigation of hybrid models that combine CNNs with Vision Transformers (ViTs). There is an increasing focus on developing architectures that are both high-performing and computationally efficient, capable of being deployed on remote systems with limited resources. Although transformers can capture global dependencies in the input space, they face challenges from the corresponding high computational and storage expenses involved. The objective of this research is to propose that Vision Extended Long Short-Term Memory (Vision-xLSTM) forms an appropriate backbone for medical image segmentation, offering excellent performance with reduced computational costs. This study investigates the integration of CNNs with Vision-xLSTM by introducing the novel U-VixLSTM. The Vision-xLSTM blocks capture the temporal and global relationships within the patches extracted from the CNN feature maps. The convolutional feature reconstruction path upsamples the output volume from the Vision-xLSTM blocks to produce the segmentation output. The U-VixLSTM exhibits superior performance compared to the state-of-the-art networks in the publicly available Synapse, ISIC and ACDC datasets. The findings suggest that U-VixLSTM is a promising alternative to ViTs for medical image segmentation, delivering effective performance without substantial computational burden. This makes it feasible for deployment in healthcare environments with limited resources for faster diagnosis. Code provided: https://github.com/duttapallabi2907/U-VixLSTM.

Topics

Journal Article

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.