X-LAT-Net: An Interpretable Lightweight Axial Transformer Network for Pancreatic CT Segmentation.
Authors
Abstract
Accurate segmentation of the pancreas in CT images is notoriously difficult due to its deep-seated anatomical position, complex morphological variations, and low contrast against surrounding soft tissues. Although deep learning has significantly improved segmentation performance, state-of-the-art models often suffer from excessive parameter counts and high computational complexity, hindering their deployment in resource-constrained clinical settings. Crucially, the opaque black-box nature of existing models lacks transparent decision-making mechanisms, which erodes clinician trust and restricts the practical utility of AI-aided diagnosis. To address these challenges, we propose X-LAT-Net, a lightweight segmentation network designed to harmonize efficiency, accuracy, and clinical interpretability. Adopting a U-shaped architecture, the network incorporates an Axial Depthwise Convolution module to capture long-range spatial dependencies with minimal parameter overhead. Concurrently, we introduce an Interpretable Cross-scale Transformer (X-CATrans) module; this component not only enhances global context modeling but also generates intuitive attention heatmaps to visualize the model's decision making rationale. Furthermore, a Shift-enhanced MLP module is utilized to refine the capture of indistinct pancreatic boundaries. Extensive experiments on the NIH Pancreas-CT dataset demonstrate that X-LAT-Net achieves a Dice coefficient of 82.34% with only 1.6M parameters. It outperforms existing mainstream methods in both accuracy and inference speed while bolstering clinical confidence through visual interpretability, offering an efficient and reliable solution for intelligent pancreatic cancer screening.