3D CSFA-UNet: a unified attention-driven deep learning framework for accurate knee MRI segmentation and osteoarthritis severity classification.
Authors
Affiliations (2)
Affiliations (2)
- Dr. Mahalingam College of Engineering and Technology, Pollachi, Tamil Nadu, India. [email protected].
- Dr. Mahalingam College of Engineering and Technology, Pollachi, Tamil Nadu, India.
Abstract
Although several recent multi-task deep learning methods already perform segmentation and classification jointly, many still face limitations in clinical applicability such as restricted multi-scale context modeling, insufficient attention to clinically relevant spatial-channel cues, or heavy computational cost that hinders deployment. Building on these advances, we propose a unified, multi-stage framework for joint segmentation and classification of 3D knee MRI volumes that targets improved diagnostic precision, interpretability, and efficiency. The pipeline begins with Gaussian Guided Filtering to enhance anatomical boundaries while suppressing noise. A novel 3D CSFA-UNet (Channel-Spatial Feature Attention) performs segmentation with embedded multi-scale context via Atrous Spatial Pyramid Pooling (ASPP). To reduce redundancy and isolate discriminative signals, we introduce the Desert Scorpion Feature Selector (DSFS), a metaheuristic feature-selection module. Selected features are classified by a Spiking Transformer network that uses Leaky Integrate-and-Fire (LIF) neurons and graph-attention layers to capture temporal sensitivity and inter-structure context. Falcon Hunting Optimisation (FHO) is used to tune hyperparameters for robust performance. Evaluated on the publicly available OAI dataset, the proposed model achieved a Dice Similarity Coefficient (DSC)of 98.10%, Intersection over Union (IoU) of 96.26%, Average Surface Distance (ASD) of 0.45 mm, and 95th percentile Hausdorff Distance (Hd95) of 1.85 mm for segmentation. For classification, the model attained an accuracy of 99.15%, precision of 98.82%, recall of 99.11%, and an F1-score of 99.04%, demonstrating its robustness and reliability across both segmentation and grading tasks. We also expanded the Introduction with a detailed literature analysis of representative multi-task approaches and clearly position our contributions relative to prior work. This framework therefore advances clinically relevant, interpretable joint segmentation-classification for image-guided orthopaedic diagnostics.