Hybrid-CMLP: Hybrid CNN-MLP Networks for Low-to-standard-dose PET Synthesis.
Authors
Abstract
Positron Emission Tomography (PET) is a critical modality in medical imaging for detecting abnormalities and diagnosing diseases. However, the radiation exposure from the radiotracer remains a significant concern. Synthesizing standard-dose PET (sPET) images from low-dose PET (lPET) offers a compelling solution, enabling high-quality imaging while substantially reducing radiation risks to patients. Existing PET synthesis methods predominantly rely on convolutional neural networks (CNNs), which are inherently limited by their local receptive fields. Transformer-based architectures have been explored to address this limitation by modeling long-range dependencies through self-attention. However, the high computational and memory costs of self-attention restrict Transformers to downsampled feature maps, limiting their ability to model global anatomical and functional correlations at full resolution. Multilayer Perceptrons (MLPs), in contrast, offer a computationally efficient way to capture long-range dependencies at full resolution. Yet, existing MLP-based methods for PET synthesis lack mechanisms to capture localized features such as subtle tissue textures that are critical for PET functionality. In this study, we propose a hybrid CNN-MLP network (Hybrid-CMLP) architecture for PET synthesis. Central to this architecture is a novel Hybrid-Syn block, which integrates axis-wise MLP branches for capturing global dependencies along spatial dimensions, and dual CNN branches for extracting both fine-grained and broad contextual features. To further enhance fine-grained reasoning, we introduce an Adaptive Fusion Mechanism (AFM) that dynamically integrates global and local features based on spatial context. Extensive experiments on two benchmark datasets demonstrate that Hybrid-CMLP consistently outperforms state-of-the-art methods in both qualitative and quantitative evaluations.