The U-shaped architecture has emerged as a crucial paradigm in the design of
medical image segmentation networks. However, due to the inherent local
limitations of convolution, a fully convolutional segmentation network with
U-shaped architecture struggles to effectively extract global context
information, which is vital for the precise localization of lesions. While
hybrid architectures combining CNNs and Transformers can address these issues,
their application in real medical scenarios is limited due to the computational
resource constraints imposed by the environment and edge devices. In addition,
the convolutional inductive bias in lightweight networks adeptly fits the
scarce medical data, which is lacking in the Transformer based network. In
order to extract global context information while taking advantage of the
inductive bias, we propose CMUNeXt, an efficient fully convolutional
lightweight medical image segmentation network, which enables fast and accurate
auxiliary diagnosis in real scene scenarios. CMUNeXt leverages large kernel and
inverted bottleneck design to thoroughly mix distant spatial and location
information, efficiently extracting global context information. We also
introduce the Skip-Fusion block, designed to enable smooth skip-connections and
ensure ample feature fusion. Experimental results on multiple medical image
datasets demonstrate that CMUNeXt outperforms existing heavyweight and
lightweight medical image segmentation networks in terms of segmentation
performance, while offering a faster inference speed, lighter weights, and a
reduced computational cost. The code is available at
https://github.com/FengheTan9/CMUNeXt.Comment: 8 pages, 3 figure