More complex encoder is not all you need

Dong, Zhiqi; Geng, Dehua; Li, Yusong; Wang, Pengwei; Xu, Longwei; Xu, Mingyuan; Yang, Weibin

More complex encoder is not all you need

Authors: Zhiqi Dong
Dehua Geng
Yusong Li
Pengwei Wang
Longwei Xu
Mingyuan Xu
Weibin Yang
Publication date: 27 October 2023
Publisher

Abstract

U-Net and its variants have been widely used in medical image segmentation. However, most current U-Net variants confine their improvement strategies to building more complex encoder, while leaving the decoder unchanged or adopting a simple symmetric structure. These approaches overlook the true functionality of the decoder: receiving low-resolution feature maps from the encoder and restoring feature map resolution and lost information through upsampling. As a result, the decoder, especially its upsampling component, plays a crucial role in enhancing segmentation outcomes. However, in 3D medical image segmentation, the commonly used transposed convolution can result in visual artifacts. This issue stems from the absence of direct relationship between adjacent pixels in the output feature map. Furthermore, plain encoder has already possessed sufficient feature extraction capability because downsampling operation leads to the gradual expansion of the receptive field, but the loss of information during downsampling process is unignorable. To address the gap in relevant research, we extend our focus beyond the encoder and introduce neU-Net (i.e., not complex encoder U-Net), which incorporates a novel Sub-pixel Convolution for upsampling to construct a powerful decoder. Additionally, we introduce multi-scale wavelet inputs module on the encoder side to provide additional information. Our model design achieves excellent results, surpassing other state-of-the-art methods on both the Synapse and ACDC datasets

Similar works

Full text

Available Versions

arXiv.org e-Print Archive

oai:arXiv.org:2309.11139

Last time updated on 10/10/2023