1 research outputs found
Strip Pooling: Rethinking Spatial Pooling for Scene Parsing
Spatial pooling has been proven highly effective in capturing long-range
contextual information for pixel-wise prediction tasks, such as scene parsing.
In this paper, beyond conventional spatial pooling that usually has a regular
shape of NxN, we rethink the formulation of spatial pooling by introducing a
new pooling strategy, called strip pooling, which considers a long but narrow
kernel, i.e., 1xN or Nx1. Based on strip pooling, we further investigate
spatial pooling architecture design by 1) introducing a new strip pooling
module that enables backbone networks to efficiently model long-range
dependencies, 2) presenting a novel building block with diverse spatial pooling
as a core, and 3) systematically comparing the performance of the proposed
strip pooling and conventional spatial pooling techniques. Both novel
pooling-based designs are lightweight and can serve as an efficient
plug-and-play module in existing scene parsing networks. Extensive experiments
on popular benchmarks (e.g., ADE20K and Cityscapes) demonstrate that our simple
approach establishes new state-of-the-art results. Code is made available at
https://github.com/Andrew-Qibin/SPNet.Comment: Published as a CVPR2020 pape