53 research outputs found
GFF: Gated Fully Fusion for Semantic Segmentation
Semantic segmentation generates comprehensive understanding of scenes through
densely predicting the category for each pixel. High-level features from Deep
Convolutional Neural Networks already demonstrate their effectiveness in
semantic segmentation tasks, however the coarse resolution of high-level
features often leads to inferior results for small/thin objects where detailed
information is important. It is natural to consider importing low level
features to compensate for the lost detailed information in high-level
features.Unfortunately, simply combining multi-level features suffers from the
semantic gap among them. In this paper, we propose a new architecture, named
Gated Fully Fusion (GFF), to selectively fuse features from multiple levels
using gates in a fully connected way. Specifically, features at each level are
enhanced by higher-level features with stronger semantics and lower-level
features with more details, and gates are used to control the propagation of
useful information which significantly reduces the noises during fusion. We
achieve the state of the art results on four challenging scene parsing datasets
including Cityscapes, Pascal Context, COCO-stuff and ADE20K.Comment: accepted by AAAI-2020(oral
Real-time Semantic Segmentation with Context Aggregation Network
With the increasing demand of autonomous systems, pixelwise semantic
segmentation for visual scene understanding needs to be not only accurate but
also efficient for potential real-time applications. In this paper, we propose
Context Aggregation Network, a dual branch convolutional neural network, with
significantly lower computational costs as compared to the state-of-the-art,
while maintaining a competitive prediction accuracy. Building upon the existing
dual branch architectures for high-speed semantic segmentation, we design a
cheap high resolution branch for effective spatial detailing and a context
branch with light-weight versions of global aggregation and local distribution
blocks, potent to capture both long-range and local contextual dependencies
required for accurate semantic segmentation, with low computational overheads.
We evaluate our method on two semantic segmentation datasets, namely Cityscapes
dataset and UAVid dataset. For Cityscapes test set, our model achieves
state-of-the-art results with mIOU of 75.9%, at 76 FPS on an NVIDIA RTX 2080Ti
and 8 FPS on a Jetson Xavier NX. With regards to UAVid dataset, our proposed
network achieves mIOU score of 63.5% with high execution speed (15 FPS).Comment: extended version of v
- …