270 research outputs found
Bringing Background into the Foreground: Making All Classes Equal in Weakly-supervised Video Semantic Segmentation
Pixel-level annotations are expensive and time-consuming to obtain. Hence,
weak supervision using only image tags could have a significant impact in
semantic segmentation. Recent years have seen great progress in
weakly-supervised semantic segmentation, whether from a single image or from
videos. However, most existing methods are designed to handle a single
background class. In practical applications, such as autonomous navigation, it
is often crucial to reason about multiple background classes. In this paper, we
introduce an approach to doing so by making use of classifier heatmaps. We then
develop a two-stream deep architecture that jointly leverages appearance and
motion, and design a loss based on our heatmaps to train it. Our experiments
demonstrate the benefits of our classifier heatmaps and of our two-stream
architecture on challenging urban scene datasets and on the YouTube-Objects
benchmark, where we obtain state-of-the-art results.Comment: 11 pages, 4 figures, 7 tables, Accepted in ICCV 201
WEDGE: Web-Image Assisted Domain Generalization for Semantic Segmentation
Domain generalization for semantic segmentation is highly demanded in real
applications, where a trained model is expected to work well in previously
unseen domains. One challenge lies in the lack of data which could cover the
diverse distributions of the possible unseen domains for training. In this
paper, we propose a WEb-image assisted Domain GEneralization (WEDGE) scheme,
which is the first to exploit the diversity of web-crawled images for
generalizable semantic segmentation. To explore and exploit the real-world data
distributions, we collect a web-crawled dataset which presents large diversity
in terms of weather conditions, sites, lighting, camera styles, etc. We also
present a method which injects the style representation of the web-crawled data
into the source domain on-the-fly during training, which enables the network to
experience images of diverse styles with reliable labels for effective
training. Moreover, we use the web-crawled dataset with predicted pseudo labels
for training to further enhance the capability of the network. Extensive
experiments demonstrate that our method clearly outperforms existing domain
generalization techniques
- …