61,576 research outputs found
Joint Object and Part Segmentation using Deep Learned Potentials
Segmenting semantic objects from images and parsing them into their
respective semantic parts are fundamental steps towards detailed object
understanding in computer vision. In this paper, we propose a joint solution
that tackles semantic object and part segmentation simultaneously, in which
higher object-level context is provided to guide part segmentation, and more
detailed part-level localization is utilized to refine object segmentation.
Specifically, we first introduce the concept of semantic compositional parts
(SCP) in which similar semantic parts are grouped and shared among different
objects. A two-channel fully convolutional network (FCN) is then trained to
provide the SCP and object potentials at each pixel. At the same time, a
compact set of segments can also be obtained from the SCP predictions of the
network. Given the potentials and the generated segments, in order to explore
long-range context, we finally construct an efficient fully connected
conditional random field (FCRF) to jointly predict the final object and part
labels. Extensive evaluation on three different datasets shows that our
approach can mutually enhance the performance of object and part segmentation,
and outperforms the current state-of-the-art on both tasks
LENet: Lightweight And Efficient LiDAR Semantic Segmentation Using Multi-Scale Convolution Attention
LiDAR-based semantic segmentation is critical in the fields of robotics and
autonomous driving as it provides a comprehensive understanding of the scene.
This paper proposes a lightweight and efficient projection-based semantic
segmentation network called LENet with an encoder-decoder structure for
LiDAR-based semantic segmentation. The encoder is composed of a novel
multi-scale convolutional attention (MSCA) module with varying receptive field
sizes to capture features. The decoder employs an Interpolation And Convolution
(IAC) mechanism utilizing bilinear interpolation for upsampling
multi-resolution feature maps and integrating previous and current dimensional
features through a single convolution layer. This approach significantly
reduces the network's complexity while also improving its accuracy.
Additionally, we introduce multiple auxiliary segmentation heads to further
refine the network's accuracy. Extensive evaluations on publicly available
datasets, including SemanticKITTI, SemanticPOSS, and nuScenes, show that our
proposed method is lighter, more efficient, and robust compared to
state-of-the-art semantic segmentation methods. Full implementation is
available at https://github.com/fengluodb/LENet
Improving Facial Attribute Prediction using Semantic Segmentation
Attributes are semantically meaningful characteristics whose applicability
widely crosses category boundaries. They are particularly important in
describing and recognizing concepts where no explicit training example is
given, \textit{e.g., zero-shot learning}. Additionally, since attributes are
human describable, they can be used for efficient human-computer interaction.
In this paper, we propose to employ semantic segmentation to improve facial
attribute prediction. The core idea lies in the fact that many facial
attributes describe local properties. In other words, the probability of an
attribute to appear in a face image is far from being uniform in the spatial
domain. We build our facial attribute prediction model jointly with a deep
semantic segmentation network. This harnesses the localization cues learned by
the semantic segmentation to guide the attention of the attribute prediction to
the regions where different attributes naturally show up. As a result of this
approach, in addition to recognition, we are able to localize the attributes,
despite merely having access to image level labels (weak supervision) during
training. We evaluate our proposed method on CelebA and LFWA datasets and
achieve superior results to the prior arts. Furthermore, we show that in the
reverse problem, semantic face parsing improves when facial attributes are
available. That reaffirms the need to jointly model these two interconnected
tasks
- …