81,216 research outputs found
Improving Spatial Codification in Semantic Segmentation
This paper explores novel approaches for improving the spatial codification
for the pooling of local descriptors to solve the semantic segmentation
problem. We propose to partition the image into three regions for each object
to be described: Figure, Border and Ground. This partition aims at minimizing
the influence of the image context on the object description and vice versa by
introducing an intermediate zone around the object contour. Furthermore, we
also propose a richer visual descriptor of the object by applying a Spatial
Pyramid over the Figure region. Two novel Spatial Pyramid configurations are
explored: Cartesian-based and crown-based Spatial Pyramids. We test these
approaches with state-of-the-art techniques and show that they improve the
Figure-Ground based pooling in the Pascal VOC 2011 and 2012 semantic
segmentation challenges.Comment: Paper accepted at the IEEE International Conference on Image
Processing, ICIP 2015. Quebec City, 27-30 September. Project page:
https://imatge.upc.edu/web/publications/improving-spatial-codification-semantic-segmentatio
CoupleNet: Coupling Global Structure with Local Parts for Object Detection
The region-based Convolutional Neural Network (CNN) detectors such as Faster
R-CNN or R-FCN have already shown promising results for object detection by
combining the region proposal subnetwork and the classification subnetwork
together. Although R-FCN has achieved higher detection speed while keeping the
detection performance, the global structure information is ignored by the
position-sensitive score maps. To fully explore the local and global
properties, in this paper, we propose a novel fully convolutional network,
named as CoupleNet, to couple the global structure with local parts for object
detection. Specifically, the object proposals obtained by the Region Proposal
Network (RPN) are fed into the the coupling module which consists of two
branches. One branch adopts the position-sensitive RoI (PSRoI) pooling to
capture the local part information of the object, while the other employs the
RoI pooling to encode the global and context information. Next, we design
different coupling strategies and normalization ways to make full use of the
complementary advantages between the global and local branches. Extensive
experiments demonstrate the effectiveness of our approach. We achieve
state-of-the-art results on all three challenging datasets, i.e. a mAP of 82.7%
on VOC07, 80.4% on VOC12, and 34.4% on COCO. Codes will be made publicly
available.Comment: Accepted by ICCV 201
Monocular SLAM Supported Object Recognition
In this work, we develop a monocular SLAM-aware object recognition system
that is able to achieve considerably stronger recognition performance, as
compared to classical object recognition systems that function on a
frame-by-frame basis. By incorporating several key ideas including multi-view
object proposals and efficient feature encoding methods, our proposed system is
able to detect and robustly recognize objects in its environment using a single
RGB camera in near-constant time. Through experiments, we illustrate the
utility of using such a system to effectively detect and recognize objects,
incorporating multiple object viewpoint detections into a unified prediction
hypothesis. The performance of the proposed recognition system is evaluated on
the UW RGB-D Dataset, showing strong recognition performance and scalable
run-time performance compared to current state-of-the-art recognition systems.Comment: Accepted to appear at Robotics: Science and Systems 2015, Rome, Ital
- …