19,137 research outputs found
Depth map compression via 3D region-based representation
In 3D video, view synthesis is used to create new virtual views between
encoded camera views. Errors in the coding of the depth maps introduce
geometry inconsistencies in synthesized views. In this paper, a new 3D plane
representation of the scene is presented which improves the performance of
current standard video codecs in the view synthesis domain. Two image segmentation
algorithms are proposed for generating a color and depth segmentation.
Using both partitions, depth maps are segmented into regions without
sharp discontinuities without having to explicitly signal all depth edges. The
resulting regions are represented using a planar model in the 3D world scene.
This 3D representation allows an efficient encoding while preserving the 3D
characteristics of the scene. The 3D planes open up the possibility to code
multiview images with a unique representation.Postprint (author's final draft
Single-Shot Clothing Category Recognition in Free-Configurations with Application to Autonomous Clothes Sorting
This paper proposes a single-shot approach for recognising clothing
categories from 2.5D features. We propose two visual features, BSP (B-Spline
Patch) and TSD (Topology Spatial Distances) for this task. The local BSP
features are encoded by LLC (Locality-constrained Linear Coding) and fused with
three different global features. Our visual feature is robust to deformable
shapes and our approach is able to recognise the category of unknown clothing
in unconstrained and random configurations. We integrated the category
recognition pipeline with a stereo vision system, clothing instance detection,
and dual-arm manipulators to achieve an autonomous sorting system. To verify
the performance of our proposed method, we build a high-resolution RGBD
clothing dataset of 50 clothing items of 5 categories sampled in random
configurations (a total of 2,100 clothing samples). Experimental results show
that our approach is able to reach 83.2\% accuracy while classifying clothing
items which were previously unseen during training. This advances beyond the
previous state-of-the-art by 36.2\%. Finally, we evaluate the proposed approach
in an autonomous robot sorting system, in which the robot recognises a clothing
item from an unconstrained pile, grasps it, and sorts it into a box according
to its category. Our proposed sorting system achieves reasonable sorting
success rates with single-shot perception.Comment: 9 pages, accepted by IROS201
Graph Spectral Image Processing
Recent advent of graph signal processing (GSP) has spurred intensive studies
of signals that live naturally on irregular data kernels described by graphs
(e.g., social networks, wireless sensor networks). Though a digital image
contains pixels that reside on a regularly sampled 2D grid, if one can design
an appropriate underlying graph connecting pixels with weights that reflect the
image structure, then one can interpret the image (or image patch) as a signal
on a graph, and apply GSP tools for processing and analysis of the signal in
graph spectral domain. In this article, we overview recent graph spectral
techniques in GSP specifically for image / video processing. The topics covered
include image compression, image restoration, image filtering and image
segmentation
CASENet: Deep Category-Aware Semantic Edge Detection
Boundary and edge cues are highly beneficial in improving a wide variety of
vision tasks such as semantic segmentation, object recognition, stereo, and
object proposal generation. Recently, the problem of edge detection has been
revisited and significant progress has been made with deep learning. While
classical edge detection is a challenging binary problem in itself, the
category-aware semantic edge detection by nature is an even more challenging
multi-label problem. We model the problem such that each edge pixel can be
associated with more than one class as they appear in contours or junctions
belonging to two or more semantic classes. To this end, we propose a novel
end-to-end deep semantic edge learning architecture based on ResNet and a new
skip-layer architecture where category-wise edge activations at the top
convolution layer share and are fused with the same set of bottom layer
features. We then propose a multi-label loss function to supervise the fused
activations. We show that our proposed architecture benefits this problem with
better performance, and we outperform the current state-of-the-art semantic
edge detection methods by a large margin on standard data sets such as SBD and
Cityscapes.Comment: Accepted to CVPR 201
- …