1,407 research outputs found
Saliency-guided Adaptive Seeding for Supervoxel Segmentation
We propose a new saliency-guided method for generating supervoxels in 3D
space. Rather than using an evenly distributed spatial seeding procedure, our
method uses visual saliency to guide the process of supervoxel generation. This
results in densely distributed, small, and precise supervoxels in salient
regions which often contain objects, and larger supervoxels in less salient
regions that often correspond to background. Our approach largely improves the
quality of the resulting supervoxel segmentation in terms of boundary recall
and under-segmentation error on publicly available benchmarks.Comment: 6 pages, accepted to IROS201
Semantic Object Parsing with Graph LSTM
By taking the semantic object parsing task as an exemplar application
scenario, we propose the Graph Long Short-Term Memory (Graph LSTM) network,
which is the generalization of LSTM from sequential data or multi-dimensional
data to general graph-structured data. Particularly, instead of evenly and
fixedly dividing an image to pixels or patches in existing multi-dimensional
LSTM structures (e.g., Row, Grid and Diagonal LSTMs), we take each
arbitrary-shaped superpixel as a semantically consistent node, and adaptively
construct an undirected graph for each image, where the spatial relations of
the superpixels are naturally used as edges. Constructed on such an adaptive
graph topology, the Graph LSTM is more naturally aligned with the visual
patterns in the image (e.g., object boundaries or appearance similarities) and
provides a more economical information propagation route. Furthermore, for each
optimization step over Graph LSTM, we propose to use a confidence-driven scheme
to update the hidden and memory states of nodes progressively till all nodes
are updated. In addition, for each node, the forgets gates are adaptively
learned to capture different degrees of semantic correlation with neighboring
nodes. Comprehensive evaluations on four diverse semantic object parsing
datasets well demonstrate the significant superiority of our Graph LSTM over
other state-of-the-art solutions.Comment: 18 page
A Few Photons Among Many: Unmixing Signal and Noise for Photon-Efficient Active Imaging
Conventional LIDAR systems require hundreds or thousands of photon detections
to form accurate depth and reflectivity images. Recent photon-efficient
computational imaging methods are remarkably effective with only 1.0 to 3.0
detected photons per pixel, but they are not demonstrated at
signal-to-background ratio (SBR) below 1.0 because their imaging accuracies
degrade significantly in the presence of high background noise. We introduce a
new approach to depth and reflectivity estimation that focuses on unmixing
contributions from signal and noise sources. At each pixel in an image,
short-duration range gates are adaptively determined and applied to remove
detections likely to be due to noise. For pixels with too few detections to
perform this censoring accurately, we borrow data from neighboring pixels to
improve depth estimates, where the neighborhood formation is also adaptive to
scene content. Algorithm performance is demonstrated on experimental data at
varying levels of noise. Results show improved performance of both reflectivity
and depth estimates over state-of-the-art methods, especially at low
signal-to-background ratios. In particular, accurate imaging is demonstrated
with SBR as low as 0.04. This validation of a photon-efficient, noise-tolerant
method demonstrates the viability of rapid, long-range, and low-power LIDAR
imaging
Salient Object Detection via Augmented Hypotheses
In this paper, we propose using \textit{augmented hypotheses} which consider
objectness, foreground and compactness for salient object detection. Our
algorithm consists of four basic steps. First, our method generates the
objectness map via objectness hypotheses. Based on the objectness map, we
estimate the foreground margin and compute the corresponding foreground map
which prefers the foreground objects. From the objectness map and the
foreground map, the compactness map is formed to favor the compact objects. We
then derive a saliency measure that produces a pixel-accurate saliency map
which uniformly covers the objects of interest and consistently separates fore-
and background. We finally evaluate the proposed framework on two challenging
datasets, MSRA-1000 and iCoSeg. Our extensive experimental results show that
our method outperforms state-of-the-art approaches.Comment: IJCAI 2015 pape
Superpixels: An Evaluation of the State-of-the-Art
Superpixels group perceptually similar pixels to create visually meaningful
entities while heavily reducing the number of primitives for subsequent
processing steps. As of these properties, superpixel algorithms have received
much attention since their naming in 2003. By today, publicly available
superpixel algorithms have turned into standard tools in low-level vision. As
such, and due to their quick adoption in a wide range of applications,
appropriate benchmarks are crucial for algorithm selection and comparison.
Until now, the rapidly growing number of algorithms as well as varying
experimental setups hindered the development of a unifying benchmark. We
present a comprehensive evaluation of 28 state-of-the-art superpixel algorithms
utilizing a benchmark focussing on fair comparison and designed to provide new
insights relevant for applications. To this end, we explicitly discuss
parameter optimization and the importance of strictly enforcing connectivity.
Furthermore, by extending well-known metrics, we are able to summarize
algorithm performance independent of the number of generated superpixels,
thereby overcoming a major limitation of available benchmarks. Furthermore, we
discuss runtime, robustness against noise, blur and affine transformations,
implementation details as well as aspects of visual quality. Finally, we
present an overall ranking of superpixel algorithms which redefines the
state-of-the-art and enables researchers to easily select appropriate
algorithms and the corresponding implementations which themselves are made
publicly available as part of our benchmark at
davidstutz.de/projects/superpixel-benchmark/
Video Propagation Networks
We propose a technique that propagates information forward through video
data. The method is conceptually simple and can be applied to tasks that
require the propagation of structured information, such as semantic labels,
based on video content. We propose a 'Video Propagation Network' that processes
video frames in an adaptive manner. The model is applied online: it propagates
information forward without the need to access future frames. In particular we
combine two components, a temporal bilateral network for dense and video
adaptive filtering, followed by a spatial network to refine features and
increased flexibility. We present experiments on video object segmentation and
semantic video segmentation and show increased performance comparing to the
best previous task-specific methods, while having favorable runtime.
Additionally we demonstrate our approach on an example regression task of color
propagation in a grayscale video.Comment: Appearing in Computer Vision and Pattern Recognition, 2017 (CVPR'17
- …