Search CORE

1,407 research outputs found

Saliency-guided Adaptive Seeding for Supervoxel Segmentation

Author: Frintrop Simone
Gao Ge
Lauri Mikko
Zhang Jianwei
Publication venue
Publication date: 19/10/2017
Field of study

We propose a new saliency-guided method for generating supervoxels in 3D space. Rather than using an evenly distributed spatial seeding procedure, our method uses visual saliency to guide the process of supervoxel generation. This results in densely distributed, small, and precise supervoxels in salient regions which often contain objects, and larger supervoxels in less salient regions that often correspond to background. Our approach largely improves the quality of the resulting supervoxel segmentation in terms of boundary recall and under-segmentation error on publicly available benchmarks.Comment: 6 pages, accepted to IROS201

arXiv.org e-Print Archive

Crossref

Semantic Object Parsing with Graph LSTM

Author: A Graves
E Simo-Serra
F Xia
S Hochreiter
X Liang
Y Wang
Publication venue
Publication date: 22/03/2016
Field of study

By taking the semantic object parsing task as an exemplar application scenario, we propose the Graph Long Short-Term Memory (Graph LSTM) network, which is the generalization of LSTM from sequential data or multi-dimensional data to general graph-structured data. Particularly, instead of evenly and fixedly dividing an image to pixels or patches in existing multi-dimensional LSTM structures (e.g., Row, Grid and Diagonal LSTMs), we take each arbitrary-shaped superpixel as a semantically consistent node, and adaptively construct an undirected graph for each image, where the spatial relations of the superpixels are naturally used as edges. Constructed on such an adaptive graph topology, the Graph LSTM is more naturally aligned with the visual patterns in the image (e.g., object boundaries or appearance similarities) and provides a more economical information propagation route. Furthermore, for each optimization step over Graph LSTM, we propose to use a confidence-driven scheme to update the hidden and memory states of nodes progressively till all nodes are updated. In addition, for each node, the forgets gates are adaptively learned to capture different degrees of semantic correlation with neighboring nodes. Comprehensive evaluations on four diverse semantic object parsing datasets well demonstrate the significant superiority of our Graph LSTM over other state-of-the-art solutions.Comment: 18 page

arXiv.org e-Print Archive

Crossref

A Few Photons Among Many: Unmixing Signal and Noise for Photon-Efficient Active Imaging

Author: Goyal Vivek K
Rapp Joshua
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 23/09/2016
Field of study

Conventional LIDAR systems require hundreds or thousands of photon detections to form accurate depth and reflectivity images. Recent photon-efficient computational imaging methods are remarkably effective with only 1.0 to 3.0 detected photons per pixel, but they are not demonstrated at signal-to-background ratio (SBR) below 1.0 because their imaging accuracies degrade significantly in the presence of high background noise. We introduce a new approach to depth and reflectivity estimation that focuses on unmixing contributions from signal and noise sources. At each pixel in an image, short-duration range gates are adaptively determined and applied to remove detections likely to be due to noise. For pixels with too few detections to perform this censoring accurately, we borrow data from neighboring pixels to improve depth estimates, where the neighborhood formation is also adaptive to scene content. Algorithm performance is demonstrated on experimental data at varying levels of noise. Results show improved performance of both reflectivity and depth estimates over state-of-the-art methods, especially at low signal-to-background ratios. In particular, accurate imaging is demonstrated with SBR as low as 0.04. This validation of a photon-efficient, noise-tolerant method demonstrates the viability of rapid, long-range, and low-power LIDAR imaging

arXiv.org e-Print Archive

Crossref

Boston University Institutional Repository (OpenBU)

Salient Object Detection via Augmented Hypotheses

Author: Nguyen Tam V.
Sepulveda Jose
Publication venue
Publication date: 29/05/2015
Field of study

In this paper, we propose using \textit{augmented hypotheses} which consider objectness, foreground and compactness for salient object detection. Our algorithm consists of four basic steps. First, our method generates the objectness map via objectness hypotheses. Based on the objectness map, we estimate the foreground margin and compute the corresponding foreground map which prefers the foreground objects. From the objectness map and the foreground map, the compactness map is formed to favor the compact objects. We then derive a saliency measure that produces a pixel-accurate saliency map which uniformly covers the objects of interest and consistently separates fore- and background. We finally evaluate the proposed framework on two challenging datasets, MSRA-1000 and iCoSeg. Our extensive experimental results show that our method outperforms state-of-the-art approaches.Comment: IJCAI 2015 pape

arXiv.org e-Print Archive

University of Dayton

Superpixels: An Evaluation of the State-of-the-Art

Author: Hermans Alexander
Leibe Bastian
Stutz David
Publication venue: 'Elsevier BV'
Publication date: 19/04/2017
Field of study

Superpixels group perceptually similar pixels to create visually meaningful entities while heavily reducing the number of primitives for subsequent processing steps. As of these properties, superpixel algorithms have received much attention since their naming in 2003. By today, publicly available superpixel algorithms have turned into standard tools in low-level vision. As such, and due to their quick adoption in a wide range of applications, appropriate benchmarks are crucial for algorithm selection and comparison. Until now, the rapidly growing number of algorithms as well as varying experimental setups hindered the development of a unifying benchmark. We present a comprehensive evaluation of 28 state-of-the-art superpixel algorithms utilizing a benchmark focussing on fair comparison and designed to provide new insights relevant for applications. To this end, we explicitly discuss parameter optimization and the importance of strictly enforcing connectivity. Furthermore, by extending well-known metrics, we are able to summarize algorithm performance independent of the number of generated superpixels, thereby overcoming a major limitation of available benchmarks. Furthermore, we discuss runtime, robustness against noise, blur and affine transformations, implementation details as well as aspects of visual quality. Finally, we present an overall ranking of superpixel algorithms which redefines the state-of-the-art and enables researchers to easily select appropriate algorithms and the corresponding implementations which themselves are made publicly available as part of our benchmark at davidstutz.de/projects/superpixel-benchmark/

arXiv.org e-Print Archive

Publikationsserver der RWTH Aachen University

Video Propagation Networks

Author: Gadde Raghudeep
Gehler Peter V.
Jampani Varun
Publication venue
Publication date: 01/01/2017
Field of study

We propose a technique that propagates information forward through video data. The method is conceptually simple and can be applied to tasks that require the propagation of structured information, such as semantic labels, based on video content. We propose a 'Video Propagation Network' that processes video frames in an adaptive manner. The model is applied online: it propagates information forward without the need to access future frames. In particular we combine two components, a temporal bilateral network for dense and video adaptive filtering, followed by a spatial network to refine features and increased flexibility. We present experiments on video object segmentation and semantic video segmentation and show increased performance comparing to the best previous task-specific methods, while having favorable runtime. Additionally we demonstrate our approach on an example regression task of color propagation in a grayscale video.Comment: Appearing in Computer Vision and Pattern Recognition, 2017 (CVPR'17

arXiv.org e-Print Archive

MPG.PuRe