Search CORE

625 research outputs found

Semantically Guided Depth Upsampling

Author: A Geiger
A Kundu
D Scharstein
J Kopf
J Liu
K He
K Yamaguchi
L Ladický
M Everingham
M Kiechle
P Dollar
Publication venue
Publication date: 02/08/2016
Field of study

We present a novel method for accurate and efficient up- sampling of sparse depth data, guided by high-resolution imagery. Our approach goes beyond the use of intensity cues only and additionally exploits object boundary cues through structured edge detection and semantic scene labeling for guidance. Both cues are combined within a geodesic distance measure that allows for boundary-preserving depth in- terpolation while utilizing local context. We model the observed scene structure by locally planar elements and formulate the upsampling task as a global energy minimization problem. Our method determines glob- ally consistent solutions and preserves fine details and sharp depth bound- aries. In our experiments on several public datasets at different levels of application, we demonstrate superior performance of our approach over the state-of-the-art, even for very sparse measurements.Comment: German Conference on Pattern Recognition 2016 (Oral

arXiv.org e-Print Archive

Crossref

Localizing Region-Based Active Contours

Author: Allen Tannenbaum
Shawn Lankton
Student Member
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2008
Field of study

©2008 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or distribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE. This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. In most cases, these works may not be reposted without the explicit permission of the copyright holder.DOI: 10.1109/TIP.2008.2004611In this paper, we propose a natural framework that allows any region-based segmentation energy to be re-formulated in a local way. We consider local rather than global image statistics and evolve a contour based on local information. Localized contours are capable of segmenting objects with heterogeneous feature profiles that would be difficult to capture correctly using a standard global method. The presented technique is versatile enough to be used with any global region-based active contour energy and instill in it the benefits of localization. We describe this framework and demonstrate the localization of three well-known energies in order to illustrate how our framework can be applied to any energy. We then compare each localized energy to its global counterpart to show the improvements that can be achieved. Next, an in-depth study of the behaviors of these energies in response to the degree of localization is given. Finally, we show results on challenging images to illustrate the robust and accurate segmentations that are possible with this new class of active contour models

Scholarly Materials And Research @ Georgia Tech

CiteSeerX

Scene Parsing with Multiscale Feature Learning, Purity Trees, and Optimal Covers

Author: Couprie Camille
Farabet Clément
LeCun Yann
Najman Laurent
Publication venue
Publication date: 01/01/2012
Field of study

Scene parsing, or semantic segmentation, consists in labeling each pixel in an image with the category of the object it belongs to. It is a challenging task that involves the simultaneous detection, segmentation and recognition of all the objects in the image. The scene parsing method proposed here starts by computing a tree of segments from a graph of pixel dissimilarities. Simultaneously, a set of dense feature vectors is computed which encodes regions of multiple sizes centered on each pixel. The feature extractor is a multiscale convolutional network trained from raw pixels. The feature vectors associated with the segments covered by each node in the tree are aggregated and fed to a classifier which produces an estimate of the distribution of object categories contained in the segment. A subset of tree nodes that cover the image are then selected so as to maximize the average "purity" of the class distributions, hence maximizing the overall likelihood that each segment will contain a single object. The convolutional network feature extractor is trained end-to-end from raw pixels, alleviating the need for engineered features. After training, the system is parameter free. The system yields record accuracies on the Stanford Background Dataset (8 classes), the Sift Flow Dataset (33 classes) and the Barcelona Dataset (170 classes) while being an order of magnitude faster than competing approaches, producing a 320 \times 240 image labeling in less than 1 second.Comment: 9 pages, 4 figures - Published in 29th International Conference on Machine Learning (ICML 2012), Jun 2012, Edinburgh, United Kingdo

arXiv.org e-Print Archive

CiteSeerX

HAL Descartes

HAL-Ecole des Ponts ParisTech

HAL - UPEC / UPEM

SceneFlowFields: Dense Interpolation of Sparse Scene Flow Correspondences

Author: Bailer Christian
Kuschk Georg
Schuster René
Stricker Didier
Wasenmüller Oliver
Publication venue
Publication date: 27/10/2017
Field of study

While most scene flow methods use either variational optimization or a strong rigid motion assumption, we show for the first time that scene flow can also be estimated by dense interpolation of sparse matches. To this end, we find sparse matches across two stereo image pairs that are detected without any prior regularization and perform dense interpolation preserving geometric and motion boundaries by using edge information. A few iterations of variational energy minimization are performed to refine our results, which are thoroughly evaluated on the KITTI benchmark and additionally compared to state-of-the-art on MPI Sintel. For application in an automotive context, we further show that an optional ego-motion model helps to boost performance and blends smoothly into our approach to produce a segmentation of the scene into static and dynamic parts.Comment: IEEE Winter Conference on Applications of Computer Vision (WACV), 201

arXiv.org e-Print Archive

Crossref

DeepOrgan: Multi-level Deep Convolutional Networks for Automated Pancreas Segmentation

Author: Farag Amal
Liu Jiamin
Lu Le
Roth Holger R.
Shin Hoo-Chang
Summers Ronald M.
Turkbey Evrim
Publication venue
Publication date: 21/06/2015
Field of study

Automatic organ segmentation is an important yet challenging problem for medical image analysis. The pancreas is an abdominal organ with very high anatomical variability. This inhibits previous segmentation methods from achieving high accuracies, especially compared to other organs such as the liver, heart or kidneys. In this paper, we present a probabilistic bottom-up approach for pancreas segmentation in abdominal computed tomography (CT) scans, using multi-level deep convolutional networks (ConvNets). We propose and evaluate several variations of deep ConvNets in the context of hierarchical, coarse-to-fine classification on image patches and regions, i.e. superpixels. We first present a dense labeling of local image patches via

P{-}\mathrm{ConvNet}

and nearest neighbor fusion. Then we describe a regional ConvNet (

R_1{-}\mathrm{ConvNet}

) that samples a set of bounding boxes around each image superpixel at different scales of contexts in a "zoom-out" fashion. Our ConvNets learn to assign class probabilities for each superpixel region of being pancreas. Last, we study a stacked

R_2{-}\mathrm{ConvNet}

leveraging the joint space of CT intensities and the

P{-}\mathrm{ConvNet}

dense probability maps. Both 3D Gaussian smoothing and 2D conditional random fields are exploited as structured predictions for post-processing. We evaluate on CT images of 82 patients in 4-fold cross-validation. We achieve a Dice Similarity Coefficient of 83.6

\pm

6.3% in training and 71.8

\pm

10.7% in testing.Comment: To be presented at MICCAI 2015 - 18th International Conference on Medical Computing and Computer Assisted Interventions, Munich, German

arXiv.org e-Print Archive

CiteSeerX

Efficient Decomposition of Image and Mesh Graphs by Lifted Multicuts

Author: Andres Bjoern
Bonneel Nicolas
Brox Thomas
Keuper Margret
Lavoué Guillaume
Levinkov Evgeny
Publication venue
Publication date: 01/01/2015
Field of study

Formulations of the Image Decomposition Problem as a Multicut Problem (MP) w.r.t. a superpixel graph have received considerable attention. In contrast, instances of the MP w.r.t. a pixel grid graph have received little attention, firstly, because the MP is NP-hard and instances w.r.t. a pixel grid graph are hard to solve in practice, and, secondly, due to the lack of long-range terms in the objective function of the MP. We propose a generalization of the MP with long-range terms (LMP). We design and implement two efficient algorithms (primal feasible heuristics) for the MP and LMP which allow us to study instances of both problems w.r.t. the pixel grid graphs of the images in the BSDS-500 benchmark. The decompositions we obtain do not differ significantly from the state of the art, suggesting that the LMP is a competitive formulation of the Image Decomposition Problem. To demonstrate the generality of the LMP, we apply it also to the Mesh Decomposition Problem posed by the Princeton benchmark, obtaining state-of-the-art decompositions

arXiv.org e-Print Archive

Crossref

MAnnheim DOCument Server

HAL

MPG.PuRe

Hal-Diderot