1,448 research outputs found
Learning sparse representations of depth
This paper introduces a new method for learning and inferring sparse
representations of depth (disparity) maps. The proposed algorithm relaxes the
usual assumption of the stationary noise model in sparse coding. This enables
learning from data corrupted with spatially varying noise or uncertainty,
typically obtained by laser range scanners or structured light depth cameras.
Sparse representations are learned from the Middlebury database disparity maps
and then exploited in a two-layer graphical model for inferring depth from
stereo, by including a sparsity prior on the learned features. Since they
capture higher-order dependencies in the depth structure, these priors can
complement smoothness priors commonly used in depth inference based on Markov
Random Field (MRF) models. Inference on the proposed graph is achieved using an
alternating iterative optimization technique, where the first layer is solved
using an existing MRF-based stereo matching algorithm, then held fixed as the
second layer is solved using the proposed non-stationary sparse coding
algorithm. This leads to a general method for improving solutions of state of
the art MRF-based depth estimation algorithms. Our experimental results first
show that depth inference using learned representations leads to state of the
art denoising of depth maps obtained from laser range scanners and a time of
flight camera. Furthermore, we show that adding sparse priors improves the
results of two depth estimation methods: the classical graph cut algorithm by
Boykov et al. and the more recent algorithm of Woodford et al.Comment: 12 page
Sub-cortical brain structure segmentation using F-CNN's
In this paper we propose a deep learning approach for segmenting sub-cortical
structures of the human brain in Magnetic Resonance (MR) image data. We draw
inspiration from a state-of-the-art Fully-Convolutional Neural Network (F-CNN)
architecture for semantic segmentation of objects in natural images, and adapt
it to our task. Unlike previous CNN-based methods that operate on image
patches, our model is applied on a full blown 2D image, without any alignment
or registration steps at testing time. We further improve segmentation results
by interpreting the CNN output as potentials of a Markov Random Field (MRF),
whose topology corresponds to a volumetric grid. Alpha-expansion is used to
perform approximate inference imposing spatial volumetric homogeneity to the
CNN priors. We compare the performance of the proposed pipeline with a similar
system using Random Forest-based priors, as well as state-of-art segmentation
algorithms, and show promising results on two different brain MRI datasets.Comment: ISBI 2016: International Symposium on Biomedical Imaging, Apr 2016,
Prague, Czech Republi
Structured learning of sum-of-submodular higher order energy functions
Submodular functions can be exactly minimized in polynomial time, and the
special case that graph cuts solve with max flow \cite{KZ:PAMI04} has had
significant impact in computer vision
\cite{BVZ:PAMI01,Kwatra:SIGGRAPH03,Rother:GrabCut04}. In this paper we address
the important class of sum-of-submodular (SoS) functions
\cite{Arora:ECCV12,Kolmogorov:DAM12}, which can be efficiently minimized via a
variant of max flow called submodular flow \cite{Edmonds:ADM77}. SoS functions
can naturally express higher order priors involving, e.g., local image patches;
however, it is difficult to fully exploit their expressive power because they
have so many parameters. Rather than trying to formulate existing higher order
priors as an SoS function, we take a discriminative learning approach,
effectively searching the space of SoS functions for a higher order prior that
performs well on our training set. We adopt a structural SVM approach
\cite{Joachims/etal/09a,Tsochantaridis/etal/04} and formulate the training
problem in terms of quadratic programming; as a result we can efficiently
search the space of SoS priors via an extended cutting-plane algorithm. We also
show how the state-of-the-art max flow method for vision problems
\cite{Goldberg:ESA11} can be modified to efficiently solve the submodular flow
problem. Experimental comparisons are made against the OpenCV implementation of
the GrabCut interactive segmentation technique \cite{Rother:GrabCut04}, which
uses hand-tuned parameters instead of machine learning. On a standard dataset
\cite{Gulshan:CVPR10} our method learns higher order priors with hundreds of
parameter values, and produces significantly better segmentations. While our
focus is on binary labeling problems, we show that our techniques can be
naturally generalized to handle more than two labels
- …