8,105 research outputs found
Growing Regression Forests by Classification: Applications to Object Pose Estimation
In this work, we propose a novel node splitting method for regression trees
and incorporate it into the regression forest framework. Unlike traditional
binary splitting, where the splitting rule is selected from a predefined set of
binary splitting rules via trial-and-error, the proposed node splitting method
first finds clusters of the training data which at least locally minimize the
empirical loss without considering the input space. Then splitting rules which
preserve the found clusters as much as possible are determined by casting the
problem into a classification problem. Consequently, our new node splitting
method enjoys more freedom in choosing the splitting rules, resulting in more
efficient tree structures. In addition to the Euclidean target space, we
present a variant which can naturally deal with a circular target space by the
proper use of circular statistics. We apply the regression forest employing our
node splitting to head pose estimation (Euclidean target space) and car
direction estimation (circular target space) and demonstrate that the proposed
method significantly outperforms state-of-the-art methods (38.5% and 22.5%
error reduction respectively).Comment: Paper accepted by ECCV 201
Boosted Random ferns for object detection
© 20xx IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.In this paper we introduce the Boosted Random Ferns (BRFs) to rapidly build discriminative classifiers for learning and detecting object categories. At the core of our approach we use standard random ferns, but we introduce four main innovations that let us bring ferns from an instance to a category level, and still retain efficiency. First, we define binary features on the histogram of oriented gradients-domain (as opposed to intensity-), allowing for a better representation of intra-class variability. Second, both the positions where ferns are evaluated within the sliding window, and the location of the binary features for each fern are not chosen completely at random, but instead we use a boosting strategy to pick the most discriminative combination of them. This is further enhanced by our third contribution, that is to adapt the boosting strategy to enable sharing of binary features among different ferns, yielding high recognition rates at a low computational cost. And finally, we show that training can be performed online, for sequentially arriving images. Overall, the resulting classifier can be very efficiently trained, densely evaluated for all image locations in about 0.1 seconds, and provides detection rates similar to competing approaches that require expensive and significantly slower processing times. We demonstrate the effectiveness of our approach by thorough experimentation in publicly available datasets in which we compare against state-of-the-art, and for tasks of both 2D detection and 3D multi-view estimation.Peer ReviewedPostprint (author's final draft
Simultaneous Facial Landmark Detection, Pose and Deformation Estimation under Facial Occlusion
Facial landmark detection, head pose estimation, and facial deformation
analysis are typical facial behavior analysis tasks in computer vision. The
existing methods usually perform each task independently and sequentially,
ignoring their interactions. To tackle this problem, we propose a unified
framework for simultaneous facial landmark detection, head pose estimation, and
facial deformation analysis, and the proposed model is robust to facial
occlusion. Following a cascade procedure augmented with model-based head pose
estimation, we iteratively update the facial landmark locations, facial
occlusion, head pose and facial de- formation until convergence. The
experimental results on benchmark databases demonstrate the effectiveness of
the proposed method for simultaneous facial landmark detection, head pose and
facial deformation estimation, even if the images are under facial occlusion.Comment: International Conference on Computer Vision and Pattern Recognition,
201
Deep Directional Statistics: Pose Estimation with Uncertainty Quantification
Modern deep learning systems successfully solve many perception tasks such as
object pose estimation when the input image is of high quality. However, in
challenging imaging conditions such as on low-resolution images or when the
image is corrupted by imaging artifacts, current systems degrade considerably
in accuracy. While a loss in performance is unavoidable, we would like our
models to quantify their uncertainty in order to achieve robustness against
images of varying quality. Probabilistic deep learning models combine the
expressive power of deep learning with uncertainty quantification. In this
paper, we propose a novel probabilistic deep learning model for the task of
angular regression. Our model uses von Mises distributions to predict a
distribution over object pose angle. Whereas a single von Mises distribution is
making strong assumptions about the shape of the distribution, we extend the
basic model to predict a mixture of von Mises distributions. We show how to
learn a mixture model using a finite and infinite number of mixture components.
Our model allows for likelihood-based training and efficient inference at test
time. We demonstrate on a number of challenging pose estimation datasets that
our model produces calibrated probability predictions and competitive or
superior point estimates compared to the current state-of-the-art
Recovering 6D Object Pose and Predicting Next-Best-View in the Crowd
Object detection and 6D pose estimation in the crowd (scenes with multiple
object instances, severe foreground occlusions and background distractors), has
become an important problem in many rapidly evolving technological areas such
as robotics and augmented reality. Single shot-based 6D pose estimators with
manually designed features are still unable to tackle the above challenges,
motivating the research towards unsupervised feature learning and
next-best-view estimation. In this work, we present a complete framework for
both single shot-based 6D object pose estimation and next-best-view prediction
based on Hough Forests, the state of the art object pose estimator that
performs classification and regression jointly. Rather than using manually
designed features we a) propose an unsupervised feature learnt from
depth-invariant patches using a Sparse Autoencoder and b) offer an extensive
evaluation of various state of the art features. Furthermore, taking advantage
of the clustering performed in the leaf nodes of Hough Forests, we learn to
estimate the reduction of uncertainty in other views, formulating the problem
of selecting the next-best-view. To further improve pose estimation, we propose
an improved joint registration and hypotheses verification module as a final
refinement step to reject false detections. We provide two additional
challenging datasets inspired from realistic scenarios to extensively evaluate
the state of the art and our framework. One is related to domestic environments
and the other depicts a bin-picking scenario mostly found in industrial
settings. We show that our framework significantly outperforms state of the art
both on public and on our datasets.Comment: CVPR 2016 accepted paper, project page:
http://www.iis.ee.ic.ac.uk/rkouskou/6D_NBV.htm
- …