1,373 research outputs found
Multi-View Priors for Learning Detectors from Sparse Viewpoint Data
While the majority of today's object class models provide only 2D bounding
boxes, far richer output hypotheses are desirable including viewpoint,
fine-grained category, and 3D geometry estimate. However, models trained to
provide richer output require larger amounts of training data, preferably well
covering the relevant aspects such as viewpoint and fine-grained categories. In
this paper, we address this issue from the perspective of transfer learning,
and design an object class model that explicitly leverages correlations between
visual features. Specifically, our model represents prior distributions over
permissible multi-view detectors in a parametric way -- the priors are learned
once from training data of a source object class, and can later be used to
facilitate the learning of a detector for a target class. As we show in our
experiments, this transfer is not only beneficial for detectors based on
basic-level category representations, but also enables the robust learning of
detectors that represent classes at finer levels of granularity, where training
data is typically even scarcer and more unbalanced. As a result, we report
largely improved performance in simultaneous 2D object localization and
viewpoint estimation on a recent dataset of challenging street scenes.Comment: 13 pages, 7 figures, 4 tables, International Conference on Learning
Representations 201
One-shot learning of object categories
Learning visual models of object categories notoriously requires hundreds or thousands of training examples. We show that it is possible to learn much information about a category from just one, or a handful, of images. The key insight is that, rather than learning from scratch, one can take advantage of knowledge coming from previously learned categories, no matter how different these categories might be. We explore a Bayesian implementation of this idea. Object categories are represented by probabilistic models. Prior knowledge is represented as a probability density function on the parameters of these models. The posterior model for an object category is obtained by updating the prior in the light of one or more observations. We test a simple implementation of our algorithm on a database of 101 diverse object categories. We compare category models learned by an implementation of our Bayesian approach to models learned from by maximum likelihood (ML) and maximum a posteriori (MAP) methods. We find that on a database of more than 100 categories, the Bayesian approach produces informative models when the number of training examples is too small for other methods to operate successfully
Learning Human Pose Estimation Features with Convolutional Networks
This paper introduces a new architecture for human pose estimation using a
multi- layer convolutional network architecture and a modified learning
technique that learns low-level features and higher-level weak spatial models.
Unconstrained human pose estimation is one of the hardest problems in computer
vision, and our new architecture and learning schema shows significant
improvement over the current state-of-the-art results. The main contribution of
this paper is showing, for the first time, that a specific variation of deep
learning is able to outperform all existing traditional architectures on this
task. The paper also discusses several lessons learned while researching
alternatives, most notably, that it is possible to learn strong low-level
feature detectors on features that might even just cover a few pixels in the
image. Higher-level spatial models improve somewhat the overall result, but to
a much lesser extent then expected. Many researchers previously argued that the
kinematic structure and top-down information is crucial for this domain, but
with our purely bottom up, and weak spatial model, we could improve other more
complicated architectures that currently produce the best results. This mirrors
what many other researchers, like those in the speech recognition, object
recognition, and other domains have experienced
- …