2,094 research outputs found
Discriminatively Trained Latent Ordinal Model for Video Classification
We study the problem of video classification for facial analysis and human
action recognition. We propose a novel weakly supervised learning method that
models the video as a sequence of automatically mined, discriminative
sub-events (eg. onset and offset phase for "smile", running and jumping for
"highjump"). The proposed model is inspired by the recent works on Multiple
Instance Learning and latent SVM/HCRF -- it extends such frameworks to model
the ordinal aspect in the videos, approximately. We obtain consistent
improvements over relevant competitive baselines on four challenging and
publicly available video based facial analysis datasets for prediction of
expression, clinical pain and intent in dyadic conversations and on three
challenging human action datasets. We also validate the method with qualitative
results and show that they largely support the intuitions behind the method.Comment: Paper accepted in IEEE TPAMI. arXiv admin note: substantial text
overlap with arXiv:1604.0150
Geometric deep learning: going beyond Euclidean data
Many scientific fields study data with an underlying structure that is a
non-Euclidean space. Some examples include social networks in computational
social sciences, sensor networks in communications, functional networks in
brain imaging, regulatory networks in genetics, and meshed surfaces in computer
graphics. In many applications, such geometric data are large and complex (in
the case of social networks, on the scale of billions), and are natural targets
for machine learning techniques. In particular, we would like to use deep
neural networks, which have recently proven to be powerful tools for a broad
range of problems from computer vision, natural language processing, and audio
analysis. However, these tools have been most successful on data with an
underlying Euclidean or grid-like structure, and in cases where the invariances
of these structures are built into networks used to model them. Geometric deep
learning is an umbrella term for emerging techniques attempting to generalize
(structured) deep neural models to non-Euclidean domains such as graphs and
manifolds. The purpose of this paper is to overview different examples of
geometric deep learning problems and present available solutions, key
difficulties, applications, and future research directions in this nascent
field
Localizing Objects While Learning Their Appearance
Abstract. Learning a new object class from cluttered training images is very challenging when the location of object instances is unknown. Previous works generally require objects covering a large portion of the images. We present a novel approach that can cope with extensive clutter as well as large scale and appearance variations between object instances. To make this possible we propose a conditional random field that starts from generic knowledge and then progressively adapts to the new class. Our approach simultaneously localizes object instances while learning an appearance model specific for the class. We demonstrate this on the challenging Pascal VOC 2007 dataset. Furthermore, our method enables to train any state-of-the-art object detector in a weakly supervised fashion, although it would normally require object location annotations.
- …