26,244 research outputs found
"'Who are you?' - Learning person specific classifiers from video"
We investigate the problem of automatically labelling
faces of characters in TV or movie material with their
names, using only weak supervision from automaticallyaligned
subtitle and script text. Our previous work (Everingham
et al. [8]) demonstrated promising results on the
task, but the coverage of the method (proportion of video
labelled) and generalization was limited by a restriction to
frontal faces and nearest neighbour classification.
In this paper we build on that method, extending the coverage
greatly by the detection and recognition of characters
in profile views. In addition, we make the following contributions:
(i) seamless tracking, integration and recognition
of profile and frontal detections, and (ii) a character specific
multiple kernel classifier which is able to learn the features
best able to discriminate between the characters.
We report results on seven episodes of the TV series
“Buffy the Vampire Slayer”, demonstrating significantly increased
coverage and performance with respect to previous
methods on this material
Combining multiple resolutions into hierarchical representations for kernel-based image classification
Geographic object-based image analysis (GEOBIA) framework has gained
increasing interest recently. Following this popular paradigm, we propose a
novel multiscale classification approach operating on a hierarchical image
representation built from two images at different resolutions. They capture the
same scene with different sensors and are naturally fused together through the
hierarchical representation, where coarser levels are built from a Low Spatial
Resolution (LSR) or Medium Spatial Resolution (MSR) image while finer levels
are generated from a High Spatial Resolution (HSR) or Very High Spatial
Resolution (VHSR) image. Such a representation allows one to benefit from the
context information thanks to the coarser levels, and subregions spatial
arrangement information thanks to the finer levels. Two dedicated structured
kernels are then used to perform machine learning directly on the constructed
hierarchical representation. This strategy overcomes the limits of conventional
GEOBIA classification procedures that can handle only one or very few
pre-selected scales. Experiments run on an urban classification task show that
the proposed approach can highly improve the classification accuracy w.r.t.
conventional approaches working on a single scale.Comment: International Conference on Geographic Object-Based Image Analysis
(GEOBIA 2016), University of Twente in Enschede, The Netherland
Compositional Model based Fisher Vector Coding for Image Classification
Deriving from the gradient vector of a generative model of local features,
Fisher vector coding (FVC) has been identified as an effective coding method
for image classification. Most, if not all, FVC implementations employ the
Gaussian mixture model (GMM) to depict the generation process of local
features. However, the representative power of the GMM could be limited because
it essentially assumes that local features can be characterized by a fixed
number of feature prototypes and the number of prototypes is usually small in
FVC. To handle this limitation, in this paper we break the convention which
assumes that a local feature is drawn from one of few Gaussian distributions.
Instead, we adopt a compositional mechanism which assumes that a local feature
is drawn from a Gaussian distribution whose mean vector is composed as the
linear combination of multiple key components and the combination weight is a
latent random variable. In this way, we can greatly enhance the representative
power of the generative model of FVC. To implement our idea, we designed two
particular generative models with such a compositional mechanism.Comment: Fixed typos. 16 pages. Appearing in IEEE T. Pattern Analysis and
Machine Intelligence (TPAMI
- …