7,378 research outputs found
Max-margin Metric Learning for Speaker Recognition
Probabilistic linear discriminant analysis (PLDA) is a popular normalization
approach for the i-vector model, and has delivered state-of-the-art performance
in speaker recognition. A potential problem of the PLDA model, however, is that
it essentially assumes Gaussian distributions over speaker vectors, which is
not always true in practice. Additionally, the objective function is not
directly related to the goal of the task, e.g., discriminating true speakers
and imposters. In this paper, we propose a max-margin metric learning approach
to solve the problems. It learns a linear transform with a criterion that the
margin between target and imposter trials are maximized. Experiments conducted
on the SRE08 core test show that compared to PLDA, the new approach can obtain
comparable or even better performance, though the scoring is simply a cosine
computation
Beyond Gauss: Image-Set Matching on the Riemannian Manifold of PDFs
State-of-the-art image-set matching techniques typically implicitly model
each image-set with a Gaussian distribution. Here, we propose to go beyond
these representations and model image-sets as probability distribution
functions (PDFs) using kernel density estimators. To compare and match
image-sets, we exploit Csiszar f-divergences, which bear strong connections to
the geodesic distance defined on the space of PDFs, i.e., the statistical
manifold. Furthermore, we introduce valid positive definite kernels on the
statistical manifolds, which let us make use of more powerful classification
schemes to match image-sets. Finally, we introduce a supervised dimensionality
reduction technique that learns a latent space where f-divergences reflect the
class labels of the data. Our experiments on diverse problems, such as
video-based face recognition and dynamic texture classification, evidence the
benefits of our approach over the state-of-the-art image-set matching methods
Highly Efficient Regression for Scalable Person Re-Identification
Existing person re-identification models are poor for scaling up to large
data required in real-world applications due to: (1) Complexity: They employ
complex models for optimal performance resulting in high computational cost for
training at a large scale; (2) Inadaptability: Once trained, they are
unsuitable for incremental update to incorporate any new data available. This
work proposes a truly scalable solution to re-id by addressing both problems.
Specifically, a Highly Efficient Regression (HER) model is formulated by
embedding the Fisher's criterion to a ridge regression model for very fast
re-id model learning with scalable memory/storage usage. Importantly, this new
HER model supports faster than real-time incremental model updates therefore
making real-time active learning feasible in re-id with human-in-the-loop.
Extensive experiments show that such a simple and fast model not only
outperforms notably the state-of-the-art re-id methods, but also is more
scalable to large data with additional benefits to active learning for reducing
human labelling effort in re-id deployment
- …