1,843 research outputs found
Bounded-Distortion Metric Learning
Metric learning aims to embed one metric space into another to benefit tasks
like classification and clustering. Although a greatly distorted metric space
has a high degree of freedom to fit training data, it is prone to overfitting
and numerical inaccuracy. This paper presents {\it bounded-distortion metric
learning} (BDML), a new metric learning framework which amounts to finding an
optimal Mahalanobis metric space with a bounded-distortion constraint. An
efficient solver based on the multiplicative weights update method is proposed.
Moreover, we generalize BDML to pseudo-metric learning and devise the
semidefinite relaxation and a randomized algorithm to approximately solve it.
We further provide theoretical analysis to show that distortion is a key
ingredient for stability and generalization ability of our BDML algorithm.
Extensive experiments on several benchmark datasets yield promising results
Positive Semidefinite Metric Learning Using Boosting-like Algorithms
The success of many machine learning and pattern recognition methods relies
heavily upon the identification of an appropriate distance metric on the input
data. It is often beneficial to learn such a metric from the input training
data, instead of using a default one such as the Euclidean distance. In this
work, we propose a boosting-based technique, termed BoostMetric, for learning a
quadratic Mahalanobis distance metric. Learning a valid Mahalanobis distance
metric requires enforcing the constraint that the matrix parameter to the
metric remains positive definite. Semidefinite programming is often used to
enforce this constraint, but does not scale well and easy to implement.
BoostMetric is instead based on the observation that any positive semidefinite
matrix can be decomposed into a linear combination of trace-one rank-one
matrices. BoostMetric thus uses rank-one positive semidefinite matrices as weak
learners within an efficient and scalable boosting-based learning process. The
resulting methods are easy to implement, efficient, and can accommodate various
types of constraints. We extend traditional boosting algorithms in that its
weak learner is a positive semidefinite matrix with trace and rank being one
rather than a classifier or regressor. Experiments on various datasets
demonstrate that the proposed algorithms compare favorably to those
state-of-the-art methods in terms of classification accuracy and running time.Comment: 30 pages, appearing in Journal of Machine Learning Researc
Investigation of new learning methods for visual recognition
Visual recognition is one of the most difficult and prevailing problems in computer vision and pattern recognition due to the challenges in understanding the semantics and contents of digital images. Two major components of a visual recognition system are discriminatory feature representation and efficient and accurate pattern classification. This dissertation therefore focuses on developing new learning methods for visual recognition.
Based on the conventional sparse representation, which shows its robustness for visual recognition problems, a series of new methods is proposed. Specifically, first, a new locally linear K nearest neighbor method, or LLK method, is presented. The LLK method derives a new representation, which is an approximation to the ideal representation, by optimizing an objective function based on a host of criteria for sparsity, locality, and reconstruction. The novel representation is further processed by two new classifiers, namely, an LLK based classifier (LLKc) and a locally linear nearest mean based classifier (LLNc), for visual recognition. The proposed classifiers are shown to connect to the Bayes decision rule for minimum error. Second, a new generative and discriminative sparse representation (GDSR) method is proposed by taking advantage of both a coarse modeling of the generative information and a modeling of the discriminative information. The proposed GDSR method integrates two new criteria, namely, a discriminative criterion and a generative criterion, into the conventional sparse representation criterion. A new generative and discriminative sparse representation based classification (GDSRc) method is then presented based on the derived new representation. Finally, a new Score space based multiple Metric Learning (SML) method is presented for a challenging visual recognition application, namely, recognizing kinship relations or kinship verification. The proposed SML method, which goes beyond the conventional Mahalanobis distance metric learning, not only learns the distance metric but also models the generative process of features by taking advantage of the score space. The SML method is optimized by solving a constrained, non-negative, and weighted variant of the sparse representation problem.
To assess the feasibility of the proposed new learning methods, several visual recognition tasks, such as face recognition, scene recognition, object recognition, computational fine art analysis, action recognition, fine grained recognition, as well as kinship verification are applied. The experimental results show that the proposed new learning methods achieve better performance than the other popular methods
Advances in generative modelling: from component analysis to generative adversarial networks
This Thesis revolves around datasets and algorithms, with a focus on generative modelling. In particular, we first turn our attention to a novel, multi-attribute, 2D facial dataset. We then present deterministic as well as probabilistic Component Analysis (CA) techniques which can be applied to multi-attribute 2D as well as 3D data. We finally present deep learning generative approaches specially designed to manipulate 3D facial data.
Most 2D facial datasets that are available in the literature, are: a) automatically or semi-automatically collected and thus contain noisy labels, hindering the benchmarking and comparisons between algorithms. Moreover, they are not annotated for multiple attributes. In the first part of the Thesis, we present the first manually collected and annotated database, which contains labels for multiple attributes. As we demonstrate in a series of experiments, it can be used in a number of applications ranging from image translation to age-invariant face recognition.
Moving on, we turn our attention to CA methodologies. CA approaches, although being able to only capture linear relationships between data, can still be proven to be efficient in data such as UV maps or 3D data registered in a common template, since they are well aligned. The introduction of more complex datasets in the literature, which contain labels for multiple attributes, naturally brought the need for novel algorithms that can simultaneously handle multiple attributes. In this Thesis, we cover novel CA approaches which are specifically designed to be utilised in datasets annotated with respect to multiple attributes and can be used in a variety of tasks, such as 2D image denoising and translation, as well as 3D data generation and identification.
Nevertheless, while CA methods are indeed efficient when handling registered 3D facial data, linear 3D generative models lack details when it comes to reconstructing or generating finer facial characteristics. To alleviate this, in the final part of this Thesis we propose a novel generative framework harnessing the power of Generative Adversarial Networks.Open Acces
A multimodal deep learning framework using local feature representations for face recognition
YesThe most recent face recognition systems are
mainly dependent on feature representations obtained using
either local handcrafted-descriptors, such as local binary patterns
(LBP), or use a deep learning approach, such as deep
belief network (DBN). However, the former usually suffers
from the wide variations in face images, while the latter
usually discards the local facial features, which are proven
to be important for face recognition. In this paper, a novel
framework based on merging the advantages of the local
handcrafted feature descriptors with the DBN is proposed to
address the face recognition problem in unconstrained conditions.
Firstly, a novel multimodal local feature extraction
approach based on merging the advantages of the Curvelet
transform with Fractal dimension is proposed and termed
the Curvelet–Fractal approach. The main motivation of this
approach is that theCurvelet transform, a newanisotropic and
multidirectional transform, can efficiently represent themain
structure of the face (e.g., edges and curves), while the Fractal
dimension is one of the most powerful texture descriptors
for face images. Secondly, a novel framework is proposed,
termed the multimodal deep face recognition (MDFR)framework,
to add feature representations by training aDBNon top
of the local feature representations instead of the pixel intensity
representations. We demonstrate that representations acquired by the proposed MDFR framework are complementary
to those acquired by the Curvelet–Fractal approach.
Finally, the performance of the proposed approaches has
been evaluated by conducting a number of extensive experiments
on four large-scale face datasets: the SDUMLA-HMT,
FERET, CAS-PEAL-R1, and LFW databases. The results
obtained from the proposed approaches outperform other
state-of-the-art of approaches (e.g., LBP, DBN, WPCA) by
achieving new state-of-the-art results on all the employed
datasets
- …