19,902 research outputs found
Multi-Task Learning Using Neighborhood Kernels
This paper introduces a new and effective algorithm for learning kernels in a
Multi-Task Learning (MTL) setting. Although, we consider a MTL scenario here,
our approach can be easily applied to standard single task learning, as well.
As shown by our empirical results, our algorithm consistently outperforms the
traditional kernel learning algorithms such as uniform combination solution,
convex combinations of base kernels as well as some kernel alignment-based
models, which have been proven to give promising results in the past. We
present a Rademacher complexity bound based on which a new Multi-Task Multiple
Kernel Learning (MT-MKL) model is derived. In particular, we propose a Support
Vector Machine-regularized model in which, for each task, an optimal kernel is
learned based on a neighborhood-defining kernel that is not restricted to be
positive semi-definite. Comparative experimental results are showcased that
underline the merits of our neighborhood-defining framework in both
classification and regression problems
A Binary Classification Framework for Two-Stage Multiple Kernel Learning
With the advent of kernel methods, automating the task of specifying a
suitable kernel has become increasingly important. In this context, the
Multiple Kernel Learning (MKL) problem of finding a combination of
pre-specified base kernels that is suitable for the task at hand has received
significant attention from researchers. In this paper we show that Multiple
Kernel Learning can be framed as a standard binary classification problem with
additional constraints that ensure the positive definiteness of the learned
kernel. Framing MKL in this way has the distinct advantage that it makes it
easy to leverage the extensive research in binary classification to develop
better performing and more scalable MKL algorithms that are conceptually
simpler, and, arguably, more accessible to practitioners. Experiments on nine
data sets from different domains show that, despite its simplicity, the
proposed technique compares favorably with current leading MKL approaches.Comment: Appears in Proceedings of the 29th International Conference on
Machine Learning (ICML 2012
Continuous Molecular Fields Approach Applied to Structure-Activity Modeling
The Method of Continuous Molecular Fields is a universal approach to predict
various properties of chemical compounds, in which molecules are represented by
means of continuous fields (such as electrostatic, steric, electron density
functions, etc). The essence of the proposed approach consists in performing
statistical analysis of functional molecular data by means of joint application
of kernel machine learning methods and special kernels which compare molecules
by computing overlap integrals of their molecular fields. This approach is an
alternative to traditional methods of building 3D structure-activity and
structure-property models based on the use of fixed sets of molecular
descriptors. The methodology of the approach is described in this chapter,
followed by its application to building regression 3D-QSAR models and
conducting virtual screening based on one-class classification models. The main
directions of the further development of this approach are outlined at the end
of the chapter.Comment: To be published in Applications of Computational Techniques in
Pharmacy and Medicin
A Global Alignment Kernel based Approach for Group-level Happiness Intensity Estimation
With the progress in automatic human behavior understanding, analysing the
perceived affect of multiple people has been recieved interest in affective
computing community. Unlike conventional facial expression analysis, this paper
primarily focuses on analysing the behaviour of multiple people in an image.
The proposed method is based on support vector regression with the combined
global alignment kernels (GAKs) to estimate the happiness intensity of a group
of people. We first exploit Riesz-based volume local binary pattern (RVLBP) and
deep convolutional neural network (CNN) based features for characterizing
facial images. Furthermore, we propose to use the GAK for RVLBP and deep CNN
features, respectively for explicitly measuring the similarity of two
group-level images. Specifically, we exploit the global weight sort scheme to
sort the face images from group-level image according to their spatial weights,
making an efficient data structure to GAK. Lastly, we propose Multiple kernel
learning based on three combination strategies for combining two respective
GAKs based on RVLBP and deep CNN features, such that enhancing the
discriminative ability of each GAK. Intensive experiments are performed on the
challenging group-level happiness intensity database, namely HAPPEI. Our
experimental results demonstrate that the proposed approach achieves promising
performance for group happiness intensity analysis, when compared with the
recent state-of-the-art methods
Transfer Adaptation Learning: A Decade Survey
The world we see is ever-changing and it always changes with people, things,
and the environment. Domain is referred to as the state of the world at a
certain moment. A research problem is characterized as transfer adaptation
learning (TAL) when it needs knowledge correspondence between different
moments/domains. Conventional machine learning aims to find a model with the
minimum expected risk on test data by minimizing the regularized empirical risk
on the training data, which, however, supposes that the training and test data
share similar joint probability distribution. TAL aims to build models that can
perform tasks of target domain by learning knowledge from a semantic related
but distribution different source domain. It is an energetic research filed of
increasing influence and importance, which is presenting a blowout publication
trend. This paper surveys the advances of TAL methodologies in the past decade,
and the technical challenges and essential problems of TAL have been observed
and discussed with deep insights and new perspectives. Broader solutions of
transfer adaptation learning being created by researchers are identified, i.e.,
instance re-weighting adaptation, feature adaptation, classifier adaptation,
deep network adaptation and adversarial adaptation, which are beyond the early
semi-supervised and unsupervised split. The survey helps researchers rapidly
but comprehensively understand and identify the research foundation, research
status, theoretical limitations, future challenges and under-studied issues
(universality, interpretability, and credibility) to be broken in the field
toward universal representation and safe applications in open-world scenarios.Comment: 26 pages, 4 figure
A Distributionally Robust Optimization Method for Adversarial Multiple Kernel Learning
We propose a novel data-driven method to learn a mixture of multiple kernels
with random features that is certifiabaly robust against adverserial inputs.
Specifically, we consider a distributionally robust optimization of the
kernel-target alignment with respect to the distribution of training samples
over a distributional ball defined by the Kullback-Leibler (KL) divergence. The
distributionally robust optimization problem can be recast as a min-max
optimization whose objective function includes a log-sum term. We develop a
mini-batch biased stochastic primal-dual proximal method to solve the min-max
optimization. To debias the minibatch algorithm, we use the Gumbel perturbation
technique to estimate the log-sum term. We establish theoretical guarantees for
the performance of the proposed multiple kernel learning method. In particular,
we prove the consistency, asymptotic normality, stochastic equicontinuity, and
the minimax rate of the empirical estimators. In addition, based on the notion
of Rademacher and Gaussian complexities, we establish distributionally robust
generalization bounds that are tighter than previous known bounds. More
specifically, we leverage matrix concentration inequalities to establish
distributionally robust generalization bounds. We validate our kernel learning
approach for classification with the kernel SVMs on synthetic dataset generated
by sampling multvariate Gaussian distributions with differernt variance
structures. We also apply our kernel learning approach to the MNIST data-set
and evaluate its robustness to perturbation of input images under different
adversarial models. More specifically, we examine the robustness of the
proposed kernel model selection technique against FGSM, PGM, C\&W, and DDN
adversarial perturbations, and compare its performance with alternative
state-of-the-art multiple kernel learning paradigms.Comment: Major revision. The title and abstract have been update
Principled Non-Linear Feature Selection
Recent non-linear feature selection approaches employing greedy optimisation
of Centred Kernel Target Alignment(KTA) exhibit strong results in terms of
generalisation accuracy and sparsity. However, they are computationally
prohibitive for large datasets. We propose randSel, a randomised feature
selection algorithm, with attractive scaling properties. Our theoretical
analysis of randSel provides strong probabilistic guarantees for correct
identification of relevant features. RandSel's characteristics make it an ideal
candidate for identifying informative learned representations. We've conducted
experimentation to establish the performance of this approach, and present
encouraging results, including a 3rd position result in the recent ICML black
box learning challenge as well as competitive results for signal peptide
prediction, an important problem in bioinformatics.Comment: arXiv admin note: substantial text overlap with arXiv:1311.563
CNN-based Action Recognition and Supervised Domain Adaptation on 3D Body Skeletons via Kernel Feature Maps
Deep learning is ubiquitous across many areas areas of computer vision. It
often requires large scale datasets for training before being fine-tuned on
small-to-medium scale problems. Activity, or, in other words, action
recognition, is one of many application areas of deep learning. While there
exist many Convolutional Neural Network architectures that work with the RGB
and optical flow frames, training on the time sequences of 3D body skeleton
joints is often performed via recurrent networks such as LSTM.
In this paper, we propose a new representation which encodes sequences of 3D
body skeleton joints in texture-like representations derived from
mathematically rigorous kernel methods. Such a representation becomes the first
layer in a standard CNN network e.g., ResNet-50, which is then used in the
supervised domain adaptation pipeline to transfer information from the source
to target dataset. This lets us leverage the available Kinect-based data beyond
training on a single dataset and outperform simple fine-tuning on any two
datasets combined in a naive manner. More specifically, in this paper we
utilize the overlapping classes between datasets. We associate datapoints of
the same class via so-called commonality, known from the supervised domain
adaptation. We demonstrate state-of-the-art results on three publicly available
benchmarks
Algorithms for Learning Kernels Based on Centered Alignment
This paper presents new and effective algorithms for learning kernels. In
particular, as shown by our empirical results, these algorithms consistently
outperform the so-called uniform combination solution that has proven to be
difficult to improve upon in the past, as well as other algorithms for learning
kernels based on convex combinations of base kernels in both classification and
regression. Our algorithms are based on the notion of centered alignment which
is used as a similarity measure between kernels or kernel matrices. We present
a number of novel algorithmic, theoretical, and empirical results for learning
kernels based on our notion of centered alignment. In particular, we describe
efficient algorithms for learning a maximum alignment kernel by showing that
the problem can be reduced to a simple QP and discuss a one-stage algorithm for
learning both a kernel and a hypothesis based on that kernel using an
alignment-based regularization. Our theoretical results include a novel
concentration bound for centered alignment between kernel matrices, the proof
of the existence of effective predictors for kernels with high alignment, both
for classification and for regression, and the proof of stability-based
generalization bounds for a broad family of algorithms for learning kernels
based on centered alignment. We also report the results of experiments with our
centered alignment-based algorithms in both classification and regression
An introduction to domain adaptation and transfer learning
In machine learning, if the training data is an unbiased sample of an
underlying distribution, then the learned classification function will make
accurate predictions for new samples. However, if the training data is not an
unbiased sample, then there will be differences between how the training data
is distributed and how the test data is distributed. Standard classifiers
cannot cope with changes in data distributions between training and test
phases, and will not perform well. Domain adaptation and transfer learning are
sub-fields within machine learning that are concerned with accounting for these
types of changes. Here, we present an introduction to these fields, guided by
the question: when and how can a classifier generalize from a source to a
target domain? We will start with a brief introduction into risk minimization,
and how transfer learning and domain adaptation expand upon this framework.
Following that, we discuss three special cases of data set shift, namely prior,
covariate and concept shift. For more complex domain shifts, there are a wide
variety of approaches. These are categorized into: importance-weighting,
subspace mapping, domain-invariant spaces, feature augmentation, minimax
estimators and robust algorithms. A number of points will arise, which we will
discuss in the last section. We conclude with the remark that many open
questions will have to be addressed before transfer learners and
domain-adaptive classifiers become practical.Comment: Technical Report. 41 pages, 5 figure
- …