19,902 research outputs found

    Multi-Task Learning Using Neighborhood Kernels

    Full text link
    This paper introduces a new and effective algorithm for learning kernels in a Multi-Task Learning (MTL) setting. Although, we consider a MTL scenario here, our approach can be easily applied to standard single task learning, as well. As shown by our empirical results, our algorithm consistently outperforms the traditional kernel learning algorithms such as uniform combination solution, convex combinations of base kernels as well as some kernel alignment-based models, which have been proven to give promising results in the past. We present a Rademacher complexity bound based on which a new Multi-Task Multiple Kernel Learning (MT-MKL) model is derived. In particular, we propose a Support Vector Machine-regularized model in which, for each task, an optimal kernel is learned based on a neighborhood-defining kernel that is not restricted to be positive semi-definite. Comparative experimental results are showcased that underline the merits of our neighborhood-defining framework in both classification and regression problems

    A Binary Classification Framework for Two-Stage Multiple Kernel Learning

    Full text link
    With the advent of kernel methods, automating the task of specifying a suitable kernel has become increasingly important. In this context, the Multiple Kernel Learning (MKL) problem of finding a combination of pre-specified base kernels that is suitable for the task at hand has received significant attention from researchers. In this paper we show that Multiple Kernel Learning can be framed as a standard binary classification problem with additional constraints that ensure the positive definiteness of the learned kernel. Framing MKL in this way has the distinct advantage that it makes it easy to leverage the extensive research in binary classification to develop better performing and more scalable MKL algorithms that are conceptually simpler, and, arguably, more accessible to practitioners. Experiments on nine data sets from different domains show that, despite its simplicity, the proposed technique compares favorably with current leading MKL approaches.Comment: Appears in Proceedings of the 29th International Conference on Machine Learning (ICML 2012

    Continuous Molecular Fields Approach Applied to Structure-Activity Modeling

    Full text link
    The Method of Continuous Molecular Fields is a universal approach to predict various properties of chemical compounds, in which molecules are represented by means of continuous fields (such as electrostatic, steric, electron density functions, etc). The essence of the proposed approach consists in performing statistical analysis of functional molecular data by means of joint application of kernel machine learning methods and special kernels which compare molecules by computing overlap integrals of their molecular fields. This approach is an alternative to traditional methods of building 3D structure-activity and structure-property models based on the use of fixed sets of molecular descriptors. The methodology of the approach is described in this chapter, followed by its application to building regression 3D-QSAR models and conducting virtual screening based on one-class classification models. The main directions of the further development of this approach are outlined at the end of the chapter.Comment: To be published in Applications of Computational Techniques in Pharmacy and Medicin

    A Global Alignment Kernel based Approach for Group-level Happiness Intensity Estimation

    Full text link
    With the progress in automatic human behavior understanding, analysing the perceived affect of multiple people has been recieved interest in affective computing community. Unlike conventional facial expression analysis, this paper primarily focuses on analysing the behaviour of multiple people in an image. The proposed method is based on support vector regression with the combined global alignment kernels (GAKs) to estimate the happiness intensity of a group of people. We first exploit Riesz-based volume local binary pattern (RVLBP) and deep convolutional neural network (CNN) based features for characterizing facial images. Furthermore, we propose to use the GAK for RVLBP and deep CNN features, respectively for explicitly measuring the similarity of two group-level images. Specifically, we exploit the global weight sort scheme to sort the face images from group-level image according to their spatial weights, making an efficient data structure to GAK. Lastly, we propose Multiple kernel learning based on three combination strategies for combining two respective GAKs based on RVLBP and deep CNN features, such that enhancing the discriminative ability of each GAK. Intensive experiments are performed on the challenging group-level happiness intensity database, namely HAPPEI. Our experimental results demonstrate that the proposed approach achieves promising performance for group happiness intensity analysis, when compared with the recent state-of-the-art methods

    Transfer Adaptation Learning: A Decade Survey

    Full text link
    The world we see is ever-changing and it always changes with people, things, and the environment. Domain is referred to as the state of the world at a certain moment. A research problem is characterized as transfer adaptation learning (TAL) when it needs knowledge correspondence between different moments/domains. Conventional machine learning aims to find a model with the minimum expected risk on test data by minimizing the regularized empirical risk on the training data, which, however, supposes that the training and test data share similar joint probability distribution. TAL aims to build models that can perform tasks of target domain by learning knowledge from a semantic related but distribution different source domain. It is an energetic research filed of increasing influence and importance, which is presenting a blowout publication trend. This paper surveys the advances of TAL methodologies in the past decade, and the technical challenges and essential problems of TAL have been observed and discussed with deep insights and new perspectives. Broader solutions of transfer adaptation learning being created by researchers are identified, i.e., instance re-weighting adaptation, feature adaptation, classifier adaptation, deep network adaptation and adversarial adaptation, which are beyond the early semi-supervised and unsupervised split. The survey helps researchers rapidly but comprehensively understand and identify the research foundation, research status, theoretical limitations, future challenges and under-studied issues (universality, interpretability, and credibility) to be broken in the field toward universal representation and safe applications in open-world scenarios.Comment: 26 pages, 4 figure

    A Distributionally Robust Optimization Method for Adversarial Multiple Kernel Learning

    Full text link
    We propose a novel data-driven method to learn a mixture of multiple kernels with random features that is certifiabaly robust against adverserial inputs. Specifically, we consider a distributionally robust optimization of the kernel-target alignment with respect to the distribution of training samples over a distributional ball defined by the Kullback-Leibler (KL) divergence. The distributionally robust optimization problem can be recast as a min-max optimization whose objective function includes a log-sum term. We develop a mini-batch biased stochastic primal-dual proximal method to solve the min-max optimization. To debias the minibatch algorithm, we use the Gumbel perturbation technique to estimate the log-sum term. We establish theoretical guarantees for the performance of the proposed multiple kernel learning method. In particular, we prove the consistency, asymptotic normality, stochastic equicontinuity, and the minimax rate of the empirical estimators. In addition, based on the notion of Rademacher and Gaussian complexities, we establish distributionally robust generalization bounds that are tighter than previous known bounds. More specifically, we leverage matrix concentration inequalities to establish distributionally robust generalization bounds. We validate our kernel learning approach for classification with the kernel SVMs on synthetic dataset generated by sampling multvariate Gaussian distributions with differernt variance structures. We also apply our kernel learning approach to the MNIST data-set and evaluate its robustness to perturbation of input images under different adversarial models. More specifically, we examine the robustness of the proposed kernel model selection technique against FGSM, PGM, C\&W, and DDN adversarial perturbations, and compare its performance with alternative state-of-the-art multiple kernel learning paradigms.Comment: Major revision. The title and abstract have been update

    Principled Non-Linear Feature Selection

    Full text link
    Recent non-linear feature selection approaches employing greedy optimisation of Centred Kernel Target Alignment(KTA) exhibit strong results in terms of generalisation accuracy and sparsity. However, they are computationally prohibitive for large datasets. We propose randSel, a randomised feature selection algorithm, with attractive scaling properties. Our theoretical analysis of randSel provides strong probabilistic guarantees for correct identification of relevant features. RandSel's characteristics make it an ideal candidate for identifying informative learned representations. We've conducted experimentation to establish the performance of this approach, and present encouraging results, including a 3rd position result in the recent ICML black box learning challenge as well as competitive results for signal peptide prediction, an important problem in bioinformatics.Comment: arXiv admin note: substantial text overlap with arXiv:1311.563

    CNN-based Action Recognition and Supervised Domain Adaptation on 3D Body Skeletons via Kernel Feature Maps

    Full text link
    Deep learning is ubiquitous across many areas areas of computer vision. It often requires large scale datasets for training before being fine-tuned on small-to-medium scale problems. Activity, or, in other words, action recognition, is one of many application areas of deep learning. While there exist many Convolutional Neural Network architectures that work with the RGB and optical flow frames, training on the time sequences of 3D body skeleton joints is often performed via recurrent networks such as LSTM. In this paper, we propose a new representation which encodes sequences of 3D body skeleton joints in texture-like representations derived from mathematically rigorous kernel methods. Such a representation becomes the first layer in a standard CNN network e.g., ResNet-50, which is then used in the supervised domain adaptation pipeline to transfer information from the source to target dataset. This lets us leverage the available Kinect-based data beyond training on a single dataset and outperform simple fine-tuning on any two datasets combined in a naive manner. More specifically, in this paper we utilize the overlapping classes between datasets. We associate datapoints of the same class via so-called commonality, known from the supervised domain adaptation. We demonstrate state-of-the-art results on three publicly available benchmarks

    Algorithms for Learning Kernels Based on Centered Alignment

    Full text link
    This paper presents new and effective algorithms for learning kernels. In particular, as shown by our empirical results, these algorithms consistently outperform the so-called uniform combination solution that has proven to be difficult to improve upon in the past, as well as other algorithms for learning kernels based on convex combinations of base kernels in both classification and regression. Our algorithms are based on the notion of centered alignment which is used as a similarity measure between kernels or kernel matrices. We present a number of novel algorithmic, theoretical, and empirical results for learning kernels based on our notion of centered alignment. In particular, we describe efficient algorithms for learning a maximum alignment kernel by showing that the problem can be reduced to a simple QP and discuss a one-stage algorithm for learning both a kernel and a hypothesis based on that kernel using an alignment-based regularization. Our theoretical results include a novel concentration bound for centered alignment between kernel matrices, the proof of the existence of effective predictors for kernels with high alignment, both for classification and for regression, and the proof of stability-based generalization bounds for a broad family of algorithms for learning kernels based on centered alignment. We also report the results of experiments with our centered alignment-based algorithms in both classification and regression

    An introduction to domain adaptation and transfer learning

    Full text link
    In machine learning, if the training data is an unbiased sample of an underlying distribution, then the learned classification function will make accurate predictions for new samples. However, if the training data is not an unbiased sample, then there will be differences between how the training data is distributed and how the test data is distributed. Standard classifiers cannot cope with changes in data distributions between training and test phases, and will not perform well. Domain adaptation and transfer learning are sub-fields within machine learning that are concerned with accounting for these types of changes. Here, we present an introduction to these fields, guided by the question: when and how can a classifier generalize from a source to a target domain? We will start with a brief introduction into risk minimization, and how transfer learning and domain adaptation expand upon this framework. Following that, we discuss three special cases of data set shift, namely prior, covariate and concept shift. For more complex domain shifts, there are a wide variety of approaches. These are categorized into: importance-weighting, subspace mapping, domain-invariant spaces, feature augmentation, minimax estimators and robust algorithms. A number of points will arise, which we will discuss in the last section. We conclude with the remark that many open questions will have to be addressed before transfer learners and domain-adaptive classifiers become practical.Comment: Technical Report. 41 pages, 5 figure
    • …
    corecore