209 research outputs found

    On Quantile Regression in Reproducing Kernel Hilbert Spaces with Data Sparsity Constraint

    Get PDF
    For spline regressions, it is well known that the choice of knots is crucial for the performance of the estimator. As a general learning framework covering the smoothing splines, learning in a Reproducing Kernel Hilbert Space (RKHS) has a similar issue. However, the selection of training data points for kernel functions in the RKHS representation has not been carefully studied in the literature. In this paper we study quantile regression as an example of learning in a RKHS. In this case, the regular squared norm penalty does not perform training data selection. We propose a data sparsity constraint that imposes thresholding on the kernel function coefficients to achieve a sparse kernel function representation. We demonstrate that the proposed data sparsity method can have competitive prediction performance for certain situations, and have comparable performance in other cases compared to that of the traditional squared norm penalty. Therefore, the data sparsity method can serve as a competitive alternative to the squared norm penalty method. Some theoretical properties of our proposed method using the data sparsity constraint are obtained. Both simulated and real data sets are used to demonstrate the usefulness of our data sparsity constraint

    Multi-view Metric Learning in Vector-valued Kernel Spaces

    Full text link
    We consider the problem of metric learning for multi-view data and present a novel method for learning within-view as well as between-view metrics in vector-valued kernel spaces, as a way to capture multi-modal structure of the data. We formulate two convex optimization problems to jointly learn the metric and the classifier or regressor in kernel feature spaces. An iterative three-step multi-view metric learning algorithm is derived from the optimization problems. In order to scale the computation to large training sets, a block-wise Nystr{\"o}m approximation of the multi-view kernel matrix is introduced. We justify our approach theoretically and experimentally, and show its performance on real-world datasets against relevant state-of-the-art methods

    Learning Theory and Approximation

    Get PDF
    Learning theory studies data structures from samples and aims at understanding unknown function relations behind them. This leads to interesting theoretical problems which can be often attacked with methods from Approximation Theory. This workshop - the second one of this type at the MFO - has concentrated on the following recent topics: Learning of manifolds and the geometry of data; sparsity and dimension reduction; error analysis and algorithmic aspects, including kernel based methods for regression and classification; application of multiscale aspects and of refinement algorithms to learning

    Function Embeddings for Multi-modal Bayesian Inference

    Get PDF
    Tractable Bayesian inference is a fundamental challenge in robotics and machine learning. Standard approaches such as Gaussian process regression and Kalman filtering make strong Gaussianity assumptions about the underlying distributions. Such assumptions, however, can quickly break down when dealing with complex systems such as the dynamics of a robot or multi-variate spatial models. In this thesis we aim to solve Bayesian regression and filtering problems without making assumptions about the underlying distributions. We develop techniques to produce rich posterior representations for complex, multi-modal phenomena. Our work extends kernel Bayes' rule (KBR), which uses empirical estimates of distributions derived from a set of training samples and embeds them into a high-dimensional reproducing kernel Hilbert space (RKHS). Bayes' rule itself occurs on elements of this space. Our first contribution is the development of an efficient method for estimating posterior density functions from kernel Bayes' rule, applied to both filtering and regression. By embedding fixed-mean mixtures of component distributions, we can efficiently find an approximate pre-image by optimising the mixture weights using a convex quadratic program. The result is a complex, multi-modal posterior representation. Our next contributions are methods for estimating cumulative distributions and quantile estimates from the posterior embedding of kernel Bayes' rule. We examine a number of novel methods, including those based on our density estimation techniques, as well as directly estimating the cumulative through use of the reproducing property of RKHSs. Finally, we develop a novel method for scaling kernel Bayes' rule inference to large datasets, using a reduced-set construction optimised using the posterior likelihood. This method retains the ability to perform multi-output inference, as well as our earlier contributions to represent explicitly non-Gaussian posteriors and quantile estimates
    corecore