1,563 research outputs found

    Recursive Aggregation of Estimators by Mirror Descent Algorithm with Averaging

    Get PDF
    We consider a recursive algorithm to construct an aggregated estimator from a finite number of base decision rules in the classification problem. The estimator approximately minimizes a convex risk functional under the l1-constraint. It is defined by a stochastic version of the mirror descent algorithm (i.e., of the method which performs gradient descent in the dual space) with an additional averaging. The main result of the paper is an upper bound for the expected accuracy of the proposed estimator. This bound is of the order (logM)/t\sqrt{(\log M)/t} with an explicit and small constant factor, where MM is the dimension of the problem and tt stands for the sample size. A similar bound is proved for a more general setting that covers, in particular, the regression model with squared loss.Comment: 29 pages; mai 200

    Combining Kernel Functions in Supervised Learning Models.

    Get PDF
    The research activity has mainly dealt with supervised Machine Learning algorithms, specifically within the context of kernel methods. A kernel function is a positive definite function mapping data from the original input space into a higher dimensional Hilbert space. Differently from classical linear methods, where problems are solved seeking for a linear function separating points in the input space, kernel methods all have in common the same basic focus: original input data is mapped onto a higher dimensional feature set where new coordinates are not computed, but only the inner product of input points. In this way, kernel methods make possible to deal with non-linearly separable set of data, making use of linear models in the feature space: all the Machine Learning methods using a linear function to determine the best fitting for a set of given data. Instead of employing one single kernel function, Multiple Kernel Learning algorithms tackle the problem of selecting kernel functions by using a combination of preset base kernels. Infinite Kernel Learning further extends such idea by exploiting a combination of possibly infinite base kernels. The research activity core idea is utilize a novel complex combination of kernel functions in already existing or modified supervised Machine Learning frameworks. Specifically, we considered two frameworks: Extreme Learning Machine, having the structure of classical feedforward Neural Networks but being characterized by hidden nodes variables randomly assigned at the beginning of the algorithm; Support Vector Machine, a class of linear algorithms based on the idea of separating data with a hyperplane having as wide a margin as possible. The first proposed model extends the classical Extreme Learning Machine formulation using a combination of possibly infinitely many base kernel, presenting a two-step algorithm. The second result uses a preexisting multi-task kernel function in a novel Support Vector Machine framework. Multi-task learning defines the Machine Learning problem of solving more than one task at the same time, with the main goal of taking into account the existing multi-task relationships. To be able to use the existing multi-task kernel function, we had to construct a new framework based on the classical Support Vector Machine one, taking care of every multi-task correlation factor

    Local-Aggregate Modeling for Big-Data via Distributed Optimization: Applications to Neuroimaging

    Full text link
    Technological advances have led to a proliferation of structured big data that have matrix-valued covariates. We are specifically motivated to build predictive models for multi-subject neuroimaging data based on each subject's brain imaging scans. This is an ultra-high-dimensional problem that consists of a matrix of covariates (brain locations by time points) for each subject; few methods currently exist to fit supervised models directly to this tensor data. We propose a novel modeling and algorithmic strategy to apply generalized linear models (GLMs) to this massive tensor data in which one set of variables is associated with locations. Our method begins by fitting GLMs to each location separately, and then builds an ensemble by blending information across locations through regularization with what we term an aggregating penalty. Our so called, Local-Aggregate Model, can be fit in a completely distributed manner over the locations using an Alternating Direction Method of Multipliers (ADMM) strategy, and thus greatly reduces the computational burden. Furthermore, we propose to select the appropriate model through a novel sequence of faster algorithmic solutions that is similar to regularization paths. We will demonstrate both the computational and predictive modeling advantages of our methods via simulations and an EEG classification problem.Comment: 41 pages, 5 figures and 3 table

    Kernel Methods for Machine Learning with Life Science Applications

    Get PDF