1,563 research outputs found
Recursive Aggregation of Estimators by Mirror Descent Algorithm with Averaging
We consider a recursive algorithm to construct an aggregated estimator from a
finite number of base decision rules in the classification problem. The
estimator approximately minimizes a convex risk functional under the
l1-constraint. It is defined by a stochastic version of the mirror descent
algorithm (i.e., of the method which performs gradient descent in the dual
space) with an additional averaging. The main result of the paper is an upper
bound for the expected accuracy of the proposed estimator. This bound is of the
order with an explicit and small constant factor, where
is the dimension of the problem and stands for the sample size. A similar
bound is proved for a more general setting that covers, in particular, the
regression model with squared loss.Comment: 29 pages; mai 200
Combining Kernel Functions in Supervised Learning Models.
The research activity has mainly dealt with supervised Machine Learning algorithms,
specifically within the context of kernel methods. A kernel function is a positive definite
function mapping data from the original input space into a higher dimensional Hilbert
space. Differently from classical linear methods, where problems are solved seeking for a
linear function separating points in the input space, kernel methods all have in common
the same basic focus: original input data is mapped onto a higher dimensional feature
set where new coordinates are not computed, but only the inner product of input
points. In this way, kernel methods make possible to deal with non-linearly separable
set of data, making use of linear models in the feature space: all the Machine Learning
methods using a linear function to determine the best fitting for a set of given data.
Instead of employing one single kernel function, Multiple Kernel Learning algorithms
tackle the problem of selecting kernel functions by using a combination of preset base
kernels. Infinite Kernel Learning further extends such idea by exploiting a combination
of possibly infinite base kernels. The research activity core idea is utilize a novel
complex combination of kernel functions in already existing or modified supervised
Machine Learning frameworks. Specifically, we considered two frameworks: Extreme
Learning Machine, having the structure of classical feedforward Neural Networks but
being characterized by hidden nodes variables randomly assigned at the beginning of
the algorithm; Support Vector Machine, a class of linear algorithms based on the idea
of separating data with a hyperplane having as wide a margin as possible. The first
proposed model extends the classical Extreme Learning Machine formulation using a
combination of possibly infinitely many base kernel, presenting a two-step algorithm.
The second result uses a preexisting multi-task kernel function in a novel Support
Vector Machine framework. Multi-task learning defines the Machine Learning problem
of solving more than one task at the same time, with the main goal of taking into
account the existing multi-task relationships. To be able to use the existing multi-task
kernel function, we had to construct a new framework based on the classical Support
Vector Machine one, taking care of every multi-task correlation factor
Local-Aggregate Modeling for Big-Data via Distributed Optimization: Applications to Neuroimaging
Technological advances have led to a proliferation of structured big data
that have matrix-valued covariates. We are specifically motivated to build
predictive models for multi-subject neuroimaging data based on each subject's
brain imaging scans. This is an ultra-high-dimensional problem that consists of
a matrix of covariates (brain locations by time points) for each subject; few
methods currently exist to fit supervised models directly to this tensor data.
We propose a novel modeling and algorithmic strategy to apply generalized
linear models (GLMs) to this massive tensor data in which one set of variables
is associated with locations. Our method begins by fitting GLMs to each
location separately, and then builds an ensemble by blending information across
locations through regularization with what we term an aggregating penalty. Our
so called, Local-Aggregate Model, can be fit in a completely distributed manner
over the locations using an Alternating Direction Method of Multipliers (ADMM)
strategy, and thus greatly reduces the computational burden. Furthermore, we
propose to select the appropriate model through a novel sequence of faster
algorithmic solutions that is similar to regularization paths. We will
demonstrate both the computational and predictive modeling advantages of our
methods via simulations and an EEG classification problem.Comment: 41 pages, 5 figures and 3 table
- …