4,201 research outputs found
A study of the classification of low-dimensional data with supervised manifold learning
Supervised manifold learning methods learn data representations by preserving
the geometric structure of data while enhancing the separation between data
samples from different classes. In this work, we propose a theoretical study of
supervised manifold learning for classification. We consider nonlinear
dimensionality reduction algorithms that yield linearly separable embeddings of
training data and present generalization bounds for this type of algorithms. A
necessary condition for satisfactory generalization performance is that the
embedding allow the construction of a sufficiently regular interpolation
function in relation with the separation margin of the embedding. We show that
for supervised embeddings satisfying this condition, the classification error
decays at an exponential rate with the number of training samples. Finally, we
examine the separability of supervised nonlinear embeddings that aim to
preserve the low-dimensional geometric structure of data based on graph
representations. The proposed analysis is supported by experiments on several
real data sets
Invertibility of symmetric random matrices
We study n by n symmetric random matrices H, possibly discrete, with iid
above-diagonal entries. We show that H is singular with probability at most
exp(-n^c), and the spectral norm of the inverse of H is O(sqrt{n}).
Furthermore, the spectrum of H is delocalized on the optimal scale o(n^{-1/2}).
These results improve upon a polynomial singularity bound due to Costello, Tao
and Vu, and they generalize, up to constant factors, results of Tao and Vu, and
Erdos, Schlein and Yau.Comment: 53 pages. Minor corrections, changes in presentation. To appear in
Random Structures and Algorithm
Regularization in kernel learning
Under mild assumptions on the kernel, we obtain the best known error rates in
a regularized learning scenario taking place in the corresponding reproducing
kernel Hilbert space (RKHS). The main novelty in the analysis is a proof that
one can use a regularization term that grows significantly slower than the
standard quadratic growth in the RKHS norm.Comment: Published in at http://dx.doi.org/10.1214/09-AOS728 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Local Rademacher Complexity-based Learning Guarantees for Multi-Task Learning
We show a Talagrand-type concentration inequality for Multi-Task Learning
(MTL), using which we establish sharp excess risk bounds for MTL in terms of
distribution- and data-dependent versions of the Local Rademacher Complexity
(LRC). We also give a new bound on the LRC for norm regularized as well as
strongly convex hypothesis classes, which applies not only to MTL but also to
the standard i.i.d. setting. Combining both results, one can now easily derive
fast-rate bounds on the excess risk for many prominent MTL methods,
including---as we demonstrate---Schatten-norm, group-norm, and
graph-regularized MTL. The derived bounds reflect a relationship akeen to a
conservation law of asymptotic convergence rates. This very relationship allows
for trading off slower rates w.r.t. the number of tasks for faster rates with
respect to the number of available samples per task, when compared to the rates
obtained via a traditional, global Rademacher analysis.Comment: In this version, some arguments and results (of the previous version)
have been corrected, or modifie
- …