Search CORE

44 research outputs found

Leveraging Low-Rank Relations Between Surrogate Tasks in Structured Prediction

Author: Ciliberto Carlo
Luise Giulia
Pontil Massimiliano
Stamos Dimitris
Publication venue
Publication date: 01/01/2019
Field of study

We study the interplay between surrogate methods for structured prediction and techniques from multitask learning designed to leverage relationships between surrogate outputs. We propose an efficient algorithm based on trace norm regularization which, differently from previous methods, does not require explicit knowledge of the coding/decoding functions of the surrogate framework. As a result, our algorithm can be applied to the broad class of problems in which the surrogate space is large or even infinite dimensional. We study excess risk bounds for trace norm regularized structured prediction, implying the consistency and learning rates for our estimator. We also identify relevant regimes in which our approach can enjoy better generalization performance than previous methods. Numerical experiments on ranking problems indicate that enforcing low-rank relations among surrogate outputs may indeed provide a significant advantage in practice.Comment: 42 pages, 1 tabl

arXiv.org e-Print Archive

UCL Discovery

Some Statistical Properties of Spectral Regression Estimators

Author: Hassan Nawal
Publication venue: RIT Scholar Works
Publication date: 03/05/2019
Field of study

In this thesis we explore different Spectral Regression Estimators in order to solve the prob- lem in regression where we have multiple columns that are linearly dependent: We explore two scenarios • Scenario 1: p \u3c\u3c n where there exists at least two columns; xj and xk that are nearly linearly dependent which indicates co-linearity and X⊤X becomes near singular. • Scenario 2: n \u3c\u3c p since there are more predictors than observations so some columns must be a linear combination of another column which indicates linear dependence. The scenarios give us an ill conditioned matrix of X⊤X (when solving the normal equa- tion) due to collinearity issues and the matrix becomes singular and makes the least squares estimate unstable and impossible to compute. In the paper, we explore different methods (variable selection, regularization, compression and dimensionality reduction) that solves the above issue. For variable selection techniques, we use Stepwise Selection Regression as well as the method of Best Subset Selection regression. Two approaches for Stepwise Se- lection regression are assessed in the paper: Forward Selection and Backward Elimination. Performance assessment of our regression models will be made based on criterion based procedures like AIC,BIC,R2,R2 adjusted and the Mallow’s CP statistic. In chapter three of this paper we introduce the concepts of General Regularization, Ridge Regression as well as subsequent shrinkage methods such as the Lasso, Bayesian Lasso and the Elastic net. Chapter five will look at Compression and Dimensionality reduction procedures which are outlined via SVD (Singular Value Decomposition) and Eigenvector Decomposition. Hard thresholding is subsequently introduced via SPCA (Sparse Principle Component Analysis) and a novel approach using RPCA (Robust Principle Component Analysis). Furthermore, RPCA also shows how it can aid with data and image compression. The basis of this study is concluded with an empirical exploration of all the methods outlined above using several performance indicators on simulated data and real data sets. Assessment of the data sets is done via cross-validation. We determine the optimal values of the settings and then evalu- ate the predictive and explanatory performance

RIT Scholar Works

Additive Regularization Trade-Off: Fusion of Training and Validation Levels in Kernel Methods

Author: A. E. Hoerl
A. Tikhonov
B. D. Ripley
B. De Moor
B. Schölkopf
C. L. Mallows
D. J. C. MacKay
D. J. L. Herrmann
F. Cucker
G. C. Cawley
G. H. Golub
G. R. G. Lanckriet
H. Akaike
J. A. Hanley
J. A. K. Suykens
J. A. K. Suykens
J. A. K. Suykens
J. Rissanen
K. Pelckmans
L. Breiman
L. L. Schumaker
M. Kearns
M. Stone
O. Bousquet
O. Chapelle
P. Burman
P. C. Hansen
P. Craven
T. Hastie
V. Cherkassky
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Leveraging Low-Rank Relations Between Surrogate Tasks in Structured Prediction

Author: Ciliberto C
Luise G
Pontil M
Stamos D
Publication venue
Publication date: 01/01/2019
Field of study

UCL Discovery

Structured sparsity via optimal interpolation norms

Author: McDonald Andrew Michael
Publication venue: UCL (University College London)
Publication date: 28/10/2017
Field of study

We study norms that can be used as penalties in machine learning problems. In particular, we consider norms that are defined by an optimal interpolation problem and whose additional structure can be used to encourage specific characteristics, such as sparsity, in the solution to a learning problem. We first study a norm that is defined as an infimum of quadratics parameterized over a convex set. We show that this formulation includes the k-support norm for sparse vector learning, and its Moreau envelope, the box-norm. These extend naturally to spectral regularizers for matrices, and we introduce the spectral k-support norm and spectral box-norm. We study their properties and we apply the penalties to low rank matrix and multitask learning problems. We next introduce two generalizations of the k-support norm. The first of these is the (k, p)-support norm. In the matrix setting, the additional parameter p allows us to better learn the curvature of the spectrum of the underlying solution. A second application is to multilinear algebra. By considering the rank of its matricizations, we obtain a k-support norm that can be applied to learn a low rank tensor. For each of these norms we provide an optimization method to solve the underlying learning problem, and we present numerical experiments. Finally, we present a general framework for optimal interpolation norms. We focus on a specific formulation that involves an infimal convolution coupled with a linear operator, and which captures several of the penalties discussed in this thesis. Finally we introduce an algorithm to solve regularization problems with norms of this type, and we provide numerical experiments to illustrate the method

UCL Discovery