192 research outputs found
Recommended from our members
New perspectives in cross-validation
Appealing due to its universality, cross-validation is an ubiquitous tool for model tuning and selection. At its core, cross-validation proposes to split the data (potentially several times), and alternatively use some of the data for fitting a model and the rest for testing the model. This produces a reliable estimate of the risk, although many questions remain concerning how best to compare such estimates across different models. Despite its widespread use, many theoretical problems remain unanswered for cross-validation, particularly in high-dimensional regimes where bias issues are non-negligible. We first provide an asymptotic analysis of the cross-validated risk in relation to the train-test split risk for a large class of estimators under stability conditions. This asymptotic analysis is expressed in the form of a central limit theorem, and allows us to characterize the speed-up of the cross-validation procedure for general parametric M-estimators. In particular, we show that when the loss used for fitting differs from that used for evaluation, k-fold cross-validation may offer a reduction in variance less (or greater) than k. We then turn our attention to the high-dimensional regime (where the number of parameters is comparable to the number of observations). In such a regime, k-fold cross-validation presents asymptotic bias, and hence increasing the number of folds is of interest. We study the extreme case of leave-one-out cross-validation, and show that, for generalized linear models under smoothness conditions, it is a consistent estimate of the risk at the optimal rate. Given the large computational requirements of leave-one-out cross-validation, we finally consider the problem of obtaining a fast approximate version of the leave-one-out cross-validation (ALO) estimator. We propose a general strategy for deriving formulas for such ALO estimators for penalized generalized linear models, and apply it to many common estimators such as the LASSO, SVM, nuclear norm minimization. The performance of such approximations are evaluated on simulated and real datasets
Frank-Wolfe-type methods for a class of nonconvex inequality-constrained problems
The Frank-Wolfe (FW) method, which implements efficient linear oracles that
minimize linear approximations of the objective function over a fixed compact
convex set, has recently received much attention in the optimization and
machine learning literature. In this paper, we propose a new FW-type method for
minimizing a smooth function over a compact set defined as the level set of a
single difference-of-convex function, based on new generalized
linear-optimization oracles (LO). We show that these LOs can be computed
efficiently with closed-form solutions in some important optimization models
that arise in compressed sensing and machine learning. In addition, under a
mild strict feasibility condition, we establish the subsequential convergence
of our nonconvex FW-type method. Since the feasible region of our generalized
LO typically changes from iteration to iteration, our convergence analysis is
completely different from those existing works in the literature on FW-type
methods that deal with fixed feasible regions among subproblems. Finally,
motivated by the away steps for accelerating FW-type methods for convex
problems, we further design an away-step oracle to supplement our nonconvex
FW-type method, and establish subsequential convergence of this variant.
Numerical results on the matrix completion problem with standard datasets are
presented to demonstrate the efficiency of the proposed FW-type method and its
away-step variant.Comment: We updated grant information and fixed some minor typos in Section
- …