Search CORE

1,710 research outputs found

Minimax risks for sparse regressions: Ultra-high-dimensional phenomenons

Author: Verzelen Nicolas
Publication venue
Publication date: 01/01/2012
Field of study

Consider the standard Gaussian linear regression model

Y=X\theta+\epsilon

, where

Y\in R^n

is a response vector and

X\in R^{n*p}

is a design matrix. Numerous work have been devoted to building efficient estimators of

\theta

when

p

is much larger than

n

. In such a situation, a classical approach amounts to assume that

\theta_0

is approximately sparse. This paper studies the minimax risks of estimation and testing over classes of

k

-sparse vectors

\theta

. These bounds shed light on the limitations due to high-dimensionality. The results encompass the problem of prediction (estimation of

X\theta

), the inverse problem (estimation of

\theta_0

) and linear testing (testing

X\theta=0

). Interestingly, an elbow effect occurs when the number of variables

k\log(p/k)

becomes large compared to

n

. Indeed, the minimax risks and hypothesis separation distances blow up in this ultra-high dimensional setting. We also prove that even dimension reduction techniques cannot provide satisfying results in an ultra-high dimensional setting. Moreover, we compute the minimax risks when the variance of the noise is unknown. The knowledge of this variance is shown to play a significant role in the optimal rates of estimation and testing. All these minimax bounds provide a characterization of statistical problems that are so difficult so that no procedure can provide satisfying results

arXiv.org e-Print Archive

Crossref

ProdInra

Adaptive robust variable selection

Author: Barut Emre
Fan Jianqing
Fan Yingying
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/02/2014
Field of study

Heavy-tailed high-dimensional data are commonly encountered in various scientific fields and pose great challenges to modern statistical analysis. A natural procedure to address this problem is to use penalized quantile regression with weighted

L_1

-penalty, called weighted robust Lasso (WR-Lasso), in which weights are introduced to ameliorate the bias problem induced by the

L_1

-penalty. In the ultra-high dimensional setting, where the dimensionality can grow exponentially with the sample size, we investigate the model selection oracle property and establish the asymptotic normality of the WR-Lasso. We show that only mild conditions on the model error distribution are needed. Our theoretical results also reveal that adaptive choice of the weight vector is essential for the WR-Lasso to enjoy these nice asymptotic properties. To make the WR-Lasso practically feasible, we propose a two-step procedure, called adaptive robust Lasso (AR-Lasso), in which the weight vector in the second step is constructed based on the

L_1

-penalized quantile regression estimate from the first step. This two-step procedure is justified theoretically to possess the oracle property and the asymptotic normality. Numerical studies demonstrate the favorable finite-sample performance of the AR-Lasso.Comment: Published in at http://dx.doi.org/10.1214/13-AOS1191 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

Princeton University Open Access Repository

Crossref

PubMed Central

Bayesian Conditional Tensor Factorizations for High-Dimensional Classification

Author: Dunson David B.
Yang Yun
Publication venue
Publication date: 21/01/2013
Field of study

In many application areas, data are collected on a categorical response and high-dimensional categorical predictors, with the goals being to build a parsimonious model for classification while doing inferences on the important predictors. In settings such as genomics, there can be complex interactions among the predictors. By using a carefully-structured Tucker factorization, we define a model that can characterize any conditional probability, while facilitating variable selection and modeling of higher-order interactions. Following a Bayesian approach, we propose a Markov chain Monte Carlo algorithm for posterior computation accommodating uncertainty in the predictors to be included. Under near sparsity assumptions, the posterior distribution for the conditional probability is shown to achieve close to the parametric rate of contraction even in ultra high-dimensional settings. The methods are illustrated using simulation examples and biomedical applications

arXiv.org e-Print Archive

CiteSeerX

Private Incremental Regression

Author: Chaudhuri K.
Chaudhuri K.
Fard M. M.
Gordon Y.
Jain P.
Jain P.
Jain P.
Kabáan A.
Kasiviswanathan S.
Kifer D.
Ledoux M.
Maillard O.
Mishra N.
Shalev-Shwartz S.
Talwar K.
Thakurta A. G.
Thakurta A. G.
Vapnik V.
Williams O.
Publication venue
Publication date: 04/01/2017
Field of study

Data is continuously generated by modern data sources, and a recent challenge in machine learning has been to develop techniques that perform well in an incremental (streaming) setting. In this paper, we investigate the problem of private machine learning, where as common in practice, the data is not given at once, but rather arrives incrementally over time. We introduce the problems of private incremental ERM and private incremental regression where the general goal is to always maintain a good empirical risk minimizer for the history observed under differential privacy. Our first contribution is a generic transformation of private batch ERM mechanisms into private incremental ERM mechanisms, based on a simple idea of invoking the private batch ERM procedure at some regular time intervals. We take this construction as a baseline for comparison. We then provide two mechanisms for the private incremental regression problem. Our first mechanism is based on privately constructing a noisy incremental gradient function, which is then used in a modified projected gradient procedure at every timestep. This mechanism has an excess empirical risk of

\approx\sqrt{d}

, where

d

is the dimensionality of the data. While from the results of [Bassily et al. 2014] this bound is tight in the worst-case, we show that certain geometric properties of the input and constraint set can be used to derive significantly better results for certain interesting regression problems.Comment: To appear in PODS 201

arXiv.org e-Print Archive

Crossref

$L_1$ -Penalization in Functional Linear Regression with Subgaussian Design

Author: Koltchinskii Vladimir
Minsker Stanislav
Publication venue
Publication date: 01/01/2014
Field of study

We study functional regression with random subgaussian design and real-valued response. The focus is on the problems in which the regression function can be well approximated by a functional linear model with the slope function being "sparse" in the sense that it can be represented as a sum of a small number of well separated "spikes". This can be viewed as an extension of now classical sparse estimation problems to the case of infinite dictionaries. We study an estimator of the regression function based on penalized empirical risk minimization with quadratic loss and the complexity penalty defined in terms of

L_1

-norm (a continuous version of LASSO). The main goal is to introduce several important parameters characterizing sparsity in this class of problems and to prove sharp oracle inequalities showing how the

L_2

-error of the continuous LASSO estimator depends on the underlying sparsity of the problem

arXiv.org e-Print Archive

Numérisation de Documents Anciens Mathématiques

Journal de l’École polytechnique. Mathématiques

Entropy-based convergence rates of greedy algorithms

Author: Li Yuwen
Siegel Jonathan
Publication venue
Publication date: 26/04/2023
Field of study

We present convergence estimates of two types of greedy algorithms in terms of the metric entropy of underlying compact sets. In the first part, we measure the error of a standard greedy reduced basis method for parametric PDEs by the metric entropy of the solution manifold in Banach spaces. This contrasts with the classical analysis based on the Kolmogorov n-widths and enables us to obtain direct comparisons between the greedy algorithm error and the entropy numbers, where the multiplicative constants are explicit and simple. The entropy-based convergence estimate is sharp and improves upon the classical width-based analysis of reduced basis methods for elliptic model problems. In the second part, we derive a novel and simple convergence analysis of the classical orthogonal greedy algorithm for nonlinear dictionary approximation using the metric entropy of the symmetric convex hull of the dictionary. This also improves upon existing results by giving a direct comparison between the algorithm error and the metric entropy.Comment: 22 pages, no figure

arXiv.org e-Print Archive