Search CORE

1,997 research outputs found

A Bayesian Approach for Noisy Matrix Completion: Optimal Rate under General Sampling Distribution

Author: Alquier Pierre
Mai The Tien
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 21/01/2015
Field of study

Bayesian methods for low-rank matrix completion with noise have been shown to be very efficient computationally. While the behaviour of penalized minimization methods is well understood both from the theoretical and computational points of view in this problem, the theoretical optimality of Bayesian estimators have not been explored yet. In this paper, we propose a Bayesian estimator for matrix completion under general sampling distribution. We also provide an oracle inequality for this estimator. This inequality proves that, whatever the rank of the matrix to be estimated, our estimator reaches the minimax-optimal rate of convergence (up to a logarithmic factor). We end the paper with a short simulation study

arXiv.org e-Print Archive

Research Repository UCD

Irish Universities

Robust linear least squares regression

Author: Audibert Jean-Yves
Catoni Olivier
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2011
Field of study

We consider the problem of robustly predicting as well as the best linear combination of

d

given functions in least squares regression, and variants of this problem including constraints on the parameters of the linear combination. For the ridge estimator and the ordinary least squares estimator, and their variants, we provide new risk bounds of order

d/n

without logarithmic factor unlike some standard results, where

n

is the size of the training data. We also provide a new estimator with better deviations in the presence of heavy-tailed noise. It is based on truncating differences of losses in a min--max framework and satisfies a

d/n

risk bound both in expectation and in deviations. The key common surprising factor of these results is the absence of exponential moment condition on the output distribution while achieving exponential deviations. All risk bounds are obtained through a PAC-Bayesian analysis on truncated differences of losses. Experimental results strongly back up our truncated min--max estimator.Comment: Published in at http://dx.doi.org/10.1214/11-AOS918 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org). arXiv admin note: significant text overlap with arXiv:0902.173

arXiv.org e-Print Archive

Crossref

INRIA a CCSD electronic archive server

HAL-Ecole des Ponts ParisTech

HAL - UPEC / UPEM

A reduced-rank approach to predicting multiple binary responses through machine learning

Author: Mai The Tien
Publication venue
Publication date: 09/06/2023
Field of study

This paper investigates the problem of simultaneously predicting multiple binary responses by utilizing a shared set of covariates. Our approach incorporates machine learning techniques for binary classification, without making assumptions about the underlying observations. Instead, our focus lies on a group of predictors, aiming to identify the one that minimizes prediction error. Unlike previous studies that primarily address estimation error, we directly analyze the prediction error of our method using PAC-Bayesian bounds techniques. In this paper, we introduce a pseudo-Bayesian approach capable of handling incomplete response data. Our strategy is efficiently implemented using the Langevin Monte Carlo method. Through simulation studies and a practical application using real data, we demonstrate the effectiveness of our proposed method, producing comparable or sometimes superior results compared to the current state-of-the-art method

arXiv.org e-Print Archive

Dichotomize and Generalize: PAC-Bayesian Binary Activated Deep Neural Networks

Author: Germain P
Guedj B
Laviolette F
Letarte G
Publication venue: Neural Information Processing Systems (NIPS)
Publication date
Field of study

We present a comprehensive study of multilayer neural networks with binary activation, relying on the PAC-Bayesian theory. Our contributions are twofold: (i) we develop an end-to-end framework to train a binary activated deep neural network, overcoming the fact that binary activation function is non-differentiable; (ii) we provide nonvacuous PAC-Bayesian generalization bounds for binary activated deep neural networks. Noteworthy, our results are obtained by minimizing the expected loss of an architecture-dependent aggregation of binary activated deep neural networks. The performance of our approach is assessed on a thorough numerical experiment protocol on real-life datasets

UCL Discovery

Sparse Estimation by Exponential Weighting

Author: Rigollet Philippe
Tsybakov Alexandre B.
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2012
Field of study

Consider a regression model with fixed design and Gaussian noise where the regression function can potentially be well approximated by a function that admits a sparse representation in a given dictionary. This paper resorts to exponential weights to exploit this underlying sparsity by implementing the principle of sparsity pattern aggregation. This model selection take on sparse estimation allows us to derive sparsity oracle inequalities in several popular frameworks, including ordinary sparsity, fused sparsity and group sparsity. One striking aspect of these theoretical results is that they hold under no condition in the dictionary. Moreover, we describe an efficient implementation of the sparsity pattern aggregation principle that compares favorably to state-of-the-art procedures on some basic numerical examples.Comment: Published in at http://dx.doi.org/10.1214/12-STS393 the Statistical Science (http://www.imstat.org/sts/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

CiteSeerX

Princeton University Open Access Repository

Crossref

Bringing Salary Transparency to the World: Computing Robust Compensation Insights via LinkedIn Salary

Author: Callegaro M.
Duerr A.
Fielding R. T.
Groves R. M.
Harris S.
Jessen R. J.
Job
Sandler R.
Statistical Methods SEMATECH
Sumbaly R.
Weiner J.
Zhang L.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/09/2017
Field of study

The recently launched LinkedIn Salary product has been designed with the goal of providing compensation insights to the world's professionals and thereby helping them optimize their earning potential. We describe the overall design and architecture of the statistical modeling system underlying this product. We focus on the unique data mining challenges while designing and implementing the system, and describe the modeling components such as Bayesian hierarchical smoothing that help to compute and present robust compensation insights to users. We report on extensive evaluation with nearly one year of de-identified compensation data collected from over one million LinkedIn users, thereby demonstrating the efficacy of the statistical models. We also highlight the lessons learned through the deployment of our system at LinkedIn.Comment: Conference information: ACM International Conference on Information and Knowledge Management (CIKM 2017

arXiv.org e-Print Archive

Crossref

Exponential Screening and optimal rates of sparse estimation

Author: Rigollet Philippe
Tsybakov Alexandre
Publication venue
Publication date: 27/07/2010
Field of study

In high-dimensional linear regression, the goal pursued here is to estimate an unknown regression function using linear combinations of a suitable set of covariates. One of the key assumptions for the success of any statistical procedure in this setup is to assume that the linear combination is sparse in some sense, for example, that it involves only few covariates. We consider a general, non necessarily linear, regression with Gaussian noise and study a related question that is to find a linear combination of approximating functions, which is at the same time sparse and has small mean squared error (MSE). We introduce a new estimation procedure, called Exponential Screening that shows remarkable adaptation properties. It adapts to the linear combination that optimally balances MSE and sparsity, whether the latter is measured in terms of the number of non-zero entries in the combination (

\ell_0

norm) or in terms of the global weight of the combination (

\ell_1

norm). The power of this adaptation result is illustrated by showing that Exponential Screening solves optimally and simultaneously all the problems of aggregation in Gaussian regression that have been discussed in the literature. Moreover, we show that the performance of the Exponential Screening estimator cannot be improved in a minimax sense, even if the optimal sparsity is known in advance. The theoretical and numerical superiority of Exponential Screening compared to state-of-the-art sparse procedures is also discussed

arXiv.org e-Print Archive

Hal-Diderot