1,997 research outputs found
A Bayesian Approach for Noisy Matrix Completion: Optimal Rate under General Sampling Distribution
Bayesian methods for low-rank matrix completion with noise have been shown to
be very efficient computationally. While the behaviour of penalized
minimization methods is well understood both from the theoretical and
computational points of view in this problem, the theoretical optimality of
Bayesian estimators have not been explored yet. In this paper, we propose a
Bayesian estimator for matrix completion under general sampling distribution.
We also provide an oracle inequality for this estimator. This inequality proves
that, whatever the rank of the matrix to be estimated, our estimator reaches
the minimax-optimal rate of convergence (up to a logarithmic factor). We end
the paper with a short simulation study
Robust linear least squares regression
We consider the problem of robustly predicting as well as the best linear
combination of given functions in least squares regression, and variants of
this problem including constraints on the parameters of the linear combination.
For the ridge estimator and the ordinary least squares estimator, and their
variants, we provide new risk bounds of order without logarithmic factor
unlike some standard results, where is the size of the training data. We
also provide a new estimator with better deviations in the presence of
heavy-tailed noise. It is based on truncating differences of losses in a
min--max framework and satisfies a risk bound both in expectation and in
deviations. The key common surprising factor of these results is the absence of
exponential moment condition on the output distribution while achieving
exponential deviations. All risk bounds are obtained through a PAC-Bayesian
analysis on truncated differences of losses. Experimental results strongly back
up our truncated min--max estimator.Comment: Published in at http://dx.doi.org/10.1214/11-AOS918 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org). arXiv admin note: significant text
overlap with arXiv:0902.173
A reduced-rank approach to predicting multiple binary responses through machine learning
This paper investigates the problem of simultaneously predicting multiple
binary responses by utilizing a shared set of covariates. Our approach
incorporates machine learning techniques for binary classification, without
making assumptions about the underlying observations. Instead, our focus lies
on a group of predictors, aiming to identify the one that minimizes prediction
error. Unlike previous studies that primarily address estimation error, we
directly analyze the prediction error of our method using PAC-Bayesian bounds
techniques. In this paper, we introduce a pseudo-Bayesian approach capable of
handling incomplete response data. Our strategy is efficiently implemented
using the Langevin Monte Carlo method. Through simulation studies and a
practical application using real data, we demonstrate the effectiveness of our
proposed method, producing comparable or sometimes superior results compared to
the current state-of-the-art method
Dichotomize and Generalize: PAC-Bayesian Binary Activated Deep Neural Networks
We present a comprehensive study of multilayer neural networks with binary activation, relying on the PAC-Bayesian theory. Our contributions are twofold: (i) we develop an end-to-end framework to train a binary activated deep neural network, overcoming the fact that binary activation function is non-differentiable; (ii) we provide nonvacuous PAC-Bayesian generalization bounds for binary activated deep neural networks. Noteworthy, our results are obtained by minimizing the expected loss of an architecture-dependent aggregation of binary activated deep neural networks. The performance of our approach is assessed on a thorough numerical experiment protocol on real-life datasets
Sparse Estimation by Exponential Weighting
Consider a regression model with fixed design and Gaussian noise where the
regression function can potentially be well approximated by a function that
admits a sparse representation in a given dictionary. This paper resorts to
exponential weights to exploit this underlying sparsity by implementing the
principle of sparsity pattern aggregation. This model selection take on sparse
estimation allows us to derive sparsity oracle inequalities in several popular
frameworks, including ordinary sparsity, fused sparsity and group sparsity. One
striking aspect of these theoretical results is that they hold under no
condition in the dictionary. Moreover, we describe an efficient implementation
of the sparsity pattern aggregation principle that compares favorably to
state-of-the-art procedures on some basic numerical examples.Comment: Published in at http://dx.doi.org/10.1214/12-STS393 the Statistical
Science (http://www.imstat.org/sts/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Bringing Salary Transparency to the World: Computing Robust Compensation Insights via LinkedIn Salary
The recently launched LinkedIn Salary product has been designed with the goal
of providing compensation insights to the world's professionals and thereby
helping them optimize their earning potential. We describe the overall design
and architecture of the statistical modeling system underlying this product. We
focus on the unique data mining challenges while designing and implementing the
system, and describe the modeling components such as Bayesian hierarchical
smoothing that help to compute and present robust compensation insights to
users. We report on extensive evaluation with nearly one year of de-identified
compensation data collected from over one million LinkedIn users, thereby
demonstrating the efficacy of the statistical models. We also highlight the
lessons learned through the deployment of our system at LinkedIn.Comment: Conference information: ACM International Conference on Information
and Knowledge Management (CIKM 2017
Exponential Screening and optimal rates of sparse estimation
In high-dimensional linear regression, the goal pursued here is to estimate
an unknown regression function using linear combinations of a suitable set of
covariates. One of the key assumptions for the success of any statistical
procedure in this setup is to assume that the linear combination is sparse in
some sense, for example, that it involves only few covariates. We consider a
general, non necessarily linear, regression with Gaussian noise and study a
related question that is to find a linear combination of approximating
functions, which is at the same time sparse and has small mean squared error
(MSE). We introduce a new estimation procedure, called Exponential Screening
that shows remarkable adaptation properties. It adapts to the linear
combination that optimally balances MSE and sparsity, whether the latter is
measured in terms of the number of non-zero entries in the combination
( norm) or in terms of the global weight of the combination (
norm). The power of this adaptation result is illustrated by showing that
Exponential Screening solves optimally and simultaneously all the problems of
aggregation in Gaussian regression that have been discussed in the literature.
Moreover, we show that the performance of the Exponential Screening estimator
cannot be improved in a minimax sense, even if the optimal sparsity is known in
advance. The theoretical and numerical superiority of Exponential Screening
compared to state-of-the-art sparse procedures is also discussed
- âŠ