19 research outputs found
Finite sample performance of linear least squares estimators under sub-Gaussian martingale difference noise
Linear Least Squares is a very well known technique for parameter estimation,
which is used even when sub-optimal, because of its very low computational
requirements and the fact that exact knowledge of the noise statistics is not
required. Surprisingly, bounding the probability of large errors with finitely
many samples has been left open, especially when dealing with correlated noise
with unknown covariance. In this paper we analyze the finite sample performance
of the linear least squares estimator under sub-Gaussian martingale difference
noise. In order to analyze this important question we used concentration of
measure bounds. When applying these bounds we obtained tight bounds on the tail
of the estimator's distribution. We show the fast exponential convergence of
the number of samples required to ensure a given accuracy with high
probability. We provide probability tail bounds on the estimation error's norm.
Our analysis method is simple and uses simple type bounds on the
estimation error. The tightness of the bounds is tested through simulation. The
proposed bounds make it possible to predict the number of samples required for
least squares estimation even when least squares is sub-optimal and used for
computational simplicity. The finite sample analysis of least squares models
with this general noise model is novel
The Expected Norm of a Sum of Independent Random Matrices: An Elementary Approach
In contemporary applied and computational mathematics, a frequent challenge
is to bound the expectation of the spectral norm of a sum of independent random
matrices. This quantity is controlled by the norm of the expected square of the
random matrix and the expectation of the maximum squared norm achieved by one
of the summands; there is also a weak dependence on the dimension of the random
matrix. The purpose of this paper is to give a complete, elementary proof of
this important, but underappreciated, inequality.Comment: 20 page
Sample average approximation with heavier tails II: localization in stochastic convex optimization and persistence results for the Lasso
We present exponential finite-sample nonasymptotic deviation inequalities for
the SAA estimator's near-optimal solution set over the class of stochastic
optimization problems with heavy-tailed random \emph{convex} functions in the
objective and constraints. Such setting is better suited for problems where a
sub-Gaussian data generating distribution is less expected, e.g., in stochastic
portfolio optimization. One of our contributions is to exploit \emph{convexity}
of the perturbed objective and the perturbed constraints as a property which
entails \emph{localized} deviation inequalities for joint feasibility and
optimality guarantees. This means that our bounds are significantly tighter in
terms of diameter and metric entropy since they depend only on the near-optimal
solution set but not on the whole feasible set. As a result, we obtain a much
sharper sample complexity estimate when compared to a general nonconvex
problem. In our analysis, we derive some localized deterministic perturbation
error bounds for convex optimization problems which are of independent
interest. To obtain our results, we only assume a metric regular convex
feasible set, possibly not satisfying the Slater condition and not having a
metric regular solution set. In this general setting, joint near feasibility
and near optimality are guaranteed. If in addition the set satisfies the Slater
condition, we obtain finite-sample simultaneous \emph{exact} feasibility and
near optimality guarantees (for a sufficiently small tolerance). Another
contribution of our work is to present, as a proof of concept of our localized
techniques, a persistent result for a variant of the LASSO estimator under very
weak assumptions on the data generating distribution.Comment: 34 pages. Some correction