11,292 research outputs found
High-dimensional regression with noisy and missing data: Provable guarantees with nonconvexity
Although the standard formulations of prediction problems involve
fully-observed and noiseless data drawn in an i.i.d. manner, many applications
involve noisy and/or missing data, possibly involving dependence, as well. We
study these issues in the context of high-dimensional sparse linear regression,
and propose novel estimators for the cases of noisy, missing and/or dependent
data. Many standard approaches to noisy or missing data, such as those using
the EM algorithm, lead to optimization problems that are inherently nonconvex,
and it is difficult to establish theoretical guarantees on practical
algorithms. While our approach also involves optimizing nonconvex programs, we
are able to both analyze the statistical error associated with any global
optimum, and more surprisingly, to prove that a simple algorithm based on
projected gradient descent will converge in polynomial time to a small
neighborhood of the set of all global minimizers. On the statistical side, we
provide nonasymptotic bounds that hold with high probability for the cases of
noisy, missing and/or dependent data. On the computational side, we prove that
under the same types of conditions required for statistical consistency, the
projected gradient descent algorithm is guaranteed to converge at a geometric
rate to a near-global minimizer. We illustrate these theoretical predictions
with simulations, showing close agreement with the predicted scalings.Comment: Published in at http://dx.doi.org/10.1214/12-AOS1018 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
On Distributed Linear Estimation With Observation Model Uncertainties
We consider distributed estimation of a Gaussian source in a heterogenous
bandwidth constrained sensor network, where the source is corrupted by
independent multiplicative and additive observation noises, with incomplete
statistical knowledge of the multiplicative noise. For multi-bit quantizers, we
derive the closed-form mean-square-error (MSE) expression for the linear
minimum MSE (LMMSE) estimator at the FC. For both error-free and erroneous
communication channels, we propose several rate allocation methods named as
longest root to leaf path, greedy and integer relaxation to (i) minimize the
MSE given a network bandwidth constraint, and (ii) minimize the required
network bandwidth given a target MSE. We also derive the Bayesian Cramer-Rao
lower bound (CRLB) and compare the MSE performance of our proposed methods
against the CRLB. Our results corroborate that, for low power multiplicative
observation noises and adequate network bandwidth, the gaps between the MSE of
our proposed methods and the CRLB are negligible, while the performance of
other methods like individual rate allocation and uniform is not satisfactory
Cosmological constraints from the convergence 1-point probability distribution
We examine the cosmological information available from the 1-point
probability distribution (PDF) of the weak-lensing convergence field, utilizing
fast L-PICOLA simulations and a Fisher analysis. We find competitive
constraints in the - plane from the convergence PDF with
pixels compared to the cosmic shear power spectrum with an
equivalent number of modes (). The convergence PDF also partially
breaks the degeneracy cosmic shear exhibits in that parameter space. A joint
analysis of the convergence PDF and shear 2-point function also reduces the
impact of shape measurement systematics, to which the PDF is less susceptible,
and improves the total figure of merit by a factor of , depending on the
level of systematics. Finally, we present a correction factor necessary for
calculating the unbiased Fisher information from finite differences using a
limited number of cosmological simulations.Comment: 10 pages, 5 figure
- …