57,944 research outputs found
High-dimensional Statistical Inference: from Vector to Matrix
Statistical inference for sparse signals or low-rank matrices in high-dimensional settings is of significant interest in a range of contemporary applications. It has attracted significant recent attention in many fields including statistics, applied mathematics and electrical engineering. In this thesis, we consider several problems in including sparse signal recovery (compressed sensing under restricted isometry) and low-rank matrix recovery (matrix recovery via rank-one projections and structured matrix completion).
The first part of the thesis discusses compressed sensing and affine rank minimization in both noiseless and noisy cases and establishes sharp restricted isometry conditions for sparse signal and low-rank matrix recovery. The analysis relies on a key technical tool which represents points in a polytope by convex combinations of sparse vectors. The technique is elementary while leads to sharp results. It is shown that, in compressed sensing, , , or for any given constant guarantee the exact recovery of all sparse signals in the noiseless case through the constrained minimization, and similarly in affine rank minimization , , or ensure the exact reconstruction of all matrices with rank at most in the noiseless case via the constrained nuclear norm minimization. Moreover, for any , , , or are not sufficient to guarantee the exact recovery of all -sparse signals for large . Similar result also holds for matrix recovery. In addition, the conditions , , and , , are also shown to be sufficient respectively for stable recovery of approximately sparse signals and low-rank matrices in the noisy case.
For the second part of the thesis, we introduce a rank-one projection model for low-rank matrix recovery and propose a constrained nuclear norm minimization method for stable recovery of low-rank matrices in the noisy case. The procedure is adaptive to the rank and robust against small perturbations. Both upper and lower bounds for the estimation accuracy under the Frobenius norm loss are obtained. The proposed estimator is shown to be rate-optimal under certain conditions. The estimator is easy to implement via convex programming and performs well numerically. The techniques and main results developed in the chapter also have implications to other related statistical problems. An application to estimation of spiked covariance matrices from one-dimensional random projections is considered. The results demonstrate that it is still possible to accurately estimate the covariance matrix of a high-dimensional distribution based only on one-dimensional projections.
For the third part of the thesis, we consider another setting of low-rank matrix completion. Current literature on matrix completion focuses primarily on independent sampling models under which the individual observed entries are sampled independently. Motivated by applications in genomic data integration, we propose a new framework of structured matrix completion (SMC) to treat structured missingness by design. Specifically, our proposed method aims at efficient matrix recovery when a subset of the rows and columns of an approximately low-rank matrix are observed. We provide theoretical justification for the proposed SMC method and derive lower bound for the estimation errors, which together establish the optimal rate of recovery over certain classes of approximately low-rank matrices. Simulation studies show that the method performs well in finite sample under a variety of configurations. The method is applied to integrate several ovarian cancer genomic studies with different extent of genomic measurements, which enables us to construct more accurate prediction rules for ovarian cancer survival
Matrix Completion via Max-Norm Constrained Optimization
Matrix completion has been well studied under the uniform sampling model and
the trace-norm regularized methods perform well both theoretically and
numerically in such a setting. However, the uniform sampling model is
unrealistic for a range of applications and the standard trace-norm relaxation
can behave very poorly when the underlying sampling scheme is non-uniform.
In this paper we propose and analyze a max-norm constrained empirical risk
minimization method for noisy matrix completion under a general sampling model.
The optimal rate of convergence is established under the Frobenius norm loss in
the context of approximately low-rank matrix reconstruction. It is shown that
the max-norm constrained method is minimax rate-optimal and yields a unified
and robust approximate recovery guarantee, with respect to the sampling
distributions. The computational effectiveness of this method is also
discussed, based on first-order algorithms for solving convex optimizations
involving max-norm regularization.Comment: 33 page
Guarantees of Riemannian Optimization for Low Rank Matrix Completion
We study the Riemannian optimization methods on the embedded manifold of low
rank matrices for the problem of matrix completion, which is about recovering a
low rank matrix from its partial entries. Assume entries of an
rank matrix are sampled independently and uniformly with replacement. We
first prove that with high probability the Riemannian gradient descent and
conjugate gradient descent algorithms initialized by one step hard thresholding
are guaranteed to converge linearly to the measured matrix provided
\begin{align*} m\geq C_\kappa n^{1.5}r\log^{1.5}(n), \end{align*} where
is a numerical constant depending on the condition number of the
underlying matrix. The sampling complexity has been further improved to
\begin{align*} m\geq C_\kappa nr^2\log^{2}(n) \end{align*} via the resampled
Riemannian gradient descent initialization. The analysis of the new
initialization procedure relies on an asymmetric restricted isometry property
of the sampling operator and the curvature of the low rank matrix manifold.
Numerical simulation shows that the algorithms are able to recover a low rank
matrix from nearly the minimum number of measurements
Network Topology Mapping from Partial Virtual Coordinates and Graph Geodesics
For many important network types (e.g., sensor networks in complex harsh
environments and social networks) physical coordinate systems (e.g.,
Cartesian), and physical distances (e.g., Euclidean), are either difficult to
discern or inapplicable. Accordingly, coordinate systems and characterizations
based on hop-distance measurements, such as Topology Preserving Maps (TPMs) and
Virtual-Coordinate (VC) systems are attractive alternatives to Cartesian
coordinates for many network algorithms. Herein, we present an approach to
recover geometric and topological properties of a network with a small set of
distance measurements. In particular, our approach is a combination of shortest
path (often called geodesic) recovery concepts and low-rank matrix completion,
generalized to the case of hop-distances in graphs. Results for sensor networks
embedded in 2-D and 3-D spaces, as well as a social networks, indicates that
the method can accurately capture the network connectivity with a small set of
measurements. TPM generation can now also be based on various context
appropriate measurements or VC systems, as long as they characterize different
nodes by distances to small sets of random nodes (instead of a set of global
anchors). The proposed method is a significant generalization that allows the
topology to be extracted from a random set of graph shortest paths, making it
applicable in contexts such as social networks where VC generation may not be
possible.Comment: 17 pages, 9 figures. arXiv admin note: substantial text overlap with
arXiv:1712.1006
- …