503 research outputs found
The Overlap Gap Property in Principal Submatrix Recovery
We study support recovery for a principal submatrix with
elevated mean , hidden in an symmetric mean zero
Gaussian matrix. Here is a universal constant, and we assume for some constant . We establish that {there exists a
constant such that} the MLE recovers a constant proportion of the hidden
submatrix if ,
{while such recovery is information theoretically impossible if }. The MLE is computationally
intractable in general, and in fact, for sufficiently small, this
problem is conjectured to exhibit a \emph{statistical-computational gap}. To
provide rigorous evidence for this, we study the likelihood landscape for this
problem, and establish that for some and , the
problem exhibits a variant of the \emph{Overlap-Gap-Property (OGP)}. As a
direct consequence, we establish that a family of local MCMC based algorithms
do not achieve optimal recovery. Finally, we establish that for , a simple spectral method recovers a constant proportion of the hidden
submatrix.Comment: 42 pages, 1 figur
Information-theoretic bounds and phase transitions in clustering, sparse PCA, and submatrix localization
We study the problem of detecting a structured, low-rank signal matrix
corrupted with additive Gaussian noise. This includes clustering in a Gaussian
mixture model, sparse PCA, and submatrix localization. Each of these problems
is conjectured to exhibit a sharp information-theoretic threshold, below which
the signal is too weak for any algorithm to detect. We derive upper and lower
bounds on these thresholds by applying the first and second moment methods to
the likelihood ratio between these "planted models" and null models where the
signal matrix is zero. Our bounds differ by at most a factor of root two when
the rank is large (in the clustering and submatrix localization problems, when
the number of clusters or blocks is large) or the signal matrix is very sparse.
Moreover, our upper bounds show that for each of these problems there is a
significant regime where reliable detection is information- theoretically
possible but where known algorithms such as PCA fail completely, since the
spectrum of the observed matrix is uninformative. This regime is analogous to
the conjectured 'hard but detectable' regime for community detection in sparse
graphs.Comment: For sparse PCA and submatrix localization, we determine the
information-theoretic threshold exactly in the limit where the number of
blocks is large or the signal matrix is very sparse based on a conditional
second moment method, closing the factor of root two gap in the first versio
Computational Barriers to Estimation from Low-Degree Polynomials
One fundamental goal of high-dimensional statistics is to detect or recover
structure from noisy data. In many cases, the data can be faithfully modeled by
a planted structure (such as a low-rank matrix) perturbed by random noise. But
even for these simple models, the computational complexity of estimation is
sometimes poorly understood. A growing body of work studies low-degree
polynomials as a proxy for computational complexity: it has been demonstrated
in various settings that low-degree polynomials of the data can match the
statistical performance of the best known polynomial-time algorithms for
detection. While prior work has studied the power of low-degree polynomials for
the task of detecting the presence of hidden structures, it has failed to
address the estimation problem in settings where detection is qualitatively
easier than estimation.
In this work, we extend the method of low-degree polynomials to address
problems of estimation and recovery. For a large class of "signal plus noise"
problems, we give a user-friendly lower bound for the best possible mean
squared error achievable by any degree-D polynomial. To our knowledge, this is
the first instance in which the low-degree polynomial method can establish
low-degree hardness of recovery problems where the associated detection problem
is easy. As applications, we give a tight characterization of the low-degree
minimum mean squared error for the planted submatrix and planted dense subgraph
problems, resolving (in the low-degree framework) open problems about the
computational complexity of recovery in both cases.Comment: 38 page
Simultaneously Structured Models with Application to Sparse and Low-rank Matrices
The topic of recovery of a structured model given a small number of linear
observations has been well-studied in recent years. Examples include recovering
sparse or group-sparse vectors, low-rank matrices, and the sum of sparse and
low-rank matrices, among others. In various applications in signal processing
and machine learning, the model of interest is known to be structured in
several ways at the same time, for example, a matrix that is simultaneously
sparse and low-rank.
Often norms that promote each individual structure are known, and allow for
recovery using an order-wise optimal number of measurements (e.g.,
norm for sparsity, nuclear norm for matrix rank). Hence, it is reasonable to
minimize a combination of such norms. We show that, surprisingly, if we use
multi-objective optimization with these norms, then we can do no better,
order-wise, than an algorithm that exploits only one of the present structures.
This result suggests that to fully exploit the multiple structures, we need an
entirely new convex relaxation, i.e. not one that is a function of the convex
relaxations used for each structure. We then specialize our results to the case
of sparse and low-rank matrices. We show that a nonconvex formulation of the
problem can recover the model from very few measurements, which is on the order
of the degrees of freedom of the matrix, whereas the convex problem obtained
from a combination of the and nuclear norms requires many more
measurements. This proves an order-wise gap between the performance of the
convex and nonconvex recovery problems in this case. Our framework applies to
arbitrary structure-inducing norms as well as to a wide range of measurement
ensembles. This allows us to give performance bounds for problems such as
sparse phase retrieval and low-rank tensor completion.Comment: 38 pages, 9 figure
Efficient reconstruction of band-limited sequences from nonuniformly decimated versions by use of polyphase filter banks
An efficient polyphase structure for the reconstruction of a band-limited sequence from a nonuniformly decimated version is developed. Theoretically, the reconstruction involves the implementation of a bank of multilevel filters, and it is shown that how all these reconstruction filters can be obtained at the cost of one Mth band low-pass filter and a constant matrix multiplier. The resulting structure is therefore more general than previous schemes. In addition, the method offers a direct means of controlling the overall reconstruction distortion T(z) by appropriate design of a low-pass prototype filter P(z). Extension of these results to multiband band-limited signals and to the case of nonconsecutive nonuniform subsampling are also summarized, along with generalizations to the multidimensional case. Design examples are included to demonstrate the theory, and the complexity of the new method is seen to be much lower than earlier ones
- …