390 research outputs found
Information-theoretic bounds and phase transitions in clustering, sparse PCA, and submatrix localization
We study the problem of detecting a structured, low-rank signal matrix
corrupted with additive Gaussian noise. This includes clustering in a Gaussian
mixture model, sparse PCA, and submatrix localization. Each of these problems
is conjectured to exhibit a sharp information-theoretic threshold, below which
the signal is too weak for any algorithm to detect. We derive upper and lower
bounds on these thresholds by applying the first and second moment methods to
the likelihood ratio between these "planted models" and null models where the
signal matrix is zero. Our bounds differ by at most a factor of root two when
the rank is large (in the clustering and submatrix localization problems, when
the number of clusters or blocks is large) or the signal matrix is very sparse.
Moreover, our upper bounds show that for each of these problems there is a
significant regime where reliable detection is information- theoretically
possible but where known algorithms such as PCA fail completely, since the
spectrum of the observed matrix is uninformative. This regime is analogous to
the conjectured 'hard but detectable' regime for community detection in sparse
graphs.Comment: For sparse PCA and submatrix localization, we determine the
information-theoretic threshold exactly in the limit where the number of
blocks is large or the signal matrix is very sparse based on a conditional
second moment method, closing the factor of root two gap in the first versio
MMSE of probabilistic low-rank matrix estimation: Universality with respect to the output channel
This paper considers probabilistic estimation of a low-rank matrix from
non-linear element-wise measurements of its elements. We derive the
corresponding approximate message passing (AMP) algorithm and its state
evolution. Relying on non-rigorous but standard assumptions motivated by
statistical physics, we characterize the minimum mean squared error (MMSE)
achievable information theoretically and with the AMP algorithm. Unlike in
related problems of linear estimation, in the present setting the MMSE depends
on the output channel only trough a single parameter - its Fisher information.
We illustrate this striking finding by analysis of submatrix localization, and
of detection of communities hidden in a dense stochastic block model. For this
example we locate the computational and statistical boundaries that are not
equal for rank larger than four.Comment: 10 pages, Allerton Conference on Communication, Control, and
Computing 201
Sigma Point Belief Propagation
The sigma point (SP) filter, also known as unscented Kalman filter, is an
attractive alternative to the extended Kalman filter and the particle filter.
Here, we extend the SP filter to nonsequential Bayesian inference corresponding
to loopy factor graphs. We propose sigma point belief propagation (SPBP) as a
low-complexity approximation of the belief propagation (BP) message passing
scheme. SPBP achieves approximate marginalizations of posterior distributions
corresponding to (generally) loopy factor graphs. It is well suited for
decentralized inference because of its low communication requirements. For a
decentralized, dynamic sensor localization problem, we demonstrate that SPBP
can outperform nonparametric (particle-based) BP while requiring significantly
less computations and communications.Comment: 5 pages, 1 figur
Inference And Learning: Computational Difficulty And Efficiency
In this thesis, we mainly investigate two collections of problems: statistical network inference and model selection in regression. The common feature shared by these two types of problems is that they typically exhibit an interesting phenomenon in terms of computational difficulty and efficiency.
For statistical network inference, our goal is to infer the network structure based on a noisy observation of the network. Statistically, we model the network as generated from the structural information with the presence of noise, for example, planted submatrix model (for bipartite weighted graph), stochastic block model, and Watts-Strogatz model. As the relative amount of ``signal-to-noise\u27\u27 varies, the problems exhibit different stages of computational difficulty. On the theoretical side, we investigate these stages through characterizing the transition thresholds on the ``signal-to-noise\u27\u27 ratio, for the aforementioned models. On the methodological side, we provide new computationally efficient procedures to reconstruct the network structure for each model.
For model selection in regression, our goal is to learn a ``good\u27\u27 model based on a certain model class from the observed data sequences (feature and response pairs), when the model can be misspecified. More concretely, we study two model selection problems: to learn from general classes of functions based on i.i.d. data with minimal assumptions, and to select from the sparse linear model class based on possibly adversarially chosen data in a sequential fashion. We develop new theoretical and algorithmic tools beyond empirical risk minimization to study these problems from a learning theory point of view
Mutual Information in Rank-One Matrix Estimation
We consider the estimation of a n-dimensional vector x from the knowledge of
noisy and possibility non-linear element-wise measurements of xxT , a very
generic problem that contains, e.g. stochastic 2-block model, submatrix
localization or the spike perturbation of random matrices. We use an
interpolation method proposed by Guerra and later refined by Korada and Macris.
We prove that the Bethe mutual information (related to the Bethe free energy
and conjectured to be exact by Lesieur et al. on the basis of the non-rigorous
cavity method) always yields an upper bound to the exact mutual information. We
also provide a lower bound using a similar technique. For concreteness, we
illustrate our findings on the sparse PCA problem, and observe that (a) our
bounds match for a large region of parameters and (b) that it exists a phase
transition in a region where the spectum remains uninformative. While we
present only the case of rank-one symmetric matrix estimation, our proof
technique is readily extendable to low-rank symmetric matrix or low-rank
symmetric tensor estimationComment: 8 pages, 1 figure
- …