1,232 research outputs found
Selecting the rank of truncated SVD by Maximum Approximation Capacity
Truncated Singular Value Decomposition (SVD) calculates the closest rank-
approximation of a given input matrix. Selecting the appropriate rank
defines a critical model order choice in most applications of SVD. To obtain a
principled cut-off criterion for the spectrum, we convert the underlying
optimization problem into a noisy channel coding problem. The optimal
approximation capacity of this channel controls the appropriate strength of
regularization to suppress noise. In simulation experiments, this information
theoretic method to determine the optimal rank competes with state-of-the art
model selection techniques.Comment: 7 pages, 5 figures; Will be presented at the IEEE International
Symposium on Information Theory (ISIT) 2011. The conference version has only
5 pages. This version has an extended appendi
Greedy MAXCUT Algorithms and their Information Content
MAXCUT defines a classical NP-hard problem for graph partitioning and it
serves as a typical case of the symmetric non-monotone Unconstrained Submodular
Maximization (USM) problem. Applications of MAXCUT are abundant in machine
learning, computer vision and statistical physics. Greedy algorithms to
approximately solve MAXCUT rely on greedy vertex labelling or on an edge
contraction strategy. These algorithms have been studied by measuring their
approximation ratios in the worst case setting but very little is known to
characterize their robustness to noise contaminations of the input data in the
average case. Adapting the framework of Approximation Set Coding, we present a
method to exactly measure the cardinality of the algorithmic approximation sets
of five greedy MAXCUT algorithms. Their information contents are explored for
graph instances generated by two different noise models: the edge reversal
model and Gaussian edge weights model. The results provide insights into the
robustness of different greedy heuristics and techniques for MAXCUT, which can
be used for algorithm design of general USM problems.Comment: This is a longer version of the paper published in 2015 IEEE
Information Theory Workshop (ITW
Learning Dictionaries with Bounded Self-Coherence
Sparse coding in learned dictionaries has been established as a successful
approach for signal denoising, source separation and solving inverse problems
in general. A dictionary learning method adapts an initial dictionary to a
particular signal class by iteratively computing an approximate factorization
of a training data matrix into a dictionary and a sparse coding matrix. The
learned dictionary is characterized by two properties: the coherence of the
dictionary to observations of the signal class, and the self-coherence of the
dictionary atoms. A high coherence to the signal class enables the sparse
coding of signal observations with a small approximation error, while a low
self-coherence of the atoms guarantees atom recovery and a more rapid residual
error decay rate for the sparse coding algorithm. The two goals of high signal
coherence and low self-coherence are typically in conflict, therefore one seeks
a trade-off between them, depending on the application. We present a dictionary
learning method with an effective control over the self-coherence of the
trained dictionary, enabling a trade-off between maximizing the sparsity of
codings and approximating an equiangular tight frame.Comment: 4 pages, 2 figures; IEEE Signal Processing Letters, vol. 19, no. 12,
201
Exact Recovery for a Family of Community-Detection Generative Models
Generative models for networks with communities have been studied extensively
for being a fertile ground to establish information-theoretic and computational
thresholds. In this paper we propose a new toy model for planted generative
models called planted Random Energy Model (REM), inspired by Derrida's REM. For
this model we provide the asymptotic behaviour of the probability of error for
the maximum likelihood estimator and hence the exact recovery threshold. As an
application, we further consider the 2 non-equally sized community Weighted
Stochastic Block Model (2-WSBM) on -uniform hypergraphs, that is equivalent
to the P-REM on both sides of the spectrum, for high and low edge cardinality
. We provide upper and lower bounds for the exact recoverability for any
, mapping these problems to the aforementioned P-REM. To the best of our
knowledge these are the first consistency results for the 2-WSBM on graphs and
on hypergraphs with non-equally sized community
- …