18 research outputs found
Probabilistic Clustering Using Maximal Matrix Norm Couplings
In this paper, we present a local information theoretic approach to
explicitly learn probabilistic clustering of a discrete random variable. Our
formulation yields a convex maximization problem for which it is NP-hard to
find the global optimum. In order to algorithmically solve this optimization
problem, we propose two relaxations that are solved via gradient ascent and
alternating maximization. Experiments on the MSR Sentence Completion Challenge,
MovieLens 100K, and Reuters21578 datasets demonstrate that our approach is
competitive with existing techniques and worthy of further investigation.Comment: Presented at 56th Annual Allerton Conference on Communication,
Control, and Computing, 201
On Relations Between the Relative entropy and -Divergence, Generalizations and Applications
The relative entropy and chi-squared divergence are fundamental divergence
measures in information theory and statistics. This paper is focused on a study
of integral relations between the two divergences, the implications of these
relations, their information-theoretic applications, and some generalizations
pertaining to the rich class of -divergences. Applications that are studied
in this paper refer to lossless compression, the method of types and large
deviations, strong~data-processing inequalities, bounds on contraction
coefficients and maximal correlation, and the convergence rate to stationarity
of a type of discrete-time Markov chains.Comment: Published in the Entropy journal, May 18, 2020. Journal version (open
access) is available at https://www.mdpi.com/1099-4300/22/5/56
Comparison of Channels: Criteria for Domination by a Symmetric Channel
This paper studies the basic question of whether a given channel can be
dominated (in the precise sense of being more noisy) by a -ary symmetric
channel. The concept of "less noisy" relation between channels originated in
network information theory (broadcast channels) and is defined in terms of
mutual information or Kullback-Leibler divergence. We provide an equivalent
characterization in terms of -divergence. Furthermore, we develop a
simple criterion for domination by a -ary symmetric channel in terms of the
minimum entry of the stochastic matrix defining the channel . The criterion
is strengthened for the special case of additive noise channels over finite
Abelian groups. Finally, it is shown that domination by a symmetric channel
implies (via comparison of Dirichlet forms) a logarithmic Sobolev inequality
for the original channel.Comment: 31 pages, 2 figures. Presented at 2017 IEEE International Symposium
on Information Theory (ISIT
Discovering Potential Correlations via Hypercontractivity
Discovering a correlation from one variable to another variable is of
fundamental scientific and practical interest. While existing correlation
measures are suitable for discovering average correlation, they fail to
discover hidden or potential correlations. To bridge this gap, (i) we postulate
a set of natural axioms that we expect a measure of potential correlation to
satisfy; (ii) we show that the rate of information bottleneck, i.e., the
hypercontractivity coefficient, satisfies all the proposed axioms; (iii) we
provide a novel estimator to estimate the hypercontractivity coefficient from
samples; and (iv) we provide numerical experiments demonstrating that this
proposed estimator discovers potential correlations among various indicators of
WHO datasets, is robust in discovering gene interactions from gene expression
time series data, and is statistically more powerful than the estimators for
other correlation measures in binary hypothesis testing of canonical examples
of potential correlations.Comment: 30 pages, 19 figures, accepted for publication in the 31st Conference
on Neural Information Processing Systems (NIPS 2017