18 research outputs found

    Probabilistic Clustering Using Maximal Matrix Norm Couplings

    Full text link
    In this paper, we present a local information theoretic approach to explicitly learn probabilistic clustering of a discrete random variable. Our formulation yields a convex maximization problem for which it is NP-hard to find the global optimum. In order to algorithmically solve this optimization problem, we propose two relaxations that are solved via gradient ascent and alternating maximization. Experiments on the MSR Sentence Completion Challenge, MovieLens 100K, and Reuters21578 datasets demonstrate that our approach is competitive with existing techniques and worthy of further investigation.Comment: Presented at 56th Annual Allerton Conference on Communication, Control, and Computing, 201

    On Relations Between the Relative entropy and χ2\chi^2-Divergence, Generalizations and Applications

    Full text link
    The relative entropy and chi-squared divergence are fundamental divergence measures in information theory and statistics. This paper is focused on a study of integral relations between the two divergences, the implications of these relations, their information-theoretic applications, and some generalizations pertaining to the rich class of ff-divergences. Applications that are studied in this paper refer to lossless compression, the method of types and large deviations, strong~data-processing inequalities, bounds on contraction coefficients and maximal correlation, and the convergence rate to stationarity of a type of discrete-time Markov chains.Comment: Published in the Entropy journal, May 18, 2020. Journal version (open access) is available at https://www.mdpi.com/1099-4300/22/5/56

    Comparison of Channels: Criteria for Domination by a Symmetric Channel

    Full text link
    This paper studies the basic question of whether a given channel VV can be dominated (in the precise sense of being more noisy) by a qq-ary symmetric channel. The concept of "less noisy" relation between channels originated in network information theory (broadcast channels) and is defined in terms of mutual information or Kullback-Leibler divergence. We provide an equivalent characterization in terms of χ2\chi^2-divergence. Furthermore, we develop a simple criterion for domination by a qq-ary symmetric channel in terms of the minimum entry of the stochastic matrix defining the channel VV. The criterion is strengthened for the special case of additive noise channels over finite Abelian groups. Finally, it is shown that domination by a symmetric channel implies (via comparison of Dirichlet forms) a logarithmic Sobolev inequality for the original channel.Comment: 31 pages, 2 figures. Presented at 2017 IEEE International Symposium on Information Theory (ISIT

    Discovering Potential Correlations via Hypercontractivity

    Full text link
    Discovering a correlation from one variable to another variable is of fundamental scientific and practical interest. While existing correlation measures are suitable for discovering average correlation, they fail to discover hidden or potential correlations. To bridge this gap, (i) we postulate a set of natural axioms that we expect a measure of potential correlation to satisfy; (ii) we show that the rate of information bottleneck, i.e., the hypercontractivity coefficient, satisfies all the proposed axioms; (iii) we provide a novel estimator to estimate the hypercontractivity coefficient from samples; and (iv) we provide numerical experiments demonstrating that this proposed estimator discovers potential correlations among various indicators of WHO datasets, is robust in discovering gene interactions from gene expression time series data, and is statistically more powerful than the estimators for other correlation measures in binary hypothesis testing of canonical examples of potential correlations.Comment: 30 pages, 19 figures, accepted for publication in the 31st Conference on Neural Information Processing Systems (NIPS 2017
    corecore