Search CORE

350 research outputs found

Optimal Kullback-Leibler Aggregation via Information Bottleneck

Author: Geiger Bernhard C.
Koeppl Heinz
Kubin Gernot
Petrov Tatjana
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2015
Field of study

In this paper, we present a method for reducing a regular, discrete-time Markov chain (DTMC) to another DTMC with a given, typically much smaller number of states. The cost of reduction is defined as the Kullback-Leibler divergence rate between a projection of the original process through a partition function and a DTMC on the correspondingly partitioned state space. Finding the reduced model with minimal cost is computationally expensive, as it requires an exhaustive search among all state space partitions, and an exact evaluation of the reduction cost for each candidate partition. Our approach deals with the latter problem by minimizing an upper bound on the reduction cost instead of minimizing the exact cost; The proposed upper bound is easy to compute and it is tight if the original chain is lumpable with respect to the partition. Then, we express the problem in the form of information bottleneck optimization, and propose using the agglomerative information bottleneck algorithm for searching a sub-optimal partition greedily, rather than exhaustively. The theory is illustrated with examples and one application scenario in the context of modeling bio-molecular interactions.Comment: 13 pages, 4 figure

arXiv.org e-Print Archive

Archivio istituzionale della ricerca - Università di Trieste

TUbiblio

Crossref

IST Austria: PubRep (Institute of Science and Technology)

Approximate inference in hidden Markov models using iterative active state selection

Author: Andrieu C
Piechocki RJ
Vithanage CM
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/02/2006
Field of study

Explore Bristol Research

Reduction of Markov Chains using a Value-of-Information-Based Approach

Author: Principe Jose C.
Sledge Isaac J.
Publication venue: 'MDPI AG'
Publication date: 01/03/2019
Field of study

In this paper, we propose an approach to obtain reduced-order models of Markov chains. Our approach is composed of two information-theoretic processes. The first is a means of comparing pairs of stationary chains on different state spaces, which is done via the negative Kullback-Leibler divergence defined on a model joint space. Model reduction is achieved by solving a value-of-information criterion with respect to this divergence. Optimizing the criterion leads to a probabilistic partitioning of the states in the high-order Markov chain. A single free parameter that emerges through the optimization process dictates both the partition uncertainty and the number of state groups. We provide a data-driven means of choosing the `optimal' value of this free parameter, which sidesteps needing to a priori know the number of state groups in an arbitrary chain.Comment: Submitted to Entrop

arXiv.org e-Print Archive

Directory of Open Access Journals

Empirical and Strong Coordination via Soft Covering with Polar Codes

Author: Bloch Matthieu
Chou Remi A.
Kliewer Joerg
Publication venue
Publication date: 06/06/2018
Field of study

We design polar codes for empirical coordination and strong coordination in two-node networks. Our constructions hinge on the fact that polar codes enable explicit low-complexity schemes for soft covering. We leverage this property to propose explicit and low-complexity coding schemes that achieve the capacity regions of both empirical coordination and strong coordination for sequences of actions taking value in an alphabet of prime cardinality. Our results improve previously known polar coding schemes, which (i) were restricted to uniform distributions and to actions obtained via binary symmetric channels for strong coordination, (ii) required a non-negligible amount of common randomness for empirical coordination, and (iii) assumed that the simulation of discrete memoryless channels could be perfectly implemented. As a by-product of our results, we obtain a polar coding scheme that achieves channel resolvability for an arbitrary discrete memoryless channel whose input alphabet has prime cardinality.Comment: 14 pages, two-column, 5 figures, accepted to IEEE Transactions on Information Theor

arXiv.org e-Print Archive

HAL-CentraleSupelec

HAL - Université de Franche-Comté

Hal-Diderot

Information theoretic novelty detection

Author: Anderson
Barnett
Bishop
Eguchi
Fisher
Guido Sanguinetti
Hayton
He
Horton
Markou
Martinez
Maurizio Filippone
Quinn
Roberts
Schölkopf
Singer
Publication venue: 'Elsevier BV'
Publication date: 01/01/2009
Field of study

We present a novel approach to online change detection problems when the training sample size is small. The proposed approach is based on estimating the expected information content of a new data point and allows an accurate control of the false positive rate even for small data sets. In the case of the Gaussian distribution, our approach is analytically tractable and closely related to classical statistical tests. We then propose an approximation scheme to extend our approach to the case of the mixture of Gaussians. We evaluate extensively our approach on synthetic data and on three real benchmark data sets. The experimental validation shows that our method maintains a good overall accuracy, but significantly improves the control over the false positive rate

CiteSeerX

Crossref

Enlighten

White Rose Research Online

The information bottleneck method

Author: Bialek William
Pereira Fernando C.
Tishby Naftali
Publication venue
Publication date: 01/01/1999
Field of study

We define the relevant information in a signal

x\in X

as being the information that this signal provides about another signal y\in \Y. Examples include the information that face images provide about the names of the people portrayed, or the information that speech sounds provide about the words spoken. Understanding the signal

x

requires more than just predicting

y

, it also requires specifying which features of \X play a role in the prediction. We formalize this problem as that of finding a short code for \X that preserves the maximum information about \Y. That is, we squeeze the information that \X provides about \Y through a `bottleneck' formed by a limited set of codewords \tX. This constrained optimization problem can be seen as a generalization of rate distortion theory in which the distortion measure d(x,\x) emerges from the joint statistics of \X and \Y. This approach yields an exact set of self consistent equations for the coding rules X \to \tX and \tX \to \Y. Solutions to these equations can be found by a convergent re-estimation method that generalizes the Blahut-Arimoto algorithm. Our variational principle provides a surprisingly rich framework for discussing a variety of problems in signal processing and learning, as will be described in detail elsewhere

arXiv.org e-Print Archive

CiteSeerX

CERN Document Server