6,298 research outputs found
Information theoretic properties of Markov Random Fields, and their algorithmic applications
Markov random fields are a popular model for high-dimensional probability distributions. Over the years, many mathematical, statistical and algorithmic problems on them have been studied. Until recently, the only known algorithms for provably learning them relied on exhaustive search, correlation decay or various incoherence assumptions. Bresler [1] gave an algorithm for learning general Ising models on bounded degree graphs. His approach was based on a structural result about mutual information in Ising models. Here we take a more conceptual approach to proving lower bounds on the mutual information. Our proof generalizes well beyond Ising models, to arbitrary Markov random fields with higher order interactions. As an application, we obtain algorithms for learning Markov random fields on bounded degree graphs on n nodes with r-order interactions in n r time and log n sample complexity. Our algorithms also extend to various partial observation models
Information-theoretic inference of common ancestors
A directed acyclic graph (DAG) partially represents the conditional
independence structure among observations of a system if the local Markov
condition holds, that is, if every variable is independent of its
non-descendants given its parents. In general, there is a whole class of DAGs
that represents a given set of conditional independence relations. We are
interested in properties of this class that can be derived from observations of
a subsystem only. To this end, we prove an information theoretic inequality
that allows for the inference of common ancestors of observed parts in any DAG
representing some unknown larger system. More explicitly, we show that a large
amount of dependence in terms of mutual information among the observations
implies the existence of a common ancestor that distributes this information.
Within the causal interpretation of DAGs our result can be seen as a
quantitative extension of Reichenbach's Principle of Common Cause to more than
two variables. Our conclusions are valid also for non-probabilistic
observations such as binary strings, since we state the proof for an
axiomatized notion of mutual information that includes the stochastic as well
as the algorithmic version.Comment: 18 pages, 4 figure
Bits from Biology for Computational Intelligence
Computational intelligence is broadly defined as biologically-inspired
computing. Usually, inspiration is drawn from neural systems. This article
shows how to analyze neural systems using information theory to obtain
constraints that help identify the algorithms run by such systems and the
information they represent. Algorithms and representations identified
information-theoretically may then guide the design of biologically inspired
computing systems (BICS). The material covered includes the necessary
introduction to information theory and the estimation of information theoretic
quantities from neural data. We then show how to analyze the information
encoded in a system about its environment, and also discuss recent
methodological developments on the question of how much information each agent
carries about the environment either uniquely, or redundantly or
synergistically together with others. Last, we introduce the framework of local
information dynamics, where information processing is decomposed into component
processes of information storage, transfer, and modification -- locally in
space and time. We close by discussing example applications of these measures
to neural data and other complex systems
- …