5,842 research outputs found
Optimal Kullback-Leibler Aggregation via Information Bottleneck
In this paper, we present a method for reducing a regular, discrete-time
Markov chain (DTMC) to another DTMC with a given, typically much smaller number
of states. The cost of reduction is defined as the Kullback-Leibler divergence
rate between a projection of the original process through a partition function
and a DTMC on the correspondingly partitioned state space. Finding the reduced
model with minimal cost is computationally expensive, as it requires an
exhaustive search among all state space partitions, and an exact evaluation of
the reduction cost for each candidate partition. Our approach deals with the
latter problem by minimizing an upper bound on the reduction cost instead of
minimizing the exact cost; The proposed upper bound is easy to compute and it
is tight if the original chain is lumpable with respect to the partition. Then,
we express the problem in the form of information bottleneck optimization, and
propose using the agglomerative information bottleneck algorithm for searching
a sub-optimal partition greedily, rather than exhaustively. The theory is
illustrated with examples and one application scenario in the context of
modeling bio-molecular interactions.Comment: 13 pages, 4 figure
The information bottleneck method
We define the relevant information in a signal as being the
information that this signal provides about another signal y\in \Y. Examples
include the information that face images provide about the names of the people
portrayed, or the information that speech sounds provide about the words
spoken. Understanding the signal requires more than just predicting , it
also requires specifying which features of \X play a role in the prediction.
We formalize this problem as that of finding a short code for \X that
preserves the maximum information about \Y. That is, we squeeze the
information that \X provides about \Y through a `bottleneck' formed by a
limited set of codewords \tX. This constrained optimization problem can be
seen as a generalization of rate distortion theory in which the distortion
measure d(x,\x) emerges from the joint statistics of \X and \Y. This
approach yields an exact set of self consistent equations for the coding rules
X \to \tX and \tX \to \Y. Solutions to these equations can be found by a
convergent re-estimation method that generalizes the Blahut-Arimoto algorithm.
Our variational principle provides a surprisingly rich framework for discussing
a variety of problems in signal processing and learning, as will be described
in detail elsewhere
Information-Distilling Quantizers
Let and be dependent random variables. This paper considers the
problem of designing a scalar quantizer for to maximize the mutual
information between the quantizer's output and , and develops fundamental
properties and bounds for this form of quantization, which is connected to the
log-loss distortion criterion. The main focus is the regime of low ,
where it is shown that, if is binary, a constant fraction of the mutual
information can always be preserved using
quantization levels, and there exist distributions for which this many
quantization levels are necessary. Furthermore, for larger finite alphabets , it is established that an -fraction of the
mutual information can be preserved using roughly quantization levels
A multistage linear array assignment problem
The implementation of certain algorithms on parallel processing computing architectures can involve partitioning contiguous elements into a fixed number of groups, each of which is to be handled by a single processor. It is desired to find an assignment of elements to processors that minimizes the sum of the maximum workloads experienced at each stage. This problem can be viewed as a multi-objective network optimization problem. Polynomially-bounded algorithms are developed for the case of two stages, whereas the associated decision problem (for an arbitrary number of stages) is shown to be NP-complete. Heuristic procedures are therefore proposed and analyzed for the general problem. Computational experience with one of the exact problems, incorporating certain pruning rules, is presented with one of the exact problems. Empirical results also demonstrate that one of the heuristic procedures is especially effective in practice
Speeding up Martins' algorithm for multiple objective shortest path problems
The latest transportation systems require the best routes in a large network with respect to multiple objectives simultaneously to be calculated in a very short time. The label setting algorithm of Martins efficiently finds this set of Pareto optimal paths, but sometimes tends to be slow, especially for large networks such as transportation networks. In this article we investigate a number of speedup measures, resulting in new algorithms. It is shown that the calculation time to find the Pareto optimal set can be reduced considerably. Moreover, it is mathematically proven that these algorithms still produce the Pareto optimal set of paths
Breaking Instance-Independent Symmetries In Exact Graph Coloring
Code optimization and high level synthesis can be posed as constraint
satisfaction and optimization problems, such as graph coloring used in register
allocation. Graph coloring is also used to model more traditional CSPs relevant
to AI, such as planning, time-tabling and scheduling. Provably optimal
solutions may be desirable for commercial and defense applications.
Additionally, for applications such as register allocation and code
optimization, naturally-occurring instances of graph coloring are often small
and can be solved optimally. A recent wave of improvements in algorithms for
Boolean satisfiability (SAT) and 0-1 Integer Linear Programming (ILP) suggests
generic problem-reduction methods, rather than problem-specific heuristics,
because (1) heuristics may be upset by new constraints, (2) heuristics tend to
ignore structure, and (3) many relevant problems are provably inapproximable.
Problem reductions often lead to highly symmetric SAT instances, and
symmetries are known to slow down SAT solvers. In this work, we compare several
avenues for symmetry breaking, in particular when certain kinds of symmetry are
present in all generated instances. Our focus on reducing CSPs to SAT allows us
to leverage recent dramatic improvement in SAT solvers and automatically
benefit from future progress. We can use a variety of black-box SAT solvers
without modifying their source code because our symmetry-breaking techniques
are static, i.e., we detect symmetries and add symmetry breaking predicates
(SBPs) during pre-processing.
An important result of our work is that among the types of
instance-independent SBPs we studied and their combinations, the simplest and
least complete constructions are the most effective. Our experiments also
clearly indicate that instance-independent symmetries should mostly be
processed together with instance-specific symmetries rather than at the
specification level, contrary to what has been suggested in the literature
- …