2,432 research outputs found
Information-Preserving Markov Aggregation
We present a sufficient condition for a non-injective function of a Markov
chain to be a second-order Markov chain with the same entropy rate as the
original chain. This permits an information-preserving state space reduction by
merging states or, equivalently, lossless compression of a Markov source on a
sample-by-sample basis. The cardinality of the reduced state space is bounded
from below by the node degrees of the transition graph associated with the
original Markov chain.
We also present an algorithm listing all possible information-preserving
state space reductions, for a given transition graph. We illustrate our results
by applying the algorithm to a bi-gram letter model of an English text.Comment: 7 pages, 3 figures, 2 table
Approximately Sampling Elements with Fixed Rank in Graded Posets
Graded posets frequently arise throughout combinatorics, where it is natural
to try to count the number of elements of a fixed rank. These counting problems
are often -complete, so we consider approximation algorithms for
counting and uniform sampling. We show that for certain classes of posets,
biased Markov chains that walk along edges of their Hasse diagrams allow us to
approximately generate samples with any fixed rank in expected polynomial time.
Our arguments do not rely on the typical proofs of log-concavity, which are
used to construct a stationary distribution with a specific mode in order to
give a lower bound on the probability of outputting an element of the desired
rank. Instead, we infer this directly from bounds on the mixing time of the
chains through a method we call .
A noteworthy application of our method is sampling restricted classes of
integer partitions of . We give the first provably efficient Markov chain
algorithm to uniformly sample integer partitions of from general restricted
classes. Several observations allow us to improve the efficiency of this chain
to require space, and for unrestricted integer partitions,
expected time. Related applications include sampling permutations
with a fixed number of inversions and lozenge tilings on the triangular lattice
with a fixed average height.Comment: 23 pages, 12 figure
Stochastic kinetics of viral capsid assembly based on detailed protein structures
We present a generic computational framework for the simulation of viral
capsid assembly which is quantitative and specific. Starting from PDB files
containing atomic coordinates, the algorithm builds a coarse grained
description of protein oligomers based on graph rigidity. These reduced protein
descriptions are used in an extended Gillespie algorithm to investigate the
stochastic kinetics of the assembly process. The association rates are obtained
from a diffusive Smoluchowski equation for rapid coagulation, modified to
account for water shielding and protein structure. The dissociation rates are
derived by interpreting the splitting of oligomers as a process of graph
partitioning akin to the escape from a multidimensional well. This modular
framework is quantitative yet computationally tractable, with a small number of
physically motivated parameters. The methodology is illustrated using two
different viruses which are shown to follow quantitatively different assembly
pathways. We also show how in this model the quasi-stationary kinetics of
assembly can be described as a Markovian cascading process in which only a few
intermediates and a small proportion of pathways are present. The observed
pathways and intermediates can be related a posteriori to structural and
energetic properties of the capsid oligomers
Ergodic Control and Polyhedral approaches to PageRank Optimization
We study a general class of PageRank optimization problems which consist in
finding an optimal outlink strategy for a web site subject to design
constraints. We consider both a continuous problem, in which one can choose the
intensity of a link, and a discrete one, in which in each page, there are
obligatory links, facultative links and forbidden links. We show that the
continuous problem, as well as its discrete variant when there are no
constraints coupling different pages, can both be modeled by constrained Markov
decision processes with ergodic reward, in which the webmaster determines the
transition probabilities of websurfers. Although the number of actions turns
out to be exponential, we show that an associated polytope of transition
measures has a concise representation, from which we deduce that the continuous
problem is solvable in polynomial time, and that the same is true for the
discrete problem when there are no coupling constraints. We also provide
efficient algorithms, adapted to very large networks. Then, we investigate the
qualitative features of optimal outlink strategies, and identify in particular
assumptions under which there exists a "master" page to which all controlled
pages should point. We report numerical results on fragments of the real web
graph.Comment: 39 page
Markov chain aggregation and its applications to combinatorial reaction networks
We consider a continuous-time Markov chain (CTMC) whose state space is
partitioned into aggregates, and each aggregate is assigned a probability
measure. A sufficient condition for defining a CTMC over the aggregates is
presented as a variant of weak lumpability, which also characterizes that the
measure over the original process can be recovered from that of the aggregated
one. We show how the applicability of de-aggregation depends on the initial
distribution. The application section is a major aspect of the article, where
we illustrate that the stochastic rule-based models for biochemical reaction
networks form an important area for usage of the tools developed in the paper.
For the rule-based models, the construction of the aggregates and computation
of the distribution over the aggregates are algorithmic. The techniques are
exemplified in three case studies.Comment: 29 pages, 9 figures, 1 table; Ganguly and Petrov are authors with
equal contributio
Generalized Markov stability of network communities
We address the problem of community detection in networks by introducing a
general definition of Markov stability, based on the difference between the
probability fluxes of a Markov chain on the network at different time scales.
The specific implementation of the quality function and the resulting optimal
community structure thus become dependent both on the type of Markov process
and on the specific Markov times considered. For instance, if we use a natural
Markov chain dynamics and discount its stationary distribution -- that is, we
take as reference process the dynamics at infinite time -- we obtain the
standard formulation of the Markov stability. Notably, the possibility to use
finite-time transition probabilities to define the reference process naturally
allows detecting communities at different resolutions, without the need to
consider a continuous-time Markov chain in the small time limit. The main
advantage of our general formulation of Markov stability based on dynamical
flows is that we work with lumped Markov chains on network partitions, having
the same stationary distribution of the original process. In this way the form
of the quality function becomes invariant under partitioning, leading to a
self-consistent definition of community structures at different aggregation
scales
- …