2,432 research outputs found

    Information-Preserving Markov Aggregation

    Full text link
    We present a sufficient condition for a non-injective function of a Markov chain to be a second-order Markov chain with the same entropy rate as the original chain. This permits an information-preserving state space reduction by merging states or, equivalently, lossless compression of a Markov source on a sample-by-sample basis. The cardinality of the reduced state space is bounded from below by the node degrees of the transition graph associated with the original Markov chain. We also present an algorithm listing all possible information-preserving state space reductions, for a given transition graph. We illustrate our results by applying the algorithm to a bi-gram letter model of an English text.Comment: 7 pages, 3 figures, 2 table

    Approximately Sampling Elements with Fixed Rank in Graded Posets

    Full text link
    Graded posets frequently arise throughout combinatorics, where it is natural to try to count the number of elements of a fixed rank. These counting problems are often #P\#\textbf{P}-complete, so we consider approximation algorithms for counting and uniform sampling. We show that for certain classes of posets, biased Markov chains that walk along edges of their Hasse diagrams allow us to approximately generate samples with any fixed rank in expected polynomial time. Our arguments do not rely on the typical proofs of log-concavity, which are used to construct a stationary distribution with a specific mode in order to give a lower bound on the probability of outputting an element of the desired rank. Instead, we infer this directly from bounds on the mixing time of the chains through a method we call balanced bias\textit{balanced bias}. A noteworthy application of our method is sampling restricted classes of integer partitions of nn. We give the first provably efficient Markov chain algorithm to uniformly sample integer partitions of nn from general restricted classes. Several observations allow us to improve the efficiency of this chain to require O(n1/2log(n))O(n^{1/2}\log(n)) space, and for unrestricted integer partitions, expected O(n9/4)O(n^{9/4}) time. Related applications include sampling permutations with a fixed number of inversions and lozenge tilings on the triangular lattice with a fixed average height.Comment: 23 pages, 12 figure

    Stochastic kinetics of viral capsid assembly based on detailed protein structures

    Get PDF
    We present a generic computational framework for the simulation of viral capsid assembly which is quantitative and specific. Starting from PDB files containing atomic coordinates, the algorithm builds a coarse grained description of protein oligomers based on graph rigidity. These reduced protein descriptions are used in an extended Gillespie algorithm to investigate the stochastic kinetics of the assembly process. The association rates are obtained from a diffusive Smoluchowski equation for rapid coagulation, modified to account for water shielding and protein structure. The dissociation rates are derived by interpreting the splitting of oligomers as a process of graph partitioning akin to the escape from a multidimensional well. This modular framework is quantitative yet computationally tractable, with a small number of physically motivated parameters. The methodology is illustrated using two different viruses which are shown to follow quantitatively different assembly pathways. We also show how in this model the quasi-stationary kinetics of assembly can be described as a Markovian cascading process in which only a few intermediates and a small proportion of pathways are present. The observed pathways and intermediates can be related a posteriori to structural and energetic properties of the capsid oligomers

    Ergodic Control and Polyhedral approaches to PageRank Optimization

    Full text link
    We study a general class of PageRank optimization problems which consist in finding an optimal outlink strategy for a web site subject to design constraints. We consider both a continuous problem, in which one can choose the intensity of a link, and a discrete one, in which in each page, there are obligatory links, facultative links and forbidden links. We show that the continuous problem, as well as its discrete variant when there are no constraints coupling different pages, can both be modeled by constrained Markov decision processes with ergodic reward, in which the webmaster determines the transition probabilities of websurfers. Although the number of actions turns out to be exponential, we show that an associated polytope of transition measures has a concise representation, from which we deduce that the continuous problem is solvable in polynomial time, and that the same is true for the discrete problem when there are no coupling constraints. We also provide efficient algorithms, adapted to very large networks. Then, we investigate the qualitative features of optimal outlink strategies, and identify in particular assumptions under which there exists a "master" page to which all controlled pages should point. We report numerical results on fragments of the real web graph.Comment: 39 page

    Markov chain aggregation and its applications to combinatorial reaction networks

    Full text link
    We consider a continuous-time Markov chain (CTMC) whose state space is partitioned into aggregates, and each aggregate is assigned a probability measure. A sufficient condition for defining a CTMC over the aggregates is presented as a variant of weak lumpability, which also characterizes that the measure over the original process can be recovered from that of the aggregated one. We show how the applicability of de-aggregation depends on the initial distribution. The application section is a major aspect of the article, where we illustrate that the stochastic rule-based models for biochemical reaction networks form an important area for usage of the tools developed in the paper. For the rule-based models, the construction of the aggregates and computation of the distribution over the aggregates are algorithmic. The techniques are exemplified in three case studies.Comment: 29 pages, 9 figures, 1 table; Ganguly and Petrov are authors with equal contributio

    Generalized Markov stability of network communities

    Full text link
    We address the problem of community detection in networks by introducing a general definition of Markov stability, based on the difference between the probability fluxes of a Markov chain on the network at different time scales. The specific implementation of the quality function and the resulting optimal community structure thus become dependent both on the type of Markov process and on the specific Markov times considered. For instance, if we use a natural Markov chain dynamics and discount its stationary distribution -- that is, we take as reference process the dynamics at infinite time -- we obtain the standard formulation of the Markov stability. Notably, the possibility to use finite-time transition probabilities to define the reference process naturally allows detecting communities at different resolutions, without the need to consider a continuous-time Markov chain in the small time limit. The main advantage of our general formulation of Markov stability based on dynamical flows is that we work with lumped Markov chains on network partitions, having the same stationary distribution of the original process. In this way the form of the quality function becomes invariant under partitioning, leading to a self-consistent definition of community structures at different aggregation scales