341 research outputs found

    Voter Model with Time dependent Flip-rates

    Full text link
    We introduce time variation in the flip-rates of the Voter Model. This type of generalisation is relevant to models of ageing in language change, allowing the representation of changes in speakers' learning rates over their lifetime and may be applied to any other similar model in which interaction rates at the microscopic level change with time. The mean time taken to reach consensus varies in a nontrivial way with the rate of change of the flip-rates, varying between bounds given by the mean consensus times for static homogeneous (the original Voter Model) and static heterogeneous flip-rates. By considering the mean time between interactions for each agent, we derive excellent estimates of the mean consensus times and exit probabilities for any time scale of flip-rate variation. The scaling of consensus times with population size on complex networks is correctly predicted, and is as would be expected for the ordinary voter model. Heterogeneity in the initial distribution of opinions has a strong effect, considerably reducing the mean time to consensus, while increasing the probability of survival of the opinion which initially occupies the most slowly changing agents. The mean times to reach consensus for different states are very different. An opinion originally held by the fastest changing agents has a smaller chance to succeed, and takes much longer to do so than an evenly distributed opinion.Comment: 16 pages, 6 figure

    The zero exemplar distance problem

    Full text link
    Given two genomes with duplicate genes, \textsc{Zero Exemplar Distance} is the problem of deciding whether the two genomes can be reduced to the same genome without duplicate genes by deleting all but one copy of each gene in each genome. Blin, Fertin, Sikora, and Vialette recently proved that \textsc{Zero Exemplar Distance} for monochromosomal genomes is NP-hard even if each gene appears at most two times in each genome, thereby settling an important open question on genome rearrangement in the exemplar model. In this paper, we give a very simple alternative proof of this result. We also study the problem \textsc{Zero Exemplar Distance} for multichromosomal genomes without gene order, and prove the analogous result that it is also NP-hard even if each gene appears at most two times in each genome. For the positive direction, we show that both variants of \textsc{Zero Exemplar Distance} admit polynomial-time algorithms if each gene appears exactly once in one genome and at least once in the other genome. In addition, we present a polynomial-time algorithm for the related problem \textsc{Exemplar Longest Common Subsequence} in the special case that each mandatory symbol appears exactly once in one input sequence and at least once in the other input sequence. This answers an open question of Bonizzoni et al. We also show that \textsc{Zero Exemplar Distance} for multichromosomal genomes without gene order is fixed-parameter tractable if the parameter is the maximum number of chromosomes in each genome.Comment: Strengthened and reorganize

    On the PATHGROUPS approach to rapid small phylogeny

    Get PDF
    We present a data structure enabling rapid heuristic solution to the ancestral genome reconstruction problem for given phylogenies under genomic rearrangement metrics. The efficiency of the greedy algorithm is due to fast updating of the structure during run time and a simple priority scheme for choosing the next step. Since accuracy deteriorates for sets of highly divergent genomes, we investigate strategies for improving accuracy and expanding the range of data sets where accurate reconstructions can be expected. This includes a more refined priority system, and a two-step look-ahead, as well as iterative local improvements based on a the median version of the problem, incorporating simulated annealing. We apply this to a set of yeast genomes to corroborate a recent gene sequence-based phylogeny

    A framework for orthology assignment from gene rearrangement data

    Get PDF
    Abstract. Gene rearrangements have successfully been used in phylogenetic reconstruction and comparative genomics, but usually under the assumption that all genomes have the same gene content and that no gene is duplicated. While these assumptions allow one to work with organellar genomes, they are too restrictive when comparing nuclear genomes. The main challenge is how to deal with gene families, specifically, how to identify orthologs. While searching for orthologies is a common task in computational biology, it is usually done using sequence data. We approach that problem using gene rearrangement data, provide an optimization framework in which to phrase the problem, and present some preliminary theoretical results.

    Maximum Parsimony on Phylogenetic networks

    Get PDF
    Abstract Background Phylogenetic networks are generalizations of phylogenetic trees, that are used to model evolutionary events in various contexts. Several different methods and criteria have been introduced for reconstructing phylogenetic trees. Maximum Parsimony is a character-based approach that infers a phylogenetic tree by minimizing the total number of evolutionary steps required to explain a given set of data assigned on the leaves. Exact solutions for optimizing parsimony scores on phylogenetic trees have been introduced in the past. Results In this paper, we define the parsimony score on networks as the sum of the substitution costs along all the edges of the network; and show that certain well-known algorithms that calculate the optimum parsimony score on trees, such as Sankoff and Fitch algorithms extend naturally for networks, barring conflicting assignments at the reticulate vertices. We provide heuristics for finding the optimum parsimony scores on networks. Our algorithms can be applied for any cost matrix that may contain unequal substitution costs of transforming between different characters along different edges of the network. We analyzed this for experimental data on 10 leaves or fewer with at most 2 reticulations and found that for almost all networks, the bounds returned by the heuristics matched with the exhaustively determined optimum parsimony scores. Conclusion The parsimony score we define here does not directly reflect the cost of the best tree in the network that displays the evolution of the character. However, when searching for the most parsimonious network that describes a collection of characters, it becomes necessary to add additional cost considerations to prefer simpler structures, such as trees over networks. The parsimony score on a network that we describe here takes into account the substitution costs along the additional edges incident on each reticulate vertex, in addition to the substitution costs along the other edges which are common to all the branching patterns introduced by the reticulate vertices. Thus the score contains an in-built cost for the number of reticulate vertices in the network, and would provide a criterion that is comparable among all networks. Although the problem of finding the parsimony score on the network is believed to be computationally hard to solve, heuristics such as the ones described here would be beneficial in our efforts to find a most parsimonious network.</p

    Limited Lifespan of Fragile Regions in Mammalian Evolution

    Full text link
    An important question in genome evolution is whether there exist fragile regions (rearrangement hotspots) where chromosomal rearrangements are happening over and over again. Although nearly all recent studies supported the existence of fragile regions in mammalian genomes, the most comprehensive phylogenomic study of mammals (Ma et al. (2006) Genome Research 16, 1557-1565) raised some doubts about their existence. We demonstrate that fragile regions are subject to a "birth and death" process, implying that fragility has limited evolutionary lifespan. This finding implies that fragile regions migrate to different locations in different mammals, explaining why there exist only a few chromosomal breakpoints shared between different lineages. The birth and death of fragile regions phenomenon reinforces the hypothesis that rearrangements are promoted by matching segmental duplications and suggests putative locations of the currently active fragile regions in the human genome

    Modeling the evolution space of breakage fusion bridge cycles with a stochastic folding process

    Get PDF
    Breakage-Fusion-Bridge cycles in cancer arise when a broken segment of DNA is duplicated and an end from each copy joined together. This structure then 'unfolds' into a new piece of palindromic DNA. This is one mechanism responsible for the localised amplicons observed in cancer genome data. The process has parallels with paper folding sequences that arise when a piece of paper is folded several times and then unfolded. Here we adapt such methods to study the breakage-fusion-bridge structures in detail. We firstly consider discrete representations of this space with 2-d trees to demonstrate that there are 2^(n(n-1)/2) qualitatively distinct evolutions involving n breakage-fusion-bridge cycles. Secondly we consider the stochastic nature of the fold positions, to determine evolution likelihoods, and also describe how amplicons become localised. Finally we highlight these methods by inferring the evolution of breakage-fusion-bridge cycles with data from primary tissue cancer samples

    A fast algorithm for the multiple genome rearrangement problem with weighted reversals and transpositions

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Due to recent progress in genome sequencing, more and more data for phylogenetic reconstruction based on rearrangement distances between genomes become available. However, this phylogenetic reconstruction is a very challenging task. For the most simple distance measures (the breakpoint distance and the reversal distance), the problem is NP-hard even if one considers only three genomes.</p> <p>Results</p> <p>In this paper, we present a new heuristic algorithm that directly constructs a phylogenetic tree w.r.t. the weighted reversal and transposition distance. Experimental results on previously published datasets show that constructing phylogenetic trees in this way results in better trees than constructing the trees w.r.t. the reversal distance, and recalculating the weight of the trees with the weighted reversal and transposition distance. An implementation of the algorithm can be obtained from the authors.</p> <p>Conclusion</p> <p>The possibility of creating phylogenetic trees directly w.r.t. the weighted reversal and transposition distance results in biologically more realistic scenarios. Our algorithm can solve today's most challenging biological datasets in a reasonable amount of time.</p
    corecore