4 research outputs found

    Co-evolution is Incompatible with the Markov Assumption in Phylogenetics

    Get PDF
    Markov models are extensively used in the analysis of molecular evolution. A recent line of research suggests that pairs of proteins with functional and physical interactions co-evolve with each other. Here, by analyzing hundreds of orthologous sets of three fungi and their co-evolutionary relations, we demonstrate that co-evolutionary assumption may violate the Markov assumption. Our results encourage developing alternative probabilistic models for the cases of extreme co-evolution

    More Taxa Are Not Necessarily Better for the Reconstruction of Ancestral Character States

    Full text link
    We show that the accuracy of reconstrucing an ancestral state is not an increasing function of the size of taxon sampling.Comment: 21 page

    Efficient algorithms for reconstructing gene content by co-evolution

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>In a previous study we demonstrated that co-evolutionary information can be utilized for improving the accuracy of ancestral gene content reconstruction. To this end, we defined a new computational problem, the Ancestral Co-Evolutionary (ACE) problem, and developed algorithms for solving it.</p> <p>Results</p> <p>In the current paper we generalize our previous study in various ways. First, we describe new efficient computational approaches for solving the ACE problem. The new approaches are based on reductions to classical methods such as linear programming relaxation, quadratic programming, and min-cut. Second, we report new computational hardness results related to the ACE, including practical cases where it can be solved in polynomial time.</p> <p>Third, we generalize the ACE problem and demonstrate how our approach can be used for inferring parts of the genomes of <it>non-ancestral</it> organisms. To this end, we describe a heuristic for finding the portion of the genome ('dominant set’) that can be used to reconstruct the rest of the genome with the lowest error rate. This heuristic utilizes both evolutionary information and co-evolutionary information.</p> <p>We implemented these algorithms on a large input of the ACE problem (95 unicellular organisms, 4,873 protein families, and 10, 576 of co-evolutionary relations), demonstrating that some of these algorithms can outperform the algorithm used in our previous study. In addition, we show that based on our approach a ’dominant set’ cab be used reconstruct a major fraction of a genome (up to 79%) with relatively low error-rate (<it>e.g.</it> 0.11). We find that the ’dominant set’ tends to include metabolic and regulatory genes, with high evolutionary rate, and low protein abundance and number of protein-protein interactions.</p> <p>Conclusions</p> <p>The <it>ACE</it> problem can be efficiently extended for inferring the genomes of organisms that exist today. In addition, it may be solved in polynomial time in many practical cases. Metabolic and regulatory genes were found to be the most important groups of genes necessary for reconstructing gene content of an organism based on other related genomes.</p
    corecore