494 research outputs found

    Recombination-aware alignment of diploid individuals

    Get PDF
    From Twelfth Annual Research in Computational Molecular Biology (RECOMB) Satellite Workshop on Comparative Genomics Cold Spring Harbor, NY, USA. 19-22 October 2014.Veli Mäkinen and Daniel Valenzuela. Recombination-aware alignment of diploid individuals. Presented at RECOMB-CG 2014. BMC Genomics, 15(Suppl 6):S15, 2014.Peer reviewe

    Decoding coalescent hidden Markov models in linear time

    Full text link
    In many areas of computational biology, hidden Markov models (HMMs) have been used to model local genomic features. In particular, coalescent HMMs have been used to infer ancient population sizes, migration rates, divergence times, and other parameters such as mutation and recombination rates. As more loci, sequences, and hidden states are added to the model, however, the runtime of coalescent HMMs can quickly become prohibitive. Here we present a new algorithm for reducing the runtime of coalescent HMMs from quadratic in the number of hidden time states to linear, without making any additional approximations. Our algorithm can be incorporated into various coalescent HMMs, including the popular method PSMC for inferring variable effective population sizes. Here we implement this algorithm to speed up our demographic inference method diCal, which is equivalent to PSMC when applied to a sample of two haplotypes. We demonstrate that the linear-time method can reconstruct a population size change history more accurately than the quadratic-time method, given similar computation resources. We also apply the method to data from the 1000 Genomes project, inferring a high-resolution history of size changes in the European population.Comment: 18 pages, 5 figures. To appear in the Proceedings of the 18th Annual International Conference on Research in Computational Molecular Biology (RECOMB 2014). The final publication is available at link.springer.co

    Developing a scoring function for NMR structure-based assignments using machine learning

    Get PDF
    Determining the assignment of signals received from the ex- periments (peaks) to speci_c nuclei of the target molecule in Nuclear Magnetic Resonance (NMR1) spectroscopy is an important challenge. Nuclear Vector Replacement (NVR) ([2, 3]) is a framework for structure- based assignments which combines multiple types of NMR data such as chemical shifts, residual dipolar couplings, and NOEs. NVR-BIP [1] is a tool which utilizes a scoring function with a binary integer programming (BIP) model to perform the assignments. In this paper, support vector machines (SVM) and boosting are employed to combine the terms in NVR-BIP's scoring function by viewing the assignment as a classi_ca- tion problem. The assignment accuracies obtained using this approach show that boosting improves the assignment accuracy of NVR-BIP on our data set when RDCs are not available and outperforms SVMs. With RDCs, boosting and SVMs o_er mixed results

    Drawing explicit phylogenetic networks and their integration into SplitsTree

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>SplitsTree provides a framework for the calculation of phylogenetic trees and networks. It contains a wide variety of methods for the import/export, calculation and visualization of phylogenetic information. The software is developed in Java and implements a command line tool as well as a graphical user interface.</p> <p>Results</p> <p>In this article, we present solutions to two important problems in the field of phylogenetic networks. The first problem is the visualization of explicit phylogenetic networks. To solve this, we present a modified version of the equal angle algorithm that naturally integrates reticulations into the layout process and thus leads to an appealing visualization of these networks. The second problem is the availability of explicit phylogenetic network methods for the general user. To advance the usage of explicit phylogenetic networks by biologists further, we present an extension to the SplitsTree framework that integrates these networks. By addressing these two problems, SplitsTree is among the first programs that incorporates <it>implicit </it>and <it>explicit </it>network methods together with standard phylogenetic tree methods in a graphical user interface environment.</p> <p>Conclusion</p> <p>In this article, we presented an extension of SplitsTree 4 that incorporates explicit phylogenetic networks. The extension provides a set of core classes to handle explicit phylogenetic networks and a visualization of these networks.</p

    Domain-oriented edge-based alignment of protein interaction networks

    Get PDF
    Motivation: Recent advances in high-throughput experimental techniques have yielded a large amount of data on protein–protein interactions (PPIs). Since these interactions can be organized into networks, and since separate PPI networks can be constructed for different species, a natural research direction is the comparative analysis of such networks across species in order to detect conserved functional modules. This is the task of network alignment

    A Survey of Combinatorial Methods for Phylogenetic Networks

    Get PDF
    The evolutionary history of a set of species is usually described by a rooted phylogenetic tree. Although it is generally undisputed that bifurcating speciation events and descent with modifications are major forces of evolution, there is a growing belief that reticulate events also have a role to play. Phylogenetic networks provide an alternative to phylogenetic trees and may be more suitable for data sets where evolution involves significant amounts of reticulate events, such as hybridization, horizontal gene transfer, or recombination. In this article, we give an introduction to the topic of phylogenetic networks, very briefly describing the fundamental concepts and summarizing some of the most important combinatorial methods that are available for their computation

    When two trees go to war

    Get PDF
    Rooted phylogenetic networks are often constructed by combining trees, clusters, triplets or characters into a single network that in some well-defined sense simultaneously represents them all. We review these four models and investigate how they are related. In general, the model chosen influences the minimum number of reticulation events required. However, when one obtains the input data from two binary trees, we show that the minimum number of reticulations is independent of the model. The number of reticulations necessary to represent the trees, triplets, clusters (in the softwired sense) and characters (with unrestricted multiple crossover recombination) are all equal. Furthermore, we show that these results also hold when not the number of reticulations but the level of the constructed network is minimised. We use these unification results to settle several complexity questions that have been open in the field for some time. We also give explicit examples to show that already for data obtained from three binary trees the models begin to diverge
    corecore