3 research outputs found

    Characterization of phylogenetic networks with NetTest

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Typical evolutionary events like recombination, hybridization or gene transfer make necessary the use of phylogenetic networks to properly depict the evolution of DNA and protein sequences. Although several theoretical classes have been proposed to characterize these networks, they make stringent assumptions that will likely not be met by the evolutionary process. We have recently shown that the complexity of simulated networks is a function of the population recombination rate, and that at moderate and large recombination rates the resulting networks cannot be categorized. However, we do not know whether these results extend to networks estimated from real data.</p> <p>Results</p> <p>We introduce a web server for the categorization of explicit phylogenetic networks, including the most relevant theoretical classes developed so far. Using this tool, we analyzed statistical parsimony phylogenetic networks estimated from ~5,000 DNA alignments, obtained from the NCBI PopSet and Polymorphix databases. The level of characterization was correlated to nucleotide diversity, and a high proportion of the networks derived from these data sets could be formally characterized.</p> <p>Conclusions</p> <p>We have developed a public web server, <it>NetTest </it>(freely available from the software section at <url>http://darwin.uvigo.es</url>), to formally characterize the complexity of phylogenetic networks. Using NetTest we found that most statistical parsimony networks estimated with the program TCS could be assigned to a known network class. The level of network characterization was correlated to nucleotide diversity and dependent upon the intra/interspecific levels, although no significant differences were detected among genes. More research on the properties of phylogenetic networks is clearly needed.</p

    Faster Computation of the Robinson-Foulds Distance between Phylogenetic Networks

    Get PDF
    The Robinson-Foulds distance, which is the most widely used metric for comparing phylogenetic trees, has recently been generalized to phylogenetic networks. Given two networks N_1,N_2 with n leaves, m nodes, and e edges, the Robinson-Foulds distance measures the number of clusters of descendant leaves that are not shared by N_1and N_2. The fastest known algorithm for computing the Robinson-Foulds distance between those networks runs in O(m(m+e)) time. In this paper, we improve the time complexity to O(n(m + e)/ log n) for general networks and O(nm/log n) for general networks with bounded degree, and to optimal O(m+e) time for planar phylogenetic networks and boundedlevel phylogenetic networks. We also introduce the natural concept of theminimum spread of a phylogenetic network and show how the running time of our new algorithm depends on this parameter. As an example, we prove that the minimum spread of a level-k phylogenetic network is at most k + 1, which implies that for two level-k phylogenetic networks, our algorithm runs in O((k + 1)(m + e)) time.Combinatorial Pattern Matching : 21st Annual Symposium, CPM 2010, New York, NY, USA, June 21-23, 2010

    Knowledge Discovery Models for Product Design, Assembly Planning and Manufacturing System Synthesis

    Get PDF
    The variety of products has been growing over the last few decades so that the challenges for designers and manufacturers to enhance their design and manufacturing capabilities, responsively and cost-effectively are greater than ever. The main objective of this research is to help designers and manufacturers cope with the increasing variety management challenges by exploiting the data records of existing or old products, along with appropriate Knowledge Discovery (KD) models, in order to extract the embedded knowledge in such data and use it to speed-up the development of new products. Four product development activities have been successfully addressed in this research: product design, product family formation, assembly sequencing and manufacturing system synthesis. The models and methods developed in this dissertation present a package of knowledge-based solutions that can greatly support product designers and manufacturers at various stages of the product development and manufacturing planning stages. For design retrieval; using efficient tree reconciliation algorithms found in Biological Sciences, a novel Bill of Materials (BOM) trees matching method was developed to retrieve the closest old design and discover components and structure shared with new product design. As a further application to BOM matching, an enhanced BOM matching method was also developed and used for product family formation. A new approach was introduced for assembly sequencing, based on the notion of consensus trees used in evolutionary studies, to overcome the critical limitation of individual assembly sequence retrieval methods that are not able to capture the assembly sequence data for a given new combination of components that never existed before in the same product variant. For manufacturing system synthesis; a novel Integer Programming model was developed to extract association rules between the product design domain and manufacturing domain to be used for synthesizing a manufacturing/assembly system for new products. Examples of real products were used to demonstrate and validate the developed models and comparisons with related existing methods were carried out to demonstrate the advantages of the developed models. The outcomes of this research provide efficient, and easy to implement knowledge-based solutions for facilitating cost-effective and rapid product development activities
    corecore