989 research outputs found

    Progressive Tree Neighborhood applied to the Maximum Parsimony Problem

    Get PDF
    The Maximum Parsimony (MP) problem aims at reconstructing a phylogenetic tree from DNA sequences while minimizing the number of genetic transformations. To solve this NP-complete problem, heuristic methods have been developed, often based on local search. In this paper, we focus on the influence of the neighborhood relations. After analyzing the advantages and drawbacks of the well-known Nearest Neighbor Interchange (NNI), Subtree Pruning Regrafting (SPR), and Tree-Bisection-Reconnection (TBR) neighborhoods, we introduce the concept of Progressive Neighborhood (PN), which consists of constraining progressively the size of the neighborhood as the search advances. We empirically show that applied to the MP problem, this PN turns out to be more efficient and robust than the classic neighborhoods using a descent algorithm. Indeed, it allows us to find better solutions with a smaller number of iterations or trees evaluated

    Progressive Tree Neighborhood Applied to the Maximum Parsimony Problem

    Get PDF
    The Maximum Parsimony problem aims at reconstructing a phylogenetic tree from DNA sequences while minimizing the number of genetic transformations. To solve this NP-complete problem, heuristic methods have been developed, often based on local search. In this article, we focus on the influence of the neighborhood relations. After identifying the limits of the commonly used neighborhoods, we introduce the concept of progressive neighborhood which consists in decreasing the search space as the search advances. We empirically show that applied to the Maximum Parsimony problem, this progressive neighborhood turns out to be more efficient and robust than the classic neighborhoods as it allows to find better solutions from any initial solution


    Get PDF
    筑波大学 (University of Tsukuba)201

    Phylogenetic Trees and Their Analysis

    Full text link
    Determining the best possible evolutionary history, the lowest-cost phylogenetic tree, to fit a given set of taxa and character sequences using maximum parsimony is an active area of research due to its underlying importance in understanding biological processes. As several steps in this process are NP-Hard when using popular, biologically-motivated optimality criteria, significant amounts of resources are dedicated to both both heuristics and to making exact methods more computationally tractable. We examine both phylogenetic data and the structure of the search space in order to suggest methods to reduce the number of possible trees that must be examined to find an exact solution for any given set of taxa and associated character data. Our work on four related problems combines theoretical insight with empirical study to improve searching of the tree space. First, we show that there is a Hamiltonian path through tree space for the most common tree metrics, answering Bryant\u27s Challenge for the minimal such path. We next examine the topology of the search space under various metrics, showing that some metrics have local maxima and minima even with perfect data, while some others do not. We further characterize conditions for which sequences simulated under the Jukes-Cantor model of evolution yield well-behaved search spaces. Next, we reduce the search space needed for an exact solution by splitting the set of characters into mutually-incompatible subsets of compatible characters, building trees based on the perfect phylogenies implied by these sets, and then searching in the neighborhoods of these trees. We validate this work empirically. Finally, we compare two approaches to the generalized tree alignment problem, or GTAP: Sequence alignment followed by tree search vs. Direct Optimization, on both biological and simulated data

    On the Complexity of Parameterized Local Search for the Maximum Parsimony Problem

    Get PDF
    Maximum Parsimony is the problem of computing a most parsimonious phylogenetic tree for a taxa set X from character data for X. A common strategy to attack this notoriously hard problem is to perform a local search over the phylogenetic tree space. Here, one is given a phylogenetic tree T and wants to find a more parsimonious tree in the neighborhood of T. We study the complexity of this problem when the neighborhood contains all trees within distance k for several classic distance functions. For the nearest neighbor interchange (NNI), subtree prune and regraft (SPR), tree bisection and reconnection (TBR), and edge contraction and refinement (ECR) distances, we show that, under the exponential time hypothesis, there are no algorithms with running time |I|^o(k) where |I| is the total input size. Hence, brute-force algorithms with running time |X|^?(k) ? |I| are essentially optimal. In contrast to the above distances, we observe that for the sECR-distance, where the contracted edges are constrained to form a subtree, a better solution within distance k can be found in k^?(k) ? |I|^?(1) time

    On Neighborhood Tree Search

    Get PDF
    We consider the neighborhood tree induced by alternating the use of different neighborhood structures within a local search descent. We investigate the issue of designing a search strategy operating at the neighborhood tree level by exploring different paths of the tree in a heuristic way. We show that allowing the search to 'backtrack' to a previously visited solution and resuming the iterative variable neighborhood descent by 'pruning' the already explored neighborhood branches leads to the design of effective and efficient search heuristics. We describe this idea by discussing its basic design components within a generic algorithmic scheme and we propose some simple and intuitive strategies to guide the search when traversing the neighborhood tree. We conduct a thorough experimental analysis of this approach by considering two different problem domains, namely, the Total Weighted Tardiness Problem (SMTWTP), and the more sophisticated Location Routing Problem (LRP). We show that independently of the considered domain, the approach is highly competitive. In particular, we show that using different branching and backtracking strategies when exploring the neighborhood tree allows us to achieve different trade-offs in terms of solution quality and computing cost.Comment: Genetic and Evolutionary Computation Conference (GECCO'12) (2012

    A novel approach based on multiobjective variable mesh optimization to Phylogenetics

    Get PDF
    One of the most relevant problems in Bioinformaticsand Computational Biology is the search and reconstruction ofthe most accurate phylogenetic tree that explains, as exactly aspossible, the evolutionary relationships among species from agiven dataset. Different criteria have been employed to evaluatethe accuracy of evolutionary hypothesis in order to guide a searchalgorithm towards the best tree. However, these criteria may leadto distinct phylogenies, which are often conflicting among them.Therefore, a multi-objective approach can be useful. In this work,we present a phylogenetic adaptation of a multiobjective variablemesh optimization algorithm for inferring phylogenies, to tacklethe phylogenetic inference problem according to two optimalitycriteria: maximum parsimony and maximum likelihood. Theaim of this approach is to propose a complementary view ofphylogenetics in order to generate a set of trade-off phylogenetictopologies that represent a consensus between both criteria.Experiments on four real nucleotide datasets show that ourproposal can achieve promising results, under both multiobjectiveand biological approaches, with regard to other classical andrecent multiobjective metaheuristics from the state-of-the-art. &nbsp

    FPGA Hardware Acceleration of a Phylogenetic Tree Reconstruction with Maximum Parsimony Algorithm

    Get PDF
    In this paper, we present an FPGA hardware implementation for a phylogenetic tree reconstruction with a maximum parsimony algorithm. We base our approach on a particular stochastic local search algorithm that uses the Progressive Neighborhood and the Indirect Calculation of Tree Lengths method. This method is widely used for the acceleration of the phylogenetic tree reconstruction algorithm in software. In our implementation, we define a tree structure and accelerate the search by parallel and pipeline processing. We show results for eight real-world biological datasets. We compare execution times against our previous hardware approach, and TNT, the fastest available parsimony program, which is also accelerated by the Indirect Calculation of Tree Lengths method. Acceleration rates between 34 to 45 per rearrangement, and 2 to 6 for the whole search, are obtained against our previous hardware approach. Acceleration rates between 2 to 36 per rearrangement, and 18 to 112 for the whole search, are obtained against TNT

    How Fitch-Margoliash Algorithm can Benefit from Multi Dimensional Scaling

    Get PDF
    Whatever the phylogenetic method, genetic sequences are often described as strings of characters, thus molecular sequences can be viewed as elements of a multi-dimensional space. As a consequence, studying motion in this space (ie, the evolutionary process) must deal with the amazing features of high-dimensional spaces like concentration of measured phenomenon