14 research outputs found

    Selecting Data for Experiments: Past, Present and Future

    No full text

    Distribution-Insensitive Parallel External Sorting on PC Clusters

    No full text

    Estimating the deviation from a molecular clock

    No full text
    We address the problem of estimating the degree to which the evolutionary history of a set of molecular sequences violates a strong molecular clock hypothesis. We quantify this deviation formally, by defining the “stretch” of a model tree with respect to the underlying ultrametric tree (indicated by time). We then define the “minimum stretch” of a dataset for a tree and show how this can be computed optimally in polynomial time. We also present a polynomial-time algorithm for computing a lower bound on the stretch of a given dataset for any tree. We then explore the performance of standard techniques in systematics for estimating the deviation of a dataset from a molecular clock. We show that standard methods, whether based upon maximum parsimony or maximum likelihood, can return infeasible values (i.e. values for the stretch which cannot be realized on a tree), and often under-estimate the true stretch. This suggests that current approximations of the degree to which data sets deviate from a molecular clock may significantly underestimate these deviations. We conclude with some suggestions for further research

    A Linear-Time Algorithm for Computing Inversion Distance Between Signed Permutations with an Experimental Study

    No full text
    . Hannenhalli and Pevzner gave the first polynomial-time algorithm for computing the inversion distance between two signed permutations, as part of the larger task of determining the shortest sequence of inversions needed to transform one permutation into the other. Their algorithm (restricted to distance calculation) proceeds in two stages: in the first stage, the overlap graph induced by the permutation is decomposed into connected components, then in the second stage certain graph structures (hurdles and others) are identified. Berman and Hannenhalli avoided the explicit computation of the overlap graph and gave an O(na(n)) algorithm, based on a Union-Find structure, to find its connected components, where a is the inverse Ackerman function. Since for all practical purposes a(n) is a constant no larger than four, this algorithm has been the fastest practical algorithm to date. In this paper, we present a new linear-time algorithm for computing the connected components, which is more efficient than that of Berman and Hannenhalli in both theory and practice. Our algorithm uses only a stack and is very easy to implement. We give the results of computational experiments over a large range of permutation pairs produced through simulated evolution; our experiments show a speed-up by a factor of 2 to 5 in the computation of the connected components and by a factor of 1.3 to 2 in the overall distance computation.
    corecore